Science.gov

Sample records for rna instability sequences

  1. The role of topoisomerase I in suppressing genome instability associated with a highly transcribed guanine-rich sequence is not restricted to preventing RNA:DNA hybrid accumulation

    PubMed Central

    Yadav, Puja; Owiti, Norah; Kim, Nayun

    2016-01-01

    Highly transcribed guanine-run containing sequences, in Saccharomyces cerevisiae, become unstable when topoisomerase I (Top1) is disrupted. Topological changes, such as the formation of extended RNA:DNA hybrids or R-loops or non-canonical DNA structures including G-quadruplexes has been proposed as the major underlying cause of the transcription-linked genome instability. Here, we report that R-loop accumulation at a guanine-rich sequence, which is capable of assembling into the four-stranded G4 DNA structure, is dependent on the level and the orientation of transcription. In the absence of Top1 or RNase Hs, R-loops accumulated to substantially higher extent when guanine-runs were located on the non-transcribed strand. This coincides with the orientation where higher genome instability was observed. However, we further report that there are significant differences between the disruption of RNase Hs and Top1 in regards to the orientation-specific elevation in genome instability at the guanine-rich sequence. Additionally, genome instability in Top1-deficient yeasts is not completely suppressed by removal of negative supercoils and further aggravated by expression of mutant Top1. Together, our data provide a strong support for a function of Top1 in suppressing genome instability at the guanine-run containing sequence that goes beyond preventing the transcription-associated RNA:DNA hybrid formation. PMID:26527723

  2. RNA Sequencing in Schizophrenia

    PubMed Central

    Li, Xin; Teng, Shaolei

    2015-01-01

    Schizophrenia (SCZ) is a serious psychiatric disorder that affects 1% of general population and places a heavy burden worldwide. The underlying genetic mechanism of SCZ remains unknown, but studies indicate that the disease is associated with a global gene expression disturbance across many genes. Next-generation sequencing, particularly of RNA sequencing (RNA-Seq), provides a powerful genome-scale technology to investigate the pathological processes of SCZ. RNA-Seq has been used to analyze the gene expressions and identify the novel splice isoforms and rare transcripts associated with SCZ. This paper provides an overview on the genetics of SCZ, the advantages of RNA-Seq for transcriptome analysis, the accomplishments of RNA-Seq in SCZ cohorts, and the applications of induced pluripotent stem cells and RNA-Seq in SCZ research. PMID:27053919

  3. Nuclear RNA Isolation and Sequencing.

    PubMed

    Dhaliwal, Navroop K; Mitchell, Jennifer A

    2016-01-01

    Most transcriptome studies involve sequencing and quantification of steady-state mRNA by isolating and sequencing poly (A) RNA. Although this type of sequencing data is informative to determine steady-state mRNA levels it does not provide information on transcriptional output and thus may not always reflect changes in transcriptional regulation of gene expression. Furthermore, sequencing poly (A) RNA may miss transcribed regions of the genome not usually modified by polyadenylation which includes many long noncoding RNAs. Here, we describe nuclear-RNA sequencing (nucRNA-seq) which investigates the transcriptional landscape through sequencing and quantification of nuclear RNAs which are both unspliced and spliced transcripts for protein-coding genes and nuclear-retained long noncoding RNAs.

  4. AMPLIFICATION OF RIBOSOMAL RNA SEQUENCES

    EPA Science Inventory

    This book chapter offers an overview of the use of ribosomal RNA sequences. A history of the technology traces the evolution of techniques to measure bacterial phylogenetic relationships and recent advances in obtaining rRNA sequence information. The manual also describes procedu...

  5. CTLA-8, cloned from an activated T cell, bearing AU-rich messenger RNA instability sequences, and homologous to a herpesvirus Saimiri gene

    SciTech Connect

    Rouvier, E.; Luciani, M.F.; Golstein, P. ); Mattei, M.G. ); Denizot, F. )

    1993-06-15

    To detect novel molecules involved in immune functions, a subtracted cDNA library between closely related murine lymphoid cells was prepared using improved technology. Differential screening of this library yielded several clones with a very restricted tissue specificity, including one that was named CTLA-8. CTLA-8 transcripts could be detected only in T cell hybridoma clones related to the one used to prepare the library. Southern blots showed that the CTLA-8 gene was single copy in mice, rats, and humans. By radioactive in situ hybridization, the CTLA-8 gene was mapped at a single site on mouse chromosome 1A and human chromosome 2q31, in a known interspecific syntenic region. The CTLA-8 cDNA sequence indicated the presence, in the 3'-untranslated region of the mRNA, of AU-rich repeats previously found in the mRNA of various cytokines, growth factors, and oncogenes. The CTLA-8 cDNA contained an open reading frame encoding a putative protein of 150 amino acids. This protein was 57% homologous to the putative protein encoded by the ORF13 gene of herpesvirus Saimiri, a T lymphotropic virus. These findings are discussed in the context of other genes of this herpesvirus homologous to known immunologically active molecules. More generally, CTLA-8 may belong to the growing set of virus-captured functionally important cellular genes related to the immune system or to cell death and cell survival. 69 refs., 5 figs.

  6. Deciphering the RNA landscape by RNAome sequencing.

    PubMed

    Derks, Kasper W J; Misovic, Branislav; van den Hout, Mirjam C G N; Kockx, Christel E M; Gomez, Cesar Payan; Brouwer, Rutger W W; Vrieling, Harry; Hoeijmakers, Jan H J; van IJcken, Wilfred F J; Pothof, Joris

    2015-01-01

    Current RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species in an unperturbed manner. We report strand-specific RNAome sequencing that determines expression of small and large RNAs from rRNA-depleted total RNA in a single sequence run. Since current analysis pipelines cannot reliably analyze small and large RNAs simultaneously, we developed TRAP, Total Rna Analysis Pipeline, a robust interface that is also compatible with existing RNA sequencing protocols. RNAome sequencing quantitatively preserved all RNA classes, allowing cross-class comparisons that facilitates the identification of relationships between different RNA classes. We demonstrate the strength of RNAome sequencing in mouse embryonic stem cells treated with cisplatin. MicroRNA and mRNA expression in RNAome sequencing significantly correlated between replicates and was in concordance with both existing RNA sequencing methods and gene expression arrays generated from the same samples. Moreover, RNAome sequencing also detected additional RNA classes such as enhancer RNAs, anti-sense RNAs, novel RNA species and numerous differentially expressed RNAs undetectable by other methods. At the level of complete RNA classes, RNAome sequencing also identified a specific global repression of the microRNA and microRNA isoform classes after cisplatin treatment whereas all other classes such as mRNAs were unchanged. These characteristics of RNAome sequencing will significantly improve expression analysis as well as studies on RNA biology not covered by existing methods. PMID:25826412

  7. Deciphering the RNA landscape by RNAome sequencing

    PubMed Central

    Derks, Kasper WJ; Misovic, Branislav; van den Hout, Mirjam CGN; Kockx, Christel EM; Payan Gomez, Cesar; Brouwer, Rutger WW; Vrieling, Harry; Hoeijmakers, Jan HJ; van IJcken, Wilfred FJ; Pothof, Joris

    2015-01-01

    Current RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species in an unperturbed manner. We report strand-specific RNAome sequencing that determines expression of small and large RNAs from rRNA-depleted total RNA in a single sequence run. Since current analysis pipelines cannot reliably analyze small and large RNAs simultaneously, we developed TRAP, Total Rna Analysis Pipeline, a robust interface that is also compatible with existing RNA sequencing protocols. RNAome sequencing quantitatively preserved all RNA classes, allowing cross-class comparisons that facilitates the identification of relationships between different RNA classes. We demonstrate the strength of RNAome sequencing in mouse embryonic stem cells treated with cisplatin. MicroRNA and mRNA expression in RNAome sequencing significantly correlated between replicates and was in concordance with both existing RNA sequencing methods and gene expression arrays generated from the same samples. Moreover, RNAome sequencing also detected additional RNA classes such as enhancer RNAs, anti-sense RNAs, novel RNA species and numerous differentially expressed RNAs undetectable by other methods. At the level of complete RNA classes, RNAome sequencing also identified a specific global repression of the microRNA and microRNA isoform classes after cisplatin treatment whereas all other classes such as mRNAs were unchanged. These characteristics of RNAome sequencing will significantly improve expression analysis as well as studies on RNA biology not covered by existing methods. PMID:25826412

  8. RNAome sequencing delineates the complete RNA landscape.

    PubMed

    Derks, Kasper W J; Pothof, Joris

    2015-09-01

    Standard RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species. For example, small and large RNAs from the same sample cannot be sequenced in a single sequence run. We designed RNAome sequencing, which is a strand-specific method to determine the expression of small and large RNAs from ribosomal RNA-depleted total RNA in a single sequence run. RNAome sequencing quantitatively preserves all RNA classes. This characteristic allows comparisons between RNA classes, thereby facilitating relationships between different RNA classes. Here, we describe in detail the experimental procedure associated with RNAome sequencing published by Derks and colleagues in RNA Biology (2015) [1]. We also provide the R code for the developed Total Rna Analysis Pipeline (TRAP), an algorithm to analyze RNAome sequencing datasets (deposited at the Gene Expression Omnibus data repository, accession number GSE48084). PMID:26484291

  9. RNA sequencing: advances, challenges and opportunities

    PubMed Central

    Ozsolak, Fatih; Milos, Patrice M.

    2011-01-01

    In the few years since its initial application, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the characterization and quantification of transcriptomes. Recently, several developments in RNA-seq methods have provided an even more complete characterization of RNA transcripts. These developments include improvements in transcription start site mapping, strand-specific measurements, gene fusion detection, small RNA characterization and detection of alternative splicing events. Ongoing developments promise further advances in the application of RNA-seq, particularly direct RNA sequencing and approaches that allow RNA quantification from very small amounts of cellular materials. PMID:21191423

  10. Mechanisms of genome instability induced by RNA-processing defects.

    PubMed

    Chan, Yujia A; Hieter, Philip; Stirling, Peter C

    2014-06-01

    The role of normal transcription and RNA processing in maintaining genome integrity is becoming increasingly appreciated in organisms ranging from bacteria to humans. Several mutations in RNA biogenesis factors have been implicated in human cancers, but the mechanisms and potential connections to tumor genome instability are not clear. Here, we discuss how RNA-processing defects could destabilize genomes through mutagenic R-loop structures and by altering expression of genes required for genome stability. A compelling body of evidence now suggests that researchers should be directly testing these mechanisms in models of human cancer.

  11. Mechanisms of genome instability induced by RNA processing defects

    PubMed Central

    Chan, Yujia A.; Hieter, Philip

    2014-01-01

    The role of normal transcription and RNA processing in maintaining genome integrity is becoming increasingly appreciated in organisms ranging from bacteria to humans. Several mutations in RNA biogenesis factors have been implicated in human cancers, but the mechanisms and potential connections to tumor genome instability are not clear. Here we discuss how RNA processing defects could destabilize genomes through mutagenic R-loop structures and by altering expression of genes required for genome stability. A compelling body of evidence now suggests that researchers should be directly testing these mechanisms in models of human cancer. PMID:24794811

  12. Mechanisms of genome instability induced by RNA-processing defects.

    PubMed

    Chan, Yujia A; Hieter, Philip; Stirling, Peter C

    2014-06-01

    The role of normal transcription and RNA processing in maintaining genome integrity is becoming increasingly appreciated in organisms ranging from bacteria to humans. Several mutations in RNA biogenesis factors have been implicated in human cancers, but the mechanisms and potential connections to tumor genome instability are not clear. Here, we discuss how RNA-processing defects could destabilize genomes through mutagenic R-loop structures and by altering expression of genes required for genome stability. A compelling body of evidence now suggests that researchers should be directly testing these mechanisms in models of human cancer. PMID:24794811

  13. Biases in small RNA deep sequencing data

    PubMed Central

    Raabe, Carsten A.; Tang, Thean-Hock; Brosius, Juergen; Rozhdestvensky, Timofey S.

    2014-01-01

    High-throughput RNA sequencing (RNA-seq) is considered a powerful tool for novel gene discovery and fine-tuned transcriptional profiling. The digital nature of RNA-seq is also believed to simplify meta-analysis and to reduce background noise associated with hybridization-based approaches. The development of multiplex sequencing enables efficient and economic parallel analysis of gene expression. In addition, RNA-seq is of particular value when low RNA expression or modest changes between samples are monitored. However, recent data uncovered severe bias in the sequencing of small non-protein coding RNA (small RNA-seq or sRNA-seq), such that the expression levels of some RNAs appeared to be artificially enhanced and others diminished or even undetectable. The use of different adapters and barcodes during ligation as well as complex RNA structures and modifications drastically influence cDNA synthesis efficacies and exemplify sources of bias in deep sequencing. In addition, variable specific RNA G/C-content is associated with unequal polymerase chain reaction amplification efficiencies. Given the central importance of RNA-seq to molecular biology and personalized medicine, we review recent findings that challenge small non-protein coding RNA-seq data and suggest approaches and precautions to overcome or minimize bias. PMID:24198247

  14. antaRNA: ant colony-based RNA sequence design

    PubMed Central

    Kleinkauf, Robert; Mann, Martin; Backofen, Rolf

    2015-01-01

    Motivation: RNA sequence design is studied at least as long as the classical folding problem. Although for the latter the functional fold of an RNA molecule is to be found, inverse folding tries to identify RNA sequences that fold into a function-specific target structure. In combination with RNA-based biotechnology and synthetic biology, reliable RNA sequence design becomes a crucial step to generate novel biochemical components. Results: In this article, the computational tool antaRNA is presented. It is capable of compiling RNA sequences for a given structure that comply in addition with an adjustable full range objective GC-content distribution, specific sequence constraints and additional fuzzy structure constraints. antaRNA applies ant colony optimization meta-heuristics and its superior performance is shown on a biological datasets. Availability and implementation: http://www.bioinf.uni-freiburg.de/Software/antaRNA Contact: backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26023105

  15. Stem kernels for RNA sequence analyses.

    PubMed

    Sakakibara, Yasubumi; Popendorf, Kris; Ogawa, Nana; Asai, Kiyoshi; Sato, Kengo

    2007-10-01

    Several computational methods based on stochastic context-free grammars have been developed for modeling and analyzing functional RNA sequences. These grammatical methods have succeeded in modeling typical secondary structures of RNA, and are used for structural alignment of RNA sequences. However, such stochastic models cannot sufficiently discriminate member sequences of an RNA family from nonmembers and hence detect noncoding RNA regions from genome sequences. A novel kernel function, stem kernel, for the discrimination and detection of functional RNA sequences using support vector machines (SVMs) is proposed. The stem kernel is a natural extension of the string kernel, specifically the all-subsequences kernel, and is tailored to measure the similarity of two RNA sequences from the viewpoint of secondary structures. The stem kernel examines all possible common base pairs and stem structures of arbitrary lengths, including pseudoknots between two RNA sequences, and calculates the inner product of common stem structure counts. An efficient algorithm is developed to calculate the stem kernels based on dynamic programming. The stem kernels are then applied to discriminate members of an RNA family from nonmembers using SVMs. The study indicates that the discrimination ability of the stem kernel is strong compared with conventional methods. Furthermore, the potential application of the stem kernel is demonstrated by the detection of remotely homologous RNA families in terms of secondary structures. This is because the string kernel is proven to work for the remote homology detection of protein sequences. These experimental results have convinced us to apply the stem kernel in order to find novel RNA families from genome sequences. PMID:17933013

  16. RNA Polymerase Backtracking in Gene Regulation and Genome Instability

    PubMed Central

    Nudler, Evgeny

    2013-01-01

    RNA polymerase is a ratchet machine that oscillates between productive and backtracked states at numerous DNA positions. The amount of backtracking (reversible sliding of the enzyme along DNA and RNA) varies from one to many nucleotides. Since its first description 15 years ago, backtracking has been implicated in a plethora of critical processes in bacteria and eukaryotic cells. Here we review the most fundamental roles of this phenomenon in controlling transcription elongation, pausing, termination, fidelity, and genome instability. We also discuss recent progress in understanding the structural and mechanistic properties of the backtracking process. PMID:22726433

  17. Experimental investigation of an RNA sequence space

    NASA Technical Reports Server (NTRS)

    Lee, Youn-Hyung; Dsouza, Lisa; Fox, George E.

    1993-01-01

    Modern rRNAs are the historic consequence of an ongoing evolutionary exploration of a sequence space. These extant sequences belong to a special subset of the sequence space that is comprised only of those primary sequences that can validly perform the biological function(s) required of the particular RNA. If it were possible to readily identify all such valid sequences, stochastic predictions could be made about the relative likelihood of various evolutionary pathways available to an RNA. Herein an experimental system which can assess whether a particular sequence is likely to have validity as a eubacterial 5S rRNA is described. A total of ten naturally occurring, and hence known to be valid, sequences and two point mutants of unknown validity were used to test the usefulness of the approach. Nine of the ten valid sequences tested positive whereas both mutants tested as clearly defective. The tenth valid sequence gave results that would be interpreted as reflecting a borderline status were the answer not known. These results demonstrate that it is possible to experimentally determine which sequences in local regions of the sequence space are potentially valid 5S rRNAs.

  18. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    PubMed

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  19. Discovering New Biology through Sequencing of RNA.

    PubMed

    Weber, Andreas P M

    2015-11-01

    Sequencing of RNA (RNA-Seq) was invented approximately 1 decade ago and has since revolutionized biological research. This update provides a brief historic perspective on the development of RNA-Seq and then focuses on the application of RNA-Seq in qualitative and quantitative analyses of transcriptomes. Particular emphasis is given to aspects of data analysis. Since the wet-lab and data analysis aspects of RNA-Seq are still rapidly evolving and novel applications are continuously reported, a printed review will be rapidly outdated and can only serve to provide some examples and general guidelines for planning and conducting RNA-Seq studies. Hence, selected references to frequently update online resources are given.

  20. Alternative applications for distinct RNA sequencing strategies

    PubMed Central

    Han, Leng; Vickers, Kasey C.; Samuels, David C.

    2015-01-01

    Recent advances in RNA library preparation methods, platform accessibility and cost efficiency have allowed high-throughput RNA sequencing (RNAseq) to replace conventional hybridization microarray platforms as the method of choice for mRNA profiling and transcriptome analyses. RNAseq is a powerful technique to profile both long and short RNA expression, and the depth of information gained from distinct RNAseq methods is striking and facilitates discovery. In addition to expression analysis, distinct RNAseq approaches also allow investigators the ability to assess transcriptional elongation, DNA variance and exogenous RNA content. Here we review the current state of the art in transcriptome sequencing and address epigenetic regulation, quantification of transcription activation, RNAseq output and a diverse set of applications for RNAseq data. We detail how RNAseq can be used to identify allele-specific expression, single-nucleotide polymorphisms and somatic mutations and discuss the benefits and limitations of using RNAseq to monitor DNA characteristics. Moreover, we highlight the power of combining RNA- and DNAseq methods for genomic analysis. In summary, RNAseq provides the opportunity to gain greater insight into transcriptional regulation and output than simply miRNA and mRNA profiling. PMID:25246237

  1. Sequence fingerprints of microRNA conservation.

    PubMed

    Shi, Bing; Gao, Wei; Wang, Juan

    2012-01-01

    It is known that the conservation of protein-coding genes is associated with their sequences both various species, such as animals and plants. However, the association between microRNA (miRNA) conservation and their sequences in various species remains unexplored. Here we report the association of miRNA conservation with its sequence features, such as base content and cleavage sites, suggesting that miRNA sequences contain the fingerprints for miRNA conservation. More interestingly, different species show different and even opposite patterns between miRNA conservation and sequence features. For example, mammalian miRNAs show a positive/negative correlation between conservation and AU/GC content, whereas plant miRNAs show a negative/positive correlation between conservation and AU/GC content. Further analysis puts forward the hypothesis that the introns of protein-coding genes may be a main driving force for the origin and evolution of mammalian miRNAs. At the 5' end, conserved miRNAs have a preference for base U, while less-conserved miRNAs have a preference for a non-U base in mammals. This difference does not exist in insects and plants, in which both conserved miRNAs and less-conserved miRNAs have a preference for base U at the 5' end. We further revealed that the non-U preference at the 5' end of less-conserved mammalian miRNAs is associated with miRNA function diversity, which may have evolved from the pressure of a highly sophisticated environmental stimulus the mammals encountered during evolution. These results indicated that miRNA sequences contain the fingerprints for conservation, and these fingerprints vary according to species. More importantly, the results suggest that although species share common mechanisms by which miRNAs originate and evolve, mammals may develop a novel mechanism for miRNA origin and evolution. In addition, the fingerprint found in this study can be predictor of miRNA conservation, and the findings are helpful in achieving a

  2. Transcriptional profiling of Dictyostelium with RNA sequencing

    PubMed Central

    Miranda, Edward Roshan; Rot, Gregor; Toplak, Marko; Santhanam, Balaji; Curk, Tomaz; Shaulsky, Gad; Zupan, Blaz

    2014-01-01

    Summary Transcriptional profiling methods have been utilized in the analysis of various biological processes in Dictyostelium. Recent advances in high-throughput sequencing have increased the resolution and the dynamic range of transcriptional profiling. Here we describe the utility of RNA-sequencing with the Illumina technology for production of transcriptional profiles. We also describe methods for data mapping and storage as well as common and specialized tools for data analysis, both online and offline. PMID:23494306

  3. Compilation of tRNA sequences.

    PubMed

    Sprinzl, M; Grueter, F; Spelzhaus, A; Gauss, D H

    1980-01-11

    This compilation presents in a small space the tRNA sequences so far published. The numbering of tRNAPhe from yeast is used following the rules proposed by the participants of the Cold Spring Harbor Meeting on tRNA 1978 (1,2;Fig. 1). This numbering allows comparisons with the three dimensional structure of tRNAPhe. The secondary structure of tRNAs is indicated by specific underlining. In the primary structure a nucleoside followed by a nucleoside in brackets or a modification in brackets denotes that both types of nucleosides can occupy this position. Part of a sequence in brackets designates a piece of sequence not unambiguosly analyzed. Rare nucleosides are named according to the IUPACIUB rules (for complicated rare nucleosides and their identification see Table 1); those with lengthy names are given with the prefix x and specified in the footnotes. Footnotes are numbered according to the coordinates of the corresponding nucleoside and are indicated in the sequence by an asterisk. The references are restricted to the citation of the latest publication in those cases where several papers deal with one sequence. For additional information the reader is referred either to the original literature or to other tRNA sequence compilations (3-7). Mutant tRNAs are dealt with in a compilation by J. Celis (8). The compilers would welcome any information by the readers regarding missing material or erroneous presentation. On the basis of this numbering system computer printed compilations of tRNA sequences in a linear form and in cloverleaf form are in preparation. PMID:6986608

  4. Compilation of tRNA sequences.

    PubMed

    Gauss, D H; Grüter, F; Sprinzl, M

    1979-01-01

    This compilation presents in a small space the tRNA sequences so far published in order to enable rapid orientation and comparison. The numbering of tRNAPhe from yeast is used as has been done earlier (1) but following the rules proposed by the participants of the Cold Spring Harbor Meeting on tRNA 1978 (2) (Fig. 1). This numbering allows comparisons with the three dimensional structure of tRNAPhe, the only structure known from X-ray analysis. The secondary structure of tRNAs is indicated by specific underlining. In the primary structure a nucleoside followed by a nucleoside in brackets or a modification in brackets denotes that both types of nucleosides can occupy this position. Part of a sequence in brackets designates a piece of sequence not unambiguously analyzed. Rare nucleosides are named according to the IUPAC-IUB rules (for some more complicated rare nucleosides and their identification see Table 1); those with lengthy names are given with the prefix x and specified in the footnotes. Footnotes are numbered according to the coordinates of the corresponding nucleoside and are indicated in the sequence by an asterisk. The references are restricted to the citation of the latest publication in those cases where several papers deal with one sequence. For additional information the reader is referred either to the original literature or to other tRNA sequence compilations (3--7). Mutant tRNAs are dealt with in a separate compilation prepared by J. Celis (see below). The compilers would welcome any information by the readers regarding missing material or erroneous presentation. On the basis of this numbering system computer printed compilations of tRNA sequences in a linear form and in cloverleaf form are in preparation. PMID:424282

  5. Dis3- and exosome subunit-responsive 3 Prime mRNA instability elements

    SciTech Connect

    Kiss, Daniel L.; Hou, Dezhi; Gross, Robert H.; Andrulis, Erik D.

    2012-07-06

    Highlights: Black-Right-Pointing-Pointer Successful use of a novel RNA-specific bioinformatic tool, RNA SCOPE. Black-Right-Pointing-Pointer Identified novel 3 Prime UTR cis-acting element that destabilizes a reporter mRNA. Black-Right-Pointing-Pointer Show exosome subunits are required for cis-acting element-mediated mRNA instability. Black-Right-Pointing-Pointer Define precise sequence requirements of novel cis-acting element. Black-Right-Pointing-Pointer Show that microarray-defined exosome subunit-regulated mRNAs have novel element. -- Abstract: Eukaryotic RNA turnover is regulated in part by the exosome, a nuclear and cytoplasmic complex of ribonucleases (RNases) and RNA-binding proteins. The major RNase of the complex is thought to be Dis3, a multi-functional 3 Prime -5 Prime exoribonuclease and endoribonuclease. Although it is known that Dis3 and core exosome subunits are recruited to transcriptionally active genes and to messenger RNA (mRNA) substrates, this recruitment is thought to occur indirectly. We sought to discover cis-acting elements that recruit Dis3 or other exosome subunits. Using a bioinformatic tool called RNA SCOPE to screen the 3 Prime untranslated regions of up-regulated transcripts from our published Dis3 depletion-derived transcriptomic data set, we identified several motifs as candidate instability elements. Secondary screening using a luciferase reporter system revealed that one cassette-harboring four elements-destabilized the reporter transcript. RNAi-based depletion of Dis3, Rrp6, Rrp4, Rrp40, or Rrp46 diminished the efficacy of cassette-mediated destabilization. Truncation analysis of the cassette showed that two exosome subunit-sensitive elements (ESSEs) destabilized the reporter. Point-directed mutagenesis of ESSE abrogated the destabilization effect. An examination of the transcriptomic data from exosome subunit depletion-based microarrays revealed that mRNAs with ESSEs are found in every up-regulated mRNA data set but are

  6. Ribosomal RNA sequence suggest microsporidia are extremely ancient eukaryotes

    NASA Technical Reports Server (NTRS)

    Vossbrinck, C. R.; Maddox, J. V.; Friedman, S.; Debrunner-Vossbrinck, B. A.; Woese, C. R.

    1987-01-01

    A comparative sequence analysis of the 18S small subunit ribosomal RNA (rRNA) of the microsporidium Vairimorpha necatrix is presented. The results show that this rRNA sequence is more unlike those of other eukaryotes than any known eukaryote rRNA sequence. It is concluded that the lineage leading to microsporidia branched very early from that leading to other eukaryotes.

  7. RNA Target Sequences Promote Spreading of RNA Silencing1

    PubMed Central

    Van Houdt, Helena; Bleys, Annick; Depicker, Anna

    2003-01-01

    It is generally recognized that a silencing-inducing locus can efficiently reduce the expression of genes that give rise to transcripts partially homologous to those produced by the silencing-inducing locus (primary targets). Interestingly, the expression of genes that produce transcripts without homology to the silencing-inducing locus (secondary targets) can also be decreased dramatically via transitive RNA silencing. This phenomenon requires primary target RNAs that contain sequences homologous to secondary target RNAs. Sequences upstream from the region homologous to the silencing inducer in the primary target transcripts give rise to approximately 22-nucleotide small RNAs, coinciding with the region homologous to the secondary target. The presence of these small RNAs corresponds with reduced expression of the secondary target whose transcripts are not homologous to the silencing inducer. The data suggest that in transgenic plants, targets of RNA silencing are involved in the expansion of the pool of functional small interfering RNAs. Furthermore, methylation of target genes in sequences without homology to the initial silencing inducer indicates not only that RNA silencing can expand across target RNAs but also that methylation can spread along target genes. PMID:12529532

  8. Advanced Applications of RNA Sequencing and Challenges.

    PubMed

    Han, Yixing; Gao, Shouguo; Muegge, Kathrin; Zhang, Wei; Zhou, Bing

    2015-01-01

    Next-generation sequencing technologies have revolutionarily advanced sequence-based research with the advantages of high-throughput, high-sensitivity, and high-speed. RNA-seq is now being used widely for uncovering multiple facets of transcriptome to facilitate the biological applications. However, the large-scale data analyses associated with RNA-seq harbors challenges. In this study, we present a detailed overview of the applications of this technology and the challenges that need to be addressed, including data preprocessing, differential gene expression analysis, alternative splicing analysis, variants detection and allele-specific expression, pathway analysis, co-expression network analysis, and applications combining various experimental procedures beyond the achievements that have been made. Specifically, we discuss essential principles of computational methods that are required to meet the key challenges of the RNA-seq data analyses, development of various bioinformatics tools, challenges associated with the RNA-seq applications, and examples that represent the advances made so far in the characterization of the transcriptome.

  9. Advanced Applications of RNA Sequencing and Challenges

    PubMed Central

    Han, Yixing; Gao, Shouguo; Muegge, Kathrin; Zhang, Wei; Zhou, Bing

    2015-01-01

    Next-generation sequencing technologies have revolutionarily advanced sequence-based research with the advantages of high-throughput, high-sensitivity, and high-speed. RNA-seq is now being used widely for uncovering multiple facets of transcriptome to facilitate the biological applications. However, the large-scale data analyses associated with RNA-seq harbors challenges. In this study, we present a detailed overview of the applications of this technology and the challenges that need to be addressed, including data preprocessing, differential gene expression analysis, alternative splicing analysis, variants detection and allele-specific expression, pathway analysis, co-expression network analysis, and applications combining various experimental procedures beyond the achievements that have been made. Specifically, we discuss essential principles of computational methods that are required to meet the key challenges of the RNA-seq data analyses, development of various bioinformatics tools, challenges associated with the RNA-seq applications, and examples that represent the advances made so far in the characterization of the transcriptome. PMID:26609224

  10. Equilibrium Sequences and Gravitational Instability of Rotating Isothermal Rings

    NASA Astrophysics Data System (ADS)

    Kim, Woong-Tae; Moon, Sanghyuk

    2016-09-01

    Nuclear rings at the centers of barred galaxies exhibit strong star formation activities. They are thought to undergo gravitational instability when they are sufficiently massive. We approximate them as rigidly rotating isothermal objects and investigate their gravitational instability. Using a self-consistent field method, we first construct their equilibrium sequences specified by two parameters: α corresponding to the thermal energy relative to gravitational potential energy, and {\\widehat{R}}{{B}} measuring the ellipticity or ring thickness. Unlike in the incompressible case, not all values of {\\widehat{R}}{{B}} yield an isothermal equilibrium, and the range of {\\widehat{R}}{{B}} for such equilibria shrinks with decreasing α. The density distributions in the meridional plane are steeper for smaller α, and well approximated by those of infinite cylinders for slender rings. We also calculate the dispersion relations of non-axisymmetric modes in rigidly rotating slender rings with angular frequency Ω0 and central density {ρ }c. Rings with smaller α are found more unstable with a larger unstable range of the azimuthal mode number. The instability is completely suppressed by rotation when Ω0 exceeds the critical value. The critical angular frequency is found to be almost constant at ∼ 0.7{(G{ρ }c)}1/2 for α ≳ 0.01 and increases rapidly for smaller α. We apply our results to a sample of observed star-forming rings and confirm that rings without a noticeable azimuthal age gradient of young star clusters are indeed gravitationally unstable.

  11. Insertion of a Telomere Repeat Sequence into a Mammalian Gene Causes Chromosome Instability

    PubMed Central

    Kilburn, April E.; Shea, Martin J.; Sargent, R. Geoffrey; Wilson, John H.

    2001-01-01

    Telomere repeat sequences cap the ends of eucaryotic chromosomes and help stabilize them. At interstitial sites, however, they may destabilize chromosomes, as suggested by cytogenetic studies in mammalian cells that correlate interstitial telomere sequence with sites of spontaneous and radiation-induced chromosome rearrangements. In no instance is the length, purity, or orientation of the telomere repeats at these potentially destabilizing interstitial sites known. To determine the effects of a defined interstitial telomere sequence on chromosome instability, as well as other aspects of DNA metabolism, we deposited 800 bp of the functional vertebrate telomere repeat, TTAGGG, in two orientations in the second intron of the adenosine phosphoribosyltransferase (APRT) gene in Chinese hamster ovary cells. In one orientation, the deposited telomere sequence did not interfere with expression of the APRT gene, whereas in the other it reduced mRNA levels slightly. The telomere sequence did not induce chromosome truncation and the seeding of a new telomere at a frequency above the limits of detection. Similarly, the telomere sequence did not alter the rate or distribution of homologous recombination events. The interstitial telomere repeat sequence in both orientations, however, dramatically increased gene rearrangements some 30-fold. Analysis of individual rearrangements confirmed the involvement of the telomere sequence. These studies define the telomere repeat sequence as a destabilizing element in the interior of chromosomes in mammalian cells. PMID:11113187

  12. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-03-19

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data.

  13. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-01-01

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

  14. Sequencing Degraded RNA Addressed by 3' Tag Counting

    PubMed Central

    Sigurgeirsson, Benjamín; Emanuelsson, Olof; Lundeberg, Joakim

    2014-01-01

    RNA sequencing has become widely used in gene expression profiling experiments. Prior to any RNA sequencing experiment the quality of the RNA must be measured to assess whether or not it can be used for further downstream analysis. The RNA integrity number (RIN) is a scale used to measure the quality of RNA that runs from 1 (completely degraded) to 10 (intact). Ideally, samples with high RIN (8) are used in RNA sequencing experiments. RNA, however, is a fragile molecule which is susceptible to degradation and obtaining high quality RNA is often hard, or even impossible when extracting RNA from certain clinical tissues. Thus, occasionally, working with low quality RNA is the only option the researcher has. Here we investigate the effects of RIN on RNA sequencing and suggest a computational method to handle data from samples with low quality RNA which also enables reanalysis of published datasets. Using RNA from a human cell line we generated and sequenced samples with varying RINs and illustrate what effect the RIN has on the basic procedure of RNA sequencing; both quality aspects and differential expression. We show that the RIN has systematic effects on gene coverage, false positives in differential expression and the quantification of duplicate reads. We introduce 3' tag counting (3TC) as a computational approach to reliably estimate differential expression for samples with low RIN. We show that using the 3TC method in differential expression analysis significantly reduces false positives when comparing samples with different RIN, while retaining reasonable sensitivity. PMID:24632678

  15. RNA-sequencing from single nuclei.

    PubMed

    Grindberg, Rashel V; Yee-Greenbaum, Joyclyn L; McConnell, Michael J; Novotny, Mark; O'Shaughnessy, Andy L; Lambert, Georgina M; Araúzo-Bravo, Marcos J; Lee, Jun; Fishman, Max; Robbins, Gillian E; Lin, Xiaoying; Venepally, Pratap; Badger, Jonathan H; Galbraith, David W; Gage, Fred H; Lasken, Roger S

    2013-12-01

    It has recently been established that synthesis of double-stranded cDNA can be done from a single cell for use in DNA sequencing. Global gene expression can be quantified from the number of reads mapping to each gene, and mutations and mRNA splicing variants determined from the sequence reads. Here we demonstrate that this method of transcriptomic analysis can be done using the extremely low levels of mRNA in a single nucleus, isolated from a mouse neural progenitor cell line and from dissected hippocampal tissue. This method is characterized by excellent coverage and technical reproducibility. On average, more than 16,000 of the 24,057 mouse protein-coding genes were detected from single nuclei, and the amount of gene-expression variation was similar when measured between single nuclei and single cells. Several major advantages of the method exist: first, nuclei, compared with whole cells, have the advantage of being easily isolated from complex tissues and organs, such as those in the CNS. Second, the method can be widely applied to eukaryotic species, including those of different kingdoms. The method also provides insight into regulatory mechanisms specific to the nucleus. Finally, the method enables dissection of regulatory events at the single-cell level; pooling of 10 nuclei or 10 cells obscures some of the variability measured in transcript levels, implying that single nuclei and cells will be extremely useful in revealing the physiological state and interconnectedness of gene regulation in a manner that avoids the masking inherent to conventional transcriptomics using bulk cells or tissues.

  16. Sequence of Echinochloa hoja blanca tenuivirus RNA-4.

    PubMed

    de Miranda, J R; Muñoz, M; Wu, R; Espinoza, A M

    1996-01-01

    The sequence is presented of RNA-4 of Echinochloa hoja blanca tenuivirus (EHBV), one of two tenuiviruses associated with rice cultivation in Latin America (together with rice hoja blanca virus; RHBV). Analysis of the sequence shows that the coding regions of EHBV RNA-4 are closely related to those of RHBV RNA-4. However, the intergenic region separating the two ambisense open reading frames, are highly distinct for the two viruses. The features of the RNA and the comparisons with the sequences of RNA-4 of RHBV, rice stripe virus (RStV) and maize stripe virus (MStV) are discussed. PMID:8938980

  17. Sequence complementarity-driven nonenzymatic ligation of RNA.

    PubMed

    Pino, Samanta; Costanzo, Giovanna; Giorgi, Alessandra; Di Mauro, Ernesto

    2011-04-12

    We report two reactions of RNA G:C sequences occurring nonenzymatically in water in the absence of any added cofactor or metal ion: (a) sequence complementarity-driven terminal ligation and (b) complementary sequence adaptor-driven multiple tandemization. The two abiotic reactions increase the chemical complexity of the resulting pool of RNA molecules and change the Shannon information of the initial population of sequences.

  18. Evolution of tRNA-like sequences and genome variability.

    PubMed

    Frenkel, Felix E; Chaley, Maria B; Korotkov, Eugene V; Skryabin, Konstantin G

    2004-06-23

    Transfer RNA (tRNA)-like sequences were searched for in the nine basic taxonomic divisions of GenBank-121 (viruses, phages, bacteria, plants, invertebrates, vertebrates, rodents, mammals, and primates) by an original program package implementing a dynamic profile alignment approach for the genetic texts' analysis, in using 22 profiles of tRNAs of different isotypes. In total, 175,901 previously unknown tRNA-like sequences were revealed. The locations of the tRNA-likes were considered over the regions whose functional meaning is described by standard Feature Keys in GenBank. Many regions containing the tRNA-like sequences were recognized as known repeats. A mode of distribution of the tRNA-like sequences in a genome was proposed as expansion in a content of the various transposable elements. An analysis of the integrity of RNA polymerase III inner promoters in the tRNA-like sequences over the GenBank divisions has shown a high possibility of generating new copies of short interspersed nuclear element (SINE) repeats in all divisions, excepting primates. The numerous tRNA-likes found in the regions of RNA polymerase II promoters have suggested an adaptation of RNA polymerase III promoter to a binding of RNA polymerase II. PMID:15194190

  19. Empirical insights into the stochasticity of small RNA sequencing.

    PubMed

    Qin, Li-Xuan; Tuschl, Thomas; Singer, Samuel

    2016-01-01

    The choice of stochasticity distribution for modeling the noise distribution is a fundamental assumption for the analysis of sequencing data and consequently is critical for the accurate assessment of biological heterogeneity and differential expression. The stochasticity of RNA sequencing has been assumed to follow Poisson distributions. We collected microRNA sequencing data and observed that its stochasticity is better approximated by gamma distributions, likely because of the stochastic nature of exponential PCR amplification. We validated our findings with two independent datasets, one for microRNA sequencing and another for RNA sequencing. Motivated by the gamma distributed stochasticity, we provided a simple method for the analysis of RNA sequencing data and showed its superiority to three existing methods for differential expression analysis using three data examples of technical replicate data and biological replicate data. PMID:27052356

  20. Empirical insights into the stochasticity of small RNA sequencing

    NASA Astrophysics Data System (ADS)

    Qin, Li-Xuan; Tuschl, Thomas; Singer, Samuel

    2016-04-01

    The choice of stochasticity distribution for modeling the noise distribution is a fundamental assumption for the analysis of sequencing data and consequently is critical for the accurate assessment of biological heterogeneity and differential expression. The stochasticity of RNA sequencing has been assumed to follow Poisson distributions. We collected microRNA sequencing data and observed that its stochasticity is better approximated by gamma distributions, likely because of the stochastic nature of exponential PCR amplification. We validated our findings with two independent datasets, one for microRNA sequencing and another for RNA sequencing. Motivated by the gamma distributed stochasticity, we provided a simple method for the analysis of RNA sequencing data and showed its superiority to three existing methods for differential expression analysis using three data examples of technical replicate data and biological replicate data.

  1. Molecular Evolution of Multi-subunit RNA Polymerases: Sequence Analysis

    PubMed Central

    Lane, William J.; Darst, Seth A.

    2009-01-01

    Transcription in all cellular organisms is performed by multi-subunit, DNA-dependent RNA polymerases that synthesize RNA from DNA templates. Previous sequence and structural studies have elucidated the importance of shared regions common to all multi-subunit RNA polymerases. In addition RNA polymerases contain multiple lineage-specific domain insertions involved in protein-protein and protein-nucleic acid interactions. We have created comprehensive multiple sequence alignments using all available sequence data for the multi-subunit RNA polymerase large subunits, including the bacterial β and β′ subunits and their homologues from archaebacterial RNA polymerases, the eukaryotic RNA polymerases I, II, and III, the nuclear-cytoplasmic large double-stranded DNA Virus RNA polymerases, and plant plastid RNA polymerases. In order to overcome technical difficulties inherent to the large subunit sequences, including large sequence length, small and large lineage-specific insertions, split subunits, and fused proteins, we created an automated and customizable sequence retrieval and processing system. In addition, we used our alignments to create a more expansive set of shared sequence regions and bacterial lineage-specific domain insertions. We also analyzed the intergenic gap between the bacterial β and β′ genes. PMID:19895820

  2. Genome sequence-independent identification of RNA editing sites.

    PubMed

    Zhang, Qing; Xiao, Xinshu

    2015-04-01

    RNA editing generates post-transcriptional sequence changes that can be deduced from RNA-seq data, but detection typically requires matched genomic sequence or multiple related expression data sets. We developed the GIREMI tool (genome-independent identification of RNA editing by mutual information; https://www.ibp.ucla.edu/research/xiao/GIREMI.html) to predict adenosine-to-inosine editing accurately and sensitively from a single RNA-seq data set of modest sequencing depth. Using GIREMI on existing data, we observed tissue-specific and evolutionary patterns in editing sites in the human population.

  3. Novel Approach to Analyzing MFE of Noncoding RNA Sequences

    PubMed Central

    George, Tina P.; Thomas, Tessamma

    2016-01-01

    Genomic studies have become noncoding RNA (ncRNA) centric after the study of different genomes provided enormous information on ncRNA over the past decades. The function of ncRNA is decided by its secondary structure, and across organisms, the secondary structure is more conserved than the sequence itself. In this study, the optimal secondary structure or the minimum free energy (MFE) structure of ncRNA was found based on the thermodynamic nearest neighbor model. MFE of over 2600 ncRNA sequences was analyzed in view of its signal properties. Mathematical models linking MFE to the signal properties were found for each of the four classes of ncRNA analyzed. MFE values computed with the proposed models were in concordance with those obtained with the standard web servers. A total of 95% of the sequences analyzed had deviation of MFE values within ±15% relative to those obtained from standard web servers. PMID:27695341

  4. Next-Generation Sequencing RNA-Seq Library Construction.

    PubMed

    Podnar, Jessica; Deiderick, Heather; Huerta, Gabriella; Hunicke-Smith, Scott

    2014-01-01

    This unit presents protocols for construction of next-generation sequencing (NGS) directional RNA sequencing libraries for the Illumina HiSeq and MiSeq from a wide variety of input RNA sources. The protocols are based on the New England Biolabs (NEB) small RNA library preparation set for Illumina, although similar kits exist from different vendors. The protocol preserves the orientation of the original RNA in the final sequencing library, enabling strand-specific analysis of the resulting data. These libraries have been used for differential gene expression analysis and small RNA discovery and are currently being tested for de novo transcriptome assembly. The protocol is robust and applicable to a broad range of RNA input types and RNA quality, making it ideal for high-throughput laboratories.

  5. Novel Approach to Analyzing MFE of Noncoding RNA Sequences

    PubMed Central

    George, Tina P.; Thomas, Tessamma

    2016-01-01

    Genomic studies have become noncoding RNA (ncRNA) centric after the study of different genomes provided enormous information on ncRNA over the past decades. The function of ncRNA is decided by its secondary structure, and across organisms, the secondary structure is more conserved than the sequence itself. In this study, the optimal secondary structure or the minimum free energy (MFE) structure of ncRNA was found based on the thermodynamic nearest neighbor model. MFE of over 2600 ncRNA sequences was analyzed in view of its signal properties. Mathematical models linking MFE to the signal properties were found for each of the four classes of ncRNA analyzed. MFE values computed with the proposed models were in concordance with those obtained with the standard web servers. A total of 95% of the sequences analyzed had deviation of MFE values within ±15% relative to those obtained from standard web servers.

  6. Tristetraprolin recruits functional mRNA decay complexes to ARE sequences.

    PubMed

    Hau, Heidi H; Walsh, Richard J; Ogilvie, Rachel L; Williams, Darlisha A; Reilly, Cavan S; Bohjanen, Paul R

    2007-04-15

    AU-rich elements (AREs) in the 3' untranslated region (UTR) of numerous mammalian transcripts function as instability elements that promote rapid mRNA degradation. Tristetraprolin (TTP) is an ARE-binding protein that promotes rapid mRNA decay through mechanisms that are poorly understood. A 31 nucleotide ARE sequences from the TNF-alpha 3' UTR promoted TTP-dependent mRNA decay when it was inserted into the 3' UTR of a beta-globin reporter transcript, indicating that this short sequence was sufficient for TTP function. We used a gel shift assay to identify a TTP-containing complex in cytoplasmic extracts from TTP-transfected HeLa cells that bound specifically to short ARE sequences. This TTP-containing complex also contained the 5'-3' exonuclease Xrn1 and the exosome component PM-scl75 because it was super-shifted with anti-Xrn1 or anti-PMscl75 antibodies. RNA affinity purification verified that these proteins associated specifically with ARE sequences in a TTP-dependent manner. Using a competition binding assay, we found that the TTP-containing complex bound with high affinity to short ARE sequences from GM-CSF, IL-3, TNF-alpha, IL-2, and c-fos, but did not bind to a U-rich sequence from c-myc, a 22 nucleotide poly U sequence or a mutated GM-CSF control sequence. High affinity binding by the TTP-containing complex correlated with TTP-dependent deadenylation and decay of capped, polyadenylated transcripts in a cell-free mRNA decay assay, suggesting that the TTP-containing complex was functional. These data support a model whereby TTP functions to enhance mRNA decay by recruiting components of the cellular mRNA decay machinery to the transcript.

  7. Quality Control and Analysis of NGS RNA Sequencing Data.

    PubMed

    Quinn, Emma M; McManus, Ross

    2015-01-01

    Transcriptome sequencing, where RNA is isolated, converted to library of cDNA fragments, and sequenced using next-generation sequencing technology, has become the method of choice for the genome-wide characterization of mRNA levels. It offers a more accurate quantification of transcript levels than array-based methods, but also has the added benefit of allowing the discovery of novel gene/transcripts, alternative splice junctions, and novel RNAs. In addition, RNA sequencing may be used to investigate differential gene expression, allelic imbalance, eQTL mapping, RNA editing, RNA-protein interactions, and alternative splicing. A number of statistical methods and tools are available for differential expression analysis using RNA sequencing data and these are continually being developed and improved to handle more complex experimental designs. This chapter describes an example workflow for the quality control and analysis of raw RNA sequencing reads for the purposes of differential gene expression analysis, followed by pathway/enrichment analysis of significantly different genes. The methods and tools described are just one example of how this analysis can be conducted, but they can be applied to most standard RNA sequencing studies of differential gene expression. The methods covered are based on Illumina HiSeq single-end 50 bp reads. However, all programs used are capable of working with paired-end data, subsequent to minor adaptations.

  8. Prediction of Secondary Structures Conserved in Multiple RNA Sequences.

    PubMed

    Xu, Zhenjiang Zech; Mathews, David H

    2016-01-01

    RNA structure is conserved by evolution to a greater extent than sequence. Predicting the conserved structure for multiple homologous sequences can be much more accurate than predicting the structure for a single sequence. RNAstructure is a software package that includes the programs Dynalign, Multilign, TurboFold, and PARTS for predicting conserved RNA secondary structure. This chapter provides protocols for using these programs. PMID:27665591

  9. RNA self-processing towards changed topology and sequence oligomerization.

    PubMed

    Pieper, Stefan; Vauléon, Stéphanie; Müller, Sabine

    2007-07-01

    Reversible chemistry, allowing for chain-forming as well as chain-breaking steps, is important for biological self-organization. In this context, ribozymes, catalyzing both RNA cleavage and ligation, may have significantly contributed to extending the sequence space and length of RNA molecules in early life forms. Here we present an engineered RNA that self-processes by passing through a number of cleavage and ligation steps. Intermolecular reactions compete with intramolecular reactions, resulting in a variety of products. Our results demonstrate that RNA can undergo self-oligomerization, which may have been important for extending the RNA genome size in RNA world scenarios.

  10. DSAP: deep-sequencing small RNA analysis pipeline.

    PubMed

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  11. RNase P-Mediated Sequence-Specific Cleavage of RNA by Engineered External Guide Sequences.

    PubMed

    Derksen, Merel; Mertens, Vicky; Pruijn, Ger J M

    2015-11-09

    The RNA cleavage activity of RNase P can be employed to decrease the levels of specific RNAs and to study their function or even to eradicate pathogens. Two different technologies have been developed to use RNase P as a tool for RNA knockdown. In one of these, an external guide sequence, which mimics a tRNA precursor, a well-known natural RNase P substrate, is used to target an RNA molecule for cleavage by endogenous RNase P. Alternatively, a guide sequence can be attached to M1 RNA, the (catalytic) RNase P RNA subunit of Escherichia coli. The guide sequence is specific for an RNA target, which is subsequently cleaved by the bacterial M1 RNA moiety. These approaches are applicable in both bacteria and eukaryotes. In this review, we will discuss the two technologies in which RNase P is used to reduce RNA expression levels.

  12. Unbiased Deep Sequencing of RNA Viruses from Clinical Samples.

    PubMed

    Matranga, Christian B; Gladden-Young, Adrianne; Qu, James; Winnicki, Sarah; Nosamiefan, Dolo; Levin, Joshua Z; Sabeti, Pardis C

    2016-01-01

    Here we outline a next-generation RNA sequencing protocol that enables de novo assemblies and intra-host variant calls of viral genomes collected from clinical and biological sources. The method is unbiased and universal; it uses random primers for cDNA synthesis and requires no prior knowledge of the viral sequence content. Before library construction, selective RNase H-based digestion is used to deplete unwanted RNA - including poly(rA) carrier and ribosomal RNA - from the viral RNA sample. Selective depletion improves both the data quality and the number of unique reads in viral RNA sequencing libraries. Moreover, a transposase-based 'tagmentation' step is used in the protocol as it reduces overall library construction time. The protocol has enabled rapid deep sequencing of over 600 Lassa and Ebola virus samples-including collections from both blood and tissue isolates-and is broadly applicable to other microbial genomics studies. PMID:27403729

  13. Unbiased Deep Sequencing of RNA Viruses from Clinical Samples

    PubMed Central

    Matranga, Christian B.; Gladden-Young, Adrianne; Qu, James; Winnicki, Sarah; Nosamiefan, Dolo; Levin, Joshua Z.; Sabeti, Pardis C.

    2016-01-01

    Here we outline a next-generation RNA sequencing protocol that enables de novo assemblies and intra-host variant calls of viral genomes collected from clinical and biological sources. The method is unbiased and universal; it uses random primers for cDNA synthesis and requires no prior knowledge of the viral sequence content. Before library construction, selective RNase H-based digestion is used to deplete unwanted RNA — including poly(rA) carrier and ribosomal RNA — from the viral RNA sample. Selective depletion improves both the data quality and the number of unique reads in viral RNA sequencing libraries. Moreover, a transposase-based 'tagmentation' step is used in the protocol as it reduces overall library construction time. The protocol has enabled rapid deep sequencing of over 600 Lassa and Ebola virus samples-including collections from both blood and tissue isolates-and is broadly applicable to other microbial genomics studies. PMID:27403729

  14. The primary nucleotide sequence of U4 RNA.

    PubMed

    Reddy, R; Henning, D; Busch, H

    1981-04-10

    U4 RNA is one of the "capped" nuclear snRNAs recently found to be precipitable by anti-Sm antibodies as ribonucleoprotein particles. U4 RNA, along with other snRNAs, has been implicated in hnRNA processing, mRNA transport, or both (Lerner, M. R., Boyle, J., Mount, S., Wolin, S., and Steitz, J. A. (1980) Nature 283, 220-224). Since the proteins bound to different snRNAs appear to be the same, the functions of different snRNPs might be dependent on the RNA components. To help understand the function of U4 RNP, the nucleotide sequence of U4 RNA was determined. The sequence is (formula see text) In addition to the modified nucleotides in the "cap," U4 RNA contains Am at position 63 and m6A at position 98. It also exhibited A-C microheterogeneity at position 97. PMID:6162848

  15. Sequence of Echinochloa hoja blanca tenuivirus RNA-3.

    PubMed

    de Miranda, J R; Muñoz, M; Madriz, J; Wu, R; Espinoza, A M

    1996-01-01

    Analysis of the sequence of the 2336 nucleotide RNA-3 of Echinochloa hoja blanca tenuivirus shows that it is closely related to RNA-3 of rice hoja blanca tenuivirus, the principal virus disease of rice in Latin America. This is especially true for the coding regions, where the viruses are almost 90% similar. However, the non-coding regions of RNA-3 of these viruses, principally the intergenic region separating the two ambisense open reading frames, are only about 50% similar, suggesting that these are distinct viruses. The results closely resemble those obtained for the analysis of RNA-4 of these viruses, both in the absolute and relative percentage similarities of the coding and non-coding regions. This implies a coordinated evolution of the different tenuivirus RNA segments. The features of the RNA and the comparisons with the sequences of RNA-3 of RHBV, rice stripe virus (RStV) and maize stripe virus (MStV) are discussed. PMID:8938981

  16. Annotating RNA motifs in sequences and alignments

    PubMed Central

    Gardner, Paul P.; Eldai, Hisham

    2015-01-01

    RNA performs a diverse array of important functions across all cellular life. These functions include important roles in translation, building translational machinery and maturing messenger RNA. More recent discoveries include the miRNAs and bacterial sRNAs that regulate gene expression, the thermosensors, riboswitches and other cis-regulatory elements that help prokaryotes sense their environment and eukaryotic piRNAs that suppress transposition. However, there can be a long period between the initial discovery of a RNA and determining its function. We present a bioinformatic approach to characterize RNA motifs, which are critical components of many RNA structure–function relationships. These motifs can, in some instances, provide researchers with functional hypotheses for uncharacterized RNAs. Moreover, we introduce a new profile-based database of RNA motifs—RMfam—and illustrate some applications for investigating the evolution and functional characterization of RNA. All the data and scripts associated with this work are available from: https://github.com/ppgardne/RMfam. PMID:25520192

  17. The In Vitro Synthesis of Avian Myeloblastosis Viral RNA Sequences

    PubMed Central

    Jacquet, Michel; Groner, Yoram; Monroy, Gladys; Hurwitz, Jerard

    1974-01-01

    Isolated nuclei, prepared from myeloblasts of chicks infected with avian myeloblastosis virus, synthesize RNA sequences present in avian myeloblastosis viral RNA. These sequences are also formed during transcription of chromatin, isolated from myeloblasts, by DNA-dependent RNA polymerases purified from Escherichia coli or calfthymus. In the latter case, transcription is α-amanitin sensitive. Formation of hybrids between RNA and avian myeloblastosis virus DNA probes has been monitored by the combined use of ribonucleases A, T1, and H, and ribonucleases specific for single strands. PMID:4370472

  18. Sequence of echinochloa hoja blanca tenuivirus RNA-5.

    PubMed

    de Miranda, J R; Muñoz, M; Wu, R; Espinoza, A M

    1996-01-01

    The sequence is presented of RNA-5 of Echinochloa hoja blanca tenuivirus, a second tenuivirus associated with rice cultivation in Latin America (after rice hoja blanca virus). The RNA is 1334 nucleotides long and contains in the complementary sense RNA a single long open reading frame. The deduced amino acid sequence of this open reading frame shows that it encodes a highly basic and hydrophilic 44 kD protein (pc5) with about 50% similarity to the pc5 protein of maize stripe virus (MStV). This and other features of the RNA are discussed. PMID:8879129

  19. Quantifying RNA allelic ratios by microfluidic multiplex PCR and sequencing.

    PubMed

    Zhang, Rui; Li, Xin; Ramaswami, Gokul; Smith, Kevin S; Turecki, Gustavo; Montgomery, Stephen B; Li, Jin Billy

    2014-01-01

    We developed a targeted RNA sequencing method that couples microfluidics-based multiplex PCR and deep sequencing (mmPCR-seq) to uniformly and simultaneously amplify up to 960 loci in 48 samples independently of their gene expression levels and to accurately and cost-effectively measure allelic ratios even for low-quantity or low-quality RNA samples. We applied mmPCR-seq to RNA editing and allele-specific expression studies. mmPCR-seq complements RNA-seq for studying allelic variations in the transcriptome.

  20. RNAcentral: an international database of ncRNA sequences

    DOE PAGESBeta

    Williams, Kelly Porter

    2014-10-28

    The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.

  1. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  2. Compilation of 5S rRNA and 5S rRNA gene sequences

    PubMed Central

    Specht, Thomas; Wolters, Jörn; Erdmann, Volker A.

    1990-01-01

    The BERLIN RNA DATABANK as of Dezember 31, 1989, contains a total of 667 sequences of 5S rRNAs or their genes, which is an increase of 114 new sequence entries over the last compilation (1). It covers sequences from 44 archaebacteria, 267 eubacteria, 20 plastids, 6 mitochondria, 319 eukaryotes and 11 eukaryotic pseudogenes. The hardcopy shows only the list (Table 1) of those organisms whose sequences have been determined. The BERLIN RNA DATABANK uses the format of the EMBL Nucleotide Sequence Data Library complemented by a Sequence Alignment (SA) field including secondary structure information. PMID:1692116

  3. Complete sequence of RNA1 and subgenomic RNA3 of Atlantic halibut nodavirus (AHNV).

    PubMed

    Sommerset, Ingunn; Nerland, Audun H

    2004-03-10

    The Nodaviridae are divided into the alphanodavirus genus, which infects insects, and the betanodavirus genus, which infects fishes. Betanodaviruses are the causative agent of viral encephalopathy and retinopathy (VER) in a number of cultivated marine fish species. The Nodaviridae are small non-enveloped RNA viruses that contain a genome consisting of 2 single-stranded positivesense RNA segments: RNA1 (3.1 kb), which encodes the viral part of the RNA-dependent RNA polymerase (RdRp); and RNA2 (1.4 kb), which encodes the capsid protein. In addition to RNA1 and RNA2, a subgenomic transcript of RNA1, RNA3, is present in infected cells. We have cloned and sequenced RNA1 from the Atlantic halibut Hippoglossus hippoglossus nodavirus (AHNV), and for the first time, the sequence of a betanodaviral subgenomic RNA3 has been determined. AHNV RNA1 was 3100 nucleotides in length and contained a main open reading frame encoding a polypeptide of 981 amino acids. Conservative motifs for RdRp were found in the deduced amino acid sequence. RNA3 was 371 nucleotides in length, and contained an open reading frame encoding a peptide of 75 amino acids corresponding to a hypothetical B2 protein, although sequence alignments with the alphanodavirus B2 proteins showed only marginal similarities. AHNV RNA replication in the fish cell-line SSN-1 (derived from striped snakehead) was analysed by Northern blot analysis, which indicated that RNA3 was synthesised in large amounts (compared to RNA1) at an early point in time post-infection. PMID:15109133

  4. Translating RNA sequencing into clinical diagnostics: opportunities and challenges.

    PubMed

    Byron, Sara A; Van Keuren-Jensen, Kendall R; Engelthaler, David M; Carpten, John D; Craig, David W

    2016-05-01

    With the emergence of RNA sequencing (RNA-seq) technologies, RNA-based biomolecules hold expanded promise for their diagnostic, prognostic and therapeutic applicability in various diseases, including cancers and infectious diseases. Detection of gene fusions and differential expression of known disease-causing transcripts by RNA-seq represent some of the most immediate opportunities. However, it is the diversity of RNA species detected through RNA-seq that holds new promise for the multi-faceted clinical applicability of RNA-based measures, including the potential of extracellular RNAs as non-invasive diagnostic indicators of disease. Ongoing efforts towards the establishment of benchmark standards, assay optimization for clinical conditions and demonstration of assay reproducibility are required to expand the clinical utility of RNA-seq.

  5. Profiling tissue-resident T cell repertoires by RNA sequencing.

    PubMed

    Brown, Scott D; Raeburn, Lisa A; Holt, Robert A

    2015-01-01

    Deep sequencing of recombined T cell receptor (TCR) genes and transcripts has provided a view of T cell repertoire diversity at an unprecedented resolution. Beyond profiling peripheral blood, analysis of tissue-resident T cells provides further insight into immune-related diseases. We describe the extraction of TCR sequence information directly from RNA-sequencing data from 6738 tumor and 604 control tissues, with a typical yield of 1 TCR per 10 million reads. This method circumvents the need for PCR amplification of the TCR template and provides TCR information in the context of global gene expression, allowing integrated analysis of extensive RNA-sequencing data resources. PMID:26620832

  6. Deep Sequencing Insights in Therapeutic shRNA Processing and siRNA Target Cleavage Precision

    PubMed Central

    Denise, Hubert; Moschos, Sterghios A.; Sidders, Benjamin; Burden, Frances; Perkins, Hannah; Carter, Nikki; Stroud, Tim; Kennedy, Michael; Fancy, Sally-Ann; Lapthorn, Cris; Lavender, Helen; Kinloch, Ross; Suhy, David; Corbau, Romu

    2014-01-01

    TT-034 (PF-05095808) is a recombinant adeno-associated virus serotype 8 (AAV8) agent expressing three short hairpin RNA (shRNA) pro-drugs that target the hepatitis C virus (HCV) RNA genome. The cytosolic enzyme Dicer cleaves each shRNA into multiple, potentially active small interfering RNA (siRNA) drugs. Using next-generation sequencing (NGS) to identify and characterize active shRNAs maturation products, we observed that each TT-034–encoded shRNA could be processed into as many as 95 separate siRNA strands. Few of these appeared active as determined by Sanger 5′ RNA Ligase-Mediated Rapid Amplification of cDNA Ends (5-RACE) and through synthetic shRNA and siRNA analogue studies. Moreover, NGS scrutiny applied on 5-RACE products (RACE-seq) suggested that synthetic siRNAs could direct cleavage in not one, but up to five separate positions on targeted RNA, in a sequence-dependent manner. These data support an on-target mechanism of action for TT-034 without cytotoxicity and question the accepted precision of substrate processing by the key RNA interference (RNAi) enzymes Dicer and siRNA-induced silencing complex (siRISC). PMID:24496437

  7. FLDS: A Comprehensive dsRNA Sequencing Method for Intracellular RNA Virus Surveillance

    PubMed Central

    Urayama, Syun-ichi; Takaki, Yoshihiro; Nunoura, Takuro

    2016-01-01

    Knowledge of the distribution and diversity of RNA viruses is still limited in spite of their possible environmental and epidemiological impacts because RNA virus-specific metagenomic methods have not yet been developed. We herein constructed an effective metagenomic method for RNA viruses by targeting long double-stranded (ds)RNA in cellular organisms, which is a hallmark of infection, or the replication of dsRNA and single-stranded (ss)RNA viruses, except for retroviruses. This novel dsRNA targeting metagenomic method is characterized by an extremely high recovery rate of viral RNA sequences, the retrieval of terminal sequences, and uniform read coverage, which has not previously been reported in other metagenomic methods targeting RNA viruses. This method revealed a previously unidentified viral RNA diversity of more than 20 complete RNA viral genomes including dsRNA and ssRNA viruses associated with an environmental diatom colony. Our approach will be a powerful tool for cataloging RNA viruses associated with organisms of interest. PMID:26877136

  8. Deep Sequencing Insights in Therapeutic shRNA Processing and siRNA Target Cleavage Precision.

    PubMed

    Denise, Hubert; Moschos, Sterghios A; Sidders, Benjamin; Burden, Frances; Perkins, Hannah; Carter, Nikki; Stroud, Tim; Kennedy, Michael; Fancy, Sally-Ann; Lapthorn, Cris; Lavender, Helen; Kinloch, Ross; Suhy, David; Corbau, Romu

    2014-01-01

    TT-034 (PF-05095808) is a recombinant adeno-associated virus serotype 8 (AAV8) agent expressing three short hairpin RNA (shRNA) pro-drugs that target the hepatitis C virus (HCV) RNA genome. The cytosolic enzyme Dicer cleaves each shRNA into multiple, potentially active small interfering RNA (siRNA) drugs. Using next-generation sequencing (NGS) to identify and characterize active shRNAs maturation products, we observed that each TT-034-encoded shRNA could be processed into as many as 95 separate siRNA strands. Few of these appeared active as determined by Sanger 5' RNA Ligase-Mediated Rapid Amplification of cDNA Ends (5-RACE) and through synthetic shRNA and siRNA analogue studies. Moreover, NGS scrutiny applied on 5-RACE products (RACE-seq) suggested that synthetic siRNAs could direct cleavage in not one, but up to five separate positions on targeted RNA, in a sequence-dependent manner. These data support an on-target mechanism of action for TT-034 without cytotoxicity and question the accepted precision of substrate processing by the key RNA interference (RNAi) enzymes Dicer and siRNA-induced silencing complex (siRISC).Molecular Therapy-Nucleic Acids (2014) 3, e145; doi:10.1038/mtna.2013.73; published online 4 February 2014.

  9. Sequence-non-specific effects of RNA interference triggers and microRNA regulators

    PubMed Central

    Olejniczak, Marta; Galka, Paulina; Krzyzosiak, Wlodzimierz J.

    2010-01-01

    RNA reagents of diverse lengths and structures, unmodified or containing various chemical modifications are powerful tools of RNA interference and microRNA technologies. These reagents which are either delivered to cells using appropriate carriers or are expressed in cells from suitable vectors often cause unintended sequence-non-specific immune responses besides triggering intended sequence-specific silencing effects. This article reviews the present state of knowledge regarding the cellular sensors of foreign RNA, the signaling pathways these sensors mobilize and shows which specific features of the RNA reagents set the responsive systems on alert. The representative examples of toxic effects caused in the investigated cell lines and tissues by the RNAs of specific types and structures are collected and may be instructive for further studies of sequence-non-specific responses to foreign RNA in human cells. PMID:19843612

  10. The chemical structure of DNA sequence signals for RNA transcription

    NASA Technical Reports Server (NTRS)

    George, D. G.; Dayhoff, M. O.

    1982-01-01

    The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

  11. The genome of RNA tumor viruses contains polyadenylic acid sequences.

    PubMed

    Green, M; Cartas, M

    1972-04-01

    The 70S genome of two RNA tumor viruses, murine sarcoma virus and avian myeloblastosis virus, binds to Millipore filters in buffer with high salt concentration and to glass fiber filters containing poly(U). These observations suggest that 70S RNA contains adenylic acid-rich sequences. When digested by pancreatic RNase, 70S RNA of murine sarcoma virus yielded poly(A) sequences that contain 91% adenylic acid. These poly(A) sequences sedimented as a relatively homogenous peak in sucrose gradients with a sedimentation coefficient of 4-5 S, but had a mobility during polyacrylamide gel electrophoresis that corresponds to molecules that sediment at 6-7 S. If we estimate a molecular weight for each sequence of 30,000-60,000 (100-200 nucleotides) and a molecular weight for viral 70S RNA of 3-12 million, each viral genome could contain 1-8 poly(A) sequences. Possible functions of poly(A) in the infecting viral RNA may include a role in the initiation of viral DNA or RNA synthesis, in protein maturation, or in the assembly of the viral genome.

  12. Spliced synthetic genes as internal controls in RNA sequencing experiments.

    PubMed

    Hardwick, Simon A; Chen, Wendy Y; Wong, Ted; Deveson, Ira W; Blackburn, James; Andersen, Stacey B; Nielsen, Lars K; Mattick, John S; Mercer, Tim R

    2016-09-01

    RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome. PMID:27502218

  13. RNA-Pareto: interactive analysis of Pareto-optimal RNA sequence-structure alignments.

    PubMed

    Schnattinger, Thomas; Schöning, Uwe; Marchfelder, Anita; Kestler, Hans A

    2013-12-01

    Incorporating secondary structure information into the alignment process improves the quality of RNA sequence alignments. Instead of using fixed weighting parameters, sequence and structure components can be treated as different objectives and optimized simultaneously. The result is not a single, but a Pareto-set of equally optimal solutions, which all represent different possible weighting parameters. We now provide the interactive graphical software tool RNA-Pareto, which allows a direct inspection of all feasible results to the pairwise RNA sequence-structure alignment problem and greatly facilitates the exploration of the optimal solution set.

  14. The nucleotide sequence of cowpea mosaic virus B RNA

    PubMed Central

    Lomonossoff, G.P.; Shanks, M.

    1983-01-01

    The complete sequence of the bottom component RNA (B RNA) of cowpea mosaic virus (CPMV) has been determined. Restriction enzyme fragments of double-stranded cDNA were cloned in M13 and the sequence of the inserts was determined by a combination of enzymatic and chemical sequencing techniques. Additional sequence information was obtained by primed synthesis on first strand cDNA. The complete sequence deduced is 5889 nucleotides long excluding the 3' poly(A), and contains an open reading frame sufficient to code for a polypeptide of mol. wt. 207 760. The coding region is flanked by a 5' leader sequence of 206 nucleotides and a 3' non-coding region of 82 residues which does not contain a polyadenylation signal. PMID:16453487

  15. Statistical design and analysis of RNA sequencing data.

    PubMed

    Auer, Paul L; Doerge, R W

    2010-06-01

    Next-generation sequencing technologies are quickly becoming the preferred approach for characterizing and quantifying entire genomes. Even though data produced from these technologies are proving to be the most informative of any thus far, very little attention has been paid to fundamental design aspects of data collection and analysis, namely sampling, randomization, replication, and blocking. We discuss these concepts in an RNA sequencing framework. Using simulations we demonstrate the benefits of collecting replicated RNA sequencing data according to well known statistical designs that partition the sources of biological and technical variation. Examples of these designs and their corresponding models are presented with the goal of testing differential expression.

  16. Identifying novel sequence variants of RNA 3D motifs

    PubMed Central

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  17. IVT-seq reveals extreme bias in RNA sequencing

    PubMed Central

    2014-01-01

    Background RNA-seq is a powerful technique for identifying and quantifying transcription and splicing events, both known and novel. However, given its recent development and the proliferation of library construction methods, understanding the bias it introduces is incomplete but critical to realizing its value. Results We present a method, in vitro transcription sequencing (IVT-seq), for identifying and assessing the technical biases in RNA-seq library generation and sequencing at scale. We created a pool of over 1,000 in vitro transcribed RNAs from a full-length human cDNA library and sequenced them with polyA and total RNA-seq, the most common protocols. Because each cDNA is full length, and we show in vitro transcription is incredibly processive, each base in each transcript should be equivalently represented. However, with common RNA-seq applications and platforms, we find 50% of transcripts have more than two-fold and 10% have more than 10-fold differences in within-transcript sequence coverage. We also find greater than 6% of transcripts have regions of dramatically unpredictable sequencing coverage between samples, confounding accurate determination of their expression. We use a combination of experimental and computational approaches to show rRNA depletion is responsible for the most significant variability in coverage, and several sequence determinants also strongly influence representation. Conclusions These results show the utility of IVT-seq for promoting better understanding of bias introduced by RNA-seq. We find rRNA depletion is responsible for substantial, unappreciated biases in coverage introduced during library preparation. These biases suggest exon-level expression analysis may be inadvisable, and we recommend caution when interpreting RNA-seq results. PMID:24981968

  18. Library preparation for highly accurate population sequencing of RNA viruses

    PubMed Central

    Acevedo, Ashley; Andino, Raul

    2015-01-01

    Circular resequencing (CirSeq) is a novel technique for efficient and highly accurate next-generation sequencing (NGS) of RNA virus populations. The foundation of this approach is the circularization of fragmented viral RNAs, which are then redundantly encoded into tandem repeats by ‘rolling-circle’ reverse transcription. When sequenced, the redundant copies within each read are aligned to derive a consensus sequence of their initial RNA template. This process yields sequencing data with error rates far below the variant frequencies observed for RNA viruses, facilitating ultra-rare variant detection and accurate measurement of low-frequency variants. Although library preparation takes ~5 d, the high-quality data generated by CirSeq simplifies downstream data analysis, making this approach substantially more tractable for experimentalists. PMID:24967624

  19. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications.

    PubMed

    Herzog, M; Maroteaux, L

    1986-11-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage.

  20. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications

    PubMed Central

    Herzog, Michel; Maroteaux, Luc

    1986-01-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  1. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications.

    PubMed

    Herzog, M; Maroteaux, L

    1986-11-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  2. Evaluation of commercially available RNA amplification kits for RNA sequencing using very low input amounts of total RNA.

    PubMed

    Shanker, Savita; Paulson, Ariel; Edenberg, Howard J; Peak, Allison; Perera, Anoja; Alekseyev, Yuriy O; Beckloff, Nicholas; Bivens, Nathan J; Donnelly, Robert; Gillaspy, Allison F; Grove, Deborah; Gu, Weikuan; Jafari, Nadereh; Kerley-Hamilton, Joanna S; Lyons, Robert H; Tepper, Clifford; Nicolet, Charles M

    2015-04-01

    This article includes supplemental data. Please visit http://www.fasebj.org to obtain this information.Multiple recent publications on RNA sequencing (RNA-seq) have demonstrated the power of next-generation sequencing technologies in whole-transcriptome analysis. Vendor-specific protocols used for RNA library construction often require at least 100 ng total RNA. However, under certain conditions, much less RNA is available for library construction. In these cases, effective transcriptome profiling requires amplification of subnanogram amounts of RNA. Several commercial RNA amplification kits are available for amplification prior to library construction for next-generation sequencing, but these kits have not been comprehensively field evaluated for accuracy and performance of RNA-seq for picogram amounts of RNA. To address this, 4 types of amplification kits were tested with 3 different concentrations, from 5 ng to 50 pg, of a commercially available RNA. Kits were tested at multiple sites to assess reproducibility and ease of use. The human total reference RNA used was spiked with a control pool of RNA molecules in order to further evaluate quantitative recovery of input material. Additional control data sets were generated from libraries constructed following polyA selection or ribosomal depletion using established kits and protocols. cDNA was collected from the different sites, and libraries were synthesized at a single site using established protocols. Sequencing runs were carried out on the Illumina platform. Numerous metrics were compared among the kits and dilutions used. Overall, no single kit appeared to meet all the challenges of small input material. However, it is encouraging that excellent data can be recovered with even the 50 pg input total RNA. PMID:25649271

  3. Antisense Transcript and RNA Processing Alterations Suppress Instability of Polyadenylated mRNA in Chlamydomonas Chloroplasts

    PubMed Central

    Nishimura, Yoshiki; Kikis, Elise A.; Zimmer, Sara L.; Komine, Yutaka; Stern, David B.

    2004-01-01

    In chloroplasts, the control of mRNA stability is of critical importance for proper regulation of gene expression. The Chlamydomonas reinhardtii strain Δ26pAtE is engineered such that the atpB mRNA terminates with an mRNA destabilizing polyadenylate tract, resulting in this strain being unable to conduct photosynthesis. A collection of photosynthetic revertants was obtained from Δ26pAtE, and gel blot hybridizations revealed RNA processing alterations in the majority of these suppressor of polyadenylation (spa) strains, resulting in a failure to expose the atpB mRNA 3′ poly(A) tail. Two exceptions were spa19 and spa23, which maintained unusual heteroplasmic chloroplast genomes. One genome type, termed PS+, conferred photosynthetic competence by contributing to the stability of atpB mRNA; the other, termed PS−, was required for viability but could not produce stable atpB transcripts. Based on strand-specific RT-PCR, S1 nuclease protection, and RNA gel blots, evidence was obtained that the PS+ genome stabilizes atpB mRNA by generating an atpB antisense transcript, which attenuates the degradation of the polyadenylated form. The accumulation of double-stranded RNA was confirmed by insensitivity of atpB mRNA from PS+ genome-containing cells to S1 nuclease digestion. To obtain additional evidence for antisense RNA function in chloroplasts, we used strain Δ26, in which atpB mRNA is unstable because of the lack of a 3′ stem-loop structure. In this context, when a 121-nucleotide segment of atpB antisense RNA was expressed from an ectopic site, an elevated accumulation of atpB mRNA resulted. Finally, when spa19 was placed in a genetic background in which expression of the chloroplast exoribonuclease polynucleotide phosphorylase was diminished, the PS+ genome and the antisense transcript were no longer required for photosynthesis. Taken together, our results suggest that antisense RNA in chloroplasts can protect otherwise unstable transcripts from 3′→5

  4. Small RNA cloning and sequencing strategy affects host and viral microRNA expression signatures.

    PubMed

    Stik, Grégoire; Muylkens, Benoît; Coupeau, Damien; Laurent, Sylvie; Dambrine, Ginette; Messmer, Mélanie; Chane-Woon-Ming, Béatrice; Pfeffer, Sébastien; Rasschaert, Denis

    2014-07-10

    The establishment of the microRNA (miRNA) expression signatures is the basic element to investigate the role played by these regulatory molecules in the biology of an organism. Marek's disease virus 1 (MDV-1) is an avian herpesvirus that naturally infects chicken and induces T cells lymphomas. During latency, MDV-1, like other herpesviruses, expresses a limited subset of transcripts. These include three miRNA clusters. Several studies identified the expression of virus and host encoded miRNAs from MDV-1 infected cell cultures and chickens. But a high discrepancy was observed when miRNA cloning frequencies obtained from different cloning and sequencing protocols were compared. Thus, we analyzed the effect of small RNA library preparation and sequencing on the miRNA frequencies obtained from the same RNA samples collected during MDV-1 infection of chicken at different steps of the oncoviral pathogenesis. Qualitative and quantitative variations were found in the data, depending on the strategy used. One of the mature miRNA derived from the latency-associated-transcript (LAT), mdv1-miR-M7-5p, showed the highest variation. Its cloning frequency was 50% of the viral miRNA counts when a small scale sequencing approach was used. Its frequency was 100 times less abundant when determined through the deep sequencing approach. Northern blot analysis showed a better correlation with the miRNA frequencies found by the small scale sequencing approach. By analyzing the cellular miRNA repertoire, we also found a gap between the two sequencing approaches. Collectively, our study indicates that next-generation sequencing data considered alone are limited for assessing the absolute copy number of transcripts. Thus, the quantification of small RNA should be addressed by compiling data obtained by using different techniques such as microarrays, qRT-PCR and NB analysis in support of high throughput sequencing data. These observations should be considered when miRNA variations are studied

  5. Transcriptome and small RNA deep sequencing reveals deregulation of miRNA biogenesis in human glioma.

    PubMed

    Moore, Lynette M; Kivinen, Virpi; Liu, Yuexin; Annala, Matti; Cogdell, David; Liu, Xiuping; Liu, Chang-Gong; Sawaya, Raymond; Yli-Harja, Olli; Shmulevich, Ilya; Fuller, Gregory N; Zhang, Wei; Nykter, Matti

    2013-02-01

    Altered expression of oncogenic and tumour-suppressing microRNAs (miRNAs) is widely associated with tumourigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumours. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and examined expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression.

  6. Transcriptome and Small RNA Deep Sequencing Reveals Deregulation of miRNA Biogenesis in Human Glioma

    PubMed Central

    Moore, Lynette M.; Kivinen, Virpi; Liu, Yuexin; Annala, Matti; Cogdell, David; Liu, Xiuping; Liu, Chang-Gong; Sawaya, Raymond; Yli-Harja, Olli; Shmulevich, Ilya; Fuller, Gregory N.; Zhang, Wei; Nykter, Matti

    2013-01-01

    Altered expression of oncogenic and tumor-suppressing microRNAs (miRNAs) is widely associated with tumorigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumors. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and interrogated expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression. PMID:23007860

  7. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1990-10-09

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  8. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1987-10-07

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  9. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.

    1990-01-01

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.

  10. Reentrant Melting of RNA with Quenched Sequence Randomness

    NASA Astrophysics Data System (ADS)

    Hayrapetyan, G. N.; Iannelli, F.; Lekscha, J.; Morozov, V. F.; Netz, R. R.; Mamasakhlisov, Y. Sh.

    2014-08-01

    The effect of quenched sequence disorder on the thermodynamics of RNA secondary structure formation is investigated for two- and four-letter alphabet models using the constrained annealing approach, from which the temperature behavior of the free energy, specific heat, and helicity is analytically obtained. For competing base pairing energies, the calculations reveal reentrant melting at low temperatures, in excellent agreement with numerical results. Our results suggest an additional mechanism for the experimental phenomenon of RNA cold denaturation.

  11. Sequence determinants of improved CRISPR sgRNA design

    PubMed Central

    Xu, Han; Xiao, Tengfei; Chen, Chen-Hao; Li, Wei; Meyer, Clifford A.; Wu, Qiu; Wu, Di; Cong, Le; Zhang, Feng; Liu, Jun S.; Brown, Myles; Liu, X. Shirley

    2015-01-01

    The CRISPR/Cas9 system has revolutionized mammalian somatic cell genetics. Genome-wide functional screens using CRISPR/Cas9-mediated knockout or dCas9 fusion-mediated inhibition/activation (CRISPRi/a) are powerful techniques for discovering phenotype-associated gene function. We systematically assessed the DNA sequence features that contribute to single guide RNA (sgRNA) efficiency in CRISPR-based screens. Leveraging the information from multiple designs, we derived a new sequence model for predicting sgRNA efficiency in CRISPR/Cas9 knockout experiments. Our model confirmed known features and suggested new features including a preference for cytosine at the cleavage site. The model was experimentally validated for sgRNA-mediated mutation rate and protein knockout efficiency. Tested on independent data sets, the model achieved significant results in both positive and negative selection conditions and outperformed existing models. We also found that the sequence preference for CRISPRi/a is substantially different from that for CRISPR/Cas9 knockout and propose a new model for predicting sgRNA efficiency in CRISPRi/a experiments. These results facilitate the genome-wide design of improved sgRNA for both knockout and CRISPRi/a studies. PMID:26063738

  12. Quantifying sequence and structural features of protein-RNA interactions.

    PubMed

    Li, Songling; Yamashita, Kazuo; Amada, Karlou Mar; Standley, Daron M

    2014-09-01

    Increasing awareness of the importance of protein-RNA interactions has motivated many approaches to predict residue-level RNA binding sites in proteins based on sequence or structural characteristics. Sequence-based predictors are usually high in sensitivity but low in specificity; conversely structure-based predictors tend to have high specificity, but lower sensitivity. Here we quantified the contribution of both sequence- and structure-based features as indicators of RNA-binding propensity using a machine-learning approach. In order to capture structural information for proteins without a known structure, we used homology modeling to extract the relevant structural features. Several novel and modified features enhanced the accuracy of residue-level RNA-binding propensity beyond what has been reported previously, including by meta-prediction servers. These features include: hidden Markov model-based evolutionary conservation, surface deformations based on the Laplacian norm formalism, and relative solvent accessibility partitioned into backbone and side chain contributions. We constructed a web server called aaRNA that implements the proposed method and demonstrate its use in identifying putative RNA binding sites. PMID:25063293

  13. Phylogenetic relationships of Cryptosporidium determined by ribosomal RNA sequence comparison.

    PubMed

    Johnson, A M; Fielke, R; Lumb, R; Baverstock, P R

    1990-04-01

    Reverse transcription of total cellular RNA was used to obtain a partial sequence of the small subunit ribosomal RNA of Cryptosporidium, a protist currently placed in the phylum Apicomplexa. The semi-conserved regions were aligned with homologous sequences in a range of other eukaryotes, and the evolutionary relationships of Cryptosporidium were determined by two different methods of phylogenetic analysis. The prokaryotes Escherichia coli and Halobacterium cuti were included as outgroups. The results do not show an especially close relationship of Cryptosporidium to other members of the phylum Apicomplexa. PMID:2332273

  14. The technology and biology of single-cell RNA sequencing.

    PubMed

    Kolodziejczyk, Aleksandra A; Kim, Jong Kyoung; Svensson, Valentine; Marioni, John C; Teichmann, Sarah A

    2015-05-21

    The differences between individual cells can have profound functional consequences, in both unicellular and multicellular organisms. Recently developed single-cell mRNA-sequencing methods enable unbiased, high-throughput, and high-resolution transcriptomic analysis of individual cells. This provides an additional dimension to transcriptomic information relative to traditional methods that profile bulk populations of cells. Already, single-cell RNA-sequencing methods have revealed new biology in terms of the composition of tissues, the dynamics of transcription, and the regulatory relationships between genes. Rapid technological developments at the level of cell capture, phenotyping, molecular biology, and bioinformatics promise an exciting future with numerous biological and medical applications. PMID:26000846

  15. Modeling RNA loops using sequence homology and geometric constraints

    PubMed Central

    Schudoma, Christian; May, Patrick; Walther, Dirk

    2010-01-01

    Summary: RNA loop regions are essential structural elements of RNA molecules influencing both their structural and functional properties. We developed RLooM, a web application for homology-based modeling of RNA loops utilizing template structures extracted from the PDB. RLooM allows the insertion and replacement of loop structures of a desired sequence into an existing RNA structure. Furthermore, a comprehensive database of loops in RNA structures can be accessed through the web interface. Availability and Implementation: The application was implemented in Python, MySQL and Apache. A web interface to the database and loop modeling application is freely available at http://rloom.mpimp-golm.mpg.de Contact: schudoma@mpimp-golm.mpg.de; may@mpimp-golm.mpg.de; walther@mpimp-golm.mpg.de PMID:20427516

  16. Probing dimensionality beyond the linear sequence of mRNA.

    PubMed

    Del Campo, Cristian; Ignatova, Zoya

    2016-05-01

    mRNA is a nexus entity between DNA and translating ribosomes. Recent developments in deep sequencing technologies coupled with structural probing have revealed new insights beyond the classic role of mRNA and place it more centrally as a direct effector of a variety of processes, including translation, cellular localization, and mRNA degradation. Here, we highlight emerging approaches to probe mRNA secondary structure on a global transcriptome-wide level and compare their potential and resolution. Combined approaches deliver a richer and more complex picture. While our understanding on the effect of secondary structure for various cellular processes is quite advanced, the next challenge is to unravel more complex mRNA architectures and tertiary interactions. PMID:26650615

  17. Replication and packaging of Turnip yellow mosaic virus RNA containing Flock house virus RNA1 sequence.

    PubMed

    Kim, Hui-Bae; Kim, Do-Yeong; Cho, Tae-Ju

    2014-06-01

    Turnip yellow mosaic virus (TYMV) is a spherical plant virus that has a single 6.3 kb positive strand RNA as a genome. In this study, RNA1 sequence of Flock house virus (FHV) was inserted into the TYMV genome to test whether TYMV can accommodate and express another viral entity. In the resulting construct, designated TY-FHV, the FHV RNA1 sequence was expressed as a TYMV subgenomic RNA. Northern analysis of the Nicotiana benthamiana leaves agroinfiltrated with the TY-FHV showed that both genomic and subgenomic FHV RNAs were abundantly produced. This indicates that the FHV RNA1 sequence was correctly expressed and translated to produce a functional FHV replicase. Although these FHV RNAs were not encapsidated, the FHV RNA having a TYMV CP sequence at the 3'-end was efficiently encapsidated. When an eGFP gene was inserted into the B2 ORF of the FHV sequence, a fusion protein of B2-eGFP was produced as expected.

  18. High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing

    PubMed Central

    2011-01-01

    RNA interference (RNAi) screening is a state-of-the-art technology that enables the dissection of biological processes and disease-related phenotypes. The commercial availability of genome-wide, short hairpin RNA (shRNA) libraries has fueled interest in this area but the generation and analysis of these complex data remain a challenge. Here, we describe complete experimental protocols and novel open source computational methodologies, shALIGN and shRNAseq, that allow RNAi screens to be rapidly deconvoluted using next generation sequencing. Our computational pipeline offers efficient screen analysis and the flexibility and scalability to quickly incorporate future developments in shRNA library technology. PMID:22018332

  19. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

    PubMed

    Barquist, Lars; Burge, Sarah W; Gardner, Paul P

    2016-01-01

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc. PMID:27322404

  20. MicroRNA Expression Profile in Penile Cancer Revealed by Next-Generation Small RNA Sequencing

    PubMed Central

    Zhang, Yuanwei; Xu, Bo; Zhou, Jun; Fan, Song; Hao, Zongyao; Shi, Haoqiang; Zhang, Xiansheng; Kong, Rui; Xu, Lingfan; Gao, Jingjing; Zou, Duohong; Liang, Chaozhao

    2015-01-01

    Penile cancer (PeCa) is a relatively rare tumor entity but possesses higher morbidity and mortality rates especially in developing countries. To date, the concrete pathogenic signaling pathways and core machineries involved in tumorigenesis and progression of PeCa remain to be elucidated. Several studies suggested miRNAs, which modulate gene expression at posttranscriptional level, were frequently mis-regulated and aberrantly expressed in human cancers. However, the miRNA profile in human PeCa has not been reported before. In this present study, the miRNA profile was obtained from 10 fresh penile cancerous tissues and matched adjacent non-cancerous tissues via next-generation sequencing. As a result, a total of 751 and 806 annotated miRNAs were identified in normal and cancerous penile tissues, respectively. Among which, 56 miRNAs with significantly different expression levels between paired tissues were identified. Subsequently, several annotated miRNAs were selected randomly and validated using quantitative real-time PCR. Compared with the previous publications regarding to the altered miRNAs expression in various cancers and especially genitourinary (prostate, bladder, kidney, testis) cancers, the most majority of deregulated miRNAs showed the similar expression pattern in penile cancer. Moreover, the bioinformatics analyses suggested that the putative target genes of differentially expressed miRNAs between cancerous and matched normal penile tissues were tightly associated with cell junction, proliferation, growth as well as genomic instability and so on, by modulating Wnt, MAPK, p53, PI3K-Akt, Notch and TGF-β signaling pathways, which were all well-established to participate in cancer initiation and progression. Our work presents a global view of the differentially expressed miRNAs and potentially regulatory networks of their target genes for clarifying the pathogenic transformation of normal penis to PeCa, which research resource also provides new insights

  1. Assessment of microRNA differential expression and detection in multiplexed small RNA sequencing data.

    PubMed

    Campbell, Joshua D; Liu, Gang; Luo, Lingqi; Xiao, Ji; Gerrein, Joseph; Juan-Guardela, Brenda; Tedrow, John; Alekseyev, Yuriy O; Yang, Ivana V; Correll, Mick; Geraci, Mark; Quackenbush, John; Sciurba, Frank; Schwartz, David A; Kaminski, Naftali; Johnson, W Evan; Monti, Stefano; Spira, Avrum; Beane, Jennifer; Lenburg, Marc E

    2015-02-01

    Small RNA sequencing can be used to gain an unprecedented amount of detail into the microRNA transcriptome. The relatively high cost and low throughput of sequencing bases technologies can potentially be offset by the use of multiplexing. However, multiplexing involves a trade-off between increased number of sequenced samples and reduced number of reads per sample (i.e., lower depth of coverage). To assess the effect of different sequencing depths owing to multiplexing on microRNA differential expression and detection, we sequenced the small RNA of lung tissue samples collected in a clinical setting by multiplexing one, three, six, nine, or 12 samples per lane using the Illumina HiSeq 2000. As expected, the numbers of reads obtained per sample decreased as the number of samples in a multiplex increased. Furthermore, after normalization, replicate samples included in distinct multiplexes were highly correlated (R > 0.97). When detecting differential microRNA expression between groups of samples, microRNAs with average expression >1 reads per million (RPM) had reproducible fold change estimates (signal to noise) independent of the degree of multiplexing. The number of microRNAs detected was strongly correlated with the log2 number of reads aligning to microRNA loci (R = 0.96). However, most additional microRNAs detected in samples with greater sequencing depth were in the range of expression which had lower fold change reproducibility. These findings elucidate the trade-off between increasing the number of samples in a multiplex with decreasing sequencing depth and will aid in the design of large-scale clinical studies exploring microRNA expression and its role in disease.

  2. Nicotiana Small RNA Sequences Support a Host Genome Origin of Cucumber Mosaic Virus Satellite RNA

    PubMed Central

    Smith, Neil A.; Schumann, Ulrike; Fang, Yuan-Yuan; Dennis, Elizabeth S.; Zhang, Ren; Guo, Hui-Shan; Wang, Ming-Bo

    2015-01-01

    Satellite RNAs (satRNAs) are small noncoding subviral RNA pathogens in plants that depend on helper viruses for replication and spread. Despite many decades of research, the origin of satRNAs remains unknown. In this study we show that a β-glucuronidase (GUS) transgene fused with a Cucumber mosaic virus (CMV) Y satellite RNA (Y-Sat) sequence (35S-GUS:Sat) was transcriptionally repressed in N. tabacum in comparison to a 35S-GUS transgene that did not contain the Y-Sat sequence. This repression was not due to DNA methylation at the 35S promoter, but was associated with specific DNA methylation at the Y-Sat sequence. Both northern blot hybridization and small RNA deep sequencing detected 24-nt siRNAs in wild-type Nicotiana plants with sequence homology to Y-Sat, suggesting that the N. tabacum genome contains Y-Sat-like sequences that give rise to 24-nt sRNAs capable of guiding RNA-directed DNA methylation (RdDM) to the Y-Sat sequence in the 35S-GUS:Sat transgene. Consistent with this, Southern blot hybridization detected multiple DNA bands in Nicotiana plants that had sequence homology to Y-Sat, suggesting that Y-Sat-like sequences exist in the Nicotiana genome as repetitive DNA, a DNA feature associated with 24-nt sRNAs. Our results point to a host genome origin for CMV satRNAs, and suggest novel approach of using small RNA sequences for finding the origin of other satRNAs. PMID:25568943

  3. The effect of sequences with high AU content on mRNA stability in tobacco.

    PubMed Central

    Ohme-Takagi, M; Taylor, C B; Newman, T C; Green, P J

    1993-01-01

    Little is known about the mechanisms that target transcripts for rapid degradation in plants. In mammalian cells, sequences with a high AU content and multiple AUUUA motifs have been shown to cause mRNA instability when present in the 3' untranslated regions of several transcripts. This precedent, coupled with the poor accumulation of AU-rich foreign transcripts in plants (e.g., BT-toxin mRNAs), prompted us to test whether AU sequences could destabilize transcripts in tobacco. To address this question, we made a set of constructs containing sequences with high AU content inserted into the 3' untranslated regions of reporter genes. The stability of the corresponding transcripts was then assayed in stably transformed cell lines of tobacco. These experiments showed that a 60-base sequence containing 11 copies of the AUUUA motif (AUUUA repeat) markedly destabilized a beta-glucuronidase reporter transcript compared to a no-insert control or a 60-base spacer sequence (GC control). Another sequence with an identical A+U content had little effect. The same results were obtained when each sequence was assayed within the 3' untranslated region of a beta-globin reporter transcript. In regenerated transgenic plants, the AUUUA repeat decreased the accumulation of the beta-globin transcript by approximately 14-fold, compared to the GC control. Taken together, our results indicate that the AUUUA repeat is recognized as an instability determinant in plant cells and that the effect is due to the sequence of the element, not simply to the high AU content. Images Fig. 2 Fig. 4 Fig. 5 PMID:8265631

  4. Using RNA Sequencing to Classify Organisms into Three Primary Kingdoms.

    ERIC Educational Resources Information Center

    Evans, Robert H.

    1983-01-01

    Using the biochemical record to class archaebacteria, eukaryotes, and eubacteria involves abstractions difficult for the concrete learner. Therefore, a method is provided in which students discover some basic tenets of biochemical classification and apply them in a "hands-on" classification problem. The method involves use of RNA sequencing. (JN)

  5. Sequence specificity of mRNA N6-adenosine methyltransferase.

    PubMed

    Csepany, T; Lin, A; Baldick, C J; Beemon, K

    1990-11-25

    The sequence specificity of chicken mRNA N6-adenosine methyltransferase has been investigated in vivo. Localization of six new N6-methyladenosine sites on Rous sarcoma virus (RSV) virion RNA has confirmed our extended consensus sequence for methylation: RGACU, where R is usually a G (7/12). We have also observed A (2/12) and U (3/12) at the -2 position (relative to m6A at +1) but never a C. At the +3 position, the U was observed 10/12 times; an A and a C were observed once each in weakly methylated sequences. The extent of methylation varied between the different sites up to a maximum of about 90%. To test the significance of this consensus sequence, it was altered by site-specific mutagenesis, and methylation was assayed after transfection of mutated RSV DNA into chicken embryo fibroblasts. We found that changing the G at -1 or the U at +3 to any other residue inhibited methylation. However, inhibition of methylation at all four of the major sites in the RSV src gene did not detectably alter the steady-state levels of the three viral RNA species or viral infectivity. Additional mutants that inactivated the src protein kinase activity produced less virus and exhibited relatively less src mRNA in infected cells. PMID:2173695

  6. Toward Rare Blood Cell Preservation for RNA Sequencing.

    PubMed

    Vickovic, Sanja; Ahmadian, Afshin; Lewensohn, Rolf; Lundeberg, Joakim

    2015-07-01

    Cancer is driven by various events leading to cell differentiation and disease progression. Molecular tools are powerful approaches for describing how and why these events occur. With the growing field of next-generation DNA sequencing, there is an increasing need for high-quality nucleic acids derived from human cells and tissues-a prerequisite for successful cell profiling. Although advances in RNA preservation have been made, some of the largest biobanks still do not employ RNA blood preservation as standard because of limitations in low blood-input volume and RNA stability over the whole gene body. Therefore, we have developed a robust protocol for blood preservation and long-term storage while maintaining RNA integrity. Furthermore, we explored the possibility of using the protocol for preserving rare cell samples, such as circulating tumor cells. The results of our study confirmed that gene expression was not impacted by the preservation procedure (r(2) > 0.88) or by long-term storage (r(2) = 0.95), with RNA integrity number values averaging over 8. Similarly, cell surface antigens were still available for antibody selection (r(2) = 0.95). Lastly, data mining for fusion events showed that it was possible to detect rare tumor cells among a background of other cells present in blood irrespective of fixation. Thus, the developed protocol would be suitable for rare blood cell preservation followed by RNA sequencing analysis.

  7. Ribosomal RNA sequence suggests microsporidia are extremely ancient eukaryotes.

    PubMed

    Vossbrinck, C R; Maddox, J V; Friedman, S; Debrunner-Vossbrinck, B A; Woese, C R

    The microsporidia are a group of unusual, obligately parasitic protists that infect a great variety of other eukaryotes, including vertebrates, arthropods, molluscs, annelids, nematodes, cnidaria and even various ciliates, myxosporidia and gregarines. They possess a number of unusual cytological and molecular characteristics. Their nuclear division is considered to be primitive, they have no mitochondria, their ribosomes and ribosomal RNAs are reported to be of prokaryotic size and their large ribosomal subunit contains no 5.8S rRNA. The uniqueness of the microsporidia may reflect their phylogenetic position, because comparative sequence analysis shows that the small subunit rRNA of the microsporidium Vairimorpha necatrix is more unlike those of other eukaryotes than any known eukaryote 18S rRNA sequence. We conclude that the lineage leading to microsporidia branched very early from that leading to other eukaryotes.

  8. siRNA release from pri-miRNA scaffolds is controlled by the sequence and structure of RNA.

    PubMed

    Galka-Marciniak, Paulina; Olejniczak, Marta; Starega-Roslan, Julia; Szczesniak, Michal W; Makalowska, Izabela; Krzyzosiak, Wlodzimierz J

    2016-04-01

    shmiRs are pri-miRNA-based RNA interference triggers from which exogenous siRNAs are expressed in cells to silence target genes. These reagents are very promising tools in RNAi in vivo applications due to their good activity profile and lower toxicity than observed for other vector-based reagents such as shRNAs. In this study, using high-resolution northern blotting and small RNA sequencing, we investigated the precision with which RNases Drosha and Dicer process shmiRs. The fidelity of siRNA release from the commonly used pri-miRNA shuttles was found to depend on both the siRNA insert and the pri-miR scaffold. Then, we searched for specific factors that may affect the precision of siRNA release and found that both the structural features of shmiR hairpins and the nucleotide sequence at Drosha and Dicer processing sites contribute to cleavage site selection and cleavage precision. An analysis of multiple shRNA intermediates generated from several reagents revealed the complexity of shmiR processing by Drosha and demonstrated that Dicer selects substrates for further processing. Aside from providing new basic knowledge regarding the specificity of nucleases involved in miRNA biogenesis, our results facilitate the rational design of more efficient genetic reagents for RNAi technology. PMID:26921501

  9. Learning to Predict miRNA-mRNA Interactions from AGO CLIP Sequencing and CLASH Data

    PubMed Central

    Lu, Yuheng; Leslie, Christina S.

    2016-01-01

    Recent technologies like AGO CLIP sequencing and CLASH enable direct transcriptome-wide identification of AGO binding and miRNA target sites, but the most widely used miRNA target prediction algorithms do not exploit these data. Here we use discriminative learning on AGO CLIP and CLASH interactions to train a novel miRNA target prediction model. Our method combines two SVM classifiers, one to predict miRNA-mRNA duplexes and a second to learn a binding model of AGO’s local UTR sequence preferences and positional bias in 3’UTR isoforms. The duplex SVM model enables the prediction of non-canonical target sites and more accurately resolves miRNA interactions from AGO CLIP data than previous methods. The binding model is trained using a multi-task strategy to learn context-specific and common AGO sequence preferences. The duplex and common AGO binding models together outperform existing miRNA target prediction algorithms on held-out binding data. Open source code is available at https://bitbucket.org/leslielab/chimiric. PMID:27438777

  10. Learning to Predict miRNA-mRNA Interactions from AGO CLIP Sequencing and CLASH Data.

    PubMed

    Lu, Yuheng; Leslie, Christina S

    2016-07-01

    Recent technologies like AGO CLIP sequencing and CLASH enable direct transcriptome-wide identification of AGO binding and miRNA target sites, but the most widely used miRNA target prediction algorithms do not exploit these data. Here we use discriminative learning on AGO CLIP and CLASH interactions to train a novel miRNA target prediction model. Our method combines two SVM classifiers, one to predict miRNA-mRNA duplexes and a second to learn a binding model of AGO's local UTR sequence preferences and positional bias in 3'UTR isoforms. The duplex SVM model enables the prediction of non-canonical target sites and more accurately resolves miRNA interactions from AGO CLIP data than previous methods. The binding model is trained using a multi-task strategy to learn context-specific and common AGO sequence preferences. The duplex and common AGO binding models together outperform existing miRNA target prediction algorithms on held-out binding data. Open source code is available at https://bitbucket.org/leslielab/chimiric. PMID:27438777

  11. Dynamics in Sequence Space for RNA Secondary Structure Design.

    PubMed

    Matthies, Marco C; Bienert, Stefan; Torda, Andrew E

    2012-10-01

    We have implemented a method for the design of RNA sequences that should fold to arbitrary secondary structures. A popular energy model allows one to take the derivative with respect to composition, which can then be interpreted as a force and used for Newtonian dynamics in sequence space. Combined with a negative design term, one can rapidly sample sequences which are compatible with a desired secondary structure via simulated annealing. Results for 360 structures were compared with those from another nucleic acid design program using measures such as the probability of the target structure and an ensemble-weighted distance to the target structure.

  12. Direct Sequence Detection of Structured H5 Influenza Viral RNA

    PubMed Central

    Kerby, Matthew B.; Freeman, Sarah; Prachanronarong, Kristina; Artenstein, Andrew W.; Opal, Steven M.; Tripathi, Anubhav

    2008-01-01

    We describe the development of sequence-specific molecular beacons (dual-labeled DNA probes) for identification of the H5 influenza subtype, cleavage motif, and receptor specificity when hybridized directly with in vitro transcribed viral RNA (vRNA). The cloned hemagglutinin segment from a highly pathogenic H5N1 strain, A/Hanoi/30408/2005(H5N1), isolated from humans was used as template for in vitro transcription of sense-strand vRNA. The hybridization behavior of vRNA and a conserved subtype probe was characterized experimentally by varying conditions of time, temperature, and Mg2+ to optimize detection. Comparison of the hybridization rates of probe to DNA and RNA targets indicates that conformational switching of influenza RNA structure is a rate-limiting step and that the secondary structure of vRNA dominates the binding kinetics. The sensitivity and specificity of probe recognition of other H5 strains was calculated from sequence matches to the National Center for Biotechnology Information influenza database. The hybridization specificity of the subtype probes was experimentally verified with point mutations within the probe loop at five locations corresponding to the other human H5 strains. The abundance frequencies of the hemagglutinin cleavage motif and sialic acid recognition sequences were experimentally tested for H5 in all host viral species. Although the detection assay must be coupled with isothermal amplification on the chip, the new probes form the basis of a portable point-of-care diagnostic device for influenza subtyping. PMID:18403607

  13. Principles of miRNA-mRNA interactions: beyond sequence complementarity.

    PubMed

    Afonso-Grunz, Fabian; Müller, Sören

    2015-08-01

    MicroRNAs (miRNAs) are small non-coding RNAs that post-transcriptionally regulate gene expression by altering the translation efficiency and/or stability of targeted mRNAs. In vertebrates, more than 50% of all protein-coding RNAs are assumed to be subject to miRNA-mediated control, but current high-throughput methods that reliably measure miRNA-mRNA interactions either require prior knowledge of target mRNAs or elaborate preparation procedures. Consequently, experimentally validated interactions are relatively rare. Furthermore, in silico prediction based on sequence complementarity of miRNAs and their corresponding target sites suffers from extremely high false positive rates. Apparently, sequence complementarity alone is often insufficient to reflect the complex post-transcriptional regulation of mRNAs by miRNAs, which is especially true for animals. Therefore, combined analysis of small non-coding and protein-coding RNAs is indispensable to better understand and predict the complex dynamics of miRNA-regulated gene expression. Single-nucleotide polymorphisms (SNPs) and alternative polyadenylation (APA) can affect miRNA binding of a given transcript from different individuals and tissues, and especially APA is currently emerging as a major factor that contributes to variations in miRNA-mRNA interplay in animals. In this review, we focus on the influence of APA and SNPs on miRNA-mediated gene regulation and discuss the computational approaches that take these mechanisms into account.

  14. Structurally complex and highly active RNA ligases derived from random RNA sequences

    NASA Technical Reports Server (NTRS)

    Ekland, E. H.; Szostak, J. W.; Bartel, D. P.

    1995-01-01

    Seven families of RNA ligases, previously isolated from random RNA sequences, fall into three classes on the basis of secondary structure and regiospecificity of ligation. Two of the three classes of ribozymes have been engineered to act as true enzymes, catalyzing the multiple-turnover transformation of substrates into products. The most complex of these ribozymes has a minimal catalytic domain of 93 nucleotides. An optimized version of this ribozyme has a kcat exceeding one per second, a value far greater than that of most natural RNA catalysts and approaching that of comparable protein enzymes. The fact that such a large and complex ligase emerged from a very limited sampling of sequence space implies the existence of a large number of distinct RNA structures of equivalent complexity and activity.

  15. Finding the most significant common sequence and structure motifs in a set of RNA sequences.

    PubMed Central

    Gorodkin, J; Heyer, L J; Stormo, G D

    1997-01-01

    We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints. In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections. The first part utilizes a simplified version of the Sankoff algorithm for simultaneous folding and alignment of RNA sequences, but maintains tractability by constructing multi-sequence alignments from pairwise comparisons. The algorithm finds the multiple alignments using a greedy approach and has similarities to both CLUSTAL and CONSENSUS, but the core algorithm assures that the pairwise alignments are optimized for both sequence and structure conservation. The choice of scoring system and the method of progressively constructing the final solution are important considerations that are discussed. Example solutions, and comparisons with other approaches, are provided. The solutions include finding consensus structures identical to published ones. PMID:9278497

  16. Cell growth inhibition by sequence-specific RNA minihelices.

    PubMed Central

    Hipps, D; Schimmel, P

    1995-01-01

    RNA minihelices which reconstruct the 12 base pair acceptor-T psi C domains of transfer RNAs interact with their cognate tRNA synthetases. These substrates lack the anticodons of the genetic code and, therefore, cannot participate in steps of protein synthesis subsequent to aminoacylation. We report here that expression in Escherichia coli of either of two minihelices, each specific for a different amino acid, inhibited cell growth. Inhibition appears to be due to direct competition between the minihelix and its related tRNA for binding to their common synthetase. This competition, in turn, sharply lowers the pool of the specific charged tRNA for protein synthesis. Inhibition is relieved by single nucleotide changes which disrupt the minihelix-synthetase interaction. The results suggest that sequence-specific RNA minihelix substrates bind to cognate synthetases in vivo and can, in principle, act as cell growth regulators. Naturally occurring non-tRNA substrates for aminoacylation may serve a similar purpose. Images PMID:7664744

  17. Using Small RNA Deep Sequencing Data to Detect Human Viruses.

    PubMed

    Wang, Fang; Sun, Yu; Ruan, Jishou; Chen, Rui; Chen, Xin; Chen, Chengjie; Kreuze, Jan F; Fei, ZhangJun; Zhu, Xiao; Gao, Shan

    2016-01-01

    Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans. PMID:27066498

  18. Using Small RNA Deep Sequencing Data to Detect Human Viruses

    PubMed Central

    Wang, Fang; Sun, Yu; Ruan, Jishou; Chen, Rui; Chen, Xin; Chen, Chengjie; Kreuze, Jan F.; Fei, ZhangJun; Zhu, Xiao

    2016-01-01

    Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans. PMID:27066498

  19. DNA slip-outs cause RNA polymerase II arrest in vitro: potential implications for genetic instability

    PubMed Central

    Salinas-Rios, Viviana; Belotserkovskii, Boris P.; Hanawalt, Philip C.

    2011-01-01

    The abnormal number of repeats found in triplet repeat diseases arises from ‘repeat instability’, in which the repetitive section of DNA is subject to a change in copy number. Recent studies implicate transcription in a mechanism for repeat instability proposed to involve RNA polymerase II (RNAPII) arrest caused by a CTG slip-out, triggering transcription-coupled repair (TCR), futile cycles of which may lead to repeat expansion or contraction. In the present study, we use defined DNA constructs to directly test whether the structures formed by CAG and CTG repeat slip-outs can cause transcription arrest in vitro. We found that a slip-out of (CAG)20 or (CTG)20 repeats on either strand causes RNAPII arrest in HeLa cell nuclear extracts. Perfect hairpins and loops on either strand also cause RNAPII arrest. These findings are consistent with a transcription-induced repeat instability model in which transcription arrest in mammalian cells may initiate a ‘gratuitous’ TCR event leading to a change in repeat copy number. An understanding of the underlying mechanism of repeat instability could lead to intervention to slow down expansion and delay the onset of many neurodegenerative diseases in which triplet repeat expansion is implicated. PMID:21666257

  20. HLA typing from RNA-Seq sequence reads.

    PubMed

    Boegel, Sebastian; Löwer, Martin; Schäfer, Michael; Bukur, Thomas; de Graaf, Jos; Boisguérin, Valesca; Türeci, Ozlem; Diken, Mustafa; Castle, John C; Sahin, Ugur

    2012-01-01

    We present a method, seq2HLA, for obtaining an individual's human leukocyte antigen (HLA) class I and II type and expression using standard next generation sequencing RNA-Seq data. RNA-Seq reads are mapped against a reference database of HLA alleles, and HLA type, confidence score and locus-specific expression level are determined. We successfully applied seq2HLA to 50 individuals included in the HapMap project, yielding 100% specificity and 94% sensitivity at a P-value of 0.1 for two-digit HLA types. We determined HLA type and expression for previously un-typed Illumina Body Map tissues and a cohort of Korean patients with lung cancer. Because the algorithm uses standard RNA-Seq reads and requires no change to laboratory protocols, it can be used for both existing datasets and future studies, thus adding a new dimension for HLA typing and biomarker studies. PMID:23259685

  1. Understanding mechanisms underlying human gene expression variation with RNA sequencing

    PubMed Central

    Pickrell, Joseph K.; Marioni, John C.; Pai, Athma A.; Degner, Jacob F.; Engelhardt, Barbara E.; Nkadori, Everlyne; Veyrieras, Jean-Baptiste; Stephens, Matthew; Gilad, Yoav; Pritchard, Jonathan K.

    2011-01-01

    Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal1. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project2. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals. PMID:20220758

  2. Sequence of rice hoja blanca tenuivirus RNA-2.

    PubMed

    De Miranda, J R; Muñoz, M; Wu, R; Hull, R; Espinoza, A M

    1996-01-01

    The sequence of rice hoja blanca tenuivirus RNA-2 is analysed and compared to its counter-part in rice stripe tenuivirus. The RNA encodes two proteins, in an ambisense arrangement. The 94 kD pc2, located in the complementary sense RNA, has several features typical of viral membrane (glyco)proteins, and also has regions of local homology to the glycoproteins of the Phleboviruses (Bunyaviridae). The 23 kD pv2 lies in the viral sense RNA and has two small conserved domains that are almost exclusively found in retro-viral membrane glycoproteins. Its genome location is analogous to the NSm protein of several of the Bunyaviridae species, which is thought to have a membrane-related function. The two open reading frames are separated by a large intergenic region which, in common with the other tenuivirus ambisense RNA segments, has a short region that is highly conserved between RStV and RHBV. The significance of these results with respect to the virus structure and gene expression is discussed. PMID:8883360

  3. tRNA-Related Sequences Trigger Systemic mRNA Transport in Plants[OPEN

    PubMed Central

    Zhang, Wenna; Kollwig, Gregor; Apelt, Federico; Walther, Dirk

    2016-01-01

    In plants, protein-coding mRNAs can move via the phloem vasculature to distant tissues, where they may act as non-cell-autonomous signals. Emerging work has identified many phloem-mobile mRNAs, but little is known regarding RNA motifs triggering mobility, the extent of mRNA transport, and the potential of transported mRNAs to be translated into functional proteins after transport. To address these aspects, we produced reporter transcripts harboring tRNA-like structures (TLSs) that were found to be enriched in the phloem stream and in mRNAs moving over chimeric graft junctions. Phenotypic and enzymatic assays on grafted plants indicated that mRNAs harboring a distinctive TLS can move from transgenic roots into wild-type leaves and from transgenic leaves into wild-type flowers or roots; these mRNAs can also be translated into proteins after transport. In addition, we provide evidence that dicistronic mRNA:tRNA transcripts are frequently produced in Arabidopsis thaliana and are enriched in the population of graft-mobile mRNAs. Our results suggest that tRNA-derived sequences with predicted stem-bulge-stem-loop structures are sufficient to mediate mRNA transport and seem to be necessary for the mobility of a large number of endogenous transcripts that can move through graft junctions. PMID:27268430

  4. Sequence and expression of ferredoxin mRNA in barley

    SciTech Connect

    Zielinski, R.; Funder, P.M.; Ling, V. )

    1990-05-01

    We have isolated and structurally characterized a full-length cDNA clone encoding ferredoxin from a {lambda}gt10 cDNA library prepared from barley leaf mRNA. The ferredoxin clone (pBFD-1) was fused head-to-head with a partial-length cDNA clone encoding calmodulin, and was fortuitously isolated by screening the library with a calmodulin-specific oligonucleotide probe. The mRNA sequence from which pBFD-1 was derived is expressed exclusively in the leaf tissues of 7-d old barley seedlings. Barley pre-ferredoxin has a predicted size of 15.3 kDal, of which 4.6 kDal are accounted for by the transit peptide. The polypeptide encoded by pBFD-1 is identical to wheat ferredoxin, and shares slightly more amino acid sequence similarity with spinach ferredoxin I than with ferredoxin II. Ferredoxin mRNA levels are rapidly increased 10-fold by white light in etiolated barley leaves.

  5. Analysis of microRNA transcriptome by deep sequencing of small RNA libraries of peripheral blood

    PubMed Central

    2010-01-01

    Background MicroRNAs are a class of small non-coding RNAs that regulate mRNA expression at the post - transcriptional level and thereby many fundamental biological processes. A number of methods, such as multiplex polymerase chain reaction, microarrays have been developed for profiling levels of known miRNAs. These methods lack the ability to identify novel miRNAs and accurately determine expression at a range of concentrations. Deep or massively parallel sequencing methods are providing suitable platforms for genome wide transcriptome analysis and have the ability to identify novel transcripts. Results The results of analysis of small RNA sequences obtained by Solexa technology of normal peripheral blood mononuclear cells, tumor cell lines K562 and HL60 are presented. In general K562 cells displayed overall low level of miRNA population and also low levels of DICER. Some of the highly expressed miRNAs in the leukocytes include several members of the let-7 family, miR-21, 103, 185, 191 and 320a. Comparison of the miRNA profiles of normal versus K562 or HL60 cells revealed a specific set of differentially expressed molecules. Correlation of the miRNA with that of mRNA expression profiles, obtained by microarray, revealed a set of target genes showing inverse correlation with miRNA levels. Relative expression levels of individual miRNAs belonging to a cluster were found to be highly variable. Our computational pipeline also predicted a number of novel miRNAs. Some of the predictions were validated by Real-time RT-PCR and or RNase protection assay. Organization of some of the novel miRNAs in human genome suggests that these may also be part of existing clusters or form new clusters. Conclusions We conclude that about 904 miRNAs are expressed in human leukocytes. Out of these 370 are novel miRNAs. We have identified miRNAs that are differentially regulated in normal PBMC with respect to cancer cells, K562 and HL60. Our results suggest that post - transcriptional

  6. Adenylylation of small RNA sequencing adapters using the TS2126 RNA ligase I.

    PubMed

    Lama, Lodoe; Ryan, Kevin

    2016-01-01

    Many high-throughput small RNA next-generation sequencing protocols use 5' preadenylylated DNA oligonucleotide adapters during cDNA library preparation. Preadenylylation of the DNA adapter's 5' end frees from ATP-dependence the ligation of the adapter to RNA collections, thereby avoiding ATP-dependent side reactions. However, preadenylylation of the DNA adapters can be costly and difficult. The currently available method for chemical adenylylation of DNA adapters is inefficient and uses techniques not typically practiced in laboratories profiling cellular RNA expression. An alternative enzymatic method using a commercial RNA ligase was recently introduced, but this enzyme works best as a stoichiometric adenylylating reagent rather than a catalyst and can therefore prove costly when several variant adapters are needed or during scale-up or high-throughput adenylylation procedures. Here, we describe a simple, scalable, and highly efficient method for the 5' adenylylation of DNA oligonucleotides using the thermostable RNA ligase 1 from bacteriophage TS2126. Adapters with 3' blocking groups are adenylylated at >95% yield at catalytic enzyme-to-adapter ratios and need not be gel purified before ligation to RNA acceptors. Experimental conditions are also reported that enable DNA adapters with free 3' ends to be 5' adenylylated at >90% efficiency. PMID:26567315

  7. Adenylylation of small RNA sequencing adapters using the TS2126 RNA ligase I.

    PubMed

    Lama, Lodoe; Ryan, Kevin

    2016-01-01

    Many high-throughput small RNA next-generation sequencing protocols use 5' preadenylylated DNA oligonucleotide adapters during cDNA library preparation. Preadenylylation of the DNA adapter's 5' end frees from ATP-dependence the ligation of the adapter to RNA collections, thereby avoiding ATP-dependent side reactions. However, preadenylylation of the DNA adapters can be costly and difficult. The currently available method for chemical adenylylation of DNA adapters is inefficient and uses techniques not typically practiced in laboratories profiling cellular RNA expression. An alternative enzymatic method using a commercial RNA ligase was recently introduced, but this enzyme works best as a stoichiometric adenylylating reagent rather than a catalyst and can therefore prove costly when several variant adapters are needed or during scale-up or high-throughput adenylylation procedures. Here, we describe a simple, scalable, and highly efficient method for the 5' adenylylation of DNA oligonucleotides using the thermostable RNA ligase 1 from bacteriophage TS2126. Adapters with 3' blocking groups are adenylylated at >95% yield at catalytic enzyme-to-adapter ratios and need not be gel purified before ligation to RNA acceptors. Experimental conditions are also reported that enable DNA adapters with free 3' ends to be 5' adenylylated at >90% efficiency.

  8. Avian retroviral RNA encapsidation: reexamination of functional 5' RNA sequences and the role of nucleocapsid Cys-His motifs.

    PubMed Central

    Aronoff, R; Hajjar, A M; Linial, M L

    1993-01-01

    RNA packaging signals (psi) from the 5' ends of murine and avian retroviral genomes have previously been shown to direct encapsidation of heterologous mRNA into the retroviral virion. The avian 5' packaging region has now been further characterized, and we have defined a 270-nucleotide sequence, A psi, which is sufficient to direct packaging of heterologous RNA. Identification of the A psi sequence suggests that several retroviral cis-acting sequences contained in psi+ (the primer binding site, the putative dimer linkage sequence, and the splice donor site) are dispensable for specific RNA encapsidation. Subgenomic env mRNA is not efficiently encapsidated into particles, even though the A psi sequence is present in this RNA. In contrast, spliced heterologous psi-containing RNA is packaged into virions as efficiently as unspliced species; thus splicing per se is not responsible for the failure of env mRNA to be encapsidated. We also found that an avian retroviral mutant deleted for both nucleocapsid Cys-His boxes retains the capacity to encapsidate RNA containing psi sequences, although this RNA is unstable and is thus difficult to detect in mature particles. Electron microscopy reveals that virions produced by this mutant lack a condensed core, which may allow the RNA to be accessible to nucleases. Images PMID:8380070

  9. Analysis on the preference for sequence matching between mRNA sequences and the corresponding introns in ribosomal protein genes.

    PubMed

    Zhang, Qiang; Li, Hong; Zhao, Xiaoqing; Zheng, Yan; Meng, Hu; Jia, Yun; Xue, Hui; Bo, Sulin

    2016-03-01

    Introns after splicing still play an important role. Introns can accomplish gene expression and regulation by interaction with corresponding mRNA sequences. Based on the Smith-Waterman method, local comparing makes us get the optimal matched segments between intron sequences and mRNA sequences. Analyzing the distribution regulation of the optimal matching region on mRNA sequences of ribosomal protein genes about 27 species, we find a strong interaction between UTR region sequences and introns. There are a lot of the optimal matching regions and low matching ones, and the latter are supposed to be the combined regions of protein complexes. The optimal matching frequency distributions have obvious differences nearby the mRNA functional sites such as translation initiation and termination sites, exon-exon joints and EJC regions. This conclusion shows that intron sequences and mature mRNA sequences are co-evolved and interactive to play their functions. PMID:26707402

  10. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  11. Assessing long-distance RNA sequence connectivity via RNA-templated DNA–DNA ligation

    PubMed Central

    Roy, Christian K; Olson, Sara; Graveley, Brenton R; Zamore, Phillip D; Moore, Melissa J

    2015-01-01

    Many RNAs, including pre-mRNAs and long non-coding RNAs, can be thousands of nucleotides long and undergo complex post-transcriptional processing. Multiple sites of alternative splicing within a single gene exponentially increase the number of possible spliced isoforms, with most human genes currently estimated to express at least ten. To understand the mechanisms underlying these complex isoform expression patterns, methods are needed that faithfully maintain long-range exon connectivity information in individual RNA molecules. In this study, we describe SeqZip, a methodology that uses RNA-templated DNA–DNA ligation to retain and compress connectivity between distant sequences within single RNA molecules. Using this assay, we test proposed coordination between distant sites of alternative exon utilization in mouse Fn1, and we characterize the extraordinary exon diversity of Drosophila melanogaster Dscam1. DOI: http://dx.doi.org/10.7554/eLife.03700.001 PMID:25866926

  12. Rapid ribosomal RNA sequencing and the phylogenetic analysis of protists.

    PubMed

    Johnson, A M; Baverstock, P R

    1989-04-01

    A newly described technique for rapidly obtaining the partial nucleotide sequence of ribosomal RNA is being applied to investigate phylogenetic relationships among living organisms. Alan Johnson and Peter Boverstock describe the importance of this method to parasitology in providing new information on the phylogenetic relationships of parasitic organisms previously placed in groups of convenience. The phylum Apicomplexo in particular, has been the object of much study using this technique, but the technology is likely to extend soon to the restructuring of the phylogenetic trees of many groups of parasites.

  13. Small RNA sequencing identifies miRNA roles in ovule and fibre development.

    PubMed

    Xie, Fuliang; Jones, Don C; Wang, Qinglian; Sun, Runrun; Zhang, Baohong

    2015-04-01

    MicroRNAs (miRNAs) have been found to be differentially expressed during cotton fibre development. However, which specific miRNAs and how they are involved in fibre development is unclear. Here, using deep sequencing, 65 conserved miRNA families were identified and 32 families were differentially expressed between leaf and ovule. At least 40 miRNAs were either leaf or ovule specific, whereas 62 miRNAs were shared in both leaf and ovule. qRT-PCR confirmed these miRNAs were differentially expressed during fibre early development. A total of 820 genes were potentially targeted by the identified miRNAs, whose functions are involved in a series of biological processes including fibre development, metabolism and signal transduction. Many predicted miRNA-target pairs were subsequently validated by degradome sequencing analysis. GO and KEGG analyses showed that the identified miRNAs and their targets were classified to 1027 GO terms including 568 biological processes, 324 molecular functions and 135 cellular components and were enriched to 78 KEGG pathways. At least seven unique miRNAs participate in trichome regulatory interaction network. Eleven trans-acting siRNA (tasiRNA) candidate genes were also identified in cotton. One has never been found in other plant species and two of them were derived from MYB and ARF, both of which play important roles in cotton fibre development. Sixteen genes were predicted to be tasiRNA targets, including sucrose synthase and MYB2. Together, this study discovered new miRNAs in cotton and offered evidences that miRNAs play important roles in cotton ovule/fibre development. The identification of tasiRNA genes and their targets broadens our understanding of the complicated regulatory mechanism of miRNAs in cotton.

  14. Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

    PubMed Central

    Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

    2014-01-01

    The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia

  15. Use of Unamplified RNA/cDNA–Hybrid Nanopore Sequencing for Rapid Detection and Characterization of RNA Viruses

    PubMed Central

    Kilianski, Andy; Roth, Pierce A.; Liem, Alvin T.; Hill, Jessica M.; Willis, Kristen L.; Rossmaier, Rebecca D.; Marinich, Andrew V.; Maughan, Michele N.; Karavis, Mark A.; Kuhn, Jens H.; Honko, Anna N.

    2016-01-01

    Nanopore sequencing, a novel genomics technology, has potential applications for routine biosurveillance, clinical diagnosis, and outbreak investigation of virus infections. Using rapid sequencing of unamplified RNA/cDNA hybrids, we identified Venezuelan equine encephalitis virus and Ebola virus in 3 hours from sample receipt to data acquisition, demonstrating a fieldable technique for RNA virus characterization. PMID:27191483

  16. Use of Unamplified RNA/cDNA-Hybrid Nanopore Sequencing for Rapid Detection and Characterization of RNA Viruses.

    PubMed

    Kilianski, Andy; Roth, Pierce A; Liem, Alvin T; Hill, Jessica M; Willis, Kristen L; Rossmaier, Rebecca D; Marinich, Andrew V; Maughan, Michele N; Karavis, Mark A; Kuhn, Jens H; Honko, Anna N; Rosenzweig, C Nicole

    2016-08-01

    Nanopore sequencing, a novel genomics technology, has potential applications for routine biosurveillance, clinical diagnosis, and outbreak investigation of virus infections. Using rapid sequencing of unamplified RNA/cDNA hybrids, we identified Venezuelan equine encephalitis virus and Ebola virus in 3 hours from sample receipt to data acquisition, demonstrating a fieldable technique for RNA virus characterization. PMID:27191483

  17. PlantMirnaT: miRNA and mRNA integrated analysis fully utilizing characteristics of plant sequencing data.

    PubMed

    Rhee, S; Chae, H; Kim, S

    2015-07-15

    miRNA is known to regulate up to several hundreds coding genes, thus the integrated analysis of miRNA and mRNA expression data is an important problem. Unfortunately, the integrated analysis is challenging since it needs to consider expression data of two different types, miRNA and mRNA, and target relationship between miRNA and mRNA is not clear, especially when microarray data is used. Fortunately, due to the low sequencing cost, small RNA and RNA sequencing are routinely processed and we may be able to infer regulation relationships between miRNAs and mRNAs more accurately by using sequencing data. However, no method is developed specifically for sequencing data. Thus we developed PlantMirnaT, a new miRNA-mRNA integrated analysis system. To fully leverage the power of sequencing data, three major features are developed and implemented in PlantMirnaT. First, we implemented a plant-specific short read mapping tool based on recent discoveries on miRNA target relationship in plant. Second, we designed and implemented an algorithm considering miRNA targets in the full intragenic region, not just 3' UTR. Lastly but most importantly, our algorithm is designed to consider quantity of miRNA expression and its distribution on target mRNAs. The new algorithm was used to characterize rice under drought condition using our proprietary data. Our algorithm successfully discovered that two miRNAs, miRNA1425-5p, miRNA 398b, that are involved in suppression of glucose pathway in a naturally drought resistant rice, Vandana. The system can be downloaded at https://sites.google.com/site/biohealthinformaticslab/resources. PMID:25863133

  18. Molecular subtyping of leiomyosarcoma with 3′ end RNA sequencing

    PubMed Central

    Guo, Xiangqian; Forgó, Erna; van de Rijn, Matt

    2015-01-01

    Leiomyosarcoma (LMS) is a malignant neoplasm with smooth muscle differentiation. Little is known about its molecular heterogeneity and no targeted therapy currently exists for LMS. We performed expression profiling on 99 cases of LMS with 3′ end RNA sequencing (3SEQ) and demonstrated the existence of 3 molecular subtypes in this cohort. We consequently showed that these molecular subtypes are reproducible using an independent cohort of 82 LMS cases from TCGA. Two new formalin-fixed, paraffin-embedded (FFPE) tissue-compatible diagnostic immunohistochemical markers were identified for two of the three subtypes: LMOD1 for subtype I LMS and ARL4C for subtype II LMS. Subtype I LMS and subtype II LMS were associated with good and poor prognosis, respectively. Here, we describe the details of LMS diagnosis, RNA isolation, 3SEQ library construction, 3SEQ sequencing data analysis and molecular subtype determination. The 3SEQ data produced in this study was deposited into Gene Expression Omnibus (GEO) under GSE45510. PMID:26240788

  19. Nascent RNA sequencing reveals distinct features in plant transcription

    PubMed Central

    Hetzel, Jonathan; Duttke, Sascha H.; Benner, Christopher; Chory, Joanne

    2016-01-01

    Transcriptional regulation of gene expression is a major mechanism used by plants to confer phenotypic plasticity, and yet compared with other eukaryotes or bacteria, little is known about the design principles. We generated an extensive catalog of nascent and steady-state transcripts in Arabidopsis thaliana seedlings using global nuclear run-on sequencing (GRO-seq), 5′GRO-seq, and RNA-seq and reanalyzed published maize data to capture characteristics of plant transcription. De novo annotation of nascent transcripts accurately mapped start sites and unstable transcripts. Examining the promoters of coding and noncoding transcripts identified comparable chromatin signatures, a conserved “TGT” core promoter motif and unreported transcription factor-binding sites. Mapping of engaged RNA polymerases showed a lack of enhancer RNAs, promoter-proximal pausing, and divergent transcription in Arabidopsis seedlings and maize, which are commonly present in yeast and humans. In contrast, Arabidopsis and maize genes accumulate RNA polymerases in proximity of the polyadenylation site, a trend that coincided with longer genes and CpG hypomethylation. Lack of promoter-proximal pausing and a higher correlation of nascent and steady-state transcripts indicate Arabidopsis may regulate transcription predominantly at the level of initiation. Our findings provide insight into plant transcription and eukaryotic gene expression as a whole. PMID:27729530

  20. A method for clustering of miRNA sequences using fragmented programming.

    PubMed

    Ivashchenko, Anatoly; Pyrkova, Anna; Niyazova, Raigul

    2016-01-01

    Clustering of miRNA sequences is an important problem in molecular genetics associated cellular biology. Thousands of such sequences are known today through advancement in sophisticated molecular tools, sequencing techniques, computational resources and rule based mathematical models. Analysis of such large-scale miRNA sequences for inferring patterns towards deducing cellular function is a great challenge in modern molecular biology. Therefore, it is of interest to develop mathematical models specific for miRNA sequences. The process is to group (cluster) such miRNA sequences using well-defined known features. We describe a method for clustering of miRNA sequences using fragmented programming. Subsequently, we illustrated the utility of the model using a dendrogram (a tree diagram) for publically known A.thaliana miRNA nucleotide sequences towards the inference of observed conserved patterns. PMID:27212839

  1. A method for clustering of miRNA sequences using fragmented programming

    PubMed Central

    Ivashchenko, Anatoly; Pyrkova, Anna; Niyazova, Raigul

    2016-01-01

    Clustering of miRNA sequences is an important problem in molecular genetics associated cellular biology. Thousands of such sequences are known today through advancement in sophisticated molecular tools, sequencing techniques, computational resources and rule based mathematical models. Analysis of such large-scale miRNA sequences for inferring patterns towards deducing cellular function is a great challenge in modern molecular biology. Therefore, it is of interest to develop mathematical models specific for miRNA sequences. The process is to group (cluster) such miRNA sequences using well-defined known features. We describe a method for clustering of miRNA sequences using fragmented programming. Subsequently, we illustrated the utility of the model using a dendrogram (a tree diagram) for publically known A.thaliana miRNA nucleotide sequences towards the inference of observed conserved patterns PMID:27212839

  2. Complete nucleotide sequence of a 16S ribosomal RNA gene from Escherichia coli.

    PubMed Central

    Brosius, J; Palmer, M L; Kennedy, P J; Noller, H F

    1978-01-01

    The complete nucleotide sequence of the 16S RNA gene from the rrnB cistron of Escherichia coli has been determined by using three rapid DNA sequencing methods. Nearly all of the structure has been confirmed by two to six independent sequence determinations on both DNA strands. The length of the 16S rRNA chain inferred from the DNA sequence is 1541 nucleotides, in close agreement with previous estimates. We note discrepancies between this sequence and the most recent version of it reported from direct RNA sequencing [Ehresmann, C., Stiegler, P., Carbon, P. & Ebel, J.P. (1977) FEBS Lett. 84, 337-341]. A few of these may be explained by heterogeneity among 16S rRNA sequences from different cistrons. No nucleotide sequences were found in the 16S rRNA gene that cannot be reconciled with RNase digestion products of mature 16S rRNA. Images PMID:368799

  3. The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability

    PubMed Central

    Hamperl, Stephan; Cimprich, Karlene A.

    2014-01-01

    Accurate DNA replication and DNA repair are crucial for the maintenance of genome stability, and it is generally accepted that failure of these processes is a major source of DNA damage in cells. Intriguingly, recent evidence suggests that DNA damage is more likely to occur at genomic loci with high transcriptional activity. Furthermore, loss of certain RNA processing factors in eukaryotic cells is associated with increased formation of co-transcriptional RNA:DNA hybrid structures known as R-loops, resulting in double-strand breaks (DSBs) and DNA damage. However, the molecular mechanisms by which R-loop structures ultimately lead to DNA breaks and genome instability is not well understood. In this review, we summarize the current knowledge about the formation, recognition and processing of RNA:DNA hybrids, and discuss possible mechanisms by which these structures contribute to DNA damage and genome instability in the cell. PMID:24746923

  4. Improved definition of the mouse transcriptome via targeted RNA sequencing.

    PubMed

    Bussotti, Giovanni; Leonardi, Tommaso; Clark, Michael B; Mercer, Tim R; Crawford, Joanna; Malquori, Lorenzo; Notredame, Cedric; Dinger, Marcel E; Mattick, John S; Enright, Anton J

    2016-05-01

    Targeted RNA sequencing (CaptureSeq) uses oligonucleotide probes to capture RNAs for sequencing, providing enriched read coverage, accurate measurement of gene expression, and quantitative expression data. We applied CaptureSeq to refine transcript annotations in the current murine GRCm38 assembly. More than 23,000 regions corresponding to putative or annotated long noncoding RNAs (lncRNAs) and 154,281 known splicing junction sites were selected for targeted sequencing across five mouse tissues and three brain subregions. The results illustrate that the mouse transcriptome is considerably more complex than previously thought. We assemble more complete transcript isoforms than GENCODE, expand transcript boundaries, and connect interspersed islands of mapped reads. We describe a novel filtering pipeline that identifies previously unannotated but high-quality transcript isoforms. In this set, 911 GENCODE neighboring genes are condensed into 400 expanded gene models. Additionally, 594 GENCODE lncRNAs acquire an open reading frame (ORF) when their structure is extended with CaptureSeq. Finally, we validate our observations using current FANTOM and Mouse ENCODE resources. PMID:27197243

  5. Improved definition of the mouse transcriptome via targeted RNA sequencing.

    PubMed

    Bussotti, Giovanni; Leonardi, Tommaso; Clark, Michael B; Mercer, Tim R; Crawford, Joanna; Malquori, Lorenzo; Notredame, Cedric; Dinger, Marcel E; Mattick, John S; Enright, Anton J

    2016-05-01

    Targeted RNA sequencing (CaptureSeq) uses oligonucleotide probes to capture RNAs for sequencing, providing enriched read coverage, accurate measurement of gene expression, and quantitative expression data. We applied CaptureSeq to refine transcript annotations in the current murine GRCm38 assembly. More than 23,000 regions corresponding to putative or annotated long noncoding RNAs (lncRNAs) and 154,281 known splicing junction sites were selected for targeted sequencing across five mouse tissues and three brain subregions. The results illustrate that the mouse transcriptome is considerably more complex than previously thought. We assemble more complete transcript isoforms than GENCODE, expand transcript boundaries, and connect interspersed islands of mapped reads. We describe a novel filtering pipeline that identifies previously unannotated but high-quality transcript isoforms. In this set, 911 GENCODE neighboring genes are condensed into 400 expanded gene models. Additionally, 594 GENCODE lncRNAs acquire an open reading frame (ORF) when their structure is extended with CaptureSeq. Finally, we validate our observations using current FANTOM and Mouse ENCODE resources.

  6. Genome-wide analyses of Epstein-Barr virus reveal conserved RNA structures and a novel stable intronic sequence RNA

    PubMed Central

    2013-01-01

    Background Epstein-Barr virus (EBV) is a human herpesvirus implicated in cancer and autoimmune disorders. Little is known concerning the roles of RNA structure in this important human pathogen. This study provides the first comprehensive genome-wide survey of RNA and RNA structure in EBV. Results Novel EBV RNAs and RNA structures were identified by computational modeling and RNA-Seq analyses of EBV. Scans of the genomic sequences of four EBV strains (EBV-1, EBV-2, GD1, and GD2) and of the closely related Macacine herpesvirus 4 using the RNAz program discovered 265 regions with high probability of forming conserved RNA structures. Secondary structure models are proposed for these regions based on a combination of free energy minimization and comparative sequence analysis. The analysis of RNA-Seq data uncovered the first observation of a stable intronic sequence RNA (sisRNA) in EBV. The abundance of this sisRNA rivals that of the well-known and highly expressed EBV-encoded non-coding RNAs (EBERs). Conclusion This work identifies regions of the EBV genome likely to generate functional RNAs and RNA structures, provides structural models for these regions, and discusses potential functions suggested by the modeled structures. Enhanced understanding of the EBV transcriptome will guide future experimental analyses of the discovered RNAs and RNA structures. PMID:23937650

  7. Optimization of shRNA inhibitors by variation of the terminal loop sequence.

    PubMed

    Schopman, Nick C T; Liu, Ying Poi; Konstantinova, Pavlina; ter Brake, Olivier; Berkhout, Ben

    2010-05-01

    Gene silencing by RNA interference (RNAi) can be achieved by intracellular expression of a short hairpin RNA (shRNA) that is processed into the effective small interfering RNA (siRNA) inhibitor by the RNAi machinery. Previous studies indicate that shRNA molecules do not always reflect the activity of corresponding synthetic siRNAs that attack the same target sequence. One obvious difference between these two effector molecules is the hairpin loop of the shRNA. Most studies use the original shRNA design of the pSuper system, but no extensive study regarding optimization of the shRNA loop sequence has been performed. We tested the impact of different hairpin loop sequences, varying in size and structure, on the activity of a set of shRNAs targeting HIV-1. We were able to transform weak inhibitors into intermediate or even strong shRNA inhibitors by replacing the loop sequence. We demonstrate that the efficacy of these optimized shRNA inhibitors is improved significantly in different cell types due to increased siRNA production. These results indicate that the loop sequence is an essential part of the shRNA design. The optimized shRNA loop sequence is generally applicable for RNAi knockdown studies, and will allow us to develop a more potent gene therapy against HIV-1. PMID:20188764

  8. Comparative RNA sequencing reveals substantial genetic variation in endangered primates

    PubMed Central

    Perry, George H.; Melsted, Páll; Marioni, John C.; Wang, Ying; Bainer, Russell; Pickrell, Joseph K.; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D.; Stephens, Matthew; Pritchard, Jonathan K.; Gilad, Yoav

    2012-01-01

    Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success. PMID:22207615

  9. The Microglial Sensome Revealed by Direct RNA Sequencing

    PubMed Central

    Hickman, Suzanne E.; Kingery, Nathan D.; Ohsumi, Toshiro; Borowsky, Mark; Wang, Li-chong; Means, Terry K.; Khoury, Joseph El

    2013-01-01

    Microglia, the principal neuroimmune sentinels of the brain, continuously sense changes in their environment and respond to invading pathogens, toxins and cellular debris. Microglia exhibit plasticity and can assume neurotoxic or neuroprotective priming states that determine their responses to danger. We used direct RNA sequencing, without amplification or cDNA synthesis, to determine the quantitative transcriptomes of microglia of healthy adult and aged mice. We validated our findings by fluorescent dual in-situ hybridization, unbiased proteomic analysis and quantitative PCR. We report here that microglia have a distinct transcriptomic signature and express a unique cluster of transcripts encoding proteins for sensing endogenous ligands and microbes that we term the “sensome”. With aging, sensome transcripts for endogenous ligand recognition are downregulated, whereas those involved in microbe recognition and host defense are upregulated. In addition, aging is associated with an overall increase in expression of microglial genes involved in neuroprotection. PMID:24162652

  10. Modulations of RNA sequences by cytokinin in pumpkin cotyledons

    SciTech Connect

    Chang, C.; Ertl, J.; Chen, C.

    1987-04-01

    Polyadenylated mRNAs from excised pumpkin cotyledons treated with or without 10/sup -4/ M benzyladenine (BA) for various time periods in suspension culture were assayed by in vitro translation in the presence of (/sup 35/S) methionine. The radioactive polypeptides were analyzed by one- and two-dimensional polyacrylamide gel electrophoresis. Specific sequences of mRNAs were enhanced, reduced, induced, or suppressed by the hormone within 60 min of the application of BA to the cotyledons. Four independent cDNA clones of cytokinin-modulated mRNAs have been selected and characterized. RNA blot hybridization using the four cDNA probes also indicates that the levels of specific mRNAs are modulated upward or downward by the hormone.

  11. Sequence of the 16S ribosomal RNA from Halobacterium volcanii, an archaebacterium

    NASA Technical Reports Server (NTRS)

    Gupta, R.; Lanter, J. M.; Woese, C. R.

    1983-01-01

    The sequence of the 16S ribosomal RNA (rRNA) from the archaebacterium Halobacterium volcanii has been determined by DNA sequencing methods. The archaebacterial rRNA is similar to its eubacterial counterpart in secondary structure. Although it is closer in sequence to the eubacterial 16S rRNA than to the eukaryotic 16S-like rRNA, the H. volcanii sequence also shows certain points of specific similarity to its eukaryotic counterpart. Since the H. volcanii sequence is closer to both the eubacterial and the eukaryotic sequences than these two are to one another, it follows that the archaebacterial sequence resembles their common ancestral sequence more closely than does either of the other two versions.

  12. High-throughput illumina strand-specific RNA sequencing library preparation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Conventional Illumina RNA-Seq does not have the resolution to decode the complex eukaryote transcriptome due to the lack of RNA polarity information. Strand-specific RNA sequencing (ssRNA-Seq) can overcome these limitations and as such is better suited for genome annotation, de novo transcriptome as...

  13. Sequence complementarity of sonchus yellow net virus RNA with RNA isolated from the polysomes of infected tobacco.

    PubMed

    Milner, J J; Jackson, A O

    1979-08-01

    Polyribosomal RNA from tobacco infected with sonchus yellow net virus (SYNV) contained sequences which hybridized to 125I-labeled SYNV RNA and which were complementary to 80 to 100% of the viral RNA genome. The poly(A)-containing RNA from polyribosomes was complementary to over 90% of the viral genome but the polyribosomal RNA lacking poly(A) hybridized to approximately 40-60% of the genome. The kinetics of hybridization of all three fractions are best explained by the presence of a single abundance class of viral-complementary RNA. However, titration hybridization of poly(A)+ RNA to an excess of SYNV RNA suggested that viral-complementary sequences which contain poly(A) may vary in concentration over a factor of about fivefold. About 1.5 to 4.6% of the fraction containing poly(A), 0.02 to 0.06% of the fraction lacking poly(A) and 0.04 to 0.18% of the total polyribosomal RNA was complementary to viral RNA as estimated from the kinetics of hybridization. The viral complementary RNA(vcRNA) was heterogeneous in size with a modal sedimentation coefficient of 12 S and a profile in sucrose density gradients similar to the polyadenylated polyribosomal RNA.

  14. PASTA: splice junction identification from RNA-Sequencing data

    PubMed Central

    2013-01-01

    Background Next generation transcriptome sequencing (RNA-Seq) is emerging as a powerful experimental tool for the study of alternative splicing and its regulation, but requires ad-hoc analysis methods and tools. PASTA (Patterned Alignments for Splicing and Transcriptome Analysis) is a splice junction detection algorithm specifically designed for RNA-Seq data, relying on a highly accurate alignment strategy and on a combination of heuristic and statistical methods to identify exon-intron junctions with high accuracy. Results Comparisons against TopHat and other splice junction prediction software on real and simulated datasets show that PASTA exhibits high specificity and sensitivity, especially at lower coverage levels. Moreover, PASTA is highly configurable and flexible, and can therefore be applied in a wide range of analysis scenarios: it is able to handle both single-end and paired-end reads, it does not rely on the presence of canonical splicing signals, and it uses organism-specific regression models to accurately identify junctions. Conclusions PASTA is a highly efficient and sensitive tool to identify splicing junctions from RNA-Seq data. Compared to similar programs, it has the ability to identify a higher number of real splicing junctions, and provides highly annotated output files containing detailed information about their location and characteristics. Accurate junction data in turn facilitates the reconstruction of the splicing isoforms and the analysis of their expression levels, which will be performed by the remaining modules of the PASTA pipeline, still under development. Use of PASTA can therefore enable the large-scale investigation of transcription and alternative splicing. PMID:23557086

  15. Tracking Cryptosporidium parvum by sequence analysis of small double-stranded RNA.

    PubMed Central

    Xiao, L.; Limor, J.; Bern, C.; Lal, A. A.

    2001-01-01

    We sequenced a 173-nucleotide fragment of the small double-stranded viruslike RNA of Cryptosporidium parvum isolates from 23 calves and 38 humans. Sequence diversity was detected at 17 sites. Isolates from the same outbreak had identical double-stranded RNA sequences, suggesting that this technique may be useful for tracking Cryptosporidium infection sources. PMID:11266306

  16. FASTR: A novel data format for concomitant representation of RNA sequence and secondary structure information.

    PubMed

    Bose, Tungadri; Dutta, Anirban; Mh, Mohammed; Gandhi, Hemang; Mande, Sharmila S

    2015-09-01

    Given the importance of RNA secondary structures in defining their biological role, it would be convenient for researchers seeking RNA data if both sequence and structural information pertaining to RNA molecules are made available together. Current nucleotide data repositories archive only RNA sequence data. Furthermore, storage formats which can frugally represent RNA sequence as well as structure data in a single file, are currently unavailable. This article proposes a novel storage format, 'FASTR', for concomitant representation of RNA sequence and structure. The storage efficiency of the proposed FASTR format has been evaluated using RNA data from various microorganisms. Results indicate that the size of FASTR formatted files (containing both RNA sequence as well as structure information) are equivalent to that of FASTA-format files, which contain only RNA sequence information. RNA secondary structure is typically represented using a combination of a string of nucleotide characters along with the corresponding dot-bracket notation indicating structural attributes. 'FASTR' - the novel storage format proposed in the present study enables a frugal representation of both RNA sequence and structural information in the form of a single string. In spite of having a relatively smaller storage footprint, the resultant 'fastr' string(s) retain all sequence as well as secondary structural information that could be stored using a dot-bracket notation. An implementation of the 'FASTR' methodology is available for download at http://metagenomics.atc.tcs.com/compression/fastr.

  17. Structural instability of human tandemly repeated DNA sequences cloned in yeast artificial chromosome vectors.

    PubMed Central

    Neil, D L; Villasante, A; Fisher, R B; Vetrie, D; Cox, B; Tyler-Smith, C

    1990-01-01

    The suitability of yeast artificial chromosome vectors (YACs) for cloning human Y chromosome tandemly repeated DNA sequences has been investigated. Clones containing DYZ3 or DYZ5 sequences were found in libraries at about the frequency anticipated on the basis of their abundance in the genome, but clones containing DYZ1 sequences were under-represented and the three clones examined contained junctions between DYZ1 and DYZ2. One DYZ3 clone was quite stable and had a long-range structure corresponding to genomic DNA. All other clones had long-range structures which either did not correspond to genomic DNA, or were too unstable to allow a simple comparison. The effects of the transformation process and host genotype on YAC structural stability were investigated. Gross structural rearrangements were often associated with re-transformation of yeast by a YAC. rad1-deficient yeast strains showed levels of instability similar to wild-type for all YAC clones tested. In rad52-deficient strains, DYZ5 containing YACs were as unstable as in the wild-type host, but DYZ1/DYZ2 or DYZ3 containing YACs were more stable. Thus the use of rad52 hosts for future library construction is recommended, but some sequences will still be unstable. Images PMID:2183192

  18. Targeted RNA Sequencing Assay to Characterize Gene Expression and Genomic Alterations.

    PubMed

    Martin, Dorrelyn P; Miya, Jharna; Reeser, Julie W; Roychowdhury, Sameek

    2016-01-01

    RNA sequencing (RNAseq) is a versatile method that can be utilized to detect and characterize gene expression, mutations, gene fusions, and noncoding RNAs. Standard RNAseq requires 30 - 100 million sequencing reads and can include multiple RNA products such as mRNA and noncoding RNAs. We demonstrate how targeted RNAseq (capture) permits a focused study on selected RNA products using a desktop sequencer. RNAseq capture can characterize unannotated, low, or transiently expressed transcripts that may otherwise be missed using traditional RNAseq methods. Here we describe the extraction of RNA from cell lines, ribosomal RNA depletion, cDNA synthesis, preparation of barcoded libraries, hybridization and capture of targeted transcripts and multiplex sequencing on a desktop sequencer. We also outline the computational analysis pipeline, which includes quality control assessment, alignment, fusion detection, gene expression quantification and identification of single nucleotide variants. This assay allows for targeted transcript sequencing to characterize gene expression, gene fusions, and mutations. PMID:27585245

  19. The nucleotide sequence of spinach chloroplast tryptophan transfer RNA.

    PubMed Central

    Canaday, J; Guillemaut, P; Gloeckler, R; Weil, J H

    1981-01-01

    Spinach chloroplast tRNATrp, purified by column chromatography and two-dimensional gel electrophoresis, has been sequenced using in vitro labeling techniques. The sequence is : pG-C-G-C-U-C-U-U-A-G-U-U-C-A-G-U-U-C-Gm-G-D-A-G-A-A-C-m2G-psi-G-G-G-psi-C-U-C-A-A*-A-A-C-C-C-G-A-U-G-N-C-G-U-A-G-G-T-psi-C-A-A-G-U-C-C-U-A-C-A-G-A-G-C-G-U-G -C-C-AOH. Like the E. coli suppressor tRNA psu+UGA which translates both the opal terminator codon U-G-A and the tryptophan codon U-G-G, spinach chloroplast tRNATrp has C-C-A as an anticodon and contains an A-U pair in the D-stem. Images PMID:6907845

  20. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    PubMed

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B

    2015-03-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  1. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons.

    PubMed

    Olson, Nathan D; Lund, Steven P; Zook, Justin M; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B

    2015-03-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing(®), or Ion Torrent PGM(®). The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  2. The structure of the yeast ribosomal RNA genes. I. The complete nucleotide sequence of the 18S ribosomal RNA gene from Saccharomyces cerevisiae.

    PubMed

    Rubtsov, P M; Musakhanov, M M; Zakharyev, V M; Krayev, A S; Skryabin, K G; Bayev, A A

    1980-12-11

    The cloned 18 S ribosomal RNA gene from Saccharomyces cerevisiae have been sequenced, using the Maxam-Gilbert procedure. From this data the complete sequence of 1789 nucleotides of the 18 S RNA was deduced. Extensive homology with many eucaryotic as well as E. coli ribosomal small subunit rRNA (S-rRNA) has been observed in the 3'-end region of the rRNA molecule. Comparison of the yeast 18 S rRNA sequences with partial sequence data, available for rRNAs of the other eucaryotes provides strong evidence that a substantial portion of the 18 S RNA sequence has been conserved in evolution.

  3. Two sequence classes of kinetoplastid 5S ribosomal RNA gene revealed among bodonid spliced leader RNA gene arrays.

    PubMed

    Santana, D M; Lukes, J; Sturm, N R; Campbell, D A

    2001-11-13

    The spliced leader RNA genes of Bodo saltans, Cryptobia helicis and Dimastigella trypaniformis were analyzed as molecular markers for additional taxa within the suborder Bodonina. The non-transcribed spacer regions were distinctive for each organism, and 5S rRNA genes were present in Bodo and Dimastigella but not in C. helicis. Two sequence classes of 5S rRNA were evident from analysis of the bodonid genes. The two classes of 5S rRNA genes were found in other Kinetoplastids independent of co-localization with the spliced leader RNA gene.

  4. Comparison of Ribosomal RNA Removal Methods for Transcriptome Sequencing Workflows in Teleost Fish.

    PubMed

    Abernathy, Jason; Overturf, Ken

    2016-01-01

    RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. These methods have been well documented in mammals but typically require some optimization for lower vertebrates. Three commercial kits, including Dynabeads mRNA Purification Kit, RiboMinus Eukaryote System v2, and Ribo-Zero Gold rRNA Removal Kit were examined for the ability to remove rRNAs from rainbow trout (Oncorhynchus mykiss) RNA isolations. Total RNA was isolated from liver and muscle tissue samples (n = 24) and rRNAs removed using one of the three kits. Samples were analyzed visually on the Agilent Bioanalyzer and by Illumina RNA-seq, screening for Oncorhynchus rRNAs. There were significant differences between the kits in regards to their ability to remove rRNA, ranging from 2.74% - 10.94% rRNA sequences left behind per kit on average. Using the Bioanalyzer to evaluate ribosomal contamination in rRNA-depleted samples for RNA-Seq was good for detecting samples with higher concentrations of rRNA (>5%), but not very accurate at lower levels. Although all three kits were able to remove a substantial portion of the rRNA from different fish tissues, the Ribo-Zero Gold rRNA Removal Kit eliminated significantly more contaminating ribosomal RNAs than the others.

  5. Nucleolar localization elements in U8 snoRNA differ from sequences required for rRNA processing.

    PubMed Central

    Lange, T S; Borovjagin, A V; Gerbi, S A

    1998-01-01

    U8 small nucleolar RNA (snoRNA) is essential for metazoan ribosomal RNA (rRNA) processing in nucleoli. The sequences and structural features in Xenopus U8 snoRNA that are required for its nucleolar localization were analyzed. Fluorescein-labeled U8 snoRNA was injected into Xenopus oocyte nuclei, and fluorescence microscopy of nucleolar preparations revealed that wild-type Xenopus U8 snoRNA localized to nucleoli, regardless of the presence or nature of the 5' cap on the injected U8 snoRNA. Nucleolar localization was observed when loops or stems in the 5' portion of U8 that are critical for U8 snoRNA function in rRNA processing were mutated. Therefore, sites of interaction in U8 snoRNA that potentially tether it to pre-rRNA are not essential for nucleolar localization of U8. Boxes C and D are known to be nucleolar localization elements (NoLEs) for U8 snoRNA and other snoRNAs of the Box C/D family. However, the spatial relationship of Box C to Box D was not crucial for U8 nucleolar localization, as demonstrated here by deletion of sequences in the two stems that separate them. These U8 mutants can localize to nucleoli and function in rRNA processing as well. The single-stranded Cup region in U8, adjacent to evolutionarily conserved Box C, functions as a NoLE in addition to Boxes C and D. Cup is unique to U8 snoRNA and may help bind putative protein(s) needed for nucleolar localization. Alternatively, Cup may help to retain U8 snoRNA within the nucleolus. PMID:9671052

  6. Differences in genome-wide repeat sequence instability conferred by proofreading and mismatch repair defects

    PubMed Central

    Lujan, Scott A.; Clark, Alan B.; Kunkel, Thomas A.

    2015-01-01

    Mutation rates are used to calibrate molecular clocks and to link genetic variants with human disease. However, mutation rates are not uniform across each eukaryotic genome. Rates for insertion/deletion (indel) mutations have been found to vary widely when examined in vitro and at specific loci in vivo. Here, we report the genome-wide rates of formation and repair of indels made during replication of yeast nuclear DNA. Using over 6000 indels accumulated in four mismatch repair (MMR) defective strains, and statistical corrections for false negatives, we find that indel rates increase by 100 000-fold with increasing homonucleotide run length, representing the greatest effect on replication fidelity of any known genomic parameter. Nonetheless, long genomic homopolymer runs are overrepresented relative to random chance, implying positive selection. Proofreading defects in the replicative polymerases selectively increase indel rates in short repetitive tracts, likely reflecting the distance over which Pols δ and ϵ interact with duplex DNA upstream of the polymerase active site. In contrast, MMR defects hugely increase indel mutagenesis in long repetitive sequences. Because repetitive sequences are not uniformly distributed among genomic functional elements, the quantitatively different consequences on genome-wide repeat sequence instability conferred by defects in proofreading and MMR have important biological implications. PMID:25824945

  7. A modified RNA-Seq approach for whole genome sequencing of RNA viruses from faecal and blood samples.

    PubMed

    Batty, Elizabeth M; Wong, T H Nicholas; Trebes, Amy; Argoud, Karène; Attar, Moustafa; Buck, David; Ip, Camilla L C; Golubchik, Tanya; Cule, Madeleine; Bowden, Rory; Manganis, Charis; Klenerman, Paul; Barnes, Eleanor; Walker, A Sarah; Wyllie, David H; Wilson, Daniel J; Dingle, Kate E; Peto, Tim E A; Crook, Derrick W; Piazza, Paolo

    2013-01-01

    To date, very large scale sequencing of many clinically important RNA viruses has been complicated by their high population molecular variation, which creates challenges for polymerase chain reaction and sequencing primer design. Many RNA viruses are also difficult or currently not possible to culture, severely limiting the amount and purity of available starting material. Here, we describe a simple, novel, high-throughput approach to Norovirus and Hepatitis C virus whole genome sequence determination based on RNA shotgun sequencing (also known as RNA-Seq). We demonstrate the effectiveness of this method by sequencing three Norovirus samples from faeces and two Hepatitis C virus samples from blood, on an Illumina MiSeq benchtop sequencer. More than 97% of reference genomes were recovered. Compared with Sanger sequencing, our method had no nucleotide differences in 14,019 nucleotides (nt) for Noroviruses (from a total of 2 Norovirus genomes obtained with Sanger sequencing), and 8 variants in 9,542 nt for Hepatitis C virus (1 variant per 1,193 nt). The three Norovirus samples had 2, 3, and 2 distinct positions called as heterozygous, while the two Hepatitis C virus samples had 117 and 131 positions called as heterozygous. To confirm that our sample and library preparation could be scaled to true high-throughput, we prepared and sequenced an additional 77 Norovirus samples in a single batch on an Illumina HiSeq 2000 sequencer, recovering >90% of the reference genome in all but one sample. No discrepancies were observed across 118,757 nt compared between Sanger and our custom RNA-Seq method in 16 samples. By generating viral genomic sequences that are not biased by primer-specific amplification or enrichment, this method offers the prospect of large-scale, affordable studies of RNA viruses which could be adapted to routine diagnostic laboratory workflows in the near future, with the potential to directly characterize within-host viral diversity.

  8. Analysis of minimal promoter sequences for plus-strand synthesis by the Cucumber necrosis virus RNA-dependent RNA polymerase.

    PubMed

    Panavas, T; Pogany, J; Nagy, P D

    2002-05-10

    Tombusviruses are small, plus-sense, single-stranded RNA viruses of plants. A partially purified RNA-dependent RNA polymerase (RdRp) preparation of Cucumber necrosis virus (CNV), which is capable of de novo initiation of complementary RNA synthesis from either plus-strand or minus-strand templates, was used to dissect minimal promoter sequences for tombusviruses and their defective interfering (DI) RNAs. In vitro RdRp assay revealed that the core plus-strand initiation promoter included only the 3'-terminal 11 nucleotides. A hypothetical promoter-like sequence, which has been termed consensus sequence by Wu and White (1998, J. Virol. 72, 9897-9905), is recognized less efficiently by the CNV RdRp than the core plus-strand initiation promoter. The CNV RdRp can efficiently recognize the core plus-strand initiation promoter for a satellite RNA associated with the distantly related Turnip crinkle virus, while artificial AU- or GC-rich 3'-terminal sequences make poor templates in the in vitro assays. Comparison of the "strength" of minimal plus-strand and minus-strand initiation promoters reveals that the latter is almost twice as efficient in promoting complementary RNA synthesis. Template competition experiments, however, suggest that the minimal plus-strand initiation promoter makes an RNA template more competitive than the minimal minus-strand initiation promoter. Taken together, these results demonstrate that promoter recognition by the tombusvirus RdRp requires only short sequences present at the 3' end of templates.

  9. Optimal terminal sequences for continuous or serial isothermal amplification of dsRNA with norovirus RNA replicase.

    PubMed

    Arai, Hidenao; Nishigaki, Koichi; Nemoto, Naoto; Suzuki, Miho; Husimi, Yuzuru

    2014-01-01

    The norovirus RNA replicase (NV3D(pol), 56 kDa, single chain monomeric protein) can amplify double-stranded (ds) RNA isothermally. It will play an alternative role in the in vitro evolution against traditional Qβ RNA replicase, which cannot amplify dsRNA and consists of four subunits, three of which are borrowed from host E.coli. In order to identify the optimal 3'-terminal sequence of the RNA template for NV3D(pol), an in vitro selection using the serial transfer was performed for a random library having the 3'-terminal sequence of ---UUUUUUNNNN-3'. The population landscape on the 4-dimensional sequence space of the 17(th) round of transfer gave a main peak around ---CAAC-3'. In the preceding studies on the batch amplification reaction starting from a single-stranded RNA, a template with 3'-terminal C-stretch was amplified effectively. It was confirmed that in the batch amplification the ---CCC-3' was much more effective than the ---CAAC-3', but in the serial transfer condition in which the ----CAAC-3' was sustained stably, the ---CCC-3' was washed out. Based on these results we proposed the existence of the "shuttle mode" replication of dsRNA. We also proposed the optimal terminal sequences of RNA for in vitro evolution with NV3D(pol). PMID:27493494

  10. Equally parsimonious pathways through an RNA sequence space are not equally likely

    NASA Technical Reports Server (NTRS)

    Lee, Y. H.; DSouza, L. M.; Fox, G. E.

    1997-01-01

    An experimental system for determining the potential ability of sequences resembling 5S ribosomal RNA (rRNA) to perform as functional 5S rRNAs in vivo in the Escherichia coli cellular environment was devised previously. Presumably, the only 5S rRNA sequences that would have been fixed by ancestral populations are ones that were functionally valid, and hence the actual historical paths taken through RNA sequence space during 5S rRNA evolution would have most likely utilized valid sequences. Herein, we examine the potential validity of all sequence intermediates along alternative equally parsimonious trajectories through RNA sequence space which connect two pairs of sequences that had previously been shown to behave as valid 5S rRNAs in E. coli. The first trajectory requires a total of four changes. The 14 sequence intermediates provide 24 apparently equally parsimonious paths by which the transition could occur. The second trajectory involves three changes, six intermediate sequences, and six potentially equally parsimonious paths. In total, only eight of the 20 sequence intermediates were found to be clearly invalid. As a consequence of the position of these invalid intermediates in the sequence space, seven of the 30 possible paths consisted of exclusively valid sequences. In several cases, the apparent validity/invalidity of the intermediate sequences could not be anticipated on the basis of current knowledge of the 5S rRNA structure. This suggests that the interdependencies in RNA sequence space may be more complex than currently appreciated. If ancestral sequences predicted by parsimony are to be regarded as actual historical sequences, then the present results would suggest that they should also satisfy a validity requirement and that, in at least limited cases, this conjecture can be tested experimentally.

  11. Identification and characterization of microRNA sequences from bovine mammary epithelial cells.

    PubMed

    Bu, D P; Nan, X M; Wang, F; Loor, J J; Wang, J Q

    2015-03-01

    The bovine mammary gland is composed of various cell types including bovine mammary epithelial cells (BMEC). The use of BMEC to uncover the microRNA (miRNA) profile would allow us to obtain a more specific profile of miRNA sequences that could be associated with lactation and avoid interference from other cell types. The objective of this study was to characterize the miRNA sequences expressed in isolated BMEC. The miRNA were identified by Solexa sequencing technology (Illumina Inc., San Diego, CA). Furthermore, novel miRNA were uncovered by stem-loop reverse transcription-PCR and sequencing of PCR products. To detect tissue specificity, expression of novel miRNA sequences was measured by stem-loop RT-PCR and sequencing of PCR products in mammary gland, liver, adipose, ileum, spleen and kidney tissue from 3 lactating Holstein cows (50±10 d postpartum). After bioinformatics analysis, 12,323,451 reads were obtained by Solexa sequencing, of which 11,979,706 were clean reads, matching the bovine genome. Among clean reads, 9,428,122 belonged to miRNA sequences. Further analysis revealed that the miRNA bta-mir-184 had the most abundant expression, and 388 loci possessed the typical stem-loop structures matching known miRNA hairpins. In total, 38 loci with novel hairpins were identified as novel miRNA and were numbered from bta-U1 to bta-U38. One novel miRNA (bta-U21) was specific to mammary gland. Seven novel miRNA, including bta-U21, had tissue-restricted distribution. Uncovering the specific roles of these novel miRNA during lactation appears warranted.

  12. Identification of extracellular miRNA in human cerebrospinal fluid by next-generation sequencing.

    PubMed

    Burgos, Kasandra Lovette; Javaherian, Ashkan; Bomprezzi, Roberto; Ghaffari, Layla; Rhodes, Susan; Courtright, Amanda; Tembe, Waibhav; Kim, Seungchan; Metpally, Raghu; Van Keuren-Jensen, Kendall

    2013-05-01

    There has been a growing interest in using next-generation sequencing (NGS) to profile extracellular small RNAs from the blood and cerebrospinal fluid (CSF) of patients with neurological diseases, CNS tumors, or traumatic brain injury for biomarker discovery. Small sample volumes and samples with low RNA abundance create challenges for downstream small RNA sequencing assays. Plasma, serum, and CSF contain low amounts of total RNA, of which small RNAs make up a fraction. The purpose of this study was to maximize RNA isolation from RNA-limited samples and apply these methods to profile the miRNA in human CSF by small RNA deep sequencing. We systematically tested RNA isolation efficiency using ten commercially available kits and compared their performance on human plasma samples. We used RiboGreen to quantify total RNA yield and custom TaqMan assays to determine the efficiency of small RNA isolation for each of the kits. We significantly increased the recovery of small RNA by repeating the aqueous extraction during the phenol-chloroform purification in the top performing kits. We subsequently used the methods with the highest small RNA yield to purify RNA from CSF and serum samples from the same individual. We then prepared small RNA sequencing libraries using Illumina's TruSeq sample preparation kit and sequenced the samples on the HiSeq 2000. Not surprisingly, we found that the miRNA expression profile of CSF is substantially different from that of serum. To our knowledge, this is the first time that the small RNA fraction from CSF has been profiled using next-generation sequencing.

  13. Prolonged exposure to methylglyoxal causes disruption of vascular KATP channel by mRNA instability

    PubMed Central

    Yang, Yang; Li, Shanshan; Konduru, Anuhya S.; Zhang, Shuang; Trower, Timothy C.; Shi, Weiwei; Cui, Ningren; Yu, Lei; Wang, Yali; Zhu, Daling

    2012-01-01

    Diabetes mellitus is characterized by hyperglycemia and excessive production of intermediary metabolites including methylglyoxal (MGO), a reactive carbonyl species that can lead to cell injuries. Interacting with proteins, lipids, and DNA, excessive MGO can cause dysfunction of various tissues, especially the vascular walls where diabetic complications often take place. However, the potential vascular targets of excessive MGO remain to be fully understood. Here we show that the vascular Kir6.1/SUR2B isoform of ATP-sensitive K+ (KATP) channels is likely to be disrupted with an exposure to submillimolar MGO. Up to 90% of the Kir6.1/SUR2B currents were suppressed by 1 mM MGO with a time constant of ∼2 h. Consistently, MGO treatment caused a vast reduction of both Kir6.1 and SUR2B mRNAs endogenously expressed in the A10 vascular smooth muscle cells. In the presence of the transcriptional inhibitor actinomycin-D, MGO remained to lower the Kir6.1 and SUR2B mRNAs to the same degree as MGO alone, suggesting that the MGO effect is likely to compromise the mRNA stability. Luciferase reporter assays indicated that the 3′-untranslated regions (UTRs) of the Kir6.1 but not SUR2 mRNA were targeted by MGO. In contrast, the SUR2B mRNAs obtained with in vitro transcription were disrupted by MGO directly, while the Kir6.1 transcripts were unaffected. Consistent with these results, the constriction of mesenteric arterial rings was markedly augmented with an exposure to 1 mM MGO for 2 h, and such an MGO effect was totally eliminated in the presence of glibenclamide. These results therefore suggest that acting on the 3′-UTR of Kir6.1 and the coding region of SUR2B, MGO causes instability of Kir6.1 and SUR2B mRNAs, disruption of vascular KATP channels, and impairment of arterial function. PMID:22972803

  14. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    PubMed

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA.

  15. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    PubMed

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA. PMID:27117661

  16. Characterization of MazF-Mediated Sequence-Specific RNA Cleavage in Pseudomonas putida Using Massive Parallel Sequencing.

    PubMed

    Miyamoto, Tatsuki; Kato, Yuka; Sekiguchi, Yuji; Tsuneda, Satoshi; Noda, Naohiro

    2016-01-01

    Under environmental stress, microbes are known to alter their translation patterns using sequence-specific endoribonucleases that we call RNA interferases. However, there has been limited insight regarding which RNAs are specifically cleaved by these RNA interferases, hence their physiological functions remain unknown. In the current study, we developed a novel method to effectively identify cleavage specificities with massive parallel sequencing. This approach uses artificially designed RNAs composed of diverse sequences, which do not form extensive secondary structures, and it correctly identified the cleavage sequence of a well-characterized Escherichia coli RNA interferase, MazF, as ACA. In addition, we also determined that an uncharacterized MazF homologue isolated from Pseudomonas putida specifically recognizes the unique triplet, UAC. Using a real-time fluorescence resonance energy transfer assay, the UAC triplet was further proved to be essential for cleavage in P. putida MazF. These results highlight an effective method to determine cleavage specificity of RNA interferases.

  17. Distinct tmRNA sequence elements facilitate RNase R engagement on rescued ribosomes for selective nonstop mRNA decay.

    PubMed

    Venkataraman, Krithika; Zafar, Hina; Karzai, A Wali

    2014-01-01

    trans-Translation, orchestrated by SmpB and tmRNA, is the principal eubacterial pathway for resolving stalled translation complexes. RNase R, the leading nonstop mRNA surveillance factor, is recruited to stalled ribosomes in a trans-translation dependent process. To elucidate the contributions of SmpB and tmRNA to RNase R recruitment, we evaluated Escherichia coli-Francisella tularensis chimeric variants of tmRNA and SmpB. This evaluation showed that while the hybrid tmRNA supported nascent polypeptide tagging and ribosome rescue, it suffered defects in facilitating RNase R recruitment to stalled ribosomes. To gain further insights, we used established tmRNA and SmpB variants that impact distinct stages of the trans-translation process. Analysis of select tmRNA variants revealed that the sequence composition and positioning of the ultimate and penultimate codons of the tmRNA ORF play a crucial role in recruiting RNase R to rescued ribosomes. Evaluation of defined SmpB C-terminal tail variants highlighted the importance of establishing the tmRNA reading frame, and provided valuable clues into the timing of RNase R recruitment to rescued ribosomes. Taken together, these studies demonstrate that productive RNase R-ribosomes engagement requires active trans-translation, and suggest that RNase R captures the emerging nonstop mRNA at an early stage after establishment of the tmRNA ORF as the surrogate mRNA template.

  18. Evaluating methods for isolating total RNA and predicting the success of sequencing phylogenetically diverse plant transcriptomes.

    PubMed

    Johnson, Marc T J; Carpenter, Eric J; Tian, Zhijian; Bruskiewich, Richard; Burris, Jason N; Carrigan, Charlotte T; Chase, Mark W; Clarke, Neil D; Covshoff, Sarah; Depamphilis, Claude W; Edger, Patrick P; Goh, Falicia; Graham, Sean; Greiner, Stephan; Hibberd, Julian M; Jordon-Thaden, Ingrid; Kutchan, Toni M; Leebens-Mack, James; Melkonian, Michael; Miles, Nicholas; Myburg, Henrietta; Patterson, Jordan; Pires, J Chris; Ralph, Paula; Rolf, Megan; Sage, Rowan F; Soltis, Douglas; Soltis, Pamela; Stevenson, Dennis; Stewart, C Neal; Surek, Barbara; Thomsen, Christina J M; Villarreal, Juan Carlos; Wu, Xiaolei; Zhang, Yong; Deyholos, Michael K; Wong, Gane Ka-Shu

    2012-01-01

    Next-generation sequencing plays a central role in the characterization and quantification of transcriptomes. Although numerous metrics are purported to quantify the quality of RNA, there have been no large-scale empirical evaluations of the major determinants of sequencing success. We used a combination of existing and newly developed methods to isolate total RNA from 1115 samples from 695 plant species in 324 families, which represents >900 million years of phylogenetic diversity from green algae through flowering plants, including many plants of economic importance. We then sequenced 629 of these samples on Illumina GAIIx and HiSeq platforms and performed a large comparative analysis to identify predictors of RNA quality and the diversity of putative genes (scaffolds) expressed within samples. Tissue types (e.g., leaf vs. flower) varied in RNA quality, sequencing depth and the number of scaffolds. Tissue age also influenced RNA quality but not the number of scaffolds ≥ 1000 bp. Overall, 36% of the variation in the number of scaffolds was explained by metrics of RNA integrity (RIN score), RNA purity (OD 260/230), sequencing platform (GAIIx vs HiSeq) and the amount of total RNA used for sequencing. However, our results show that the most commonly used measures of RNA quality (e.g., RIN) are weak predictors of the number of scaffolds because Illumina sequencing is robust to variation in RNA quality. These results provide novel insight into the methods that are most important in isolating high quality RNA for sequencing and assembling plant transcriptomes. The methods and recommendations provided here could increase the efficiency and decrease the cost of RNA sequencing for individual labs and genome centers.

  19. Characterising the Canine Oral Microbiome by Direct Sequencing of Reverse-Transcribed rRNA Molecules.

    PubMed

    McDonald, James E; Larsen, Niels; Pennington, Andrea; Connolly, John; Wallis, Corrin; Rooks, David J; Hall, Neil; McCarthy, Alan J; Allison, Heather E

    2016-01-01

    PCR amplification and sequencing of phylogenetic markers, primarily Small Sub-Unit ribosomal RNA (SSU rRNA) genes, has been the paradigm for defining the taxonomic composition of microbiomes. However, 'universal' SSU rRNA gene PCR primer sets are likely to miss much of the diversity therein. We sequenced a library comprising purified and reverse-transcribed SSU rRNA (RT-SSU rRNA) molecules from the canine oral microbiome and compared it to a general bacterial 16S rRNA gene PCR amplicon library generated from the same biological sample. In addition, we have developed BIONmeta, a novel, open-source, computer package for the processing and taxonomic classification of the randomly fragmented RT-SSU rRNA reads produced. Direct RT-SSU rRNA sequencing revealed that 16S rRNA molecules belonging to the bacterial phyla Actinobacteria, Bacteroidetes, Firmicutes, Proteobacteria and Spirochaetes, were most abundant in the canine oral microbiome (92.5% of total bacterial SSU rRNA). The direct rRNA sequencing approach detected greater taxonomic diversity (1 additional phylum, 2 classes, 1 order, 10 families and 61 genera) when compared with general bacterial 16S rRNA amplicons from the same sample, simultaneously provided SSU rRNA gene inventories of Bacteria, Archaea and Eukarya, and detected significant numbers of sequences not recognised by 'universal' primer sets. Proteobacteria and Spirochaetes were found to be under-represented by PCR-based analysis of the microbiome, and this was due to primer mismatches and taxon-specific variations in amplification efficiency, validated by qPCR analysis of 16S rRNA amplicons from a mock community. This demonstrated the veracity of direct RT-SSU rRNA sequencing for molecular microbial ecology. PMID:27276347

  20. Characterising the Canine Oral Microbiome by Direct Sequencing of Reverse-Transcribed rRNA Molecules

    PubMed Central

    McDonald, James E.; Larsen, Niels; Pennington, Andrea; Connolly, John; Wallis, Corrin; Rooks, David J.; Hall, Neil; McCarthy, Alan J.; Allison, Heather E.

    2016-01-01

    PCR amplification and sequencing of phylogenetic markers, primarily Small Sub-Unit ribosomal RNA (SSU rRNA) genes, has been the paradigm for defining the taxonomic composition of microbiomes. However, ‘universal’ SSU rRNA gene PCR primer sets are likely to miss much of the diversity therein. We sequenced a library comprising purified and reverse-transcribed SSU rRNA (RT-SSU rRNA) molecules from the canine oral microbiome and compared it to a general bacterial 16S rRNA gene PCR amplicon library generated from the same biological sample. In addition, we have developed BIONmeta, a novel, open-source, computer package for the processing and taxonomic classification of the randomly fragmented RT-SSU rRNA reads produced. Direct RT-SSU rRNA sequencing revealed that 16S rRNA molecules belonging to the bacterial phyla Actinobacteria, Bacteroidetes, Firmicutes, Proteobacteria and Spirochaetes, were most abundant in the canine oral microbiome (92.5% of total bacterial SSU rRNA). The direct rRNA sequencing approach detected greater taxonomic diversity (1 additional phylum, 2 classes, 1 order, 10 families and 61 genera) when compared with general bacterial 16S rRNA amplicons from the same sample, simultaneously provided SSU rRNA gene inventories of Bacteria, Archaea and Eukarya, and detected significant numbers of sequences not recognised by ‘universal’ primer sets. Proteobacteria and Spirochaetes were found to be under-represented by PCR-based analysis of the microbiome, and this was due to primer mismatches and taxon-specific variations in amplification efficiency, validated by qPCR analysis of 16S rRNA amplicons from a mock community. This demonstrated the veracity of direct RT-SSU rRNA sequencing for molecular microbial ecology. PMID:27276347

  1. The genomic RNA1 and RNA2 sequences of the tobacco rattle virus isolates found in Polish potato fields.

    PubMed

    Yin, Zhimin; Pawełkowicz, Magdalena; Michalak, Krystyna; Chrzanowska, Mirosława; Zimnoch-Guzowska, Ewa

    2014-06-24

    Four tobacco rattle virus (TRV) isolates were identified from tobacco bait seedlings planted in soil samples from Polish potato fields. Sequence analysis of the genomic RNA1 of the isolates revealed significant similarity to the isolates Ho and AL recently found in Germany. Multiple sequence alignments of the genomic RNA2 indicated that the two isolates from northern Poland (Deb57 and Slu24) are in a cluster with the isolates PSG and PLB found in the Netherlands. The remaining two isolates, from central Poland (11r21 and Mlo7), are in a distinct group with the unique isolate SYM found in England. The RNA2 sequences of the studied isolates range from 1998 nt to 2739 nt in length, and all carry deletions of the 2b and/or 2c genes. The isolate Mlo7 has an atypical RNA2 structure, having its cp gene located in its central region. PMID:24637409

  2. The nucleotide sequence of 5S rRNA from a cellular slime mold Dictyostelium discoideum.

    PubMed

    Hori, H; Osawa, S; Iwabuchi, M

    1980-12-11

    The nucleotide sequence of ribosomal 5S rRNA from a cellular slime mold Dictyostelium discoideum is GUAUACGGCCAUACUAGGUUGGAAACACAUCAUCCCGUUCGAUCUGAUA AGUAAAUCGACCUCAGGCCUUCCAAGUACUCUGGUUGGAGACAACAGGGGAACAUAGGGUGCUGUAUACU. A model for the secondary structure of this 5S rRNA is proposed. The sequence is more similar to those of animals (62% similarity on the average) rather than those of yeasts (56%).

  3. Tetrathiobacter kashmirensis Strain CA-1 16S rRNA gene complete sequence.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This study used 1326 base pair 16S rRNA gene sequence methods to confirm the identification of a bacterium as Tetrathiobacter kashmirensis. Morphological, biochemical characteristics, and fatty acid profiles are consistent with the 16S rRNA gene sequence identification of the bacterium. The isolate...

  4. MicroRNA Target Site Identification by Integrating Sequence and Binding Information

    PubMed Central

    Majoros, William H.; Lekprasert, Parawee; Mukherjee, Neelanjan; Skalsky, Rebecca L.; Corcoran, David L.; Cullen, Bryan R.; Ohler, Uwe

    2013-01-01

    High-throughput sequencing has opened numerous possibilities for the identification of regulatory RNA-binding events. Cross-linking and immunoprecipitation of Argonaute protein members can pinpoint microRNA target sites within tens of bases, but leaves the identity of the microRNA unresolved. A flexible computational framework that integrates sequence with cross-linking features reliably identifies the microRNA family involved in each binding event, considerably outperforms sequence-only approaches, and quantifies the prevalence of noncanonical binding modes. PMID:23708386

  5. Transcription profile of boar spermatozoa as revealed by RNA-sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughput RNA sequencing (RNA-Seq) overcomes the limitations of the current hybridization-based techniques to detect the actual pool of RNA transcripts in spermatozoa. The application of this technology in livestock can speed the discovery of potential predictors of male fertility. As a first ...

  6. Strand-specific libraries for high throughput RNA sequencing (RNA-Seq) prepared without poly(A) selection

    PubMed Central

    2012-01-01

    Background High throughput DNA sequencing technology has enabled quantification of all the RNAs in a cell or tissue, a method widely known as RNA sequencing (RNA-Seq). However, non-coding RNAs such as rRNA are highly abundant and can consume >70% of sequencing reads. A common approach is to extract only polyadenylated mRNA; however, such approaches are blind to RNAs with short or no poly(A) tails, leading to an incomplete view of the transcriptome. Another challenge of preparing RNA-Seq libraries is to preserve the strand information of the RNAs. Design Here, we describe a procedure for preparing RNA-Seq libraries from 1 to 4 μg total RNA without poly(A) selection. Our method combines the deoxyuridine triphosphate (dUTP)/uracil-DNA glycosylase (UDG) strategy to achieve strand specificity with AMPure XP magnetic beads to perform size selection. Together, these steps eliminate gel purification, allowing a library to be made in less than two days. We barcode each library during the final PCR amplification step, allowing several samples to be sequenced in a single lane without sacrificing read length. Libraries prepared using this protocol are compatible with Illumina GAII, GAIIx and HiSeq 2000 platforms. Discussion The RNA-Seq protocol described here yields strand-specific transcriptome libraries without poly(A) selection, which provide approximately 90% mappable sequences. Typically, more than 85% of mapped reads correspond to protein-coding genes and only 6% derive from non-coding RNAs. The protocol has been used to measure RNA transcript identity and abundance in tissues from flies, mice, rats, chickens, and frogs, demonstrating its general applicability. PMID:23273270

  7. Sequence analysis of 16S rRNA from mycoplasmas by direct solid-phase DNA sequencing.

    PubMed Central

    Pettersson, B; Johansson, K E; Uhlén, M

    1994-01-01

    Automated solid-phase DNA sequencing was used for determination of partial 16S ribosomal DNA sequences of mycoplasmas. The sequence information was used to establish phylogenetic relationships of 11 different mycoplasmas whose 16S rRNA sequences had not been determined earlier. A biotinylated fragment corresponding to positions 344 to 939 in the Escherichia coli sequence was generated by PCR. The PCR product was immobilized onto streptavidin-coated paramagnetic beads, and direct sequencing was performed in both directions. One previously unclassified avian mycoplasma was found to belong to the Mycoplasma lipophilum cluster of the hominis group. Microheterogeneities were discovered in the rRNA operons of Mycoplasma mycoides subsp. mycoides (SC type), confirming the existence of two different rRNA operons. The 16S rRNA sequence of M. mycoides subsp. capri was identical to that of M. mycoides subsp. mycoides (type SC), except that no microheterogeneities were revealed. Furthermore, automated solid-phase DNA sequencing was used to identify a mycoplasmal contamination of a cell culture as Mycoplasma hyorhinis, which proved to be very difficult by conventional methods. The results suggest that the direct solid-phase DNA sequencing procedure is a powerful tool for identification of mycoplasmas and is also useful in taxonomic studies. Images PMID:7521158

  8. Role of the 5' leader sequence of alfalfa mosaic virus RNA 3 in replication and translation of the viral RNA.

    PubMed Central

    van der Vossen, E A; Neeleman, L; Bol, J F

    1993-01-01

    RNA 3 of alfalfa mosaic virus (AIMV) encodes the movement protein P3 and the viral coat protein which is translated from the subgenomic RNA 4. The 5'-leader sequences of RNA 3 of AIMV strains S, A, and Y differ in length from 314 to 392 nucleotides and contain a variable number of internal control regions of type 2 (ICR2 motifs) each located in a 27 nt repeat. Infectious cDNA clones were used to exchange the leader sequences of the three strains. This revealed that the leader sequence controls the specific ratio in which RNAs 3 and 4 are synthesized for each strain. In addition, it specifies strain specific differences in the kinetics of P3 accumulation in plants. Subsequent deletion analysis revealed that a 5'-sequence of 112 nt containing one ICR2 motif was sufficient for a 10 to 20% level of RNA 3 accumulation in protoplasts and a delayed accumulation in plants. An additional leader sequence of maximally 114 nt, containing two ICR2 motifs, was required to permit wildtype levels of RNA 3 accumulation. The effect of deletions in the leader sequence on P3 synthesis in vitro and in vivo was investigated. Images PMID:8464726

  9. JAR3D Webserver: Scoring and aligning RNA loop sequences to known 3D motifs

    PubMed Central

    Roll, James; Zirbel, Craig L.; Sweeney, Blake; Petrov, Anton I.; Leontis, Neocles

    2016-01-01

    Many non-coding RNAs have been identified and may function by forming 2D and 3D structures. RNA hairpin and internal loops are often represented as unstructured on secondary structure diagrams, but RNA 3D structures show that most such loops are structured by non-Watson–Crick basepairs and base stacking. Moreover, different RNA sequences can form the same RNA 3D motif. JAR3D finds possible 3D geometries for hairpin and internal loops by matching loop sequences to motif groups from the RNA 3D Motif Atlas, by exact sequence match when possible, and by probabilistic scoring and edit distance for novel sequences. The scoring gauges the ability of the sequences to form the same pattern of interactions observed in 3D structures of the motif. The JAR3D webserver at http://rna.bgsu.edu/jar3d/ takes one or many sequences of a single loop as input, or else one or many sequences of longer RNAs with multiple loops. Each sequence is scored against all current motif groups. The output shows the ten best-matching motif groups. Users can align input sequences to each of the motif groups found by JAR3D. JAR3D will be updated with every release of the RNA 3D Motif Atlas, and so its performance is expected to improve over time. PMID:27235417

  10. Replicating satellite RNA induces sequence-specific DNA methylation and truncated transcripts in plants.

    PubMed Central

    Wang, M B; Wesley, S V; Finnegan, E J; Smith, N A; Waterhouse, P M

    2001-01-01

    Tobacco plants were transformed with a chimeric transgene comprising sequences encoding beta-glucuronidase (GUS) and the satellite RNA (satRNA) of cereal yellow dwarf luteovirus. When transgenic plants were infected with potato leafroll luteovirus (PLRV), which replicated the transgene-derived satRNA to a high level, the satellite sequence of the GUS:Sat transgene became densely methylated. Within the satellite region, all 86 cytosines in the upper strand and 73 of the 75 cytosines in the lower strand were either partially or fully methylated. In contrast, very low levels of DNA methylation were detected in the satellite sequence of the transgene in uninfected plants and in the flanking nonsatellite sequences in both infected and uninfected plants. Substantial amounts of truncated GUS:Sat RNA accumulated in the satRNA-replicating plants, and most of the molecules terminated at nucleotides within the first 60 bp of the satellite sequence. Whereas this RNA truncation was associated with high levels of satRNA replication, it appeared to be independent of the levels of DNA methylation in the satellite sequence, suggesting that it is not caused by methylation. All the sequenced GUS:Sat DNA molecules were hypermethylated in plants with replicating satRNA despite the phloem restriction of the helper PLRV. Also, small, sense and antisense approximately 22 nt RNAs, derived from the satRNA, were associated with the replicating satellite. These results suggest that the sequence-specific DNA methylation spread into cells in which no satRNA replication occurred and that this was mediated by the spread of unamplified satRNA and/or its associated 22 nt RNA molecules. PMID:11214177

  11. Highly efficient ligation of small RNA molecules for microRNA quantitation by high-throughput sequencing.

    PubMed

    Lee, Jerome E; Yi, Rui

    2014-01-01

    MiRNA cloning and high-throughput sequencing, termed miR-Seq, stands alone as a transcriptome-wide approach to quantify miRNAs with single nucleotide resolution. This technique captures miRNAs by attaching 3' and 5' oligonucleotide adapters to miRNA molecules and allows de novo miRNA discovery. Coupling with powerful next-generation sequencing platforms, miR-Seq has been instrumental in the study of miRNA biology. However, significant biases introduced by oligonucleotide ligation steps have prevented miR-Seq from being employed as an accurate quantitation tool. Previous studies demonstrate that biases in current miR-Seq methods often lead to inaccurate miRNA quantification with errors up to 1,000-fold for some miRNAs. To resolve these biases imparted by RNA ligation, we have developed a small RNA ligation method that results in ligation efficiencies of over 95% for both 3' and 5' ligation steps. Benchmarking this improved library construction method using equimolar or differentially mixed synthetic miRNAs, consistently yields reads numbers with less than two-fold deviation from the expected value. Furthermore, this high-efficiency miR-Seq method permits accurate genome-wide miRNA profiling from in vivo total RNA samples. PMID:25490151

  12. Integration of Expressed Sequence Tag Data Flanking Predicted RNA Secondary Structures Facilitates Novel Non-Coding RNA Discovery

    PubMed Central

    Krzyzanowski, Paul M.; Price, Feodor D.; Muro, Enrique M.; Rudnicki, Michael A.; Andrade-Navarro, Miguel A.

    2011-01-01

    Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation. PMID:21698286

  13. The use of exome capture RNA-seq for highly degraded RNA with application to clinical cancer sequencing

    PubMed Central

    Cieslik, Marcin; Chugh, Rashmi; Wu, Yi-Mi; Wu, Ming; Brennan, Christine; Lonigro, Robert; Su, Fengyun; Wang, Rui; Siddiqui, Javed; Mehra, Rohit; Cao, Xuhong; Lucas, David; Chinnaiyan, Arul M.; Robinson, Dan

    2015-01-01

    RNA-seq by poly(A) selection is currently the most common protocol for whole transcriptome sequencing as it provides a broad, detailed, and accurate view of the RNA landscape. Unfortunately, the utility of poly(A) libraries is greatly limited when the input RNA is degraded, which is the norm for research tissues and clinical samples, especially when specimens are formalin-fixed. To facilitate the use of RNA sequencing beyond cell lines and in the clinical setting, we developed an exome-capture transcriptome protocol with greatly improved performance on degraded RNA. Capture transcriptome libraries enable measuring absolute and differential gene expression, calling genetic variants, and detecting gene fusions. Through validation against gold-standard poly(A) and Ribo-Zero libraries from intact RNA, we show that capture RNA-seq provides accurate and unbiased estimates of RNA abundance, uniform transcript coverage, and broad dynamic range. Unlike poly(A) selection and Ribo-Zero depletion, capture libraries retain these qualities regardless of RNA quality and provide excellent data from clinical specimens including formalin-fixed paraffin-embedded (FFPE) blocks. Systematic improvements across key applications of RNA-seq are shown on a cohort of prostate cancer patients and a set of clinical FFPE samples. Further, we demonstrate the utility of capture RNA-seq libraries in a patient with a highly malignant solitary fibrous tumor (SFT) enrolled in our clinical sequencing program called MI-ONCOSEQ. Capture transcriptome profiling from FFPE revealed two oncogenic fusions: the pathognomonic NAB2-STAT6 inversion and a therapeutically actionable BRAF fusion, which may drive this specific cancer's aggressive phenotype. PMID:26253700

  14. The use of exome capture RNA-seq for highly degraded RNA with application to clinical cancer sequencing.

    PubMed

    Cieslik, Marcin; Chugh, Rashmi; Wu, Yi-Mi; Wu, Ming; Brennan, Christine; Lonigro, Robert; Su, Fengyun; Wang, Rui; Siddiqui, Javed; Mehra, Rohit; Cao, Xuhong; Lucas, David; Chinnaiyan, Arul M; Robinson, Dan

    2015-09-01

    RNA-seq by poly(A) selection is currently the most common protocol for whole transcriptome sequencing as it provides a broad, detailed, and accurate view of the RNA landscape. Unfortunately, the utility of poly(A) libraries is greatly limited when the input RNA is degraded, which is the norm for research tissues and clinical samples, especially when specimens are formalin-fixed. To facilitate the use of RNA sequencing beyond cell lines and in the clinical setting, we developed an exome-capture transcriptome protocol with greatly improved performance on degraded RNA. Capture transcriptome libraries enable measuring absolute and differential gene expression, calling genetic variants, and detecting gene fusions. Through validation against gold-standard poly(A) and Ribo-Zero libraries from intact RNA, we show that capture RNA-seq provides accurate and unbiased estimates of RNA abundance, uniform transcript coverage, and broad dynamic range. Unlike poly(A) selection and Ribo-Zero depletion, capture libraries retain these qualities regardless of RNA quality and provide excellent data from clinical specimens including formalin-fixed paraffin-embedded (FFPE) blocks. Systematic improvements across key applications of RNA-seq are shown on a cohort of prostate cancer patients and a set of clinical FFPE samples. Further, we demonstrate the utility of capture RNA-seq libraries in a patient with a highly malignant solitary fibrous tumor (SFT) enrolled in our clinical sequencing program called MI-ONCOSEQ. Capture transcriptome profiling from FFPE revealed two oncogenic fusions: the pathognomonic NAB2-STAT6 inversion and a therapeutically actionable BRAF fusion, which may drive this specific cancer's aggressive phenotype.

  15. The use of exome capture RNA-seq for highly degraded RNA with application to clinical cancer sequencing.

    PubMed

    Cieslik, Marcin; Chugh, Rashmi; Wu, Yi-Mi; Wu, Ming; Brennan, Christine; Lonigro, Robert; Su, Fengyun; Wang, Rui; Siddiqui, Javed; Mehra, Rohit; Cao, Xuhong; Lucas, David; Chinnaiyan, Arul M; Robinson, Dan

    2015-09-01

    RNA-seq by poly(A) selection is currently the most common protocol for whole transcriptome sequencing as it provides a broad, detailed, and accurate view of the RNA landscape. Unfortunately, the utility of poly(A) libraries is greatly limited when the input RNA is degraded, which is the norm for research tissues and clinical samples, especially when specimens are formalin-fixed. To facilitate the use of RNA sequencing beyond cell lines and in the clinical setting, we developed an exome-capture transcriptome protocol with greatly improved performance on degraded RNA. Capture transcriptome libraries enable measuring absolute and differential gene expression, calling genetic variants, and detecting gene fusions. Through validation against gold-standard poly(A) and Ribo-Zero libraries from intact RNA, we show that capture RNA-seq provides accurate and unbiased estimates of RNA abundance, uniform transcript coverage, and broad dynamic range. Unlike poly(A) selection and Ribo-Zero depletion, capture libraries retain these qualities regardless of RNA quality and provide excellent data from clinical specimens including formalin-fixed paraffin-embedded (FFPE) blocks. Systematic improvements across key applications of RNA-seq are shown on a cohort of prostate cancer patients and a set of clinical FFPE samples. Further, we demonstrate the utility of capture RNA-seq libraries in a patient with a highly malignant solitary fibrous tumor (SFT) enrolled in our clinical sequencing program called MI-ONCOSEQ. Capture transcriptome profiling from FFPE revealed two oncogenic fusions: the pathognomonic NAB2-STAT6 inversion and a therapeutically actionable BRAF fusion, which may drive this specific cancer's aggressive phenotype. PMID:26253700

  16. Globin mRNA reduction for whole-blood transcriptome sequencing

    PubMed Central

    Krjutškov, Kaarel; Koel, Mariann; Roost, Anne Mari; Katayama, Shintaro; Einarsdottir, Elisabet; Jouhilahti, Eeva-Mari; Söderhäll, Cilla; Jaakma, Ülle; Plaas, Mario; Vesterlund, Liselotte; Lohi, Hannes; Salumets, Andres; Kere, Juha

    2016-01-01

    The transcriptome analysis of whole-blood RNA by sequencing holds promise for the identification and tracking of biomarkers; however, the high globin mRNA (gmRNA) content of erythrocytes hampers whole-blood and buffy coat analyses. We introduce a novel gmRNA locking assay (GlobinLock, GL) as a robust and simple gmRNA reduction tool to preserve RNA quality, save time and cost. GL consists of a pair of gmRNA-specific oligonucleotides in RNA initial denaturation buffer that is effective immediately after RNA denaturation and adds only ten minutes of incubation to the whole cDNA synthesis procedure when compared to non-blood RNA analysis. We show that GL is fully effective not only for human samples but also for mouse and rat, and so far incompletely studied cow, dog and zebrafish. PMID:27515369

  17. Globin mRNA reduction for whole-blood transcriptome sequencing.

    PubMed

    Krjutškov, Kaarel; Koel, Mariann; Roost, Anne Mari; Katayama, Shintaro; Einarsdottir, Elisabet; Jouhilahti, Eeva-Mari; Söderhäll, Cilla; Jaakma, Ülle; Plaas, Mario; Vesterlund, Liselotte; Lohi, Hannes; Salumets, Andres; Kere, Juha

    2016-01-01

    The transcriptome analysis of whole-blood RNA by sequencing holds promise for the identification and tracking of biomarkers; however, the high globin mRNA (gmRNA) content of erythrocytes hampers whole-blood and buffy coat analyses. We introduce a novel gmRNA locking assay (GlobinLock, GL) as a robust and simple gmRNA reduction tool to preserve RNA quality, save time and cost. GL consists of a pair of gmRNA-specific oligonucleotides in RNA initial denaturation buffer that is effective immediately after RNA denaturation and adds only ten minutes of incubation to the whole cDNA synthesis procedure when compared to non-blood RNA analysis. We show that GL is fully effective not only for human samples but also for mouse and rat, and so far incompletely studied cow, dog and zebrafish. PMID:27515369

  18. Sequence-specific cleavage of dsRNA by Mini-III RNase

    PubMed Central

    Głów, Dawid; Pianka, Dariusz; Sulej, Agata A.; Kozłowski, Łukasz P.; Czarnecka, Justyna; Chojnowski, Grzegorz; Skowronek, Krzysztof J.; Bujnicki, Janusz M.

    2015-01-01

    Ribonucleases (RNases) play a critical role in RNA processing and degradation by hydrolyzing phosphodiester bonds (exo- or endonucleolytically). Many RNases that cut RNA internally exhibit substrate specificity, but their target sites are usually limited to one or a few specific nucleotides in single-stranded RNA and often in a context of a particular three-dimensional structure of the substrate. Thus far, no RNase counterparts of restriction enzymes have been identified which could cleave double-stranded RNA (dsRNA) in a sequence-specific manner. Here, we present evidence for a sequence-dependent cleavage of long dsRNA by RNase Mini-III from Bacillus subtilis (BsMiniIII). Analysis of the sites cleaved by this enzyme in limited digest of bacteriophage Φ6 dsRNA led to the identification of a consensus target sequence. We defined nucleotide residues within the preferred cleavage site that affected the efficiency of the cleavage and were essential for the discrimination of cleavable versus non-cleavable dsRNA sequences. We have also determined that the loop α5b-α6, a distinctive structural element in Mini-III RNases, is crucial for the specific cleavage, but not for dsRNA binding. Our results suggest that BsMiniIII may serve as a prototype of a sequence-specific dsRNase that could possibly be used for targeted cleavage of dsRNA. PMID:25634891

  19. Discriminative Prediction of A-To-I RNA Editing Events from DNA Sequence

    PubMed Central

    Sun, Jiangming; Singh, Pratibha; Bagge, Annika; Valtat, Bérengère; Vikman, Petter; Spégel, Peter; Mulder, Hindrik

    2016-01-01

    RNA editing is a post-transcriptional alteration of RNA sequences that, via insertions, deletions or base substitutions, can affect protein structure as well as RNA and protein expression. Recently, it has been suggested that RNA editing may be more frequent than previously thought. A great impediment, however, to a deeper understanding of this process is the paramount sequencing effort that needs to be undertaken to identify RNA editing events. Here, we describe an in silico approach, based on machine learning, that ameliorates this problem. Using 41 nucleotide long DNA sequences, we show that novel A-to-I RNA editing events can be predicted from known A-to-I RNA editing events intra- and interspecies. The validity of the proposed method was verified in an independent experimental dataset. Using our approach, 203 202 putative A-to-I RNA editing events were predicted in the whole human genome. Out of these, 9% were previously reported. The remaining sites require further validation, e.g., by targeted deep sequencing. In conclusion, the approach described here is a useful tool to identify potential A-to-I RNA editing events without the requirement of extensive RNA sequencing. PMID:27764195

  20. RNA Sequencing of the Exercise Transcriptome in Equine Athletes

    PubMed Central

    Verini-Supplizi, Andrea; Barcaccia, Gianni; Albiero, Alessandro; D'Angelo, Michela; Campagna, Davide; Valle, Giorgio; Felicetti, Michela; Silvestrelli, Maurizio; Cappelli, Katia

    2013-01-01

    The horse is an optimal model organism for studying the genomic response to exercise-induced stress, due to its natural aptitude for athletic performance and the relative homogeneity of its genetic and environmental backgrounds. Here, we applied RNA-sequencing analysis through the use of SOLiD technology in an experimental framework centered on exercise-induced stress during endurance races in equine athletes. We monitored the transcriptional landscape by comparing gene expression levels between animals at rest and after competition. Overall, we observed a shift from coding to non-coding regions, suggesting that the stress response involves the differential expression of not annotated regions. Notably, we observed significant post-race increases of reads that correspond to repeats, especially the intergenic and intronic L1 and L2 transposable elements. We also observed increased expression of the antisense strands compared to the sense strands in intronic and regulatory regions (1 kb up- and downstream) of the genes, suggesting that antisense transcription could be one of the main mechanisms for transposon regulation in the horse under stress conditions. We identified a large number of transcripts corresponding to intergenic and intronic regions putatively associated with new transcriptional elements. Gene expression and pathway analysis allowed us to identify several biological processes and molecular functions that may be involved with exercise-induced stress. Ontology clustering reflected mechanisms that are already known to be stress activated (e.g., chemokine-type cytokines, Toll-like receptors, and kinases), as well as “nucleic acid binding” and “signal transduction activity” functions. There was also a general and transient decrease in the global rates of protein synthesis, which would be expected after strenuous global stress. In sum, our network analysis points toward the involvement of specific gene clusters in equine exercise-induced stress

  1. RNA sequencing of the exercise transcriptome in equine athletes.

    PubMed

    Capomaccio, Stefano; Vitulo, Nicola; Verini-Supplizi, Andrea; Barcaccia, Gianni; Albiero, Alessandro; D'Angelo, Michela; Campagna, Davide; Valle, Giorgio; Felicetti, Michela; Silvestrelli, Maurizio; Cappelli, Katia

    2013-01-01

    The horse is an optimal model organism for studying the genomic response to exercise-induced stress, due to its natural aptitude for athletic performance and the relative homogeneity of its genetic and environmental backgrounds. Here, we applied RNA-sequencing analysis through the use of SOLiD technology in an experimental framework centered on exercise-induced stress during endurance races in equine athletes. We monitored the transcriptional landscape by comparing gene expression levels between animals at rest and after competition. Overall, we observed a shift from coding to non-coding regions, suggesting that the stress response involves the differential expression of not annotated regions. Notably, we observed significant post-race increases of reads that correspond to repeats, especially the intergenic and intronic L1 and L2 transposable elements. We also observed increased expression of the antisense strands compared to the sense strands in intronic and regulatory regions (1 kb up- and downstream) of the genes, suggesting that antisense transcription could be one of the main mechanisms for transposon regulation in the horse under stress conditions. We identified a large number of transcripts corresponding to intergenic and intronic regions putatively associated with new transcriptional elements. Gene expression and pathway analysis allowed us to identify several biological processes and molecular functions that may be involved with exercise-induced stress. Ontology clustering reflected mechanisms that are already known to be stress activated (e.g., chemokine-type cytokines, Toll-like receptors, and kinases), as well as "nucleic acid binding" and "signal transduction activity" functions. There was also a general and transient decrease in the global rates of protein synthesis, which would be expected after strenuous global stress. In sum, our network analysis points toward the involvement of specific gene clusters in equine exercise-induced stress, including

  2. High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases.

    PubMed

    Qin, Yidan; Yao, Jun; Wu, Douglas C; Nottingham, Ryan M; Mohr, Sabine; Hunicke-Smith, Scott; Lambowitz, Alan M

    2016-01-01

    Next-generation RNA-sequencing (RNA-seq) has revolutionized transcriptome profiling, gene expression analysis, and RNA-based diagnostics. Here, we developed a new RNA-seq method that exploits thermostable group II intron reverse transcriptases (TGIRTs) and used it to profile human plasma RNAs. TGIRTs have higher thermostability, processivity, and fidelity than conventional reverse transcriptases, plus a novel template-switching activity that can efficiently attach RNA-seq adapters to target RNA sequences without RNA ligation. The new TGIRT-seq method enabled construction of RNA-seq libraries from <1 ng of plasma RNA in <5 h. TGIRT-seq of RNA in 1-mL plasma samples from a healthy individual revealed RNA fragments mapping to a diverse population of protein-coding gene and long ncRNAs, which are enriched in intron and antisense sequences, as well as nearly all known classes of small ncRNAs, some of which have never before been seen in plasma. Surprisingly, many of the small ncRNA species were present as full-length transcripts, suggesting that they are protected from plasma RNases in ribonucleoprotein (RNP) complexes and/or exosomes. This TGIRT-seq method is readily adaptable for profiling of whole-cell, exosomal, and miRNAs, and for related procedures, such as HITS-CLIP and ribosome profiling.

  3. Identification of Dirofilaria immitis miRNA using illumina deep sequencing

    PubMed Central

    2013-01-01

    The heartworm Dirofilaria immitis is the causative agent of cardiopulmonary dirofilariosis in dogs and cats, which also infects a wide range of wild mammals and humans. The complex life cycle of D. immitis with several developmental stages in its invertebrate mosquito vectors and its vertebrate hosts indicates the importance of miRNA in growth and development, and their ability to regulate infection of mammalian hosts. This study identified the miRNA profiles of D. immitis of zoonotic significance by deep sequencing. A total of 1063 conserved miRNA candidates, including 68 anti-sense miRNA (miRNA*) sequences, were predicted by computational methods and could be grouped into 808 miRNA families. A significant bias towards family members, family abundance and sequence nucleotides was observed. Thirteen novel miRNA candidates were predicted by alignment with the Brugia malayi genome. Eleven out of 13 predicted miRNA candidates were verified by using a PCR-based method. Target genes of the novel miRNA candidates were predicted by using the heartworm transcriptome dataset. To our knowledge, this is the first report of miRNA profiles in D. immitis, which will contribute to a better understanding of the complex biology of this zoonotic filarial nematode and the molecular regulation roles of miRNA involved. Our findings may also become a useful resource for small RNA studies in other filarial parasitic nematodes. PMID:23331513

  4. RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information

    PubMed Central

    Suresh, V.; Liu, Liang; Adjeroh, Donald; Zhou, Xiaobo

    2015-01-01

    RNA-protein complexes are essential in mediating important fundamental cellular processes, such as transport and localization. In particular, ncRNA-protein interactions play an important role in post-transcriptional gene regulation like mRNA localization, mRNA stabilization, poly-adenylation, splicing and translation. The experimental methods to solve RNA-protein interaction prediction problem remain expensive and time-consuming. Here, we present the RPI-Pred (RNA-protein interaction predictor), a new support-vector machine-based method, to predict protein-RNA interaction pairs, based on both the sequences and structures. The results show that RPI-Pred can correctly predict RNA-protein interaction pairs with ∼94% prediction accuracy when using sequence and experimentally determined protein and RNA structures, and with ∼83% when using sequences and predicted protein and RNA structures. Further, our proposed method RPI-Pred was superior to other existing ones by predicting more experimentally validated ncRNA-protein interaction pairs from different organisms. Motivated by the improved performance of RPI-Pred, we further applied our method for reliable construction of ncRNA-protein interaction networks. The RPI-Pred is publicly available at: http://ctsb.is.wfubmc.edu/projects/rpi-pred. PMID:25609700

  5. New perspectives on the diversification of the RNA interference system: insights from comparative genomics and small RNA sequencing.

    PubMed

    Burroughs, Alexander Maxwell; Ando, Yoshinari; Aravind, L

    2014-01-01

    Our understanding of the pervasive involvement of small RNAs in regulating diverse biological processes has been greatly augmented by recent application of deep-sequencing technologies to small RNA across diverse eukaryotes. We review the currently known small RNA classes and place them in context of the reconstructed evolutionary history of the RNA interference (RNAi) protein machinery. This synthesis indicates that the earliest versions of eukaryotic RNAi systems likely utilized small RNA processed from three types of precursors: (1) sense-antisense transcriptional products, (2) genome-encoded, imperfectly complementary hairpin sequences, and (3) larger noncoding RNA precursor sequences. Structural dissection of PIWI proteins along with recent discovery of novel families (including Med13 of the Mediator complex) suggest that emergence of a distinct architecture with the N-terminal domains (also occurring separately fused to endoDNases in prokaryotes) formed via duplication of an ancestral unit was key to their recruitment as primary RNAi effectors and use of small RNAs of certain preferred lengths. Prokaryotic PIWI proteins are typically components of several RNA-directed DNA restriction or CRISPR/Cas systems. However, eukaryotic versions appear to have emerged from a subset that evolved RNA-directed RNAi. They were recruited alongside RNaseIII domains and RNA-dependent RNA polymerase (RdRP) domains, also from prokaryotic systems, to form the core eukaryotic RNAi system. Like certain regulatory systems, RNAi diversified into two distinct but linked arms concomitant with eukaryotic nucleocytoplasmic compartmentalization. Subsequent elaboration of RNAi proceeded via diversification of the core protein machinery through lineage-specific expansions and recruitment of new components from prokaryotes (nucleases and small RNA-modifying enzymes), allowing for diversification of associating small RNAs. PMID:24311560

  6. Nucleotide sequence of 3' untranslated portion of human alpha globin mRNA.

    PubMed Central

    Wilson, J T; deRiel, J K; Forget, B G; Marotta, C A; Weissman, S M

    1977-01-01

    We have determined the nucleotide sequence of 75 nucleotides of the 3'-untranslated portion of normal human alpha globin mRNA which corresponds to the elongated amino acid sequence of the chain termination mutant Hb Constant Spring. This was accomplished by sequence analysis of cDNA fragments obtained by restriction endonuclease or T4 endonuclease IV cleavage of human globin cDNA synthesized from globin mRNA by use of viral reverse transcriptase. Analysis of cRNA synthesized from cDNA by use of RNA polymerase provided additional confirmatory sequence information. Possible polymorphism has been identified at one site of the sequence. Our sequence overlaps with, and extends the sequence of 43 nucleotides determined by Proudfood and coworkers for the very 3'-terminal portion of human alpha globin mRNA. The complete 3'-untranslated sequence of human alpha globin mRNA (112 nucleotides including termination codon) shows little homology to that of the human or rabbit beta globin mRNAs except for the presence of the hexanucleotide sequence AAUAAA which is found in most eukaryotic mRNAs near the 3'-terminal poly (A). Images PMID:909779

  7. Genome-Wide Probing of RNA Structures In Vitro Using Nucleases and Deep Sequencing.

    PubMed

    Wan, Yue; Qu, Kun; Ouyang, Zhengqing; Chang, Howard Y

    2016-01-01

    RNA structure probing is an important technique that studies the secondary and tertiary conformations of an RNA. While it was traditionally performed on one RNA at a time, recent advances in deep sequencing has enabled the secondary structure mapping of thousands of RNAs simultaneously. Here, we describe the method Parallel Analysis for RNA Structures (PARS), which couples double and single strand specific nuclease probing to high throughput sequencing. Upon cloning of the cleavage sites into a cDNA library, deep sequencing and mapping of reads to the transcriptome, the position of paired and unpaired bases along cellular RNAs can be identified. PARS can be performed under diverse solution conditions and on different organismal RNAs to provide genome-wide RNA structural information. This information can also be further used to constrain computational predictions to provide better RNA structure models under different conditions. PMID:26483021

  8. Sequence specific detection of bacterial 23S ribosomal RNA by TLR13

    PubMed Central

    Li, Xiao-Dong; Chen, Zhijian J

    2012-01-01

    Toll-like receptors (TLRs) detect microbial infections and trigger innate immune responses. Among vertebrate TLRs, the role of TLR13 and its ligand are unknown. Here we show that TLR13 detects the 23S ribosomal RNA of both gram-positive and gram-negative bacteria. A sequence containing 13 nucleotides near the active site of 23S rRNA ribozyme, which catalyzes peptide bond synthesis, was both necessary and sufficient to trigger TLR13-dependent interleukin-1β production. Single point mutations within this sequence destroyed the ability of the 23S rRNA to stimulate the TLR13 pathway. Knockout of TLR13 in mice abolished the induction of interleukin-1β and other cytokines by the 23S rRNA sequence. Thus, TLR13 detects bacterial RNA with exquisite sequence specificity. DOI: http://dx.doi.org/10.7554/eLife.00102.001 PMID:23110254

  9. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin

    NASA Astrophysics Data System (ADS)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-01

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecue, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G•U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G•U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  10. Sequence arrangement of the rRNA genes of the dipteran Sarcophaga bullata.

    PubMed

    French, C K; Fouts, D L; Manning, J E

    1981-06-11

    Velocity sedimentation studies of RNA of Sarcophaga bullata show that the major rRNA species have sedimentation values of 26S and 18S. Analysis of the rRNA under denaturing conditions indicates that there is a hidden break centrally located in the 26S rRNA species. Saturation hybridization studies using total genomic DNA and rRNA show that 0.08% of the nuclear DNA is occupied by rRNA coding sequences and that the average repetition frequency of these coding sequences is approximately 144. The arrangement of the rRNA genes and their spacer sequences on long strands of purified rDNA was determined by the examination of the structure of rRNa:DNA hybrids in the electron microscope. Long DNA strands contain several gene sets (18S + 26S) with one repeat unit containing the following sequences in order given: (a) An 18S gene of length 2.12 kb, (b) an internal transcribed spacer of length 2.01 kb, which contains a short sequence that may code for a 5.8S rRNA, (c) A 26S gene of length 4.06 kb which, in 20% of the cases, contains an intron with an average length of 5.62 kb, and (d) an external spacer of average length of 9.23 kb.

  11. Plant RNA virus sequences identified in kimchi by microbial metatranscriptome analysis.

    PubMed

    Kim, Dong Seon; Jung, Ji Young; Wang, Yao; Oh, Hye Ji; Choi, Dongjin; Jeon, Che Ok; Hahn, Yoonsoo

    2014-07-01

    Plant pathogenic RNA viruses are present in a variety of plant-based foods. When ingested by humans, these viruses can survive the passage through the digestive tract, and are frequently detected in human feces. Kimchi is a traditional fermented Korean food made from cabbage or vegetables, with a variety of other plant-based ingredients, including ground red pepper and garlic paste. We analyzed microbial metatranscriptome data from kimchi at five fermentation stages to identify plant RNA virus-derived sequences. We successfully identified a substantial amount of plant RNA virus sequences, especially during the early stages of fermentation: 23.47% and 16.45% of total clean reads on days 7 and 13, respectively. The most abundant plant RNA virus sequences were from pepper mild mottle virus, a major pathogen of red peppers; this constituted 95% of the total RNA virus sequences identified throughout the fermentation period. We observed distinct sequencing read-depth distributions for plant RNA virus genomes, possibly implying intrinsic and/or technical biases during the metatranscriptome generation procedure. We also identified RNA virus sequences in publicly available microbial metatranscriptome data sets. We propose that metatranscriptome data may serve as a valuable resource for RNA virus detection, and a systematic screening of the ingredients may help prevent the use of virus-infected low-quality materials for food production. PMID:24836186

  12. Plant RNA virus sequences identified in kimchi by microbial metatranscriptome analysis.

    PubMed

    Kim, Dong Seon; Jung, Ji Young; Wang, Yao; Oh, Hye Ji; Choi, Dongjin; Jeon, Che Ok; Hahn, Yoonsoo

    2014-07-01

    Plant pathogenic RNA viruses are present in a variety of plant-based foods. When ingested by humans, these viruses can survive the passage through the digestive tract, and are frequently detected in human feces. Kimchi is a traditional fermented Korean food made from cabbage or vegetables, with a variety of other plant-based ingredients, including ground red pepper and garlic paste. We analyzed microbial metatranscriptome data from kimchi at five fermentation stages to identify plant RNA virus-derived sequences. We successfully identified a substantial amount of plant RNA virus sequences, especially during the early stages of fermentation: 23.47% and 16.45% of total clean reads on days 7 and 13, respectively. The most abundant plant RNA virus sequences were from pepper mild mottle virus, a major pathogen of red peppers; this constituted 95% of the total RNA virus sequences identified throughout the fermentation period. We observed distinct sequencing read-depth distributions for plant RNA virus genomes, possibly implying intrinsic and/or technical biases during the metatranscriptome generation procedure. We also identified RNA virus sequences in publicly available microbial metatranscriptome data sets. We propose that metatranscriptome data may serve as a valuable resource for RNA virus detection, and a systematic screening of the ingredients may help prevent the use of virus-infected low-quality materials for food production.

  13. Deep Sequencing of RNA from Ancient Maize Kernels

    PubMed Central

    Rasmussen, Morten; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Alquezar-Planas, David E.; Penfield, Steven; Brown, Terence A.; Vielle-Calzada, Jean-Philippe; Montiel, Rafael; Jørgensen, Tina; Odegaard, Nancy; Jacobs, Michael; Arriaza, Bernardo; Higham, Thomas F. G.; Ramsey, Christopher Bronk; Willerslev, Eske; Gilbert, M. Thomas P.

    2013-01-01

    The characterization of biomolecules from ancient samples can shed otherwise unobtainable insights into the past. Despite the fundamental role of transcriptomal change in evolution, the potential of ancient RNA remains unexploited – perhaps due to dogma associated with the fragility of RNA. We hypothesize that seeds offer a plausible refuge for long-term RNA survival, due to the fundamental role of RNA during seed germination. Using RNA-Seq on cDNA synthesized from nucleic acid extracts, we validate this hypothesis through demonstration of partial transcriptomal recovery from two sources of ancient maize kernels. The results suggest that ancient seed transcriptomics may offer a powerful new tool with which to study plant domestication. PMID:23326310

  14. Common 5S rRNA variants are likely to be accepted in many sequence contexts

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; D'Souza, Lisa M.; Lee, Youn-Hyung; Fox, George E.

    2003-01-01

    Over evolutionary time RNA sequences which are successfully fixed in a population are selected from among those that satisfy the structural and chemical requirements imposed by the function of the RNA. These sequences together comprise the structure space of the RNA. In principle, a comprehensive understanding of RNA structure and function would make it possible to enumerate which specific RNA sequences belong to a particular structure space and which do not. We are using bacterial 5S rRNA as a model system to attempt to identify principles that can be used to predict which sequences do or do not belong to the 5S rRNA structure space. One promising idea is the very intuitive notion that frequently seen sequence changes in an aligned data set of naturally occurring 5S rRNAs would be widely accepted in many other 5S rRNA sequence contexts. To test this hypothesis, we first developed well-defined operational definitions for a Vibrio region of the 5S rRNA structure space and what is meant by a highly variable position. Fourteen sequence variants (10 point changes and 4 base-pair changes) were identified in this way, which, by the hypothesis, would be expected to incorporate successfully in any of the known sequences in the Vibrio region. All 14 of these changes were constructed and separately introduced into the Vibrio proteolyticus 5S rRNA sequence where they are not normally found. Each variant was evaluated for its ability to function as a valid 5S rRNA in an E. coli cellular context. It was found that 93% (13/14) of the variants tested are likely valid 5S rRNAs in this context. In addition, seven variants were constructed that, although present in the Vibrio region, did not meet the stringent criteria for a highly variable position. In this case, 86% (6/7) are likely valid. As a control we also examined seven variants that are seldom or never seen in the Vibrio region of 5S rRNA sequence space. In this case only two of seven were found to be potentially valid. The

  15. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    PubMed Central

    Jenior, Matthew L.; Koumpouras, Charles C.; Westcott, Sarah L.; Highlander, Sarah K.

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting. PMID:27069806

  16. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    PubMed

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting. PMID:27069806

  17. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    PubMed

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  18. The RNA structurome: transcriptome-wide structure probing with next-generation sequencing.

    PubMed

    Kwok, Chun Kit; Tang, Yin; Assmann, Sarah M; Bevilacqua, Philip C

    2015-04-01

    RNA folds into intricate structures that enable its pivotal roles in biology, ranging from regulation of gene expression to ligand sensing and enzymatic functions. Therefore, elucidating RNA structure can provide profound insights into living systems. A recent marriage between in vivo RNA structure probing and next-generation sequencing (NGS) has revolutionized the RNA field by enabling transcriptome-wide structure determination in vivo, which has been applied to date to human cells, yeast cells, and Arabidopsis seedlings. Analysis of resultant in vivo 'RNA structuromes' provides new and important information regarding myriad cellular processes, including control of translation, alternative splicing, alternative polyadenylation, energy-dependent unfolding of mRNA, and effects of proteins on RNA structure. An emerging view suggests potential links between RNA structure and stress and disease physiology across the tree of life. As we discuss here, these exciting findings open new frontiers into RNA biology, genome biology, and beyond.

  19. DNApi: A De Novo Adapter Prediction Algorithm for Small RNA Sequencing Data

    PubMed Central

    Tsuji, Junko; Weng, Zhiping

    2016-01-01

    With the rapid accumulation of publicly available small RNA sequencing datasets, third-party meta-analysis across many datasets is becoming increasingly powerful. Although removing the 3´ adapter is an essential step for small RNA sequencing analysis, the adapter sequence information is not always available in the metadata. The information can be also erroneous even when it is available. In this study, we developed DNApi, a lightweight Python software package that predicts the 3´ adapter sequence de novo and provides the user with cleansed small RNA sequences ready for down stream analysis. Tested on 539 publicly available small RNA libraries accompanied with 3´ adapter sequences in their metadata, DNApi shows near-perfect accuracy (98.5%) with fast runtime (~2.85 seconds per library) and efficient memory usage (~43 MB on average). In addition to 3´ adapter prediction, it is also important to classify whether the input small RNA libraries were already processed, i.e. the 3´ adapters were removed. DNApi perfectly judged that given another batch of datasets, 192 publicly available processed libraries were “ready-to-map” small RNA sequence. DNApi is compatible with Python 2 and 3, and is available at https://github.com/jnktsj/DNApi. The 731 small RNA libraries used for DNApi evaluation were from human tissues and were carefully and manually collected. This study also provides readers with the curated datasets that can be integrated into their studies. PMID:27736901

  20. a Simple Symmetric Algorithm Using a Likeness with Introns Behavior in RNA Sequences

    NASA Astrophysics Data System (ADS)

    Regoli, Massimo

    2009-02-01

    The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences has some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algoritnm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.

  1. Intervening sequences in ribosomal RNA genes and bobbed phenotype in Drosophila hydei.

    PubMed

    Franz, G; Kunz, W

    1981-08-13

    The "bobbed' (bb) mutation in Drosophila is represented phenotypically by shortened and abnormally thin scutellar bristles and by delayed development. There is a direct correlation between bristle size and ribosomal RNA (rRNA) synthesis, and the bb mutation was at first explained as a deficiency of rRNA genes (rDNA). However, the bb phenotype can occur in Drosophila melanogaster and Drosophila hydei with high rDNA content, while phenotypically wild-type flies are known with few rRNA genes, suggesting that what matters is not the number of rRNA genes but their transcriptional activity. In D. melanogaster, it has recently emerged that rRNA genes interrupted by an intervening sequence are not transcribed. We now report that in D. hydei, the length of the scutellar bristle is directly proportional to the number of rRNA genes without this intervening sequence.

  2. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses.

    PubMed

    Yanagisawa, Hironobu; Tomita, Reiko; Katsu, Koji; Uehara, Takuya; Atsumi, Go; Tateda, Chika; Kobayashi, Kappei; Sekine, Ken-Taro

    2016-03-01

    The presence of high molecular weight double-stranded RNA (dsRNA) within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing) analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS) would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV), a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt) that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT)-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as "DECS-C," is a powerful method for detecting novel plant viruses. PMID:27072419

  3. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses

    PubMed Central

    Yanagisawa, Hironobu; Tomita, Reiko; Katsu, Koji; Uehara, Takuya; Atsumi, Go; Tateda, Chika; Kobayashi, Kappei; Sekine, Ken-Taro

    2016-01-01

    The presence of high molecular weight double-stranded RNA (dsRNA) within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing) analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS) would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV), a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt) that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT)-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as “DECS-C,” is a powerful method for detecting novel plant viruses. PMID:27072419

  4. Excess of Yra1 RNA-Binding Factor Causes Transcription-Dependent Genome Instability, Replication Impairment and Telomere Shortening

    PubMed Central

    Gavaldá, Sandra; Santos-Pereira, José M.; García-Rubio, María L.; Luna, Rosa; Aguilera, Andrés

    2016-01-01

    Yra1 is an essential nuclear factor of the evolutionarily conserved family of hnRNP-like export factors that when overexpressed impairs mRNA export and cell growth. To investigate further the relevance of proper Yra1 stoichiometry in the cell, we overexpressed Yra1 by transforming yeast cells with YRA1 intron-less constructs and analyzed its effect on gene expression and genome integrity. We found that YRA1 overexpression induces DNA damage and leads to a transcription-associated hyperrecombination phenotype that is mediated by RNA:DNA hybrids. In addition, it confers a genome-wide replication retardation as seen by reduced BrdU incorporation and accumulation of the Rrm3 helicase. In addition, YRA1 overexpression causes a cell senescence-like phenotype and telomere shortening. ChIP-chip analysis shows that overexpressed Yra1 is loaded to transcribed chromatin along the genome and to Y’ telomeric regions, where Rrm3 is also accumulated, suggesting an impairment of telomere replication. Our work not only demonstrates that a proper stoichiometry of the Yra1 mRNA binding and export factor is required to maintain genome integrity and telomere homeostasis, but suggests that the cellular imbalance between transcribed RNA and specific RNA-binding factors may become a major cause of genome instability mediated by co-transcriptional replication impairment. PMID:27035147

  5. Predicting RNA secondary structures from sequence and probing data.

    PubMed

    Lorenz, Ronny; Wolfinger, Michael T; Tanzer, Andrea; Hofacker, Ivo L

    2016-07-01

    RNA secondary structures have proven essential for understanding the regulatory functions performed by RNA such as microRNAs, bacterial small RNAs, or riboswitches. This success is in part due to the availability of efficient computational methods for predicting RNA secondary structures. Recent advances focus on dealing with the inherent uncertainty of prediction by considering the ensemble of possible structures rather than the single most stable one. Moreover, the advent of high-throughput structural probing has spurred the development of computational methods that incorporate such experimental data as auxiliary information.

  6. Identification of two proteins that bind to a pyrimidine-rich sequence in the 3'-untranslated region of GAP-43 mRNA.

    PubMed Central

    Irwin, N; Baekelandt, V; Goritchenko, L; Benowitz, L I

    1997-01-01

    GAP-43 is a membrane phosphoprotein that is important for the development and plasticity of neural connections. In undifferentiated PC12 pheochromocytoma cells, GAP-43 mRNA degrades rapidly ( t = 5 h), but becomes stable when cells are treated with nerve growth factor. To identify trans- acting factors that may influence mRNA stability, we combined column chromatography and gel mobility shift assays to isolate GAP-43 mRNA binding proteins from neonatal bovine brain tissue. This resulted in the isolation of two proteins that bind specifically and competitively to a pyrimidine-rich sequence in the 3'-untranslated region of GAP-43 mRNA. Partial amino acid sequencing revealed that one of the RNA binding proteins coincides with FBP (far upstream element binding protein), previously characterized as a protein that resembles hnRNP K and which binds to a single-stranded, pyrimidine-rich DNA sequence upstream of the c -myc gene to activate its expression. The other binding protein shares sequence homology with PTB, a polypyrimidine tract binding protein implicated in RNA splicing and regulation of translation initiation. The two proteins bind to a 26 nt pyrimidine-rich sequence lying 300 nt downstream of the end of the coding region, in an area shown by others to confer instability on a reporter mRNA in transient transfection assays. We therefore propose that FBP and the PTB-like protein may compete for binding at the same site to influence the stability of GAP-43 mRNA. PMID:9092640

  7. RNA sequencing of Sleeping Beauty transposon-induced tumors detects transposon-RNA fusions in forward genetic cancer screens

    PubMed Central

    Temiz, Nuri A.; Moriarity, Branden S.; Wolf, Natalie K.; Riordan, Jesse D.; Dupuy, Adam J.; Largaespada, David A.; Sarver, Aaron L.

    2016-01-01

    Forward genetic screens using Sleeping Beauty (SB)-mobilized T2/Onc transposons have been used to identify common insertion sites (CISs) associated with tumor formation. Recurrent sites of transposon insertion are commonly identified using ligation-mediated PCR (LM-PCR). Here, we use RNA sequencing (RNA-seq) data to directly identify transcriptional events mediated by T2/Onc. Surprisingly, the majority (∼80%) of LM-PCR identified junction fragments do not lead to observable changes in RNA transcripts. However, in CIS regions, direct transcriptional effects of transposon insertions are observed. We developed an automated method to systematically identify T2/Onc-genome RNA fusion sequences in RNA-seq data. RNA fusion-based CISs were identified corresponding to both DNA-based CISs (Cdkn2a, Mycl1, Nf2, Pten, Sema6d, and Rere) and additional regions strongly associated with cancer that were not observed by LM-PCR (Myc, Akt1, Pth, Csf1r, Fgfr2, Wisp1, Map3k5, and Map4k3). In addition to calculating recurrent CISs, we also present complementary methods to identify potential driver events via determination of strongly supported fusions and fusions with large transcript level changes in the absence of multitumor recurrence. These methods independently identify CIS regions and also point to cancer-associated genes like Braf. We anticipate RNA-seq analyses of tumors from forward genetic screens will become an efficient tool to identify causal events. PMID:26553456

  8. RNA sequencing of Sleeping Beauty transposon-induced tumors detects transposon-RNA fusions in forward genetic cancer screens.

    PubMed

    Temiz, Nuri A; Moriarity, Branden S; Wolf, Natalie K; Riordan, Jesse D; Dupuy, Adam J; Largaespada, David A; Sarver, Aaron L

    2016-01-01

    Forward genetic screens using Sleeping Beauty (SB)-mobilized T2/Onc transposons have been used to identify common insertion sites (CISs) associated with tumor formation. Recurrent sites of transposon insertion are commonly identified using ligation-mediated PCR (LM-PCR). Here, we use RNA sequencing (RNA-seq) data to directly identify transcriptional events mediated by T2/Onc. Surprisingly, the majority (∼80%) of LM-PCR identified junction fragments do not lead to observable changes in RNA transcripts. However, in CIS regions, direct transcriptional effects of transposon insertions are observed. We developed an automated method to systematically identify T2/Onc-genome RNA fusion sequences in RNA-seq data. RNA fusion-based CISs were identified corresponding to both DNA-based CISs (Cdkn2a, Mycl1, Nf2, Pten, Sema6d, and Rere) and additional regions strongly associated with cancer that were not observed by LM-PCR (Myc, Akt1, Pth, Csf1r, Fgfr2, Wisp1, Map3k5, and Map4k3). In addition to calculating recurrent CISs, we also present complementary methods to identify potential driver events via determination of strongly supported fusions and fusions with large transcript level changes in the absence of multitumor recurrence. These methods independently identify CIS regions and also point to cancer-associated genes like Braf. We anticipate RNA-seq analyses of tumors from forward genetic screens will become an efficient tool to identify causal events.

  9. RNA sequencing of Sleeping Beauty transposon-induced tumors detects transposon-RNA fusions in forward genetic cancer screens.

    PubMed

    Temiz, Nuri A; Moriarity, Branden S; Wolf, Natalie K; Riordan, Jesse D; Dupuy, Adam J; Largaespada, David A; Sarver, Aaron L

    2016-01-01

    Forward genetic screens using Sleeping Beauty (SB)-mobilized T2/Onc transposons have been used to identify common insertion sites (CISs) associated with tumor formation. Recurrent sites of transposon insertion are commonly identified using ligation-mediated PCR (LM-PCR). Here, we use RNA sequencing (RNA-seq) data to directly identify transcriptional events mediated by T2/Onc. Surprisingly, the majority (∼80%) of LM-PCR identified junction fragments do not lead to observable changes in RNA transcripts. However, in CIS regions, direct transcriptional effects of transposon insertions are observed. We developed an automated method to systematically identify T2/Onc-genome RNA fusion sequences in RNA-seq data. RNA fusion-based CISs were identified corresponding to both DNA-based CISs (Cdkn2a, Mycl1, Nf2, Pten, Sema6d, and Rere) and additional regions strongly associated with cancer that were not observed by LM-PCR (Myc, Akt1, Pth, Csf1r, Fgfr2, Wisp1, Map3k5, and Map4k3). In addition to calculating recurrent CISs, we also present complementary methods to identify potential driver events via determination of strongly supported fusions and fusions with large transcript level changes in the absence of multitumor recurrence. These methods independently identify CIS regions and also point to cancer-associated genes like Braf. We anticipate RNA-seq analyses of tumors from forward genetic screens will become an efficient tool to identify causal events. PMID:26553456

  10. RNA editing generates cellular subsets with diverse sequence within populations

    PubMed Central

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J.; Rayon-Estrada, Violeta; Papavasiliou, F. Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  11. miRBase: integrating microRNA annotation and deep-sequencing data.

    PubMed

    Kozomara, Ana; Griffiths-Jones, Sam

    2011-01-01

    miRBase is the primary online repository for all microRNA sequences and annotation. The current release (miRBase 16) contains over 15,000 microRNA gene loci in over 140 species, and over 17,000 distinct mature microRNA sequences. Deep-sequencing technologies have delivered a sharp rise in the rate of novel microRNA discovery. We have mapped reads from short RNA deep-sequencing experiments to microRNAs in miRBase and developed web interfaces to view these mappings. The user can view all read data associated with a given microRNA annotation, filter reads by experiment and count, and search for microRNAs by tissue- and stage-specific expression. These data can be used as a proxy for relative expression levels of microRNA sequences, provide detailed evidence for microRNA annotations and alternative isoforms of mature microRNAs, and allow us to revisit previous annotations. miRBase is available online at: http://www.mirbase.org/.

  12. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing.

    PubMed

    Ferreira, Pedro G; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R; Rivas, Manuel A; Esteve-Codina, Anna; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing-alternative splice sites, introns, and cleavage sites-which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  13. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    NASA Astrophysics Data System (ADS)

    2016-09-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.

  14. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    PubMed Central

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A.C.T; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  15. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud.

    PubMed

    Griffith, Malachi; Walker, Jason R; Spies, Nicholas C; Ainscough, Benjamin J; Griffith, Obi L

    2015-08-01

    Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.

  16. Phylogenetic analysis of oryx species using partial sequences of mitochondrial rRNA genes.

    PubMed

    Khan, H A; Arif, I A; Al Farhan, A H; Al Homaidan, A A

    2008-01-01

    We conducted a comparative evaluation of 12S rRNA and 16S rRNA genes of the mitochondrial genome for molecular differentiation among three oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) with respect to two closely related outgroups, addax and roan. Our findings showed the failure of 12S rRNA gene to differentiate between the genus Oryx and addax, whereas a 342-bp partial sequence of 16S rRNA accurately grouped all five taxa studied, suggesting the utility of 16S rRNA segment for molecular phylogeny of oryx at the genus and possibly species levels. PMID:19048493

  17. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud

    PubMed Central

    Griffith, Malachi; Walker, Jason R.; Spies, Nicholas C.; Ainscough, Benjamin J.; Griffith, Obi L.

    2015-01-01

    Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki. PMID:26248053

  18. Analysis of a marine picoplankton community by 16S rRNA gene cloning and sequencing.

    PubMed Central

    Schmidt, T M; DeLong, E F; Pace, N R

    1991-01-01

    The phylogenetic diversity of an oligotrophic marine picoplankton community was examined by analyzing the sequences of cloned ribosomal genes. This strategy does not rely on cultivation of the resident microorganisms. Bulk genomic DNA was isolated from picoplankton collected in the north central Pacific Ocean by tangential flow filtration. The mixed-population DNA was fragmented, size fractionated, and cloned into bacteriophage lambda. Thirty-eight clones containing 16S rRNA genes were identified in a screen of 3.2 x 10(4) recombinant phage, and portions of the rRNA gene were amplified by polymerase chain reaction and sequenced. The resulting sequences were used to establish the identities of the picoplankton by comparison with an established data base of rRNA sequences. Fifteen unique eubacterial sequences were obtained, including four from cyanobacteria and eleven from proteobacteria. A single eucaryote related to dinoflagellates was identified; no archaebacterial sequences were detected. The cyanobacterial sequences are all closely related to sequences from cultivated marine Synechococcus strains and with cyanobacterial sequences obtained from the Atlantic Ocean (Sargasso Sea). Several sequences were related to common marine isolates of the gamma subdivision of proteobacteria. In addition to sequences closely related to those of described bacteria, sequences were obtained from two phylogenetic groups of organisms that are not closely related to any known rRNA sequences from cultivated organisms. Both of these novel phylogenetic clusters are proteobacteria, one group within the alpha subdivision and the other distinct from known proteobacterial subdivisions. The rRNA sequences of the alpha-related group are nearly identical to those of some Sargasso Sea picoplankton, suggesting a global distribution of these organisms. Images PMID:2066334

  19. New perspectives on the diversification of the RNA interference system: insights from comparative genomics and small RNA sequencing

    PubMed Central

    Burroughs, Alexander Maxwell; Ando, Yoshinari; Aravind, L

    2014-01-01

    Our understanding of the pervasive involvement of small RNAs in regulating diverse biological processes has been greatly augmented by recent application of deep-sequencing technologies to small RNA across diverse eukaryotes. We review the currently-known small RNA classes and place them in context of the reconstructed evolutionary history of the RNAi protein machinery. This synthesis indicates the earliest versions of eukaryotic RNAi systems likely utilized small RNA processed from three types of precursors: 1) sense-antisense transcriptional products, 2) genome-encoded, imperfectly-complementary hairpin sequences, and 3) larger non-coding RNA precursor sequences. Structural dissection of PIWI proteins along with recent discovery of novel families (including Med13 of the Mediator complex) suggest that emergence of a distinct architecture with the N-terminal domains (also occurring separately fused to endoDNases in prokaryotes) formed via duplication of an ancestral unit was key to their recruitment as primary RNAi effectors and use of small RNAs of certain preferred lengths. Prokaryotic PIWI proteins are typically components of several RNA-directed DNA restriction or CRISPR/Cas systems. However, eukaryotic versions appear to have emerged from a subset that evolved RNA-directed RNA interference. They were recruited alongside RNaseIII domains and RdRP domains, also from prokaryotic systems, to form the core eukaryotic RNAi system. Like certain regulatory systems, RNAi diversified into two distinct but linked arms concomitant with eukaryotic nucleo-cytoplasmic compartmentalization. Subsequent elaboration of RNAi proceeded via diversification of the core protein machinery through lineage-specific expansions and recruitment of new components from prokaryotes (nucleases and small RNA-modifying enzymes), allowing for diversification of associating small RNAs. PMID:24311560

  20. Empirical analysis of RNA robustness and evolution using high-throughput sequencing of ribozyme reactions.

    PubMed

    Hayden, Eric J

    2016-08-15

    RNA molecules provide a realistic but tractable model of a genotype to phenotype relationship. This relationship has been extensively investigated computationally using secondary structure prediction algorithms. Enzymatic RNA molecules, or ribozymes, offer access to genotypic and phenotypic information in the laboratory. Advancements in high-throughput sequencing technologies have enabled the analysis of sequences in the lab that now rivals what can be accomplished computationally. This has motivated a resurgence of in vitro selection experiments and opened new doors for the analysis of the distribution of RNA functions in genotype space. A body of computational experiments has investigated the persistence of specific RNA structures despite changes in the primary sequence, and how this mutational robustness can promote adaptations. This article summarizes recent approaches that were designed to investigate the role of mutational robustness during the evolution of RNA molecules in the laboratory, and presents theoretical motivations, experimental methods and approaches to data analysis. PMID:27215494

  1. miRNA Nomenclature: A View Incorporating Genetic Origins, Biosynthetic Pathways, and Sequence Variants.

    PubMed

    Desvignes, T; Batzel, P; Berezikov, E; Eilbeck, K; Eppig, J T; McAndrews, M S; Singer, A; Postlethwait, J H

    2015-11-01

    High-throughput sequencing of miRNAs has revealed the diversity and variability of mature and functional short noncoding RNAs, including their genomic origins, biogenesis pathways, sequence variability, and newly identified products such as miRNA-offset RNAs (moRs). Here we review known cases of alternative mature miRNA-like RNA fragments and propose a revised definition of miRNAs to encompass this diversity. We then review nomenclature guidelines for miRNAs and propose to extend nomenclature conventions to align with those for protein-coding genes established by international consortia. Finally, we suggest a system to encompass the full complexity of sequence variations (i.e., isomiRs) in the analysis of small RNA sequencing experiments.

  2. RNA sequence and transcriptional properties of the 3' end of the Newcastle disease virus genome

    SciTech Connect

    Kurilla, M.G.; Stone, H.O.; Keene, J.D.

    1985-09-01

    The 3' end of the genomic RNA of Newcastle disease virus (NDV) has been sequenced and the leader RNA defined. Using hybridization to a 3'-end-labeled genome, leader RNA species from in vitro transcription reactions and from infected cell extracts were found to be 47 and 53 nucleotides long. In addition, the start site of the 3'-proximal mRNA was determined by sequence analysis of in vitro (beta-32P)GTP-labeled transcription products. The genomic sequence extending beyond the leader region demonstrated an open reading frame for at least 42 amino acids and probably represents the amino terminus of the nucleocapsid protein (NP). The terminal 8 nucleotides of the NDV genome were identical to those of measles virus and Sendai virus while the sequence of the distal half of the leader region was more similar to that of vesicular stomatitis virus. These data argue for strong evolutionary relatedness between the paramyxovirus and rhabdovirus groups.

  3. Nucleotide sequence of the SrRNA gene and phylogenetic analysis of Trichomonas tenax.

    PubMed

    Fukura, K; Yamamoto, A; Hashimoto, T; Goto, N

    1996-01-01

    The small subunit ribosomal RNA (SrRNA) gene of Trichomonas tenax ATCC30207 was amplified by PCR and the 1.55-kb product was cloned into plasmid vector pUC18. Four clones were isolated and sequenced. The insert DNAs were 1,552 bp long and their G+C contents were 48.1%; three of them had exactly the same DNA sequences and one had only one nucleotide change. A representative SrRNA sequence was analyzed and a phylogenetic tree was estimated by the neighbor-joining (NJ) method. Among the protists examined, T. tenax was placed as the closest relative of Tritrichomonas foetus, as expected from the traditional taxonomy. The total homology between the two SrRNA sequences was 89.2%.

  4. Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data.

    PubMed

    Agrawal, Saumya; Ganley, Austen R D

    2016-01-01

    The ribosomal RNA genes (rDNA) encode the major rRNA species of the ribosome, and thus are essential across life. These genes are highly repetitive in most eukaryotes, forming blocks of tandem repeats that form the core of nucleoli. The primary role of the rDNA in encoding rRNA has been long understood, but more recently the rDNA has been implicated in a number of other important biological phenomena, including genome stability, cell cycle, and epigenetic silencing. Noncoding elements, primarily located in the intergenic spacer region, appear to mediate many of these phenomena. Although sequence information is available for the genomes of many organisms, in almost all cases rDNA repeat sequences are lacking, primarily due to problems in assembling these intriguing regions during whole genome assemblies. Here, we present a method to obtain complete rDNA repeat unit sequences from whole genome assemblies. Limitations of next generation sequencing (NGS) data make them unsuitable for assembling complete rDNA unit sequences; therefore, the method we present relies on the use of Sanger whole genome sequence data. Our method makes use of the Arachne assembler, which can assemble highly repetitive regions such as the rDNA in a memory-efficient way. We provide a detailed step-by-step protocol for generating rDNA sequences from whole genome Sanger sequence data using Arachne, for refining complete rDNA unit sequences, and for validating the sequences obtained. In principle, our method will work for any species where the rDNA is organized into tandem repeats. This will help researchers working on species without a complete rDNA sequence, those working on evolutionary aspects of the rDNA, and those interested in conducting phylogenetic footprinting studies with the rDNA. PMID:27576718

  5. Comprehensive analysis of human small RNA sequencing data provides insights into expression profiles and miRNA editing

    PubMed Central

    Gong, Jing; Wu, Yuliang; Zhang, Xiantong; Liao, Yifang; Sibanda, Vusumuzi Leroy; Liu, Wei; Guo, An-Yuan

    2014-01-01

    MicroRNAs (miRNAs) play key regulatory roles in various biological processes and diseases. A comprehensive analysis of large scale small RNA sequencing data (smRNA-seq) will be very helpful to explore tissue or disease specific miRNA markers and uncover miRNA variants. Here, we systematically analyzed 410 human smRNA-seq datasets, which samples are from 24 tissue/disease/cell lines. We tested the mapping strategies and found that it was necessary to make multiple-round mappings with different mismatch parameters. miRNA expression profiles revealed that on average ∼70% of known miRNAs were expressed at low level or not expressed (RPM < 1) in a sample and only ∼9% of known miRNAs were relatively highly expressed (RPM > 100). About 30% known miRNAs were not expressed in all of our used samples. The miRNA expression profiles were compiled into an online database (HMED, http://bioinfo.life.hust.edu.cn/smallRNA/). Dozens of tissue/disease specific miRNAs, disease/control dysregulated miRNAs and miRNAs with arm switching events were discovered. Further, we identified some highly confident editing sites including 24 A-to-I sites and 23 C-to-U sites. About half of them were widespread miRNA editing sites in different tissues. We characterized that the 2 types of editing sites have different features with regard to location, editing level and frequency. Our analyses for expression profiles, specific miRNA markers, arm switching, and editing sites, may provide valuable information for further studies of miRNA function and biomarker finding. PMID:25692236

  6. Next-generation Sequencing of 16S Ribosomal RNA Gene Amplicons

    PubMed Central

    Sanschagrin, Sylvie; Yergeau, Etienne

    2014-01-01

    One of the major questions in microbial ecology is “who is there?” This question can be answered using various tools, but one of the long-lasting gold standards is to sequence 16S ribosomal RNA (rRNA) gene amplicons generated by domain-level PCR reactions amplifying from genomic DNA. Traditionally, this was performed by cloning and Sanger (capillary electrophoresis) sequencing of PCR amplicons. The advent of next-generation sequencing has tremendously simplified and increased the sequencing depth for 16S rRNA gene sequencing. The introduction of benchtop sequencers now allows small labs to perform their 16S rRNA sequencing in-house in a matter of days. Here, an approach for 16S rRNA gene amplicon sequencing using a benchtop next-generation sequencer is detailed. The environmental DNA is first amplified by PCR using primers that contain sequencing adapters and barcodes. They are then coupled to spherical particles via emulsion PCR. The particles are loaded on a disposable chip and the chip is inserted in the sequencing machine after which the sequencing is performed. The sequences are retrieved in fastq format, filtered and the barcodes are used to establish the sample membership of the reads. The filtered and binned reads are then further analyzed using publically available tools. An example analysis where the reads were classified with a taxonomy-finding algorithm within the software package Mothur is given. The method outlined here is simple, inexpensive and straightforward and should help smaller labs to take advantage from the ongoing genomic revolution. PMID:25226019

  7. Quantitative Assessment of RNA-Protein Interactions with High Throughput Sequencing - RNA Affinity Profiling (HiTS-RAP)

    PubMed Central

    Ozer, Abdullah; Tome, Jacob M.; Friedman, Robin C.; Gheba, Dan; Schroth, Gary P.; Lis, John T.

    2016-01-01

    Because RNA-protein interactions play a central role in a wide-array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the High Throughput Sequencing-RNA Affinity Profiling (HiTS-RAP) assay, which couples sequencing on an Illumina GAIIx with the quantitative assessment of one or several proteins’ interactions with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of EGFP and NELF-E proteins with their corresponding canonical and mutant RNA aptamers. Here, we provide a detailed protocol for HiTS-RAP, which can be completed in about a month (8 days hands-on time) including the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, high-throughput sequencing and protein binding with GAIIx, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, RNA-MaP and RBNS. A successful HiTS-RAP experiment provides the sequence and binding curves for approximately 200 million RNAs in a single experiment. PMID:26182240

  8. Method for rapid base sequencing in DNA and RNA with two base labeling

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Posner, Richard G.; Marrone, Babetta L.; Hammond, Mark L.; Simpson, Daniel J.

    1995-01-01

    Method for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand.

  9. Method for rapid base sequencing in DNA and RNA with two base labeling

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Posner, R.G.; Marrone, B.L.; Hammond, M.L.; Simpson, D.J.

    1995-04-11

    A method is described for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand. 4 figures.

  10. ARM-Seq: AlkB-facilitated RNA methylation sequencing reveals a complex landscape of modified tRNA fragments

    PubMed Central

    Cozen, Aaron E.; Quartley, Erin; Holmes, Andrew D.; Robinson, Eva H.; Phizicky, Eric M.; Lowe, Todd M.

    2015-01-01

    High throughput RNA sequencing has accelerated discovery of the complex regulatory roles of small RNAs, but RNAs containing modified nucleosides may escape detection when those modifications interfere with reverse transcription during RNA-seq library preparation. Here we describe AlkB-facilitated RNA Methylation sequencing (ARM-Seq) which uses pre-treatment with Escherichia coli AlkB to demethylate 1-methyladenosine, 3-methylcytidine, and 1-methylguanosine, all commonly found in transfer RNAs. Comparative methylation analysis using ARM-Seq provides the first detailed, transcriptome-scale map of these modifications, and reveals an abundance of previously undetected, methylated small RNAs derived from tRNAs. ARM-Seq demonstrates that tRNA-derived small RNAs accurately recapitulate the m1A modification state for well-characterized yeast tRNAs, and generates new predictions for a large number of human tRNAs, including tRNA precursors and mitochondrial tRNAs. Thus, ARM-Seq provides broad utility for identifying previously overlooked methyl-modified RNAs, can efficiently monitor methylation state, and may reveal new roles for tRNA-derived RNAs as biomarkers or signaling molecules. PMID:26237225

  11. A novel mRNA 3' untranslated region translational control sequence regulates Xenopus Wee1 mRNA translation.

    PubMed

    Wang, Yi Ying; Charlesworth, Amanda; Byrd, Shannon M; Gregerson, Robert; MacNicol, Melanie C; MacNicol, Angus M

    2008-05-15

    Cell cycle progression during oocyte maturation requires the strict temporal regulation of maternal mRNA translation. The intrinsic basis of this temporal control has not been fully elucidated but appears to involve distinct mRNA 3' UTR regulatory elements. In this study, we identify a novel translational control sequence (TCS) that exerts repression of target mRNAs in immature oocytes of the frog, Xenopus laevis, and can direct early cytoplasmic polyadenylation and translational activation during oocyte maturation. The TCS is functionally distinct from the previously characterized Musashi/polyadenylation response element (PRE) and the cytoplasmic polyadenylation element (CPE). We report that TCS elements exert translational repression in both the Wee1 mRNA 3' UTR and the pericentriolar material-1 (Pcm-1) mRNA 3' UTR in immature oocytes. During oocyte maturation, TCS function directs the early translational activation of the Pcm-1 mRNA. By contrast, we demonstrate that CPE sequences flanking the TCS elements in the Wee1 3' UTR suppress the ability of the TCS to direct early translational activation. Our results indicate that a functional hierarchy exists between these distinct 3' UTR regulatory elements to control the timing of maternal mRNA translational activation during oocyte maturation.

  12. A tale of two sequences: microRNA-target chimeric reads.

    PubMed

    Broughton, James P; Pasquinelli, Amy E

    2016-04-04

    In animals, a functional interaction between a microRNA (miRNA) and its target RNA requires only partial base pairing. The limited number of base pair interactions required for miRNA targeting provides miRNAs with broad regulatory potential and also makes target prediction challenging. Computational approaches to target prediction have focused on identifying miRNA target sites based on known sequence features that are important for canonical targeting and may miss non-canonical targets. Current state-of-the-art experimental approaches, such as CLIP-seq (cross-linking immunoprecipitation with sequencing), PAR-CLIP (photoactivatable-ribonucleoside-enhanced CLIP), and iCLIP (individual-nucleotide resolution CLIP), require inference of which miRNA is bound at each site. Recently, the development of methods to ligate miRNAs to their target RNAs during the preparation of sequencing libraries has provided a new tool for the identification of miRNA target sites. The chimeric, or hybrid, miRNA-target reads that are produced by these methods unambiguously identify the miRNA bound at a specific target site. The information provided by these chimeric reads has revealed extensive non-canonical interactions between miRNAs and their target mRNAs, and identified many novel interactions between miRNAs and noncoding RNAs.

  13. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    PubMed

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-01

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits.

  14. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    PubMed

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-01

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits. PMID:27545715

  15. A method for accurate determination of terminal sequences of viral genomic RNA.

    PubMed

    Weng, Z; Xiong, Z

    1995-09-01

    A combination of ligation-anchored PCR and anchored cDNA cloning techniques were used to clone the termini of the saguaro cactus virus (SCV) RNA genome. The terminal sequences of the viral genome were subsequently determined from the clones. The 5' terminus was cloned by ligation-anchored PCR, whereas the 3' terminus was obtained by a technique we term anchored cDNA cloning. In anchored cDNA cloning, an anchor oligonucleotide was prepared by phosphorylation at the 5' end, followed by addition of a dideoxynucleotide at the 3' end to block the free hydroxyl group. The 5' end of the anchor was subsequently ligated to the 3' end of SCV RNA. The anchor-ligated, chimerical viral RNA was then reverse-transcribed into cDNA using a primer complementary to the anchor. The cDNA containing the complete 3'-terminal sequence was converted into ds-cDNA, cloned, and sequenced. Two restriction sites, one within the viral sequence and one within the primer sequence, were used to facilitate cloning. The combination of these techniques proved to be an easy and accurate way to determine the terminal sequences of SCV RNA genome and should be applicable to any other RNA molecules with unknown terminal sequences. PMID:9132274

  16. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    PubMed

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  17. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

    PubMed Central

    Anahtar, Melis N.; Bowman, Brittany A.; Kwon, Douglas S.

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  18. Adenovirus type 12-specific RNA sequences during productive infection of KB cells.

    PubMed Central

    Smiley, J R; Mak, S

    1976-01-01

    The complementary strands of adenovirus type 12 DNA were separated, and virus-specific RNA was analyzed by saturation hybridization in solution. Late during infection whole cell RNA hybridized to 75% of the light (1) strand and 15% of the heavy (H) strand, whereas cytoplasmic RNA hybridized to 65% of the 1 strand and 15% of the h strand. Late nuclear RNA hybridized to about 90% of the 1 strand and at least 36% of the h strand. Double-stranded RNA was isolated from infected cells late after infection, which annealed to greater than 30% of each of the two complementary DNA strands. Early whole cell RNA hybridized to 45 to 50% of the 1 strand and 15% of the h strand, whereas early cytoplasmic RNA hybridized to about 15% of each of the complementary strands. All early cytoplasmic sequences were present in the cytoplasm at late times. PMID:950688

  19. Visible sensing of nucleic acid sequences using a genetically encodable unmodified mRNA probe.

    PubMed

    Narita, Atsushi; Ogawa, Kazumasa; Sando, Shinsuke; Aoyama, Yasuhiro

    2006-01-01

    We previously reported a molecular beacon-mRNA (MB-mRNA) strategy for nucleic acid detection/sensing in a cell-free translation system using unmodified RNA as a probe. Here in this presentation, we report that a combination with RNase H activity, which induces an additional process of irreversible cleavage of MB-domain, achieves an improved sequence selectivity (one nucleotide selectivity) and an enhanced sensitivity. This improved system finally enabled visible sensing of target nucleic acid sequence at a single nucleotide resolution under isothermal conditions.

  20. Sequence of the PV2 gene of rice hoja blanca tenuivirus RNA-2.

    PubMed

    De Miranda, J R; Hull, R; Espinoza, A M

    1995-01-01

    Comparison of a partial sequence of rice hoja blanca tenuivirus RNA-2 with 40% similarity to rice stripe tenuivirus RNA-2 revealed regions of high local sequence homology at the 5' terminus, within the coding region (the pv2 gene), and in the intergenic region separating this gene from the other protein (pc2) encoded by this ambisense RNA. Analysis of the conserved regions of the pv2 protein identified two motifs found principally in viral membrane glycoproteins and six motifs found each in a wide variety of proteins. The possible significance of these results is discussed. PMID:8560781

  1. Towards next-generation sequencing analytics for foodborne RNA viruses: Examining the effect of RNA input quantity and viral RNA purity.

    PubMed

    Yang, Zhihui; Leonard, Susan R; Mammel, Mark K; Elkins, Christopher A; Kulka, Michael

    2016-10-01

    Detection and identification of viruses in food samples are technically challenging due largely to the low viral copy number in contaminated food items, and the lack of effective culture enrichment methods that are amenable to regulatory applications for many of the common foodborne viruses. Using an Illumina MiSeq platform and two hepatitis A virus (HAV) cell-culture adapted strains as a representative enteric virus species, this study examined the limits of single-stranded RNA (ssRNA) viral detection following next-generation sequencing without pre-amplification of the viral genome. Complete viral genome sequences were obtained from HAV samples of varying purities and with an input as low as 2ng total RNA containing 1.4×10(5) copies of viral RNA. In addition, single nucleotide variations were reproducibly detected over the range of concentrations examined, and their identity confirmed by alternate sequencing technology. In summary, next-generation sequencing technology has the potential for sensitive detection/identification of a viral genome at a low copy number. This study provides a benchmark for metagenomic sequencing application as is required for virus detection in complex food matrices using a culture-independent diagnostic approach.

  2. Taxonomic Assessment of Rumen Microbiota Using Total RNA and Targeted Amplicon Sequencing Approaches

    PubMed Central

    Li, Fuyong; Henderson, Gemma; Sun, Xu; Cox, Faith; Janssen, Peter H.; Guan, Le Luo

    2016-01-01

    Taxonomic characterization of active gastrointestinal microbiota is essential to detect shifts in microbial communities and functions under various conditions. This study aimed to identify and quantify potentially active rumen microbiota using total RNA sequencing and to compare the outcomes of this approach with the widely used targeted RNA/DNA amplicon sequencing technique. Total RNA isolated from rumen digesta samples from five beef steers was subjected to Illumina paired-end sequencing (RNA-seq), and bacterial and archaeal amplicons of partial 16S rRNA/rDNA were subjected to 454 pyrosequencing (RNA/DNA Amplicon-seq). Taxonomic assessments of the RNA-seq, RNA Amplicon-seq, and DNA Amplicon-seq datasets were performed using a pipeline developed in house. The detected major microbial phylotypes were common among the three datasets, with seven bacterial phyla, fifteen bacterial families, and five archaeal taxa commonly identified across all datasets. There were also unique microbial taxa detected in each dataset. Elusimicrobia and Verrucomicrobia phyla; Desulfovibrionaceae, Elusimicrobiaceae, and Sphaerochaetaceae families; and Methanobrevibacter woesei were only detected in the RNA-Seq and RNA Amplicon-seq datasets, whereas Streptococcaceae was only detected in the DNA Amplicon-seq dataset. In addition, the relative abundances of four bacterial phyla, eight bacterial families and one archaeal taxon were different among the three datasets. This is the first study to compare the outcomes of rumen microbiota profiling between RNA-seq and RNA/DNA Amplicon-seq datasets. Our results illustrate the differences between these methods in characterizing microbiota both qualitatively and quantitatively for the same sample, and so caution must be exercised when comparing data. PMID:27446027

  3. Taxonomic Assessment of Rumen Microbiota Using Total RNA and Targeted Amplicon Sequencing Approaches.

    PubMed

    Li, Fuyong; Henderson, Gemma; Sun, Xu; Cox, Faith; Janssen, Peter H; Guan, Le Luo

    2016-01-01

    Taxonomic characterization of active gastrointestinal microbiota is essential to detect shifts in microbial communities and functions under various conditions. This study aimed to identify and quantify potentially active rumen microbiota using total RNA sequencing and to compare the outcomes of this approach with the widely used targeted RNA/DNA amplicon sequencing technique. Total RNA isolated from rumen digesta samples from five beef steers was subjected to Illumina paired-end sequencing (RNA-seq), and bacterial and archaeal amplicons of partial 16S rRNA/rDNA were subjected to 454 pyrosequencing (RNA/DNA Amplicon-seq). Taxonomic assessments of the RNA-seq, RNA Amplicon-seq, and DNA Amplicon-seq datasets were performed using a pipeline developed in house. The detected major microbial phylotypes were common among the three datasets, with seven bacterial phyla, fifteen bacterial families, and five archaeal taxa commonly identified across all datasets. There were also unique microbial taxa detected in each dataset. Elusimicrobia and Verrucomicrobia phyla; Desulfovibrionaceae, Elusimicrobiaceae, and Sphaerochaetaceae families; and Methanobrevibacter woesei were only detected in the RNA-Seq and RNA Amplicon-seq datasets, whereas Streptococcaceae was only detected in the DNA Amplicon-seq dataset. In addition, the relative abundances of four bacterial phyla, eight bacterial families and one archaeal taxon were different among the three datasets. This is the first study to compare the outcomes of rumen microbiota profiling between RNA-seq and RNA/DNA Amplicon-seq datasets. Our results illustrate the differences between these methods in characterizing microbiota both qualitatively and quantitatively for the same sample, and so caution must be exercised when comparing data. PMID:27446027

  4. Molecular Diagnosis of Actinomadura madurae Infection by 16S rRNA Deep Sequencing

    PubMed Central

    SenGupta, Dhruba J.; Hoogestraat, Daniel R.; Cummings, Lisa A.; Bryant, Bronwyn H.; Natividad, Catherine; Thielges, Stephanie; Monsaas, Peter W.; Chau, Mimosa; Barbee, Lindley A.; Rosenthal, Christopher; Cookson, Brad T.; Hoffman, Noah G.

    2013-01-01

    Next-generation DNA sequencing can be used to catalog individual organisms within complex, polymicrobial specimens. Here, we utilized deep sequencing of 16S rRNA to implicate Actinomadura madurae as the cause of mycetoma in a diabetic patient when culture and conventional molecular methods were overwhelmed by overgrowth of other organisms. PMID:24108607

  5. Genome Sequence of Candida tropicalis no. 121, Used for RNA Production.

    PubMed

    Li, Bingbing; Guo, Ting; Chen, Yong; Xie, Jingjing; Niu, Huanqing; Liu, Dong; Cheng, Jian; Chen, Xiaochun; Wu, Jinglan; Zhuang, Wei; Zhu, Chenjie; Ying, Hanjie

    2014-01-01

    We report here the complete genome sequence of Candida tropicalis no. 121. C. tropicalis no. 121 is a high-RNA-producing strain obtained by mutagenesis in our laboratory. The complete genome sequence was determined using the Illumina HiSeq 2000 and contains 6,415 genes. The genome size of C. tropicalis no. 121 is >15.3 Mb.

  6. Genome Sequence of Saccharomyces cerevisiae Double-Stranded RNA Virus L-A-28

    PubMed Central

    Konovalovas, Aleksandras

    2016-01-01

    We cloned and sequenced the complete genome of the L-A-28 virus from the Saccharomyces cerevisiae K28 killer strain. This sequence completes the set of currently identified L-A helper viruses required for expression of double-stranded RNA-originated killer phenotypes in baking yeast. PMID:27313294

  7. Genome Sequence of Saccharomyces cerevisiae Double-Stranded RNA Virus L-A-28.

    PubMed

    Konovalovas, Aleksandras; Serviené, Elena; Serva, Saulius

    2016-01-01

    We cloned and sequenced the complete genome of the L-A-28 virus from the Saccharomyces cerevisiae K28 killer strain. This sequence completes the set of currently identified L-A helper viruses required for expression of double-stranded RNA-originated killer phenotypes in baking yeast. PMID:27313294

  8. Sequence analysis of rice hoja blanca virus RNA 3.

    PubMed

    de Miranda, J; Hernandez, M; Hull, R; Espinoza, A M

    1994-08-01

    RNA 3 of rice hoja blanca tenuivirus (RHBV) has 2299 nucleotides and resembles RNA 3 of other tenuiviruses such as maize stripe (MStV) and rice stripe (RStV) viruses in potentially coding with an ambisense strategy for two proteins. Both the viral-sense protein of 23K and the complementary-sense protein of 35K have about 46% amino acid identity with the analogous proteins encoded by RNA 3 of MStV and RStV. As the proteins of MStV and RStV have about 65% identity between themselves, RHBV cannot be a South and Central American strain of the Asian RStV. The intergenic region resembles those of other tenuiviruses, being rich in A and U residues, but its predicted folding pattern is unlike those of other tenuiviruses. Instead, the predicted folding of the intergenic region was indistinguishable from that of the coding regions and there was no evidence for a distinct hairpin-loop structure. The significance to the evolution of tenuiviruses of the similarities that the two proteins have with their analogues in other tenuiviruses is discussed. PMID:8046420

  9. Intercistronic as well as terminal sequences are required for efficient amplification of brome mosaic virus RNA3.

    PubMed Central

    French, R; Ahlquist, P

    1987-01-01

    The genome of brome mosaic virus (BMV) is divided among messenger polarity RNA1, RNA2, and RNA3 (3.2, 2.9, and 2.1 kilobases, respectively). cis-Acting sequences required for BMV RNA amplification were investigated with RNA3. By using expressible cDNA clones, deletions were constructed throughout RNA3 and tested in barley protoplasts coinoculated with RNA1 and RNA2. In contrast to requirements for 5'- and 3'-terminal noncoding sequences, either of the two RNA3 coding regions can be deleted individually and both can be simultaneously inactivated by N-terminal frameshift mutations without significantly interfering with amplification of RNA3 or production of its subgenomic mRNA. However, simultaneous major deletions in both coding regions greatly attenuate RNA3 accumulation. RNA3 levels can be largely restored by insertion of a heterologous, nonviral sequence in such mutants, suggesting that RNA3 requires physical separation of its terminal domains or a minimum overall size for normal replication or stability. Unexpectedly, deletions in a 150-base segment of the intercistronic noncoding region drastically reduce RNA3 accumulation. This segment contains a sequence element homologous to sequences found near the 5' ends of BMV RNA1 and RNA2 and in analogous positions in the three genomic RNAs of the related cucumber mosaic virus, suggesting a possible role in plus-strand synthesis. Images PMID:3573144

  10. Sequence characterization of 5S ribosomal RNA from eight gram positive procaryotes

    NASA Technical Reports Server (NTRS)

    Woese, C. R.; Luehrsen, K. R.; Pribula, C. D.; Fox, G. E.

    1976-01-01

    Complete nucleotide sequences are presented for 5S rRNA from Bacillus subtilis, B. firmus, B. pasteurii, B. brevis, Lactobacillus brevis, and Streptococcus faecalis, and 5S rRNA oligonucleotide catalogs and partial sequence data are given for B. cereus and Sporosarcina ureae. These data demonstrate a striking consistency of 5S rRNA primary and secondary structure within a given bacterial grouping. An exception is B. brevis, in which the 5S rRNA sequence varies significantly from that of other bacilli in the tuned helix and the procaryotic loop. The localization of these variations suggests that B. brevis occupies an ecological niche that selects such changes. It is noted that this organism produces antibiotics which affect ribosome function.

  11. Self-Assembly of Measles Virus Nucleocapsid-like Particles: Kinetics and RNA Sequence Dependence.

    PubMed

    Milles, Sigrid; Jensen, Malene Ringkjøbing; Communie, Guillaume; Maurin, Damien; Schoehn, Guy; Ruigrok, Rob W H; Blackledge, Martin

    2016-08-01

    Measles virus RNA genomes are packaged into helical nucleocapsids (NCs), comprising thousands of nucleo-proteins (N) that bind the entire genome. N-RNA provides the template for replication and transcription by the viral polymerase and is a promising target for viral inhibition. Elucidation of mechanisms regulating this process has been severely hampered by the inability to controllably assemble NCs. Here, we demonstrate self-organization of N into NC-like particles in vitro upon addition of RNA, providing a simple and versatile tool for investigating assembly. Real-time NMR and fluorescence spectroscopy reveals biphasic assembly kinetics. Remarkably, assembly depends strongly on the RNA-sequence, with the genomic 5' end and poly-Adenine sequences assembling efficiently, while sequences such as poly-Uracil are incompetent for NC formation. This observation has important consequences for understanding the assembly process.

  12. Mapping RNA Structure In Vitro with SHAPE Chemistry and Next-Generation Sequencing (SHAPE-Seq).

    PubMed

    Watters, Kyle E; Lucks, Julius B

    2016-01-01

    Mapping RNA structure with selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry has proven to be a versatile method for characterizing RNA structure in a variety of contexts. SHAPE reagents covalently modify RNAs in a structure-dependent manner to create adducts at the 2'-OH group of the ribose backbone at nucleotides that are structurally flexible. The positions of these adducts are detected using reverse transcriptase (RT) primer extension, which stops one nucleotide before the modification, to create a pool of cDNAs whose lengths reflect the location of SHAPE modification. Quantification of the cDNA pools is used to estimate the "reactivity" of each nucleotide in an RNA molecule to the SHAPE reagent. High reactivities indicate nucleotides that are structurally flexible, while low reactivities indicate nucleotides that are inflexible. These SHAPE reactivities can then be used to infer RNA structures by restraining RNA structure prediction algorithms. Here, we provide a state-of-the-art protocol describing how to perform in vitro RNA structure probing with SHAPE chemistry using next-generation sequencing to quantify cDNA pools and estimate reactivities (SHAPE-Seq). The use of next-generation sequencing allows for higher throughput, more consistent data analysis, and multiplexing capabilities. The technique described herein, SHAPE-Seq v2.0, uses a universal reverse transcription priming site that is ligated to the RNA after SHAPE modification. The introduced priming site allows for the structural analysis of an RNA independent of its sequence. PMID:27665597

  13. Mapping RNA Structure In Vitro with SHAPE Chemistry and Next-Generation Sequencing (SHAPE-Seq).

    PubMed

    Watters, Kyle E; Lucks, Julius B

    2016-01-01

    Mapping RNA structure with selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry has proven to be a versatile method for characterizing RNA structure in a variety of contexts. SHAPE reagents covalently modify RNAs in a structure-dependent manner to create adducts at the 2'-OH group of the ribose backbone at nucleotides that are structurally flexible. The positions of these adducts are detected using reverse transcriptase (RT) primer extension, which stops one nucleotide before the modification, to create a pool of cDNAs whose lengths reflect the location of SHAPE modification. Quantification of the cDNA pools is used to estimate the "reactivity" of each nucleotide in an RNA molecule to the SHAPE reagent. High reactivities indicate nucleotides that are structurally flexible, while low reactivities indicate nucleotides that are inflexible. These SHAPE reactivities can then be used to infer RNA structures by restraining RNA structure prediction algorithms. Here, we provide a state-of-the-art protocol describing how to perform in vitro RNA structure probing with SHAPE chemistry using next-generation sequencing to quantify cDNA pools and estimate reactivities (SHAPE-Seq). The use of next-generation sequencing allows for higher throughput, more consistent data analysis, and multiplexing capabilities. The technique described herein, SHAPE-Seq v2.0, uses a universal reverse transcription priming site that is ligated to the RNA after SHAPE modification. The introduced priming site allows for the structural analysis of an RNA independent of its sequence.

  14. A plant viral coat protein RNA binding consensus sequence contains a crucial arginine.

    PubMed Central

    Ansel-McKinney, P; Scott, S W; Swanson, M; Ge, X; Gehrke, L

    1996-01-01

    A defining feature of alfalfa mosaic virus (AMV) and ilarviruses [type virus: tobacco streak virus (TSV)] is that, in addition to genomic RNAs, viral coat protein is required to establish infection in plants. AMV and TSV coat proteins, which share little primary amino acid sequence identity, are functionally interchangeable in RNA binding and initiation of infection. The lysine-rich amino-terminal RNA binding domain of the AMV coat protein lacks previously identified RNA binding motifs. Here, the AMV coat protein RNA binding domain is shown to contain a single arginine whose specific side chain and position are crucial for RNA binding. In addition, the putative RNA binding domain of two ilarvirus coat proteins, TSV and citrus variegation virus, is identified and also shown to contain a crucial arginine. AMV and ilarvirus coat protein sequence alignment centering on the key arginine revealed a new RNA binding consensus sequence. This consensus may explain in part why heterologous viral RNA-coat protein mixtures are infectious. Images PMID:8890181

  15. Identification of differentially expressed genes during development of the zebrafish pineal complex using RNA sequencing.

    PubMed

    Khuansuwan, Sataree; Gamse, Joshua T

    2014-11-01

    We describe a method for isolating RNA suitable for high-throughput RNA sequencing (RNA-seq) from small numbers of fluorescently labeled cells isolated from live zebrafish (Danio rerio) embryos without using costly, commercially available columns. This method ensures high cell viability after dissociation and suspension of cells and gives a very high yield of intact RNA. We demonstrate the utility of our new protocol by isolating RNA from fluorescence activated cell sorted (FAC sorted) pineal complex neurons in wild-type and tbx2b knockdown embryos at 24 hours post-fertilization. Tbx2b is a transcription factor required for pineal complex formation. We describe a bioinformatics pipeline used to analyze differential expression following high-throughput sequencing and demonstrate the validity of our results using in situ hybridization of differentially expressed transcripts. This protocol brings modern transcriptome analysis to the study of small cell populations in zebrafish.

  16. When you can't trust the DNA: RNA editing changes transcript sequences.

    PubMed

    Knoop, Volker

    2011-02-01

    RNA editing describes targeted sequence alterations in RNAs so that the transcript sequences differ from their DNA template. Since the original discovery of RNA editing in trypanosomes nearly 25 years ago more than a dozen such processes of nucleotide insertions, deletions, and exchanges have been identified in evolutionarily widely separated groups of the living world including plants, animals, fungi, protists, bacteria, and viruses. In many cases gene expression in mitochondria is affected, but RNA editing also takes place in chloroplasts and in nucleocytosolic genetic environments. While some RNA editing systems largely seem to repair defect genes (cryptogenes), others have obvious functions in modulating gene activities. The present review aims for an overview on the current states of research in the different systems of RNA editing by following a historic timeline along the respective original discoveries.

  17. Primary sequence of the 16S ribosomal RNA of Escherichia coli.

    PubMed Central

    Ehresmann, C; Stiegler, P; Mackie, G A; Zimmermann, R A; Ebel, J P; Fellner, P

    1975-01-01

    Recent progress in the nucleotide sequence analysis of the 16S ribosomal RNA from E. coli is described. The sequence which has been partially or completely determined so far encompasses 1520 nucleotides, i.e. about 95% of the molecule. Possible features of the secondary structure are suggested on the basis of the nucleotide sequence and data on sequence heterogeneities, repetitions and the location of modified nucleotides are presented. In the accompanying paper, the use of the nucleotide sequence data in studies of the ribosomal protein binding sites is described. PMID:1091918

  18. Genomic RNA sequence of feline coronavirus strain FCoV C1Je

    PubMed Central

    Dye, Charlotte; Siddell, Stuart G.

    2007-01-01

    This paper reports the first genomic RNA sequence of a field strain feline coronavirus (FCoV). Viral RNA was isolated at post mortem from the jejunum and liver of a cat with feline infectious peritonitis (FIP). A consensus sequence of the jejunum-derived genomic RNA (FCoV C1Je) was determined from overlapping cDNA fragments produced by reverse transcriptase polymerase chain reaction (RT-PCR) amplification. RT-PCR products were sequenced by a reiterative sequencing strategy and the genomic RNA termini were determined using a rapid amplification of cDNA ends PCR strategy. The FCoV C1Je genome was found to be 29,255 nucleotides in length, excluding the poly(A) tail. Comparison of the FCoV C1Je genomic RNA sequence with that of the laboratory strain FCoV FIP virus (FIPV) 79-1146 showed that both viruses have a similar genome organisation and predictions made for the open reading frames and cis-acting elements of the FIPV 79-1146 genome hold true for FCoV C1Je. In addition, the sequence of the 3′-proximal third of the liver derived genomic RNA (FCoV C1Li), which encompasses the structural and accessory protein genes of the virus, was also determined. Comparisons of the enteric (jejunum) and non-enteric (liver) derived viral RNA sequences revealed 100% nucleotide identity, a finding that questions the well accepted ‘internal mutation theory’ of FIPV pathogenicity. PMID:17363313

  19. Identification of characteristic oligonucleotides in the bacterial 16S ribosomal RNA sequence dataset

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; Willson, Richard C.; Fox, George E.

    2002-01-01

    MOTIVATION: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification. RESULTS: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Q(s), was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 5-11 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16,000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.

  20. Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

    PubMed

    Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

    2013-01-01

    Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.

  1. Nucleotide Sequence Analyses and Predicted Coding of Bunyavirus Genome RNA Species

    PubMed Central

    Clerx-van Haaster, Corrie M.; Akashi, Hiroomi; Auperin, David D.; Bishop, David H. L.

    1982-01-01

    We performed 3′ RNA sequence analyses of [32P]pCp-end-labeled La Crosse (LAC) virus, alternate LAC virus isolate L74, and snowshoe hare bunyavirus large (L), medium (M), and small (S) negative-stranded viral RNA species to determine the coding capabilities of these species. These analyses were confirmed by dideoxy primer extension studies in which we used a synthetic oligodeoxynucleotide primer complementary to the conserved 3′-terminal decanucleotide of the three viral RNA species (Clerx-van Haaster and Bishop, Virology 105:564-574, 1980). The deduced sequences predicted translation of two S-RNA gene products that were read in overlapping reading frames. So far, only single contiguous open reading frames have been identified for the viral M- and L-RNA species. For the negative-stranded M-RNA species of all three viruses, the single reading frame developed from the first 3′-proximal UAC triplet. Likewise, for the L-RNA of the alternate LAC isolate, a single open reading frame developed from the first 3′-proximal UAC triplet. The corresponding L-RNA sequences of prototype LAC and snowshoe hare viruses initiated open reading frames; however, for both viral L-RNA species there was a preceding 3′-proximal UAC triplet in another reading frame that was followed shortly afterward by a termination codon. A comparison of the sequence data obtained for snowshoe hare virus, LAC virus, and the alternate LAC virus isolate showed that the identified nucleotide substitutions were sufficient to account for some of the fingerprint differences in the L-, M-, and S-RNA species of the three viruses. Unlike the distribution of the L- and M-RNA substitutions, significantly fewer nucleotide substitutions occurred after the initial UAC triplet of the S-RNA species than before this triplet, implying that the overlapping genes of the S RNA provided a constraint against evolution by point mutation. The comparative sequence analyses predicted amino acid differences among the

  2. Nucleotide sequence and newly formed phosphodiester bond of spontaneously ligated satellite tobacco ringspot virus RNA.

    PubMed Central

    Buzayan, J M; Hampel, A; Bruening, G

    1986-01-01

    The satellite RNA of tobacco ringspot virus (STobRV RNA) replicates and becomes encapsidated in association with tobacco ringspot virus. Previous results show that the infected tissue produces multimeric STobRV RNAs of both polarities. RNA that is complementary to encapsidated STobRV RNA, designated as having the (-) polarity, cleaves autolytically at a specific ApG bond. Purified autolysis products spontaneously join in a non-enzymic reaction. We report characteristics of this RNA ligation reaction: the terminal groups that react, the type of bond in the newly formed junction and the nucleotide sequence of the joined RNA. The nucleotide sequence of the ligated RNA shows that joining of the reacting RNAs restored an ApG bond. The junction ApG has a 3'-to-5' phosphodiester bond. Thus the net ligation reaction of STobRV (-)RNA is the precise reversal of autolysis. We discuss this new type of RNA ligation reaction and its implications for the formation of multimeric STobRV RNAs during replication. Images PMID:2433680

  3. Nodavirus Coat Protein Imposes Dodecahedral RNA Structure Independent of Nucleotide Sequence and Length†

    PubMed Central

    Tihova, Mariana; Dryden, Kelly A.; Le, Thuc-vy L.; Harvey, Stephen C.; Johnson, John E.; Yeager, Mark; Schneemann, Anette

    2004-01-01

    The nodavirus Flock house virus (FHV) has a bipartite, positive-sense RNA genome that is packaged into an icosahedral particle displaying T=3 symmetry. The high-resolution X-ray structure of FHV has shown that 10 bp of well-ordered, double-stranded RNA are located at each of the 30 twofold axes of the virion, but it is not known which portions of the genome form these duplex regions. The regular distribution of double-stranded RNA in the interior of the virus particle indicates that large regions of the encapsidated genome are engaged in secondary structure interactions. Moreover, the RNA is restricted to a topology that is unlikely to exist during translation or replication. We used electron cryomicroscopy and image reconstruction to determine the structure of four types of FHV particles that differed in RNA and protein content. RNA-capsid interactions were primarily mediated via the N and C termini, which are essential for RNA recognition and particle assembly. A substantial fraction of the packaged nucleic acid, either viral or heterologous, was organized as a dodecahedral cage of duplex RNA. The similarity in tertiary structure suggests that RNA folding is independent of sequence and length. Computational modeling indicated that RNA duplex formation involves both short-range and long-range interactions. We propose that the capsid protein is able to exploit the plasticity of the RNA secondary structures, capturing those that are compatible with the geometry of the dodecahedral cage. PMID:14990708

  4. Identification of conserved and novel microRNAs in Aquilaria sinensis based on small RNA sequencing and transcriptome sequence data.

    PubMed

    Gao, Zhi-Hui; Wei, Jian-He; Yang, Yun; Zhang, Zheng; Xiong, Huan-Ying; Zhao, Wen-Ting

    2012-08-15

    Agarwood is in great demand for its high value in medicine, incense, and perfume across Asia, Middle East, and Europe. As agarwood is formed only when the Aquilaria trees are wounded or infected by some microbes, overharvesting and habitat loss are threatening some populations of agarwood-producing species. Aquilaria sinensis is such a significant economic tree species. To promote the production efficiency and protect the resource of A. sinensis, it would be critical to reveal the regulation mechanisms of stress-induced agarwood formation. MicroRNAs (miRNAs), a key gene expression regulator involved in various plant stress response and metabolic processes, might function in agarwood formation, but no report concerning miRNAs in Aquilaria is available. In this study, the small RNA high-throughput sequencing and 454 transcriptome data were adopted to identify both conserved and novel miRNAs in A. sinensis. Deep sequencing showed that the small RNA (sRNA) population of A. sinensis was complex and the length of sRNAs varied. By in silico analysis of the small RNA deep sequencing data and transcriptome data, we discovered 27 novel miRNAs in A. sinensis. Based on the mature miRNA sequence conservation, we identified 74 putative conserved miRNAs from A. sinensis and 10 of them were confirmed with hairpin forming precursor. Interestingly, a novel miRNA sequence was determined to be the miRNA of asi-miR408, but with accumulation much higher than asi-miR408. The expression levels of ten stress-responsive miRNAs were examined during the time-course after wound treatment. Eight were shown to be wound-responsive. This not only shows the existence of miRNAs in this Asian economically significant tree species but also indicated its critical role in stress-induced agarwood formation. The highly accumulated miRNA of asi-miR408 implied miRNAs would be functional as well as miRNAs in plants.

  5. The phylogenetic utility and functional constraint of microRNA flanking sequences

    PubMed Central

    Kenny, Nathan J.; Sin, Yung Wa; Hayward, Alexander; Paps, Jordi; Chu, Ka Hou; Hui, Jerome H. L.

    2015-01-01

    MicroRNAs (miRNAs) have recently risen to prominence as novel factors responsible for post-transcriptional regulation of gene expression. miRNA genes have been posited as highly conserved in the clades in which they exist. Consequently, miRNAs have been used as rare genome change characters to estimate phylogeny by tracking their gain and loss. However, their short length (21–23 bp) has limited their perceived utility in sequenced-based phylogenetic inference. Here, using reference taxa with established phylogenetic relationships, we demonstrate that miRNA sequences are of high utility in quantitative, rather than in qualitative, phylogenetic analysis. The clear orthology among miRNA genes from different species makes it straightforward to identify and align these sequences from even fragmentary datasets. We also identify significant sequence conservation in the regions directly flanking miRNA genes, and show that this too is of utility in phylogenetic analysis, as well as highlighting conserved regions that will be of interest to other fields. Employing miRNA sequences from 12 sequenced drosophilid genomes, together with a Tribolium castaneum outgroup, we demonstrate that this approach is robust using Bayesian and maximum-likelihood methods. The utility of these characters is further demonstrated in the rhabditid nematodes and primates. As next-generation sequencing makes it more cost-effective to sequence genomes and small RNA libraries, this methodology provides an alternative data source for phylogenetic analysis. The approach allows rapid resolution of relationships between both closely related and rapidly evolving species, and provides an additional tool for investigation of relationships within the tree of life. PMID:25694624

  6. Integrative analyses of RNA editing, alternative splicing, and expression of young genes in human brain transcriptome by deep RNA sequencing.

    PubMed

    Wu, Dong-Dong; Ye, Ling-Qun; Li, Yan; Sun, Yan-Bo; Shao, Yi; Chen, Chunyan; Zhu, Zhu; Zhong, Li; Wang, Lu; Irwin, David M; Zhang, Yong E; Zhang, Ya-Ping

    2015-08-01

    Next-generation RNA sequencing has been successfully used for identification of transcript assembly, evaluation of gene expression levels, and detection of post-transcriptional modifications. Despite these large-scale studies, additional comprehensive RNA-seq data from different subregions of the human brain are required to fully evaluate the evolutionary patterns experienced by the human brain transcriptome. Here, we provide a total of 6.5 billion RNA-seq reads from different subregions of the human brain. A significant correlation was observed between the levels of alternative splicing and RNA editing, which might be explained by a competition between the molecular machineries responsible for the splicing and editing of RNA. Young human protein-coding genes demonstrate biased expression to the neocortical and non-neocortical regions during evolution on the lineage leading to humans. We also found that a significantly greater number of young human protein-coding genes are expressed in the putamen, a tissue that was also observed to have the highest level of RNA-editing activity. The putamen, which previously received little attention, plays an important role in cognitive ability, and our data suggest a potential contribution of the putamen to human evolution.

  7. Deep Sequencing Analysis of Nucleolar Small RNAs: RNA Isolation and Library Preparation.

    PubMed

    Bai, Baoyan; Laiho, Marikki

    2016-01-01

    The nucleolus is a subcellular compartment with a key essential function in ribosome biogenesis. The nucleolus is rich in noncoding RNAs, mostly the ribosomal RNAs and small nucleolar RNAs. Surprisingly, also several miRNAs have been detected in the nucleolus, raising the question as to whether other small RNA species are present and functional in the nucleolus. We have developed a strategy for stepwise enrichment of nucleolar small RNAs from the total nucleolar RNA extracts and subsequent construction of nucleolar small RNA libraries which are suitable for deep sequencing. Our method successfully isolates the small RNA population from total RNAs and monitors the RNA quality in each step to ensure that small RNAs recovered represent the actual small RNA population in the nucleolus and not degradation products from larger RNAs. We have further applied this approach to characterize the distribution of small RNAs in different cellular compartments. PMID:27576723

  8. Cell-Free Transcription of Mammalian Chromatin: Transcription of Globin Messenger RNA Sequences from Bone-Marrow Chromatin with Mammalian RNA Polymerase

    PubMed Central

    Steggles, A. W.; Wilson, G. N.; Kantor, J. A.; Picciano, D. J.; Falvey, A. K.; Anderson, W. F.

    1974-01-01

    A mammalian cell-free transcriptional system was developed in which mammalian RNA polymerase synthesizes globin messenger RNA sequences from bone-marrow chromatin. The messenger RNA sequences are detected by measurement of the ability of the transcribed RNA to hybridize with globin complementary DNA. The globin complementary DNA is synthesized by the enzyme from avian myeloblastosis virus, RNA-directed DNA polymerase, with purified globin messenger RNA as template. The specificity of the globin complementary DNA in annealing reactions was verified by preparing DNA complementary to liver messenger RNA and showing that the globin and liver complementary DNAs are specific for their own messenger RNAs. Both DNA-dependent RNA polymerase II from sheep liver and RNA polymerase from Escherichia coli can transcribe globin messenger RNA sequences from rabbit bone-marrow chromatin; however, the mammalian enzyme appears to be more specific in that globin gene sequences represent a higher proportion of the RNA synthesized. Neither polymerase can transcribe globin messenger RNA sequences from rabbit-liver chromatin. This cell-free assay system should be useful in searching for mammalian transcriptional regulatory factors. PMID:4364529

  9. Analyzing the microRNA Transcriptome in Plants Using Deep Sequencing Data

    PubMed Central

    Yang, Xiaozeng; Li, Lei

    2012-01-01

    MicroRNAs (miRNAs) are 20- to 24-nucleotide endogenous small RNA molecules emerging as an important class of sequence-specific, trans-acting regulators for modulating gene expression at the post-transcription level. There has been a surge of interest in the past decade in identifying miRNAs and profiling their expression pattern using various experimental approaches. In particular, ultra-deep sampling of specifically prepared low-molecular-weight RNA libraries based on next-generation sequencing technologies has been used successfully in diverse species. The challenge now is to effectively deconvolute the complex sequencing data to provide comprehensive and reliable information on the miRNAs, miRNA precursors, and expression profile of miRNA genes. Here we review the recently developed computational tools and their applications in profiling the miRNA transcriptomes, with an emphasis on the model plant Arabidopsis thaliana. Highlighted is also progress and insight into miRNA biology derived from analyzing available deep sequencing data. PMID:24832228

  10. New genera of RNA viruses in subtropical seawater, inferred from polymerase gene sequences.

    PubMed

    Culley, Alexander I; Steward, Grieg F

    2007-09-01

    Viruses are an integral component of the marine food web, contributing to the disease and mortality of essentially every type of marine life, yet the diversity of viruses in the sea, especially those with RNA genomes, remains very poorly characterized. Isolates of RNA-containing viruses that infect marine plankton are still rare, and the only cultivation-independent surveys of RNA viral diversity reported so far were conducted for temperate coastal waters of British Columbia. Here, we report on our improvements to a previously used protocol to investigate the diversity of marine picorna-like viruses and our results from applying this protocol in subtropical waters. The original protocol was simplified by using direct filtration, rather than tangential flow filtration, to harvest viruses from seawater, and new degenerate primers were designed to amplify a fragment of the RNA-dependent RNA polymerase gene by reverse transcription-PCR from RNA extracted from the filters. Whereas the original protocol was unsuccessful in a preliminary test, the new protocol resulted in amplification of picorna-like virus sequences in every sample of subtropical and temperate coastal seawater assayed. These polymerase sequences formed a diverse, but monophyletic cluster along with other sequences amplified previously from seawater and sequences from isolates infecting marine protists. Phylogenetic analysis suggested that our sequences represent at least five new genera and 24 new species of RNA viruses. These results contribute to our understanding of RNA virus diversity and suggest that picorna-like viruses are a source of mortality for a wide variety of marine protists.

  11. StarScan: a web server for scanning small RNA targets from degradome sequencing data

    PubMed Central

    Liu, Shun; Li, Jun-Hao; Wu, Jie; Zhou, Ke-Ren; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2015-01-01

    Endogenous small non-coding RNAs (sRNAs), including microRNAs, PIWI-interacting RNAs and small interfering RNAs, play important gene regulatory roles in animals and plants by pairing to the protein-coding and non-coding transcripts. However, computationally assigning these various sRNAs to their regulatory target genes remains technically challenging. Recently, a high-throughput degradome sequencing method was applied to identify biologically relevant sRNA cleavage sites. In this study, an integrated web-based tool, StarScan (sRNA target Scan), was developed for scanning sRNA targets using degradome sequencing data from 20 species. Given a sRNA sequence from plants or animals, our web server performs an ultrafast and exhaustive search for potential sRNA–target interactions in annotated and unannotated genomic regions. The interactions between small RNAs and target transcripts were further evaluated using a novel tool, alignScore. A novel tool, degradomeBinomTest, was developed to quantify the abundance of degradome fragments located at the 9–11th nucleotide from the sRNA 5′ end. This is the first web server for discovering potential sRNA-mediated RNA cleavage events in plants and animals, which affords mechanistic insights into the regulatory roles of sRNAs. The StarScan web server is available at http://mirlab.sysu.edu.cn/starscan/. PMID:25990732

  12. MicroRNA-373 induces expression of genes with complementary promoter sequences.

    PubMed

    Place, Robert F; Li, Long-Cheng; Pookot, Deepa; Noonan, Emily J; Dahiya, Rajvir

    2008-02-01

    Recent studies have shown that microRNA (miRNA) regulates gene expression by repressing translation or directing sequence-specific degradation of complementary mRNA. Here, we report new evidence in which miRNA may also function to induce gene expression. By scanning gene promoters in silico for sequences complementary to known miRNAs, we identified a putative miR-373 target site in the promoter of E-cadherin. Transfection of miR-373 and its precursor hairpin RNA (pre-miR-373) into PC-3 cells readily induced E-cadherin expression. Knockdown experiments confirmed that induction of E-cadherin by pre-miR-373 required the miRNA maturation protein Dicer. Further analysis revealed that cold-shock domain-containing protein C2 (CSDC2), which possesses a putative miR-373 target site within its promoter, was also readily induced in response to miR-373 and pre-miR-373. Furthermore, enrichment of RNA polymerase II was detected at both E-cadherin and CSDC2 promoters after miR-373 transfection. Mismatch mutations to miR-373 indicated that gene induction was specific to the miR-373 sequence. Transfection of promoter-specific dsRNAs revealed that the concurrent induction of E-cadherin and CSDC2 by miR-373 required the miRNA target sites in both promoters. In conclusion, we have identified a miRNA that targets promoter sequences and induces gene expression. These findings reveal a new mode by which miRNAs may regulate gene expression.

  13. Uncultivated microbial eukaryotic diversity: a method to link ssu rRNA gene sequences with morphology.

    PubMed

    Hirst, Marissa B; Kita, Kelley N; Dawson, Scott C

    2011-01-01

    Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA "phylotypes" from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages

  14. Uncultivated microbial eukaryotic diversity: a method to link ssu rRNA gene sequences with morphology.

    PubMed

    Hirst, Marissa B; Kita, Kelley N; Dawson, Scott C

    2011-01-01

    Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA "phylotypes" from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages

  15. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server

    PubMed Central

    Cannone, Jamie J.; Sweeney, Blake A.; Petrov, Anton I.; Gutell, Robin R.; Zirbel, Craig L.; Leontis, Neocles

    2015-01-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960

  16. Sequence and phylogenetic analysis of SSU rRNA gene of five microsporidia.

    PubMed

    Dong, ShiNan; Shen, ZhongYuan; Xu, Li; Zhu, Feng

    2010-01-01

    The complete small subunit rRNA (SSU rRNA) gene sequences of five microsporidia including Nosema heliothidis, and four novel microsporidia isolated from Pieris rapae, Phyllobrotica armta, Hemerophila atrilineata, and Bombyx mori, respectively, were obtained by PCR amplification, cloning, and sequencing. Two phylogenetic trees based on SSU rRNA sequences had been constructed by using Neighbor-Joining of Phylip software and UPGMA of MEGA4.0 software. The taxonomic status of four novel microsporidia was determined by analysis of phylogenetic relationship, length, G+C content, identity, and divergence of the SSU rRNA sequences. The results showed that the microsporidia isolated from Pieris rapae, Phyllobrotica armta, and Hemerophila atrilineata have close phylogenetic relationship with the Nosema, while another microsporidium isolated from Bombyx mori is closely related to the Endoreticulatus. So, we temporarily classify three novel species of microsporidia to genus Nosema, as Nosema sp. PR, Nosema sp. PA, Nosema sp. HA. Another is temporarily classified into genus Endoreticulatus, as Endoreticulatus sp. Zhenjiang. The result indicated as well that it is feasible and valuable to elucidate phylogenetic relationships and taxonomic status of microsporidian species by analyzing information from SSU rRNA sequences of microsporidia. PMID:19768503

  17. RNA editing in plant mitochondria—connecting RNA target sequences and acting proteins.

    PubMed

    Takenaka, Mizuki; Verbitskiy, Daniil; Zehrmann, Anja; Härtel, Barbara; Bayer-Császár, Eszter; Glass, Franziska; Brennicke, Axel

    2014-11-01

    RNA editing changes several hundred cytidines to uridines in the mRNAs of mitochondria in flowering plants. The target cytidines are identified by a subtype of PPR proteins characterized by tandem modules which each binds with a specific upstream nucleotide. Recent progress in correlating repeat structures with nucleotide identities allows to predict and identify target sites in mitochondrial RNAs. Additional proteins have been found to play a role in RNA editing; their precise function still needs to be elucidated. The enzymatic activity performing the C to U reaction may reside in the C-terminal DYW extensions of the PPR proteins; however, this still needs to be proven. Here we update recent progress in understanding RNA editing in flowering plant mitochondria.

  18. Normalization, testing, and false discovery rate estimation for RNA-sequencing data.

    PubMed

    Li, Jun; Witten, Daniela M; Johnstone, Iain M; Tibshirani, Robert

    2012-07-01

    We discuss the identification of genes that are associated with an outcome in RNA sequencing and other sequence-based comparative genomic experiments. RNA-sequencing data take the form of counts, so models based on the Gaussian distribution are unsuitable. Moreover, normalization is challenging because different sequencing experiments may generate quite different total numbers of reads. To overcome these difficulties, we use a log-linear model with a new approach to normalization. We derive a novel procedure to estimate the false discovery rate (FDR). Our method can be applied to data with quantitative, two-class, or multiple-class outcomes, and the computation is fast even for large data sets. We study the accuracy of our approaches for significance calculation and FDR estimation, and we demonstrate that our method has potential advantages over existing methods that are based on a Poisson or negative binomial model. In summary, this work provides a pipeline for the significance analysis of sequencing data.

  19. The reverse transcription signature of N-1-methyladenosine in RNA-Seq is sequence dependent

    PubMed Central

    Hauenschild, Ralf; Tserovski, Lyudmil; Schmid, Katharina; Thüring, Kathrin; Winz, Marie-Luise; Sharma, Sunny; Entian, Karl-Dieter; Wacheul, Ludivine; Lafontaine, Denis L. J.; Anderson, James; Alfonzo, Juan; Hildebrandt, Andreas; Jäschke, Andres; Motorin, Yuri; Helm, Mark

    2015-01-01

    The combination of Reverse Transcription (RT) and high-throughput sequencing has emerged as a powerful combination to detect modified nucleotides in RNA via analysis of either abortive RT-products or of the incorporation of mismatched dNTPs into cDNA. Here we simultaneously analyze both parameters in detail with respect to the occurrence of N-1-methyladenosine (m1A) in the template RNA. This naturally occurring modification is associated with structural effects, but it is also known as a mediator of antibiotic resistance in ribosomal RNA. In structural probing experiments with dimethylsulfate, m1A is routinely detected by RT-arrest. A specifically developed RNA-Seq protocol was tailored to the simultaneous analysis of RT-arrest and misincorporation patterns. By application to a variety of native and synthetic RNA preparations, we found a characteristic signature of m1A, which, in addition to an arrest rate, features misincorporation as a significant component. Detailed analysis suggests that the signature depends on RNA structure and on the nature of the nucleotide 3′ of m1A in the template RNA, meaning it is sequence dependent. The RT-signature of m1A was used for inspection and confirmation of suspected modification sites and resulted in the identification of hitherto unknown m1A residues in trypanosomal tRNA. PMID:26365242

  20. Study design requirements for RNA sequencing-based breast cancer diagnostics.

    PubMed

    Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias

    2016-01-01

    Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic. PMID:26830453

  1. The landscape of fusion transcripts in spitzoid melanoma and biologically indeterminate spitzoid tumors by RNA sequencing

    PubMed Central

    Wu, Gang; Barnhill, Raymond L.; Lee, Seungjae; Li, Yongjin; Shao, Ying; Easton, John; Dalton, James; Zhang, Jinghui; Pappo, Alberto; Bahrami, Armita

    2016-01-01

    Kinase activation by chromosomal translocations is a common mechanism that drives tumorigenesis in spitzoid neoplasms. To explore the landscape of fusion transcripts in these tumors, we performed whole-transcriptome sequencing using formalin-fixed paraffin-embedded tissues in malignant or biologically indeterminate spitzoid tumors from 7 patients (age 2–14 years). RNA sequence libraries enriched for coding regions were prepared and the sequencing was analyzed by a novel assembly-based algorithm designed for detecting complex fusions. In addition, tumor samples were screened for hotspot TERT promoter mutations, and telomerase expression was assessed by TERT mRNA in situ hybridization (ISH). Two patients had widespread metastasis and subsequently died of disease, and 5 patients had a benign clinical course on limited follow-up (mean: 30 months). RNA sequencing and TERT mRNA ISH were successful in 6 tumors and unsuccessful in 1 disseminating tumor due to low RNA quality. RNA sequencing identified a kinase fusion in 5 of the 6 sequenced tumors: TPM3–NTRK1 (2 tumors), complex rearrangements involving TPM3, ALK, and IL6R (1 tumor), BAIAP2L1–BRAF (1 tumor), and EML4–BRAF (1 disseminating tumor). All predicted chimeric transcripts were expressed at high levels and contained the intact kinase domain. In addition, 2 tumors each contained a second fusion gene, ARID1B-SNX9 or PTPRZ1-NFAM1. The detected chimeric genes were validated by home-brew break-apart or fusion fluorescence in situ hybridization. The 2 disseminating tumors each harbored the TERT promoter −124C>T (Chr 5:1,295,228 hg19 coordinate) mutation whereas the remaining 5 tumors retained the wild-type gene. The presence of the −124C>T mutation correlated with telomerase expression by TERT mRNA ISH. In summary, we demonstrated complex fusion transcripts and novel partner genes for BRAF by RNA sequencing of FFPE samples. The diversity of gene fusions demonstrated by RNA sequencing defines the molecular

  2. Integrative Approaches for microRNA Target Prediction: Combining Sequence Information and the Paired mRNA and miRNA Expression Profiles

    PubMed Central

    Naifang, Su; Minping, Qian; Minghua, Deng

    2013-01-01

    Gene regulation is a key factor in gaining a full understanding of molecular biology. microRNA (miRNA), a novel class of non-coding RNA, has recently been found to be one crucial class of post-transactional regulators, and play important roles in cancer. One essential step to understand the regulatory effect of miRNAs is the reliable prediction of their target mRNAs. Typically, the predictions are solely based on the sequence information, which unavoidably have high false detection rates. Recently, some novel approaches are developed to predict miRNA targets by integrating the typical algorithm with the paired expression profiles of miRNA and mRNA. Here we review and discuss these integrative approaches and propose a new algorithm called HCTarget. Applying HCtarget to the expression data in multiple myeloma, we predict target genes for ten specific miRNAs. The experimental verification and a loss of function study validate our predictions. Therefore, the integrative approach is a reliable and effective way to predict miRNA targets, and could improve our comprehensive understanding of gene regulation. PMID:23467572

  3. Bioinformatics of Cancer ncRNA in High Throughput Sequencing: Present State and Challenges

    PubMed Central

    Jorge, Natasha Andressa Nogueira; Ferreira, Carlos Gil; Passetti, Fabio

    2012-01-01

    The numerous genome sequencing projects produced unprecedented amount of data providing significant information to the discovery of novel non-coding RNA (ncRNA). Several ncRNAs have been described to control gene expression and display important role during cell differentiation and homeostasis. In the last decade, high throughput methods in conjunction with approaches in bioinformatics have been used to identify, classify, and evaluate the expression of hundreds of ncRNA in normal and pathological states, such as cancer. Patient outcomes have been already associated with differential expression of ncRNAs in normal and tumoral tissues, providing new insights in the development of innovative therapeutic strategies in oncology. In this review, we present and discuss bioinformatics advances in the development of computational approaches to analyze and discover ncRNA data in oncology using high throughput sequencing technologies. PMID:23251139

  4. Oligonucleotide probes for Bordetella bronchiseptica based on 16S ribosomal RNA sequences.

    PubMed

    Taneda, A; Futo, S; Mitsuse, S; Seto, Y; Okada, M; Sakano, T

    1994-12-01

    Bordetella bronchiseptica 16S ribosomal RNA (rRNA) gene was cloned and identified. On the basis of information from computer-assisted sequence comparison of the B. bronchiseptica 16S RRNA sequences with that of other bacterial species, we constructed B. bronchiseptica-specific oligonucleotide probes complementary to variable regions in the 16S rRNA molecule. Specificity of these 32P-labeled oligo-nucleotide probes was tested in a RNA/DNA hybridization with B. bronchiseptica strains and other bacterial strains. Probe BB4 was more specific than three other oligonucleotide probes. This probe BB4 was sensitive enough to be able to detect 10(4) bacterial cells. PMID:9133055

  5. Prediction of Immunomodulatory potential of an RNA sequence for designing non-toxic siRNAs and RNA-based vaccine adjuvants

    PubMed Central

    Chaudhary, Kumardeep; Nagpal, Gandharva; Dhanda, Sandeep Kumar; Raghava, Gajendra P. S.

    2016-01-01

    Our innate immune system recognizes a foreign RNA sequence of a pathogen and activates the immune system to eliminate the pathogen from our body. This immunomodulatory potential of RNA can be used to design RNA-based immunotherapy and vaccine adjuvants. In case of siRNA-based therapy, the immunomodulatory effect of an RNA sequence is unwanted as it may cause immunotoxicity. Thus, we developed a method for designing a single-stranded RNA (ssRNA) sequence with desired immunomodulatory potentials, for designing RNA-based therapeutics, immunotherapy and vaccine adjuvants. The dataset used for training and testing our models consists of 602 experimentally verified immunomodulatory oligoribonucleotides (IMORNs) that are ssRNA sequences of length 17 to 27 nucleotides and 520 circulating miRNAs as non-immunomodulatory sequences. We developed prediction models using various features that include composition-based features, binary profile, selected features, and hybrid features. All models were evaluated using five-fold cross-validation and external validation techniques; achieving a maximum mean Matthews Correlation Coefficient (MCC) of 0.86 with 93% accuracy. We identified motifs using MERCI software and observed the abundance of adenine (A) in motifs. Based on the above study, we developed a web server, imRNA, comprising of various modules important for designing RNA-based therapeutics (http://crdd.osdd.net/raghava/imrna/). PMID:26861761

  6. Structator: fast index-based search for RNA sequence-structure patterns

    PubMed Central

    2011-01-01

    Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several

  7. Nucleotide sequence of a satellite RNA associated with carrot motley dwarf in parsley and carrot.

    PubMed

    Menzel, Wulf; Maiss, Edgar; Vetten, H Josef

    2009-02-01

    Carrot motley dwarf (CMD) is known to result from a mixed infection by two viruses, the polerovirus Carrot red leaf virus and one of the umbraviruses Carrot mottle mimic virus or Carrot mottle virus. Some umbraviruses have been shown to be associated with small satellite (sat) RNAs, but none have been reported for the latter two. A CMD-affected parsley plant was used for sap transmission to test plants, that were used for dsRNA isolation. The presence of a 0.8-kbp dsRNA indicated the occurrence of a hitherto unrecognized satRNA associated with CMD. The satRNAs of the CMD isolate from parsley and an isolate from carrot have been sequenced and showed 94% sequence identity. Nucleotide sequences and putative translation products had no significant similarities to GenBank entries. To our knowledge, this is the first report of satRNAs associated with CMD.

  8. Telomerase RNA stem terminus element affects template boundary element function, telomere sequence, and shelterin binding.

    PubMed

    Webb, Christopher J; Zakian, Virginia A

    2015-09-01

    The stem terminus element (STE), which was discovered 13 y ago in human telomerase RNA, is required for telomerase activity, yet its mode of action is unknown. We report that the Schizosaccharomyces pombe telomerase RNA, TER1 (telomerase RNA 1), also contains a STE, which is essential for telomere maintenance. Cells expressing a partial loss-of-function TER1 STE allele maintained short stable telomeres by a recombination-independent mechanism. Remarkably, the mutant telomere sequence was different from that of wild-type cells. Generation of the altered sequence is explained by reverse transcription into the template boundary element, demonstrating that the STE helps maintain template boundary element function. The altered telomeres bound less Pot1 (protection of telomeres 1) and Taz1 (telomere-associated in Schizosaccharomyces pombe 1) in vivo. Thus, the S. pombe STE, although distant from the template, ensures proper telomere sequence, which in turn promotes proper assembly of the shelterin complex.

  9. Determining mutant spectra of three RNA viral samples using ultra-deep sequencing

    SciTech Connect

    Chen, H

    2012-06-06

    RNA viruses have extremely high mutation rates that enable the virus to adapt to new host environments and even jump from one species to another. As part of a viral transmission study, three viral samples collected from naturally infected animals were sequenced using Illumina paired-end technology at ultra-deep coverage. In order to determine the mutant spectra within the viral quasispecies, it is critical to understand the sequencing error rates and control for false positive calls of viral variants (point mutantations). I will estimate the sequencing error rate from two control sequences and characterize the mutant spectra in the natural samples with this error rate.

  10. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence

    PubMed Central

    Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin

    2016-01-01

    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method. PMID:26808495

  11. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence.

    PubMed

    Hao, Huijing; Liang, Junrong; Duan, Ran; Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin

    2016-01-01

    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method. PMID:26808495

  12. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence.

    PubMed

    Hao, Huijing; Liang, Junrong; Duan, Ran; Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin

    2016-01-01

    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method.

  13. Deep sequencing of microRNA precursors reveals extensive 3′ end modification

    PubMed Central

    Newman, Martin A.; Mani, Vidya; Hammond, Scott M.

    2011-01-01

    MicroRNAs (miRNAs) are small, noncoding RNAs that post-transcriptionally regulate gene expression. An emerging mechanism to control miRNA production is the addition of an oligo-uridine tail to the 3′ end of the precursor miRNA. This has been demonstrated for the Let-7 family of miRNAs in embryonic cells. Additionally, nontemplated nucleotides have been found on mature miRNA species, though in most cases it is not known if nucleotide addition occurs at the precursor step or at the mature miRNA. To examine the diversity of nucleotide addition we have developed a high-throughput sequencing method specific for miRNA precursors. Here we report that nontemplated addition is a widespread phenomenon occurring in many miRNA families. As previously reported, Let-7 family members are oligo-uridylated in embryonic cells in a Lin28-dependent manner. However, we find that the fraction of uridylated precursors increases with differentiation, independent of Lin28, and is highest in adult mouse tissues, exceeding 30% of all sequence reads for some Let-7 family members. A similar fraction of sequence reads are modified for many other miRNA families. Mono-uridylation is most common, with cytidine and adenosine modification less frequent but occurring above the expected error rate for Illumina sequencing. Nucleotide addition in cell lines is associated with 3′ end degradation, in contrast to adult tissues, where modification occurs predominantly on full-length precursors. This work provides an unprecedented view of the complexity of 3′ modification and trimming of miRNA precursors. PMID:21849429

  14. Deep sequencing of microRNA precursors reveals extensive 3' end modification.

    PubMed

    Newman, Martin A; Mani, Vidya; Hammond, Scott M

    2011-10-01

    MicroRNAs (miRNAs) are small, noncoding RNAs that post-transcriptionally regulate gene expression. An emerging mechanism to control miRNA production is the addition of an oligo-uridine tail to the 3' end of the precursor miRNA. This has been demonstrated for the Let-7 family of miRNAs in embryonic cells. Additionally, nontemplated nucleotides have been found on mature miRNA species, though in most cases it is not known if nucleotide addition occurs at the precursor step or at the mature miRNA. To examine the diversity of nucleotide addition we have developed a high-throughput sequencing method specific for miRNA precursors. Here we report that nontemplated addition is a widespread phenomenon occurring in many miRNA families. As previously reported, Let-7 family members are oligo-uridylated in embryonic cells in a Lin28-dependent manner. However, we find that the fraction of uridylated precursors increases with differentiation, independent of Lin28, and is highest in adult mouse tissues, exceeding 30% of all sequence reads for some Let-7 family members. A similar fraction of sequence reads are modified for many other miRNA families. Mono-uridylation is most common, with cytidine and adenosine modification less frequent but occurring above the expected error rate for Illumina sequencing. Nucleotide addition in cell lines is associated with 3' end degradation, in contrast to adult tissues, where modification occurs predominantly on full-length precursors. This work provides an unprecedented view of the complexity of 3' modification and trimming of miRNA precursors.

  15. 5′-Terminal Sequence of Vesicular Stomatitis Virus mRNA's Synthesized In Vitro

    PubMed Central

    Rhodes, Dennis P.; Banerjee, Amiya K.

    1976-01-01

    Unmethylated or methylated 12 to 18S mRNA's synthesized in vitro by the virion-associated RNA polymerase of vesicular stomatitis virus contain the 5′-terminal hexanucleotide sequence G(5′)ppp(5′)ApApCpApGp... or m 7G(5′)ppp(5′)ApmApCpApGp..., respectively. The implication of these results in relation to the regulation of transcription in vesicular stomatitis virus is discussed. PMID:173891

  16. Sequencing and characterisation of an extensive Atlantic salmon (Salmo salar L.) microRNA repertoire.

    PubMed

    Bekaert, Michaël; Lowe, Natalie R; Bishop, Stephen C; Bron, James E; Taggart, John B; Houston, Ross D

    2013-01-01

    Atlantic salmon (Salmo salar L.), a member of the family Salmonidae, is a totemic species of ecological and cultural significance that is also economically important in terms of both sports fisheries and aquaculture. These factors have promoted the continuous development of genomic resources for this species, furthering both fundamental and applied research. MicroRNAs (miRNA) are small endogenous non-coding RNA molecules that control spatial and temporal expression of targeted genes through post-transcriptional regulation. While miRNA have been characterised in detail for many other species, this is not yet the case for Atlantic salmon. To identify miRNAs from Atlantic salmon, we constructed whole fish miRNA libraries for 18 individual juveniles (fry, four months post hatch) and characterised them by Illumina high-throughput sequencing (total of 354,505,167 paired-ended reads). We report an extensive and partly novel repertoire of miRNA sequences, comprising 888 miRNA genes (547 unique mature miRNA sequences), quantify their expression levels in basal conditions, examine their homology to miRNAs from other species and identify their predicted target genes. We also identify the location and putative copy number of the miRNA genes in the draft Atlantic salmon reference genome sequence. The Atlantic salmon miRNAs experimentally identified in this study provide a robust large-scale resource for functional genome research in salmonids. There is an opportunity to explore the evolution of salmonid miRNAs following the relatively recent whole genome duplication event in salmonid species and to investigate the role of miRNAs in the regulation of gene expression in particular their contribution to variation in economically and ecologically important traits.

  17. Deep sequencing of pigeonpea sterility mosaic virus discloses five RNA segments related to emaraviruses.

    PubMed

    Elbeaino, Toufic; Digiaro, Michele; Uppala, Mangala; Sudini, Harikishan

    2014-08-01

    The sequences of five viral RNA segments of pigeonpea sterility mosaic virus (PPSMV), the agent of sterility mosaic disease (SMD) of pigeonpea (Cajanus cajan, Fabaceae), were determined using the deep sequencing technology. Each of the five RNAs encodes a single protein on the negative-sense strand with an open reading frame (ORF) of 6885, 1947, 927, 1086, and 1,422 nts, respectively. In order, from RNA1 to RNA5, these ORFs encode the RNA-dependent RNA polymerase (p1, 267.9 kDa), a putative glycoprotein precursor (p2, 74.3 kDa), a putative nucleocapsid protein (p3, 34.6 kDa), a putative movement protein (p4, 40.8 kDa), while p5 (55 kDa) has an unknown function. All RNA segments of PPSMV showed the highest identity with orthologs of fig mosaic virus (FMV) and Rose rosette virus (RRV). In phylogenetic trees constructed with the amino acid sequences of p1, p2 and p3, PPSMV clustered consistently with other emaraviruses, close to clades comprising members of other genera of the family Bunyaviridae. Based on the molecular characteristics unveiled in this study and the morphological and epidemiological features similar to other emaraviruses, PPSMV seems to be the seventh species to join the list of emaraviruses known to date and accordingly, its classification in the genus Emaravirus seems now legitimate. PMID:24685674

  18. Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations.

    PubMed

    Fu, Glenn K; Xu, Weihong; Wilhelmy, Julie; Mindrinos, Michael N; Davis, Ronald W; Xiao, Wenzhong; Fodor, Stephen P A

    2014-02-01

    We present a simple molecular indexing method for quantitative targeted RNA sequencing, in which mRNAs of interest are selectively captured from complex cDNA libraries and sequenced to determine their absolute concentrations. cDNA fragments are individually labeled so that each molecule can be tracked from the original sample through the library preparation and sequencing process. Multiple copies of cDNA fragments of identical sequence become distinct through labeling, and replicate clones created during PCR amplification steps can be identified and assigned to their distinct parent molecules. Selective capture enables efficient use of sequencing for deep sampling and for the absolute quantitation of rare or transient transcripts that would otherwise escape detection by standard sequencing methods. We have also constructed a set of synthetic barcoded RNA molecules, which can be introduced as controls into the sample preparation mix and used to monitor the efficiency of library construction. The quantitative targeted sequencing revealed extremely low efficiency in standard library preparations, which were further confirmed by using synthetic barcoded RNA molecules. This finding shows that standard library preparation methods result in the loss of rare transcripts and highlights the need for monitoring library efficiency and for developing more efficient sample preparation methods.

  19. Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations

    PubMed Central

    Fu, Glenn K.; Xu, Weihong; Wilhelmy, Julie; Mindrinos, Michael N.; Davis, Ronald W.; Xiao, Wenzhong; Fodor, Stephen P. A.

    2014-01-01

    We present a simple molecular indexing method for quantitative targeted RNA sequencing, in which mRNAs of interest are selectively captured from complex cDNA libraries and sequenced to determine their absolute concentrations. cDNA fragments are individually labeled so that each molecule can be tracked from the original sample through the library preparation and sequencing process. Multiple copies of cDNA fragments of identical sequence become distinct through labeling, and replicate clones created during PCR amplification steps can be identified and assigned to their distinct parent molecules. Selective capture enables efficient use of sequencing for deep sampling and for the absolute quantitation of rare or transient transcripts that would otherwise escape detection by standard sequencing methods. We have also constructed a set of synthetic barcoded RNA molecules, which can be introduced as controls into the sample preparation mix and used to monitor the efficiency of library construction. The quantitative targeted sequencing revealed extremely low efficiency in standard library preparations, which were further confirmed by using synthetic barcoded RNA molecules. This finding shows that standard library preparation methods result in the loss of rare transcripts and highlights the need for monitoring library efficiency and for developing more efficient sample preparation methods. PMID:24449890

  20. incaRNAfbinv: a web server for the fragment-based design of RNA sequences.

    PubMed

    Drory Retwitzer, Matan; Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme; Barash, Danny

    2016-07-01

    In recent years, new methods for computational RNA design have been developed and applied to various problems in synthetic biology and nanotechnology. Lately, there is considerable interest in incorporating essential biological information when solving the inverse RNA folding problem. Correspondingly, RNAfbinv aims at including biologically meaningful constraints and is the only program to-date that performs a fragment-based design of RNA sequences. In doing so it allows the design of sequences that do not necessarily exactly fold into the target, as long as the overall coarse-grained tree graph shape is preserved. Augmented by the weighted sampling algorithm of incaRNAtion, our web server called incaRNAfbinv implements the method devised in RNAfbinv and offers an interactive environment for the inverse folding of RNA using a fragment-based design approach. It takes as input: a target RNA secondary structure; optional sequence and motif constraints; optional target minimum free energy, neutrality and GC content. In addition to the design of synthetic regulatory sequences, it can be used as a pre-processing step for the detection of novel natural occurring RNAs. The two complementary methodologies RNAfbinv and incaRNAtion are merged together and fully implemented in our web server incaRNAfbinv, available at http://www.cs.bgu.ac.il/incaRNAfbinv.

  1. incaRNAfbinv: a web server for the fragment-based design of RNA sequences.

    PubMed

    Drory Retwitzer, Matan; Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme; Barash, Danny

    2016-07-01

    In recent years, new methods for computational RNA design have been developed and applied to various problems in synthetic biology and nanotechnology. Lately, there is considerable interest in incorporating essential biological information when solving the inverse RNA folding problem. Correspondingly, RNAfbinv aims at including biologically meaningful constraints and is the only program to-date that performs a fragment-based design of RNA sequences. In doing so it allows the design of sequences that do not necessarily exactly fold into the target, as long as the overall coarse-grained tree graph shape is preserved. Augmented by the weighted sampling algorithm of incaRNAtion, our web server called incaRNAfbinv implements the method devised in RNAfbinv and offers an interactive environment for the inverse folding of RNA using a fragment-based design approach. It takes as input: a target RNA secondary structure; optional sequence and motif constraints; optional target minimum free energy, neutrality and GC content. In addition to the design of synthetic regulatory sequences, it can be used as a pre-processing step for the detection of novel natural occurring RNAs. The two complementary methodologies RNAfbinv and incaRNAtion are merged together and fully implemented in our web server incaRNAfbinv, available at http://www.cs.bgu.ac.il/incaRNAfbinv. PMID:27185893

  2. Secretory pancreatic stone protein messenger RNA. Nucleotide sequence and expression in chronic calcifying pancreatitis.

    PubMed Central

    Giorgi, D; Bernard, J P; Rouquier, S; Iovanna, J; Sarles, H; Dagorn, J C

    1989-01-01

    The pancreatic stone protein and its secretory form (PSP-S) are inhibitors of CaCO3 crystal growth, possibly involved in the stabilization of pancreatic juice. We have established the structure of PSP-S mRNA and monitored its expression in chronic calcifying pancreatitis (CCP). A cDNA encoding pre-PSP-S has been cloned from a human pancreatic cDNA library. Its nucleotide sequence revealed that it comprised all but the 5' end of PSP-S mRNA, which was obtained by sequencing the first exon of the PSP-S gene. The complete mRNA sequence is 775 nucleotides long, including 5'- and 3'- noncoding regions of 80 and 197 nucleotides, respectively, attached to a poly(A) tail of approximately 125 nucleotides. It encodes a preprotein of 166 amino acids, including a prepeptide of 22 amino acids. No overall sequence homology was found between PSP-S and other pancreatic proteins. Some homology with several serine proteases was observed in the COOH-terminal region, however. The mRNA levels of PSP-S, trypsinogen, chymotrypsinogen, and colipase in CCP and control pancreas were compared. PSP-S mRNA was three times lower in CCP than in control, whereas the others were not altered. It was concluded that PSP-S gene expression is specifically reduced in CCP patients. Images PMID:2525567

  3. Phenotype classification of single cells using SRS microscopy, RNA sequencing, and microfluidics (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi

    2016-03-01

    Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.

  4. AIB1 gene amplification and the instability of polyQ encoding sequence in breast cancer cell lines

    PubMed Central

    Wong, Lee-Jun C; Dai, Pu; Lu, Jyh-Feng; Lou, Mary Ann; Clarke, Robert; Nazarov, Viktor

    2006-01-01

    Background The poly Q polymorphism in AIB1 (amplified in breast cancer) gene is usually assessed by fragment length analysis which does not reveal the actual sequence variation. The purpose of this study is to investigate the sequence variation of poly Q encoding region in breast cancer cell lines at single molecule level, and to determine if the sequence variation is related to AIB1 gene amplification. Methods The polymorphic poly Q encoding region of AIB1 gene was investigated at the single molecule level by PCR cloning/sequencing. The amplification of AIB1 gene in various breast cancer cell lines were studied by real-time quantitative PCR. Results Significant amplifications (5–23 folds) of AIB1 gene were found in 2 out of 9 (22%) ER positive cell lines (in BT-474 and MCF-7 but not in BT-20, ZR-75-1, T47D, BT483, MDA-MB-361, MDA-MB-468 and MDA-MB-330). The AIB1 gene was not amplified in any of the ER negative cell lines. Different passages of MCF-7 cell lines and their derivatives maintained the feature of AIB1 amplification. When the cells were selected for hormone independence (LCC1) and resistance to 4-hydroxy tamoxifen (4-OH TAM) (LCC2 and R27), ICI 182,780 (LCC9) or 4-OH TAM, KEO and LY 117018 (LY-2), AIB1 copy number decreased but still remained highly amplified. Sequencing analysis of poly Q encoding region of AIB1 gene did not reveal specific patterns that could be correlated with AIB1 gene amplification. However, about 72% of the breast cancer cell lines had at least one under represented (<20%) extra poly Q encoding sequence patterns that were derived from the original allele, presumably due to somatic instability. Although all MCF-7 cells and their variants had the same predominant poly Q encoding sequence pattern of (CAG)3CAA(CAG)9(CAACAG)3(CAACAGCAG)2CAA of the original cell line, a number of altered poly Q encoding sequences were found in the derivatives of MCF-7 cell lines. Conclusion These data suggest that poly Q encoding region of AIB1 gene is

  5. Identification of extracellular miRNA in archived serum samples by next-generation sequencing from RNA extracted using multiple methods.

    PubMed

    Gautam, Aarti; Kumar, Raina; Dimitrov, George; Hoke, Allison; Hammamieh, Rasha; Jett, Marti

    2016-10-01

    miRNAs act as important regulators of gene expression by promoting mRNA degradation or by attenuating protein translation. Since miRNAs are stably expressed in bodily fluids, there is growing interest in profiling these miRNAs, as it is minimally invasive and cost-effective as a diagnostic matrix. A technical hurdle in studying miRNA dynamics is the ability to reliably extract miRNA as small sample volumes and low RNA abundance create challenges for extraction and downstream applications. The purpose of this study was to develop a pipeline for the recovery of miRNA using small volumes of archived serum samples. The RNA was extracted employing several widely utilized RNA isolation kits/methods with and without addition of a carrier. The small RNA library preparation was carried out using Illumina TruSeq small RNA kit and sequencing was carried out using Illumina platform. A fraction of five microliters of total RNA was used for library preparation as quantification is below the detection limit. We were able to profile miRNA levels in serum from all the methods tested. We found out that addition of nucleic acid based carrier molecules had higher numbers of processed reads but it did not enhance the mapping of any miRBase annotated sequences. However, some of the extraction procedures offer certain advantages: RNA extracted by TRIzol seemed to align to the miRBase best; extractions using TRIzol with carrier yielded higher miRNA-to-small RNA ratios. Nuclease free glycogen can be carrier of choice for miRNA sequencing. Our findings illustrate that miRNA extraction and quantification is influenced by the choice of methodologies. Addition of nucleic acid- based carrier molecules during extraction procedure is not a good choice when assaying miRNA using sequencing. The careful selection of an extraction method permits the archived serum samples to become valuable resources for high-throughput applications. PMID:27510798

  6. Identification of extracellular miRNA in archived serum samples by next-generation sequencing from RNA extracted using multiple methods.

    PubMed

    Gautam, Aarti; Kumar, Raina; Dimitrov, George; Hoke, Allison; Hammamieh, Rasha; Jett, Marti

    2016-10-01

    miRNAs act as important regulators of gene expression by promoting mRNA degradation or by attenuating protein translation. Since miRNAs are stably expressed in bodily fluids, there is growing interest in profiling these miRNAs, as it is minimally invasive and cost-effective as a diagnostic matrix. A technical hurdle in studying miRNA dynamics is the ability to reliably extract miRNA as small sample volumes and low RNA abundance create challenges for extraction and downstream applications. The purpose of this study was to develop a pipeline for the recovery of miRNA using small volumes of archived serum samples. The RNA was extracted employing several widely utilized RNA isolation kits/methods with and without addition of a carrier. The small RNA library preparation was carried out using Illumina TruSeq small RNA kit and sequencing was carried out using Illumina platform. A fraction of five microliters of total RNA was used for library preparation as quantification is below the detection limit. We were able to profile miRNA levels in serum from all the methods tested. We found out that addition of nucleic acid based carrier molecules had higher numbers of processed reads but it did not enhance the mapping of any miRBase annotated sequences. However, some of the extraction procedures offer certain advantages: RNA extracted by TRIzol seemed to align to the miRBase best; extractions using TRIzol with carrier yielded higher miRNA-to-small RNA ratios. Nuclease free glycogen can be carrier of choice for miRNA sequencing. Our findings illustrate that miRNA extraction and quantification is influenced by the choice of methodologies. Addition of nucleic acid- based carrier molecules during extraction procedure is not a good choice when assaying miRNA using sequencing. The careful selection of an extraction method permits the archived serum samples to become valuable resources for high-throughput applications.

  7. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  8. Phylogeny of protostome worms derived from 18S rRNA sequences.

    PubMed

    Winnepenninckx, B; Backeljau, T; De Wachter, R

    1995-07-01

    The phylogenetic relationships of protostome worms were studied by comparing new complete 18S rRNA sequences of Vestimentifera, Pogonophora, Sipuncula, Echiura, Nemertea, and Annelida with existing 18S rRNA sequences of Mollusca, Arthropoda, Chordata, and Platyhelminthes. Phylogenetic trees were inferred via neighbor-joining and maximum parsimony analyses. These suggest that (1) Sipuncula and Echiura are not sister groups; (2) Nemertea are protostomes; (3) Vestimentifera and Pogonophora are protostomes that have a common ancestor with Echiura; and (4) Vestimentifera and Pogonophora are a monophyletic clade.

  9. rRNA sequence comparison of Beauveria bassiana, Tolypocladium cylindrosporum, and Tolypocladium extinguens.

    PubMed

    Rakotonirainy, M S; Dutertre, M; Brygoo, Y; Riba, G

    1991-01-01

    Five strains of Tolypocladium cylindrosporum, one strain of Tolypocladium extinguens, and nine strains of Beauveria bassiana were analyzed using a rapid rRNA sequencing technique. The sequences of two highly variable domains (D1 and D2) located at the 5' end of the 28S-like rRNA molecule were determined. The phylogenetic tree computed from the absolute number of nucleotide differences shows the separation between the genus Beauveria and the genus Tolypocladium and points out that T. cylindrosporum and T. extinguens probably do not belong to the same genus.

  10. Genome Sequence of a Novel Iflavirus from mRNA Sequencing of the Butterfly Heliconius erato

    PubMed Central

    Macias-Muñoz, Aide; Briscoe, Adriana D.

    2014-01-01

    Here, we report the genome sequence of a novel iflavirus strain recovered from the neotropical butterfly Heliconius erato. The coding DNA sequence (CDS) of the iflavirus genome was 8,895 nucleotides in length, encoding a polyprotein that was 2,965 amino acids long. PMID:24831145

  11. Sequence selective recognition of double-stranded RNA using triple helix-forming peptide nucleic acids.

    PubMed

    Zengeya, Thomas; Gupta, Pankaj; Rozners, Eriks

    2014-01-01

    Noncoding RNAs are attractive targets for molecular recognition because of the central role they play in gene expression. Since most noncoding RNAs are in a double-helical conformation, recognition of such structures is a formidable problem. Herein, we describe a method for sequence-selective recognition of biologically relevant double-helical RNA (illustrated on ribosomal A-site RNA) using peptide nucleic acids (PNA) that form a triple helix in the major grove of RNA under physiologically relevant conditions. Protocols for PNA preparation and binding studies using isothermal titration calorimetry are described in detail.

  12. Barcoded cDNA library preparation for small RNA profiling by next-generation sequencing.

    PubMed

    Hafner, Markus; Renwick, Neil; Farazi, Thalia A; Mihailović, Aleksandra; Pena, John T G; Tuschl, Thomas

    2012-10-01

    The characterization of post-transcriptional gene regulation by small regulatory (20-30 nt) RNAs, particularly miRNAs and piRNAs, has become a major focus of research in recent years. A prerequisite for characterizing small RNAs is their identification and quantification across different developmental stages, and in normal and disease tissues, as well as model cell lines. Here we present a step-by-step protocol for generating barcoded small RNA cDNA libraries compatible with Illumina HiSeq sequencing, thereby facilitating miRNA and other small RNA profiling of large sample collections.

  13. Targeted Mutagenesis in Plant Cells through Transformation of Sequence-Specific Nuclease mRNA

    PubMed Central

    Stoddard, Thomas J.; Clasen, Benjamin M.; Baltes, Nicholas J.; Demorest, Zachary L.; Voytas, Daniel F.; Zhang, Feng; Luo, Song

    2016-01-01

    Plant genome engineering using sequence-specific nucleases (SSNs) promises to advance basic and applied plant research by enabling precise modification of endogenous genes. Whereas DNA is an effective means for delivering SSNs, DNA can integrate randomly into the plant genome, leading to unintentional gene inactivation. Further, prolonged expression of SSNs from DNA constructs can lead to the accumulation of off-target mutations. Here, we tested a new approach for SSN delivery to plant cells, namely transformation of messenger RNA (mRNA) encoding TAL effector nucleases (TALENs). mRNA delivery of a TALEN pair targeting the Nicotiana benthamiana ALS gene resulted in mutation frequencies of approximately 6% in comparison to DNA delivery, which resulted in mutation frequencies of 70.5%. mRNA delivery resulted in three-fold fewer insertions, and 76% were <10bp; in contrast, 88% of insertions generated through DNA delivery were >10bp. In an effort to increase mutation frequencies using mRNA, we fused several different 5’ and 3’ untranslated regions (UTRs) from Arabidopsis thaliana genes to the TALEN coding sequence. UTRs from an A. thaliana adenine nucleotide α hydrolases-like gene (At1G09740) enhanced mutation frequencies approximately two-fold, relative to a no-UTR control. These results indicate that mRNA can be used as a delivery vehicle for SSNs, and that manipulation of mRNA UTRs can influence efficiencies of genome editing. PMID:27176769

  14. Research progress on mechanisms of male sterility in plants based on high-throughput RNA sequencing.

    PubMed

    Yongming, Liu; Ling, Zhang; Tao, Qiu; Zhuofan, Zhao; Moju, Cao

    2016-08-01

    Male sterility is defined as failing to produce functional pollen during stamen development in plants, and it plays a crucial role in plant reproductive research and hybrid seed production in utilization of crop heterosis. High throughput RNA sequencing (RNA-seq) has been used widely in the study of different fields of life science, as it readily detects all the mRNA and non-coding RNA in cells. Recently, RNA-seq has been reported to be applied in different species and kinds of pollen abortion types in plants, which has contributed to the understanding of the molecular mechanism and metabolic networks of male sterility at the transcription level. In this review, we summarize research progress on the mechanisms of male sterility in plants, focusing on RNA-seq analysis encompassing strategies of RNA library construction, differentially expressed genes and functional characteristics of noncoding RNAs involved in stamen abortion. Furthermore, we also discuss application of transcriptome sequencing technology to elucidate pollen abortion mechanisms and map fertility-related genes. We hope to provide references to the study of male sterility in plants. PMID:27531606

  15. High-Throughput Sequencing of RNA Silencing-Associated Small RNAs in Olive (Olea europaea L.)

    PubMed Central

    Donaire, Livia; Pedrola, Laia; de la Rosa, Raúl; Llave, César

    2011-01-01

    Small RNAs (sRNAs) of 20 to 25 nucleotides (nt) in length maintain genome integrity and control gene expression in a multitude of developmental and physiological processes. Despite RNA silencing has been primarily studied in model plants, the advent of high-throughput sequencing technologies has enabled profiling of the sRNA component of more than 40 plant species. Here, we used deep sequencing and molecular methods to report the first inventory of sRNAs in olive (Olea europaea L.). sRNA libraries prepared from juvenile and adult shoots revealed that the 24-nt class dominates the sRNA transcriptome and atypically accumulates to levels never seen in other plant species, suggesting an active role of heterochromatin silencing in the maintenance and integrity of its large genome. A total of 18 known miRNA families were identified in the libraries. Also, 5 other sRNAs derived from potential hairpin-like precursors remain as plausible miRNA candidates. RNA blots confirmed miRNA expression and suggested tissue- and/or developmental-specific expression patterns. Target mRNAs of conserved miRNAs were computationally predicted among the olive cDNA collection and experimentally validated through endonucleolytic cleavage assays. Finally, we use expression data to uncover genetic components of the miR156, miR172 and miR390/TAS3-derived trans-acting small interfering RNA (tasiRNA) regulatory nodes, suggesting that these interactive networks controlling developmental transitions are fully operational in olive. PMID:22140484

  16. Combined sequencing of mRNA and DNA from human embryonic stem cells.

    PubMed

    Mertes, Florian; Kuhl, Heiner; Wruck, Wasco; Lehrach, Hans; Adjaye, James

    2016-06-01

    Combined transcriptome and whole genome sequencing of the same ultra-low input sample down to single cells is a rapidly evolving approach for the analysis of rare cells. Besides stem cells, rare cells originating from tissues like tumor or biopsies, circulating tumor cells and cells from early embryonic development are under investigation. Herein we describe a universal method applicable for the analysis of minute amounts of sample material (150 to 200 cells) derived from sub-colony structures from human embryonic stem cells. The protocol comprises the combined isolation and separate amplification of poly(A) mRNA and whole genome DNA followed by next generation sequencing. Here we present a detailed description of the method developed and an overview of the results obtained for RNA and whole genome sequencing of human embryonic stem cells, sequencing data is available in the Gene Expression Omnibus (GEO) database under accession number GSE69471. PMID:27275414

  17. Combined sequencing of mRNA and DNA from human embryonic stem cells.

    PubMed

    Mertes, Florian; Kuhl, Heiner; Wruck, Wasco; Lehrach, Hans; Adjaye, James

    2016-06-01

    Combined transcriptome and whole genome sequencing of the same ultra-low input sample down to single cells is a rapidly evolving approach for the analysis of rare cells. Besides stem cells, rare cells originating from tissues like tumor or biopsies, circulating tumor cells and cells from early embryonic development are under investigation. Herein we describe a universal method applicable for the analysis of minute amounts of sample material (150 to 200 cells) derived from sub-colony structures from human embryonic stem cells. The protocol comprises the combined isolation and separate amplification of poly(A) mRNA and whole genome DNA followed by next generation sequencing. Here we present a detailed description of the method developed and an overview of the results obtained for RNA and whole genome sequencing of human embryonic stem cells, sequencing data is available in the Gene Expression Omnibus (GEO) database under accession number GSE69471.

  18. Complete Genome Sequence of a Reference Stock of Simian Immunodeficiency Virus RNA (SIVmac251/32H/L28) Determined by Deep Sequencing.

    PubMed

    Jenkins, Adrian; Ham, Claire; Almond, Neil; Berry, Neil

    2016-01-01

    A reference preparation for simian immunodeficiency virus (SIV) RNA nucleic acid assays was characterized by complete genome deep sequencing. The entire coding sequence and flanking long terminal repeats, including minority species, were determined. This information will inform SIV research investigations and aid evaluation and development of amplification assays for SIV RNA quantification. PMID:27231355

  19. Complete Genome Sequence of a Reference Stock of Simian Immunodeficiency Virus RNA (SIVmac251/32H/L28) Determined by Deep Sequencing

    PubMed Central

    Jenkins, Adrian; Ham, Claire; Almond, Neil

    2016-01-01

    A reference preparation for simian immunodeficiency virus (SIV) RNA nucleic acid assays was characterized by complete genome deep sequencing. The entire coding sequence and flanking long terminal repeats, including minority species, were determined. This information will inform SIV research investigations and aid evaluation and development of amplification assays for SIV RNA quantification. PMID:27231355

  20. Delta sequences in the 5' non-coding region of yeast tRNA genes

    PubMed Central

    Gafner, Jürg; Robertis, Eddy M.De; Philippsen, Peter

    1983-01-01

    Two so far undetected tRNA genes were found close to delta (δ) sequences at the sup4 locus on chromosome X in the genome of Saccharomyces cerevisiae. The two genes were identified from their abundant transcription products in frog oocytes. Hybridisation experiments allowed the mapping of the transcripts in cloned DNA and DNA sequence analysis revealed the presence of one AGGtRNAArg and one GACtRNAAsp gene. tRNAAsp genes with sequences similar or identical to GACtRNAAsp exist in 14-16 copies per haploid yeast genome, whereas only one copy was detected for AGGtRNAArg. In vivo labelling of total yeast tRNA with 32P followed by hybridisation revealed that the unique AGGtRNAArg gene is transcribed in S. cerevisiae. δ sequences are present 120 bp upstream from the first coding nucleotide in the case of AGGtRNAArg, 80 bp in the case of GACtRNAAsp and 405 bp in the case of the known UACtRNATyr (sup4) gene. δ sequences, as part of Ty elements or alone, were also found by other investigators at similar distances upstream of the mRNA start in mutant alleles of protein-coding yeast genes. Although protein-coding genes are transcribed by RNA polymerase II and tRNA genes by RNA polymerase III, the 5' non-coding region of both types of genes could conceivably have a peculiar DNA or chromatin structure used as preferred landing sites by transposable elements. ImagesFig. 1.Fig. 2.Fig. 5.Fig. 6. PMID:16453444

  1. mirTools: microRNA profiling and discovery based on high-throughput sequencing.

    PubMed

    Zhu, Erle; Zhao, Fangqing; Xu, Gang; Hou, Huabin; Zhou, Linglin; Li, Xiaokun; Sun, Zhongsheng; Wu, Jinyu

    2010-07-01

    miRNAs are small, non-coding RNA that negatively regulate gene expression at post-transcriptional level, which play crucial roles in various physiological and pathological processes, such as development and tumorigenesis. Although deep sequencing technologies have been applied to investigate various small RNA transcriptomes, their computational methods are far away from maturation as compared to microarray-based approaches. In this study, a comprehensive web server mirTools was developed to allow researchers to comprehensively characterize small RNA transcriptome. With the aid of mirTools, users can: (i) filter low-quality reads and 3/5' adapters from raw sequenced data; (ii) align large-scale short reads to the reference genome and explore their length distribution; (iii) classify small RNA candidates into known categories, such as known miRNAs, non-coding RNA, genomic repeats and coding sequences; (iv) provide detailed annotation information for known miRNAs, such as miRNA/miRNA*, absolute/relative reads count and the most abundant tag; (v) predict novel miRNAs that have not been characterized before; and (vi) identify differentially expressed miRNAs between samples based on two different counting strategies: total read tag counts and the most abundant tag counts. We believe that the integration of multiple computational approaches in mirTools will greatly facilitate current microRNA researches in multiple ways. mirTools can be accessed at http://centre.bioinformatics.zj.cn/mirtools/ and http://59.79.168.90/mirtools.

  2. PETcofold: predicting conserved interactions and structures of two multiple alignments of RNA sequences

    PubMed Central

    Seemann, Stefan E.; Richter, Andreas S.; Gesell, Tanja; Backofen, Rolf; Gorodkin, Jan

    2011-01-01

    Motivation: Predicting RNA–RNA interactions is essential for determining the function of putative non-coding RNAs. Existing methods for the prediction of interactions are all based on single sequences. Since comparative methods have already been useful in RNA structure determination, we assume that conserved RNA–RNA interactions also imply conserved function. Of these, we further assume that a non-negligible amount of the existing RNA–RNA interactions have also acquired compensating base changes throughout evolution. We implement a method, PETcofold, that can take covariance information in intra-molecular and inter-molecular base pairs into account to predict interactions and secondary structures of two multiple alignments of RNA sequences. Results: PETcofold's ability to predict RNA–RNA interactions was evaluated on a carefully curated dataset of 32 bacterial small RNAs and their targets, which was manually extracted from the literature. For evaluation of both RNA–RNA interaction and structure prediction, we were able to extract only a few high-quality examples: one vertebrate small nucleolar RNA and four bacterial small RNAs. For these we show that the prediction can be improved by our comparative approach. Furthermore, PETcofold was evaluated on controlled data with phylogenetically simulated sequences enriched for covariance patterns at the interaction sites. We observed increased performance with increased amounts of covariance. Availability: The program PETcofold is available as source code and can be downloaded from http://rth.dk/resources/petcofold. Contact: gorodkin@rth.dk; backofen@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21088024

  3. Identification and sequence determination of a novel double-stranded RNA mycovirus from the entomopathogenic fungus Beauveria bassiana.

    PubMed

    Kotta-Loizou, Ioly; Sipkova, Jana; Coutts, Robert H A

    2015-03-01

    An isolate of the entomopathogenic fungus Beauveria bassiana was found to contain five double-stranded (ds) RNA elements ranging from 1.5 to more than 3 kbp. The complete sequence of the largest dsRNA element is described here. Analysis of the RdRp nucleotide sequence reveals its similarity to unclassified dsRNA elements, such as Alternaria longipes dsRNA virus 1, and its distant relationship to the RNA-dependent RNA polymerases of members of the family Partitiviridae. PMID:25577168

  4. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    PubMed

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences.

  5. Phylogenetic analysis of Mexican Babesia bovis isolates using msa and ssrRNA gene sequences.

    PubMed

    Genis, Alma D; Mosqueda, Juan J; Borgonio, Verónica M; Falcón, Alfonso; Alvarez, Antonio; Camacho, Minerva; de Lourdes Muñoz, Maria; Figueroa, Julio V

    2008-12-01

    Variable merozoite surface antigens of Babesia bovis are exposed glycoproteins having a role in erythrocyte invasion. Members of this gene family include msa-1 and msa-2 (msa-2c, msa-2a(1), msa-2a(2), and msa-2b). Small subunit ribosomal (ssr)RNA gene is subject to evolutive pressure and has been used in phylogenetic studies. To determine the phylogenetic relationship among B. bovis Mexican isolates using different genetic markers, PCR amplicons, corresponding to msa-1, msa-2c, msa-2b, and ssrRNA genes, were cloned and plasmids carrying the corresponding inserts were sequenced. Comparative analysis of nucleotide and deduced amino acid sequences revealed distinct degrees of variability and identity among the coding gene sequences obtained from 12 geographically different B. bovis isolates and a reference strain. Overall sequence identities of 47.7%, 72.3%, 87.7%, and 94% were determined for msa-1, msa-2b, msa-2c, and ssrRNA, respectively. A robust phylogenetic tree was obtained with msa-2b sequences. The phylogenetic analysis suggests that Mexican B. bovis isolates group in clades not concordant with the Mexican geography. However, the Mexican isolates group together in an American clade separated from the Australian clade. Sequence heterogeneity in msa-1, msa-2b, and msa-2c coding regions of Mexican B. bovis isolates present in different geographical regions can be a result of either differential evolutive pressure or cattle movement from commercial trade.

  6. Molecular identification of nanoplanktonic protists based on small subunit ribosomal RNA gene sequences for ecological studies.

    PubMed

    Lim, E L

    1996-01-01

    Nanoplanktonic protists are comprised of a diverse assemblage of species which are responsible for a variety of trophic processes in marine and freshwater ecosystems. Current methods for identifying small protists by electron microscopy do not readily permit both identification and enumeration of nanoplanktonic protists in field samples. Thus, one major goal in the application of molecular approaches in protistan ecology has been the detection and quantification of individual species in natural water samples. Sequences of small subunit ribosomal RNA (SSU rRNA) genes have proven to be useful towards achieving this goal. Comparison of sequences from clone libraries of protistan SSU rRNA genes amplified from natural assemblages of protists by the polymerase chain reaction (PCR) can be used to examine protistan diversity. Furthermore, oligonucleotide probes complementary to short sequence regions unique to species of small protists can be designed by comparative analysis of rRNA gene sequences. These probes may be used to either detect the RNA of particular species of protists in total nucleic acid extracts immobilized on membranes, or the presence of target species in water samples via in situ hybridization of whole cells. Oligonucleotide probes may also serve as primers for the selective amplification of target sequences from total population DNA by PCR. Thus, molecular sequence information is becoming increasingly useful for identifying and enumerating protists, and for studying their spatial and temporal distribution in nature. Knowledge of protistan species composition, abundance and variability in an environment can ultimately be used to relate community structure to various aspects of community function and biogeochemical activity.

  7. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    NASA Technical Reports Server (NTRS)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  8. The impact of CRISPR repeat sequence on structures of a Cas6 protein-RNA complex

    SciTech Connect

    Wang, Ruiying; Zheng, Han; Preamplume, Gan; Shao, Yaming; Li, Hong

    2012-03-15

    The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of a noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.

  9. Structure and sequence of the gene for the largest subunit of trypanosomal RNA polymerase III.

    PubMed Central

    Köck, J; Evers, R; Cornelissen, A W

    1988-01-01

    As the first step in the analysis of the transcription process in the African trypanosome, Trypanosoma brucei, we have started to characterise the trypanosomal RNA polymerases. We have previously described the gene encoding the largest subunit of RNA polymerase II and found that two almost identical RNA polymerase II genes are encoded within the genome of T. brucei. Here we present the identification, cloning and sequence analysis of the gene encoding the largest subunit of RNA polymerase III. This gene contains a single open reading frame encoding a polypeptide with a Mr of 170 kD. In total, eight encoding a polypeptide with a Mr of 170 kD. In total, eight highly conserved regions with significant homology to those previously reported in other eukaryotic RNA polymerase largest subunits were identified. Some of these domains contain functional sites, which are conserved among all eukaryotic largest subunit genes analysed thus far. Since these domains make up a large part of each polypeptide, independent of the RNA polymerase class, these data strongly support the hypothesis that these domains provide a major part of the transcription machinery of the RNA polymerase complex. The additional domains which are uniquely present in the largest subunit of RNA polymerase I and II, respectively, two large hydrophylic insertions and a C-terminal extension, might be a determining factor in specific transcription of the gene classes. Images PMID:3174432

  10. RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data

    PubMed Central

    Sun, Wen-Ju; Li, Jun-Hao; Liu, Shun; Wu, Jie; Zhou, Hui; Qu, Liang-Hu; Yang, Jian-Hua

    2016-01-01

    Although more than 100 different types of RNA modifications have been characterized across all living organisms, surprisingly little is known about the modified positions and their functions. Recently, various high-throughput modification sequencing methods have been developed to identify diverse post-transcriptional modifications of RNA molecules. In this study, we developed a novel resource, RMBase (RNA Modification Base, http://mirlab.sysu.edu.cn/rmbase/), to decode the genome-wide landscape of RNA modifications identified from high-throughput modification data generated by 18 independent studies. The current release of RMBase includes ∼9500 pseudouridine (Ψ) modifications generated from Pseudo-seq and CeU-seq sequencing data, ∼1000 5-methylcytosines (m5C) predicted from Aza-IP data, ∼124 200 N6-Methyladenosine (m6A) modifications discovered from m6A-seq and ∼1210 2′-O-methylations (2′-O-Me) identified from RiboMeth-seq data and public resources. Moreover, RMBase provides a comprehensive listing of other experimentally supported types of RNA modifications by integrating various resources. It provides web interfaces to show thousands of relationships between RNA modification sites and microRNA target sites. It can also be used to illustrate the disease-related SNPs residing in the modification sites/regions. RMBase provides a genome browser and a web-based modTool to query, annotate and visualize various RNA modifications. This database will help expand our understanding of potential functions of RNA modifications. PMID:26464443

  11. Examining the Gm18 and m1G Modification Positions in tRNA Sequences

    PubMed Central

    Subramanian, Mayavan; Srinivasan, Thangavelu

    2014-01-01

    The tRNA structure contains conserved modifications that are responsible for its stability and are involved in the initiation and accuracy of the translation process. tRNA modification enzymes are prevalent in bacteria, archaea, and eukaryotes. tRNA Gm18 methyltransferase (TrmH) and tRNA m1G37 methyltransferase (TrmD) are prevalent and essential enzymes in bacterial populations. TrmH involves itself in methylation process at the 2'-OH group of ribose at the 18th position of guanosine (G) in tRNAs. TrmD methylates the G residue next to the anticodon in selected tRNA subsets. Initially, m1G37 modification was reported to take place on three conserved tRNA subsets (tRNAArg, tRNALeu, tRNAPro); later on, few archaea and eukaryotes organisms revealed that other tRNAs also have the m1G37 modification. The present study reveals Gm18, m1G37 modification, and positions of m1G that take place next to the anticodon in tRNA sequences. We selected extremophile organisms and attempted to retrieve the m1G and Gm18 modification bases in tRNA sequences. Results showed that the Gm18 modification G residue occurs in all tRNA subsets except three tRNAs (tRNAMet, tRNAPro, tRNAVal). Whereas the m1G37 modification base G is formed only on tRNAArg, tRNALeu, tRNAPro, and tRNAHis, the rest of the tRNAs contain adenine (A) next to the anticodon. Thus, we hypothesize that Gm18 modification and m1G modification occur irrespective of a G residue in tRNAs. PMID:25031570

  12. LNCipedia: a database for annotated human lncRNA transcript sequences and structures

    PubMed Central

    Volders, Pieter-Jan; Helsens, Kenny; Wang, Xiaowei; Menten, Björn; Martens, Lennart; Gevaert, Kris; Vandesompele, Jo; Mestdagh, Pieter

    2013-01-01

    Here, we present LNCipedia (http://www.lncipedia.org), a novel database for human long non-coding RNA (lncRNA) transcripts and genes. LncRNAs constitute a large and diverse class of non-coding RNA genes. Although several lncRNAs have been functionally annotated, the majority remains to be characterized. Different high-throughput methods to identify new lncRNAs (including RNA sequencing and annotation of chromatin-state maps) have been applied in various studies resulting in multiple unrelated lncRNA data sets. LNCipedia offers 21 488 annotated human lncRNA transcripts obtained from different sources. In addition to basic transcript information and gene structure, several statistics are determined for each entry in the database, such as secondary structure information, protein coding potential and microRNA binding sites. Our analyses suggest that, much like microRNAs, many lncRNAs have a significant secondary structure, in-line with their presumed association with proteins or protein complexes. Available literature on specific lncRNAs is linked, and users or authors can submit articles through a web interface. Protein coding potential is assessed by two different prediction algorithms: Coding Potential Calculator and HMMER. In addition, a novel strategy has been integrated for detecting potentially coding lncRNAs by automatically re-analysing the large body of publicly available mass spectrometry data in the PRIDE database. LNCipedia is publicly available and allows users to query and download lncRNA sequences and structures based on different search criteria. The database may serve as a resource to initiate small- and large-scale lncRNA studies. As an example, the LNCipedia content was used to develop a custom microarray for expression profiling of all available lncRNAs. PMID:23042674

  13. 3'READS+, a sensitive and accurate method for 3' end sequencing of polyadenylated RNA.

    PubMed

    Zheng, Dinghai; Liu, Xiaochuan; Tian, Bin

    2016-10-01

    Sequencing of the 3' end of poly(A)(+) RNA identifies cleavage and polyadenylation sites (pAs) and measures transcript expression. We previously developed a method, 3' region extraction and deep sequencing (3'READS), to address mispriming issues that often plague 3' end sequencing. Here we report a new version, named 3'READS+, which has vastly improved accuracy and sensitivity. Using a special locked nucleic acid oligo to capture poly(A)(+) RNA and to remove the bulk of the poly(A) tail, 3'READS+ generates RNA fragments with an optimal number of terminal A's that balance data quality and detection of genuine pAs. With improved RNA ligation steps for efficiency, the method shows much higher sensitivity (over two orders of magnitude) compared to the previous version. Using 3'READS+, we have uncovered a sizable fraction of previously overlooked pAs located next to or within a stretch of adenylate residues in human genes and more accurately assessed the frequency of alternative cleavage and polyadenylation (APA) in HeLa cells (∼50%). 3'READS+ will be a useful tool to accurately study APA and to analyze gene expression by 3' end counting, especially when the amount of input total RNA is limited. PMID:27512124

  14. Small RNA Sequencing Based Identification of MiRNAs in Daphnia magna.

    PubMed

    Ünlü, Ercan Selçuk; Gordon, Donna M; Telli, Murat

    2015-01-01

    Small RNA molecules are short, non-coding RNAs identified for their crucial role in post-transcriptional regulation. A well-studied example includes miRNAs (microRNAs) which have been identified in several model organisms including the freshwater flea and planktonic crustacean Daphnia. A model for epigenetic-based studies with an available genome database, the identification of miRNAs and their potential role in regulating Daphnia gene expression has only recently garnered interest. Computational-based work using Daphnia pulex, has indicated the existence of 45 miRNAs, 14 of which have been experimentally verified. To extend this study, we took a sequencing approach towards identifying miRNAs present in a small RNA library isolated from Daphnia magna. Using Perl codes designed for comparative genomic analysis, 815,699 reads were obtained from 4 million raw reads and run against a database file of known miRNA sequences. Using this approach, we have identified 205 putative mature miRNA sequences belonging to 188 distinct miRNA families. Data from this study provides critical information necessary to begin an investigation into a role for these transcripts in the epigenetic regulation of Daphnia magna.

  15. 3'READS+, a sensitive and accurate method for 3' end sequencing of polyadenylated RNA.

    PubMed

    Zheng, Dinghai; Liu, Xiaochuan; Tian, Bin

    2016-10-01

    Sequencing of the 3' end of poly(A)(+) RNA identifies cleavage and polyadenylation sites (pAs) and measures transcript expression. We previously developed a method, 3' region extraction and deep sequencing (3'READS), to address mispriming issues that often plague 3' end sequencing. Here we report a new version, named 3'READS+, which has vastly improved accuracy and sensitivity. Using a special locked nucleic acid oligo to capture poly(A)(+) RNA and to remove the bulk of the poly(A) tail, 3'READS+ generates RNA fragments with an optimal number of terminal A's that balance data quality and detection of genuine pAs. With improved RNA ligation steps for efficiency, the method shows much higher sensitivity (over two orders of magnitude) compared to the previous version. Using 3'READS+, we have uncovered a sizable fraction of previously overlooked pAs located next to or within a stretch of adenylate residues in human genes and more accurately assessed the frequency of alternative cleavage and polyadenylation (APA) in HeLa cells (∼50%). 3'READS+ will be a useful tool to accurately study APA and to analyze gene expression by 3' end counting, especially when the amount of input total RNA is limited.

  16. High-resolution transcriptome analysis with long-read RNA sequencing.

    PubMed

    Cho, Hyunghoon; Davis, Joe; Li, Xin; Smith, Kevin S; Battle, Alexis; Montgomery, Stephen B

    2014-01-01

    RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2×75 bp and 2×262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-mapping performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower mapping bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals.

  17. Messenger RNA sequence and the translation process --a particle transport perspective

    NASA Astrophysics Data System (ADS)

    Dong, Jiajia; Schmittmann, Beate; Zia, Royce K. P.

    2008-03-01

    The translation process in bacteria has been under intensive study. A key question concerns the quantitative effect of different elongation rates, associated with different codons, on the overall translation efficiency. Starting with a simple particle transport model, the totally asymmetric simple exclusion process (TASEP), we incorporate the essential components of the translation process: Ribosomes, cognate tRNA concentrations, and messenger RNA (mRNA) templates correspond to particles, hopping rates, and the underlying lattice, respectively. Using simulations and mean-field approximations to obtain the stationary currents (the protein production rates) associated with different mRNA sequences, we are especially interested in the effect of slow codons, i.e., codons which are associated with rare tRNAs and are therefore translated very slowly. As the first step, we look at a ``designed sequence'' with one and two slow codons and quantify the marked impact of their spatial distribution to the currents. Extending the results to several mRNA sequences taken from real genes, we argue that an effective translation rate including the information from the vicinity of each codon needs to be taken into consideration when seeking an efficient strategy to optimize the protein production.

  18. Small RNA Sequencing Based Identification of MiRNAs in Daphnia magna

    PubMed Central

    2015-01-01

    Small RNA molecules are short, non-coding RNAs identified for their crucial role in post-transcriptional regulation. A well-studied example includes miRNAs (microRNAs) which have been identified in several model organisms including the freshwater flea and planktonic crustacean Daphnia. A model for epigenetic-based studies with an available genome database, the identification of miRNAs and their potential role in regulating Daphnia gene expression has only recently garnered interest. Computational-based work using Daphnia pulex, has indicated the existence of 45 miRNAs, 14 of which have been experimentally verified. To extend this study, we took a sequencing approach towards identifying miRNAs present in a small RNA library isolated from Daphnia magna. Using Perl codes designed for comparative genomic analysis, 815,699 reads were obtained from 4 million raw reads and run against a database file of known miRNA sequences. Using this approach, we have identified 205 putative mature miRNA sequences belonging to 188 distinct miRNA families. Data from this study provides critical information necessary to begin an investigation into a role for these transcripts in the epigenetic regulation of Daphnia magna. PMID:26367422

  19. High-Resolution Transcriptome Analysis with Long-Read RNA Sequencing

    PubMed Central

    Cho, Hyunghoon; Davis, Joe; Li, Xin; Smith, Kevin S.; Battle, Alexis; Montgomery, Stephen B.

    2014-01-01

    RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2×75 bp and 2×262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-mapping performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower mapping bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals. PMID:25251678

  20. Complete nucleotide sequence and coding strategy of rice hoja blanca virus RNA4.

    PubMed

    Ramirez, B C; Lozano, I; Constantino, L M; Haenni, A L; Calvert, L A

    1993-11-01

    The complete sequence of rice hoja blanca virus (RHBV) RNA4 has been determined, based on the sequence of the corresponding cDNA clones. RNA4 consists of 1991 nucleotides with two open reading frames (ORFs). One putative ORF is located in the 5'-proximal region of the viral RNA4; it encodes a protein of predicted M(r) 20076 which corresponds to the major non-structural protein that accumulates in RHBV-infected rice plants, and which bears limited sequence identity with the helper component of tobacco vein mottling potyvirus. The other ORF is located in the 5'-proximal region of the viral complementary RNA4 and encodes a protein of predicted M(r) 32,469. Between the two ORFs is an intergenic region of 524 nucleotides, part of which can theoretically adopt a stable stem-loop structure; the 5' and 3' ends can potentially base-pair over 16 nucleotides, producing a pan-handle configuration. These characteristics are in favour of an ambisense coding strategy for RHBV RNA4. PMID:8245863

  1. Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.

    PubMed

    Mostafavi, Sara; Battle, Alexis; Zhu, Xiaowei; Urban, Alexander E; Levinson, Douglas; Montgomery, Stephen B; Koller, Daphne

    2013-01-01

    Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks. PMID:23874524

  2. Molecular phylogeny of Stentor (Ciliophora: Heterotrichea) based on small subunit ribosomal RNA sequences.

    PubMed

    Gong, Ying-Chun; Yu, Yu-He; Zhu, Fei-Yun; Feng, Wei-Song

    2007-01-01

    To determine the phylogenetic position of Stentor within the Class Heterotrichea, the complete small subunit rRNA genes of three Stentor species, namely Stentor polymorphus, Stentor coeruleus, and Stentor roeseli, were sequenced and used to construct phylogenetic trees using the maximum parsimony, neighbor joining, and Bayesian analysis. With all phylogenetic methods, the genus Stentor was monophyletic, with S. roeseli branching basally. PMID:17300519

  3. Reproducible Analysis of Sequencing-Based RNA Structure Probing Data with User-Friendly Tools.

    PubMed

    Kielpinski, Lukasz Jan; Sidiropoulos, Nikolaos; Vinther, Jeppe

    2015-01-01

    RNA structure-probing data can improve the prediction of RNA secondary and tertiary structure and allow structural changes to be identified and investigated. In recent years, massive parallel sequencing has dramatically improved the throughput of RNA structure probing experiments, but at the same time also made analysis of the data challenging for scientists without formal training in computational biology. Here, we discuss different strategies for data analysis of massive parallel sequencing-based structure-probing data. To facilitate reproducible and standardized analysis of this type of data, we have made a collection of tools, which allow raw sequencing reads to be converted to normalized probing values using different published strategies. In addition, we also provide tools for visualization of the probing data in the UCSC Genome Browser and for converting RNA coordinates to genomic coordinates and vice versa. The collection is implemented as functions in the R statistical environment and as tools in the Galaxy platform, making them easily accessible for the scientific community. We demonstrate the usefulness of the collection by applying it to the analysis of sequencing-based hydroxyl radical probing data and comparing different normalization strategies.

  4. Bacterial metabarcoding by 16S rRNA gene ion torrent amplicon sequencing.

    PubMed

    Fantini, Elio; Gianese, Giulio; Giuliano, Giovanni; Fiore, Alessia

    2015-01-01

    Ion Torrent is a next generation sequencing technology based on the detection of hydrogen ions produced during DNA chain elongation; this technology allows analyzing and characterizing genomes, genes, and species. Here, we describe an Ion Torrent procedure applied to the metagenomic analysis of 16S rRNA gene amplicons to study the bacterial diversity in food and environmental samples. PMID:25343859

  5. Bacterial metabarcoding by 16S rRNA gene ion torrent amplicon sequencing.

    PubMed

    Fantini, Elio; Gianese, Giulio; Giuliano, Giovanni; Fiore, Alessia

    2015-01-01

    Ion Torrent is a next generation sequencing technology based on the detection of hydrogen ions produced during DNA chain elongation; this technology allows analyzing and characterizing genomes, genes, and species. Here, we describe an Ion Torrent procedure applied to the metagenomic analysis of 16S rRNA gene amplicons to study the bacterial diversity in food and environmental samples.

  6. Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence.

    PubMed

    Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab; Khandekar, Sushant; Crawford, Erin; Zirbel, Craig L; Leisner, Scott; Prakash, Ashwin; Fedorova, Larisa; Fedorov, Alexei

    2014-09-10

    Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals. PMID:25014137

  7. Prosthetic joint infection due to Lysobacter thermophilus diagnosed by 16S rRNA gene sequencing.

    PubMed

    Dhawan, B; Sebastian, S; Malhotra, R; Kapil, A; Gautam, D

    2016-01-01

    We report the first case of prosthetic joint infection caused by Lysobacter thermophilus which was identified by 16S rRNA gene sequencing. Removal of prosthesis followed by antibiotic treatment resulted in good clinical outcome. This case illustrates the use of molecular diagnostics to detect uncommon organisms in suspected prosthetic infections.

  8. Sequences more than 500 base pairs upstream of the human U3 small nuclear RNA gene stimulate the synthesis of U3 RNA in frog oocytes

    SciTech Connect

    Suh, D.; Reddy, R. ); Wright, D. )

    1991-06-04

    Small nuclear RNA (snRNA) genes contain strong promoters capable of initiating transcription once every 4 s. Studies on the human U1 snRNA gene, carried out in other laboratories, showed that sequences within 400 bp of the 5' flanking region are sufficient for maximal levels of transcription both in vivo and in frog oocytes (reviewed in Dahlberg and Lund (1988)). The authors studied the expression of a human U3 snRNA gene by injecting 5' deletion mutants into frog oocytes. The results show that sequences more than 500 bp upstream of the U3 snRNA gene have a 2-3-fold stimulatory effect on the U3 snRNA synthesis. These results indicate that the human U3 snRNA gene is different from human U1 snRNA gene in containing regulatory elements more than 500 bp upstream. The U3 snRNA gene upstream sequences contain an AluI homologous sequence in the {minus}1,200 region; these AluI sequences were transcribed in vitro and in frog oocytes but were not detectable in Hela cells.

  9. SINA: Accurate high-throughput multiple sequence alignment of ribosomal RNA genes

    PubMed Central

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-01-01

    Motivation: In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. Results: In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Availability: Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license. Contact: epruesse@mpi-bremen.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22556368

  10. RNA-directed DNA methylation efficiency depends on trigger and target sequence identity.

    PubMed

    Dalakouras, Athanasios; Dadami, Elena; Wassenegger, Michèle; Krczal, Gabi; Wassenegger, Michael

    2016-07-01

    RNA-directed DNA methylation (RdDM) in plants has been extensively studied, but the RNA molecules guiding the RdDM machinery to their targets are still to be characterized. It is unclear whether these molecules require full complementarity with their target. In this study, we have generated Nicotiana tabacum (Nt) plants carrying an infectious tomato apical stunt viroid (TASVd) transgene (Nt-TASVd) and a non-infectious potato spindle tuber viroid (PSTVd) transgene (Nt-SB2). The two viroid sequences exhibit 81% sequence identity. Nt-TASVd and Nt-SB2 plants were genetically crossed. In the progeny plants (Nt-SB2/TASVd), deep sequencing of small RNAs (sRNAs) showed that TASVd infection was associated with the accumulation of abundant small interfering RNAs (siRNAs) that mapped along the entire TASVd but only partially matched the SB2 transgene. TASVd siRNAs efficiently targeted SB2 RNA for degradation, but no transitivity was detectable. Bisulfite sequencing in the Nt-SB2/TASVd plants revealed that the TASVd transgene was targeted for dense cis-RdDM along its entire sequence. In the same plants, the SB2 transgene was targeted for trans-RdDM. The SB2 methylation pattern, however, was weak and heterogeneous, pointing to a positive correlation between trigger-target sequence identity and RdDM efficiency. Importantly, trans-RdDM on SB2 was also detected at sites where no homologous siRNAs were detected. Our data indicate that RdDM efficiency depends on the trigger-target sequence identity, and is not restricted to siRNA occupancy. These findings support recent data suggesting that RNAs with sizes longer than 24 nt (>24-nt RNAs) trigger RdDM. PMID:27121647

  11. Computational Approaches for the Analysis of ncRNA through Deep Sequencing Techniques

    PubMed Central

    Veneziano, Dario; Nigita, Giovanni; Ferro, Alfredo

    2015-01-01

    The majority of the human transcriptome is defined as non-coding RNA (ncRNA), since only a small fraction of human DNA encodes for proteins, as reported by the ENCODE project. Several distinct classes of ncRNAs, such as transfer RNA, microRNA, and long non-coding RNA, have been classified, each with its own three-dimensional folding and specific function. As ncRNAs are highly abundant in living organisms and have been discovered to play important roles in many biological processes, there has been an ever increasing need to investigate the entire ncRNAome in further unbiased detail. Recently, the advent of next-generation sequencing (NGS) technologies has substantially increased the throughput of transcriptome studies, allowing an unprecedented investigation of ncRNAs, as regulatory pathways and novel functions involving ncRNAs are now also emerging. The huge amount of transcript data produced by NGS has progressively required the development and implementation of suitable bioinformatics workflows, complemented by knowledge-based approaches, to identify, classify, and evaluate the expression of hundreds of ncRNAs in normal and pathological conditions, such as cancer. In this mini-review, we present and discuss current bioinformatics advances in the development of such computational approaches to analyze and classify the ncRNA component of human transcriptome sequence data obtained from NGS technologies. PMID:26090362

  12. RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes

    PubMed Central

    Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle

    2016-01-01

    ABSTRACT Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation. PMID:26580233

  13. Computational Approaches for the Analysis of ncRNA through Deep Sequencing Techniques.

    PubMed

    Veneziano, Dario; Nigita, Giovanni; Ferro, Alfredo

    2015-01-01

    The majority of the human transcriptome is defined as non-coding RNA (ncRNA), since only a small fraction of human DNA encodes for proteins, as reported by the ENCODE project. Several distinct classes of ncRNAs, such as transfer RNA, microRNA, and long non-coding RNA, have been classified, each with its own three-dimensional folding and specific function. As ncRNAs are highly abundant in living organisms and have been discovered to play important roles in many biological processes, there has been an ever increasing need to investigate the entire ncRNAome in further unbiased detail. Recently, the advent of next-generation sequencing (NGS) technologies has substantially increased the throughput of transcriptome studies, allowing an unprecedented investigation of ncRNAs, as regulatory pathways and novel functions involving ncRNAs are now also emerging. The huge amount of transcript data produced by NGS has progressively required the development and implementation of suitable bioinformatics workflows, complemented by knowledge-based approaches, to identify, classify, and evaluate the expression of hundreds of ncRNAs in normal and pathological conditions, such as cancer. In this mini-review, we present and discuss current bioinformatics advances in the development of such computational approaches to analyze and classify the ncRNA component of human transcriptome sequence data obtained from NGS technologies. PMID:26090362

  14. hnRNP G: sequence and characterization of a glycosylated RNA-binding protein.

    PubMed Central

    Soulard, M; Della Valle, V; Siomi, M C; Piñol-Roma, S; Codogno, P; Bauvy, C; Bellini, M; Lacroix, J C; Monod, G; Dreyfuss, G

    1993-01-01

    The autoantigen p43 is a nuclear protein initially identified with autoantibodies from dogs with a lupus-like syndrome. Here we show that p43 is an RNA-binding protein, and identify it as hnRNP G, a previously described component of heterogeneous nuclear ribonucleoprotein complexes. We demonstrate that p43/hnRNP G is glycosylated, and identify the modification as O-linked N-acetylglucosamine. A full-length cDNA clone for hnRNP G has been isolated and sequenced, and the predicted amino acid sequence for hnRNP G shows that it contains one RNP-consensus RNA binding domain (RBD) at the amino terminus and a carboxyl domain rich in serines, arginines and glycines. The RBD of human hnRNP G shows striking similarities with the RBDs of several plant RNA-binding proteins. Images PMID:7692398

  15. Discovery and Validation of Barrett's Esophagus MicroRNA Transcriptome by Next Generation Sequencing

    PubMed Central

    Bansal, Ajay; Mathur, Sharad C.; Tawfik, Ossama; Rastogi, Amit; Buttar, Navtej; Visvanathan, Mahesh; Sharma, Prateek; Christenson, Lane K.

    2013-01-01

    Objective Barrett's esophagus (BE) is transition from squamous to columnar mucosa as a result of gastroesophageal reflux disease (GERD). The role of microRNA during this transition has not been systematically studied. Design For initial screening, total RNA from 5 GERD and 6 BE patients was size fractionated. RNA <70 nucleotides was subjected to SOLiD 3 library preparation and next generation sequencing (NGS). Bioinformatics analysis was performed using R package “DEseq”. A p value<0.05 adjusted for a false discovery rate of 5% was considered significant. NGS-identified miRNA were validated using qRT-PCR in an independent group of 40 GERD and 27 BE patients. MicroRNA expression of human BE tissues was also compared with three BE cell lines. Results NGS detected 19.6 million raw reads per sample. 53.1% of filtered reads mapped to miRBase version 18. NGS analysis followed by qRT-PCR validation found 10 differentially expressed miRNA; several are novel (-708-5p, -944, -224-5p and -3065-5p). Up- or down- regulation predicted by NGS was matched by qRT-PCR in every case. Human BE tissues and BE cell lines showed a high degree of concordance (70–80%) in miRNA expression. Prediction analysis identified targets that mapped to developmental signaling pathways such as TGFβ and Notch and inflammatory pathways such as toll-like receptor signaling and TGFβ. Cluster analysis found similarly regulated (up or down) miRNA to share common targets suggesting coordination between miRNA. Conclusion Using highly sensitive next-generation sequencing, we have performed a comprehensive genome wide analysis of microRNA in BE and GERD patients. Differentially expressed miRNA between BE and GERD have been further validated. Expression of miRNA between BE human tissues and BE cell lines are highly correlated. These miRNA should be studied in biological models to further understand BE development. PMID:23372692

  16. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB.

    PubMed

    Pruesse, Elmar; Quast, Christian; Knittel, Katrin; Fuchs, Bernhard M; Ludwig, Wolfgang; Peplies, Jörg; Glöckner, Frank Oliver

    2007-01-01

    Sequencing ribosomal RNA (rRNA) genes is currently the method of choice for phylogenetic reconstruction, nucleic acid based detection and quantification of microbial diversity. The ARB software suite with its corresponding rRNA datasets has been accepted by researchers worldwide as a standard tool for large scale rRNA analysis. However, the rapid increase of publicly available rRNA sequence data has recently hampered the maintenance of comprehensive and curated rRNA knowledge databases. A new system, SILVA (from Latin silva, forest), was implemented to provide a central comprehensive web resource for up to date, quality controlled databases of aligned rRNA sequences from the Bacteria, Archaea and Eukarya domains. All sequences are checked for anomalies, carry a rich set of sequence associated contextual information, have multiple taxonomic classifications, and the latest validly described nomenclature. Furthermore, two precompiled sequence datasets compatible with ARB are offered for download on the SILVA website: (i) the reference (Ref) datasets, comprising only high quality, nearly full length sequences suitable for in-depth phylogenetic analysis and probe design and (ii) the comprehensive Parc datasets with all publicly available rRNA sequences longer than 300 nucleotides suitable for biodiversity analyses. The latest publicly available database release 91 (August 2007) hosts 547 521 sequences split into 461 823 small subunit and 85 689 large subunit rRNAs.

  17. Long Noncoding RNA and mRNA Expression Profiles in the Thyroid Gland of Two Phenotypically Extreme Pig Breeds Using Ribo-Zero RNA Sequencing.

    PubMed

    Shen, Yifei; Mao, Haiguang; Huang, Minjie; Chen, Lixing; Chen, Jiucheng; Cai, Zhaowei; Wang, Ying; Xu, Ningying

    2016-01-01

    The thyroid gland is an important endocrine organ modulating development, growth, and metabolism, mainly by controlling the synthesis and secretion of thyroid hormones (THs). However, little is known about the pig thyroid transcriptome. Long non-coding RNAs (lncRNAs) regulate gene expression and play critical roles in many cellular processes. Yorkshire pigs have a higher growth rate but lower fat deposition than that of Jinhua pigs, and thus, these species are ideal models for studying growth and lipid metabolism. This study revealed higher levels of THs in the serum of Yorkshire pigs than in the serum of Jinhua pigs. By using Ribo-zero RNA sequencing-which can capture both polyA and non-polyA transcripts-the thyroid transcriptome of both breeds were analyzed and 22,435 known mRNAs were found to be expressed in the pig thyroid. In addition, 1189 novel mRNAs and 1018 candidate lncRNA transcripts were detected. Multiple TH-synthesis-related genes were identified among the 455 differentially-expressed known mRNAs, 37 novel mRNAs, and 52 lncRNA transcripts. Bioinformatics analysis revealed that differentially-expressed genes were enriched in the microtubule-based process, which contributes to THs secretion. Moreover, integrating analysis predicted 13 potential lncRNA-mRNA gene pairs. These data expanded the repertoire of porcine lncRNAs and mRNAs and contribute to understanding the possible molecular mechanisms involved in animal growth and lipid metabolism.

  18. Assessment of phylogenetic relationships among ciliated protists using partial ribosomal RNA sequences derived from reverse transcripts.

    PubMed

    Lynn, D H; Sogin, M L

    1988-01-01

    Partial sequences were derived for the 16S-like ribosomal RNA of Halteria grandinella, Glaucoma chattoni, Colpidium campylum, Opisthonecta henneguyi and Colpoda inflata. Using isolated bulk nucleic acid preparations, partial rRNA sequences were determined using a series of oligonucleotide primers complementary to conserved sequence elements of the molecule and a dideoxynucleotide sequencing reaction using reverse transcriptase. Sequences were aligned and an evolutionary tree was generated by a distance-matrix method, comparing these partial sequences to the complete sequences of species of Tetrahymena, Paramecium, Euplotes, Oxytricha and Stylonychia. Halteria shows greater similarity to the stichotrichs Oxytricha and Stylonychia than to the hypotrich Euplotes, while these four genera share the same branch. Glaucoma and Colpidium appear closely related to Tetrahymena, with all three being more closely related to Opisthonecta than to Paramecium. Colpoda diverges near the base of the branch shared by Paramecium and the four oligohymenophorean genera. These results are discussed with reference to published revisions of the systematics of the phylum Ciliophora.

  19. Full Genome Sequence and sfRNA Interferon Antagonist Activity of Zika Virus from Recife, Brazil

    PubMed Central

    Rezelj, Veronica V.; Clark, Jordan J.; Cordeiro, Marli T.; Freitas de Oliveira França, Rafael; Pena, Lindomar J.; Wilkie, Gavin S.; Da Silva Filipe, Ana; Davis, Christopher; Hughes, Joseph; Varjak, Margus; Selinger, Martin; Zuvanov, Luíza; Owsianka, Ania M.; Patel, Arvind H.; McLauchlan, John; Lindenbach, Brett D.; Fall, Gamou; Sall, Amadou A.; Biek, Roman; Rehwinkel, Jan; Schnettler, Esther; Kohl, Alain

    2016-01-01

    Background The outbreak of Zika virus (ZIKV) in the Americas has transformed a previously obscure mosquito-transmitted arbovirus of the Flaviviridae family into a major public health concern. Little is currently known about the evolution and biology of ZIKV and the factors that contribute to the associated pathogenesis. Determining genomic sequences of clinical viral isolates and characterization of elements within these are an important prerequisite to advance our understanding of viral replicative processes and virus-host interactions. Methodology/Principal findings We obtained a ZIKV isolate from a patient who presented with classical ZIKV-associated symptoms, and used high throughput sequencing and other molecular biology approaches to determine its full genome sequence, including non-coding regions. Genome regions were characterized and compared to the sequences of other isolates where available. Furthermore, we identified a subgenomic flavivirus RNA (sfRNA) in ZIKV-infected cells that has antagonist activity against RIG-I induced type I interferon induction, with a lesser effect on MDA-5 mediated action. Conclusions/Significance The full-length genome sequence including non-coding regions of a South American ZIKV isolate from a patient with classical symptoms will support efforts to develop genetic tools for this virus. Detection of sfRNA that counteracts interferon responses is likely to be important for further understanding of pathogenesis and virus-host interactions. PMID:27706161

  20. An rRNA variable region has an evolutionarily conserved essential role despite sequence divergence.

    PubMed Central

    Sweeney, R; Chen, L; Yao, M C

    1994-01-01

    Regions extremely variable in size and sequence occur at conserved locations in eukaryotic rRNAs. The functional importance of one such region was determined by gene reconstruction and replacement in Tetrahymena thermophila. Deletion of the D8 region of the large-subunit rRNA inactivates T. thermophila rRNA genes (rDNA): transformants containing only this type of rDNA are unable to grow. Replacement with an unrelated sequence of similar size or a variable region from a different position in the rRNA also inactivated the rDNA. Mutant rRNAs resulting from such constructs were present only in precursor forms, suggesting that these rRNAs are deficient in either processing or stabilization of the mature form. Replacement with D8 regions from three other organisms restored function, even though the sequences are very different. Thus, these D8 regions share an essential functional feature that is not reflected in their primary sequences. Similar tertiary structures may be the quality these sequences share that allows them to function interchangeably. Images PMID:8196658

  1. Automatic identification of large collections of protein-coding or rRNA sequences.

    PubMed

    Arigon, Anne-Muriel; Perrière, Guy; Gouy, Manolo

    2008-04-01

    The number of available genomic sequences is growing very fast, due to the development of massive sequencing techniques. Sequence identification is needed and contributes to the assessment of gene and species evolutionary relationships. Automated bioinformatics tools are thus necessary to carry out these identification operations in an accurate and fast way. We developed HoSeqI (Homologous Sequence Identification), a software environment allowing this kind of automated sequence identification using homologous gene family databases. HoSeqI is accessible through a Web interface (http://pbil.univ-lyon1.fr/software/HoSeqI/) allowing to identify one or several sequences and to visualize resulting alignments and phylogenetic trees. We also implemented another application, MultiHoSeqI, to quickly add a large set of sequences to a family database in order to identify them, to update the database, or to help automatic genome annotation. Lately, we developed an application, ChiSeqI (Chimeric Sequence Identification), to automate the processes of identification of bacterial 16S ribosomal RNA sequences and of detection of chimeric sequences.

  2. An RNA-based approach to sequence the mitogenome of Hypoptopoma incognitum (Siluriformes: Loricariidae).

    PubMed

    Moreira, Daniel Andrade; Magalhães, Maithê G P; de Andrade, Paula C C; Furtado, Carolina; Val, Adalberto L; Parente, Thiago Estevam

    2016-09-01

    Hypoptopoma incognitum is a fish of the fifth most species-rich family of vertebrates and abundant in rivers from the Brazilian Amazon. Only two species of Loricariidae fish have their complete mitogenomes sequence deposited in the Genbank. An innovative RNA-based approach was used to assemble the complete mitogenome of H. incognitum with an average coverage depth of 5292×. The typical vertebrate mitochondrial features were found; 22 tRNA genes, two rRNA genes, 13 protein-coding genes, and a non-coding control region. Moreover, the use of this approach allowed the measurement of mtRNA expression levels, the punctuation pattern of editing, and the detection of heteroplasmies. PMID:26370305

  3. Deep sequencing reveals global patterns of mRNA recruitment during translation initiation

    PubMed Central

    Gao, Rong; Yu, Kai; Nie, Jukui; Lian, Tengfei; Jin, Jianshi; Liljas, Anders; Su, Xiao-Dong

    2016-01-01

    In this work, we developed a method to systematically study the sequence preference of mRNAs during translation initiation. Traditionally, the dynamic process of translation initiation has been studied at the single molecule level with limited sequencing possibility. Using deep sequencing techniques, we identified the sequence preference at different stages of the initiation complexes. Our results provide a comprehensive and dynamic view of the initiation elements in the translation initiation region (TIR), including the S1 binding sequence, the Shine-Dalgarno (SD)/anti-SD interaction and the second codon, at the equilibrium of different initiation complexes. Moreover, our experiments reveal the conformational changes and regional dynamics throughout the dynamic process of mRNA recruitment. PMID:27460773

  4. Practicability of detecting somatic point mutation from RNA high throughput sequencing data.

    PubMed

    Sheng, Quanhu; Zhao, Shilin; Li, Chung-I; Shyr, Yu; Guo, Yan

    2016-05-01

    Traditionally, somatic mutations are detected by examining DNA sequence. The maturity of sequencing technology has allowed researchers to screen for somatic mutations in the whole genome. Increasingly, researchers have become interested in identifying somatic mutations through RNAseq data. With this motivation, we evaluated the practicability of detecting somatic mutations from RNAseq data. Current somatic mutation calling tools were designed for DNA sequencing data. To increase performance on RNAseq data, we developed a somatic mutation caller GLMVC based on bias reduced generalized linear model for both DNA and RNA sequencing data. Through comparison with MuTect and Varscan we showed that GLMVC performed better for somatic mutation detection using exome sequencing or RNAseq data. GLMVC is freely available for download at the following website: https://github.com/shengqh/GLMVC/wiki.

  5. Deep sequencing reveals global patterns of mRNA recruitment during translation initiation.

    PubMed

    Gao, Rong; Yu, Kai; Nie, Jukui; Lian, Tengfei; Jin, Jianshi; Liljas, Anders; Su, Xiao-Dong

    2016-07-27

    In this work, we developed a method to systematically study the sequence preference of mRNAs during translation initiation. Traditionally, the dynamic process of translation initiation has been studied at the single molecule level with limited sequencing possibility. Using deep sequencing techniques, we identified the sequence preference at different stages of the initiation complexes. Our results provide a comprehensive and dynamic view of the initiation elements in the translation initiation region (TIR), including the S1 binding sequence, the Shine-Dalgarno (SD)/anti-SD interaction and the second codon, at the equilibrium of different initiation complexes. Moreover, our experiments reveal the conformational changes and regional dynamics throughout the dynamic process of mRNA recruitment.

  6. Profiling status epilepticus-induced changes in hippocampal RNA expression using high-throughput RNA sequencing

    PubMed Central

    Hansen, Katelin F.; Sakamoto, Kensuke; Pelz, Carl; Impey, Soren; Obrietan, Karl

    2014-01-01

    Status epilepticus (SE) is a life-threatening condition that can give rise to a number of neurological disorders, including learning deficits, depression, and epilepsy. Many of the effects of SE appear to be mediated by alterations in gene expression. To gain deeper insight into how SE affects the transcriptome, we employed the pilocarpine SE model in mice and Illumina-based high-throughput sequencing to characterize alterations in gene expression from the induction of SE, to the development of spontaneous seizure activity. While some genes were upregulated over the entire course of the pathological progression, each of the three sequenced time points (12-hour, 10-days and 6-weeks post-SE) had a largely unique transcriptional profile. Hence, genes that regulate synaptic physiology and transcription were most prominently altered at 12-hours post-SE; at 10-days post-SE, marked changes in metabolic and homeostatic gene expression were detected; at 6-weeks, substantial changes in the expression of cell excitability and morphogenesis genes were detected. At the level of cell signaling, KEGG analysis revealed dynamic changes within the MAPK pathways, as well as in CREB-associated gene expression. Notably, the inducible expression of several noncoding transcripts was also detected. These findings offer potential new insights into the cellular events that shape SE-evoked pathology. PMID:25373493

  7. Profiling status epilepticus-induced changes in hippocampal RNA expression using high-throughput RNA sequencing.

    PubMed

    Hansen, Katelin F; Sakamoto, Kensuke; Pelz, Carl; Impey, Soren; Obrietan, Karl

    2014-01-01

    Status epilepticus (SE) is a life-threatening condition that can give rise to a number of neurological disorders, including learning deficits, depression, and epilepsy. Many of the effects of SE appear to be mediated by alterations in gene expression. To gain deeper insight into how SE affects the transcriptome, we employed the pilocarpine SE model in mice and Illumina-based high-throughput sequencing to characterize alterations in gene expression from the induction of SE, to the development of spontaneous seizure activity. While some genes were upregulated over the entire course of the pathological progression, each of the three sequenced time points (12-hour, 10-days and 6-weeks post-SE) had a largely unique transcriptional profile. Hence, genes that regulate synaptic physiology and transcription were most prominently altered at 12-hours post-SE; at 10-days post-SE, marked changes in metabolic and homeostatic gene expression were detected; at 6-weeks, substantial changes in the expression of cell excitability and morphogenesis genes were detected. At the level of cell signaling, KEGG analysis revealed dynamic changes within the MAPK pathways, as well as in CREB-associated gene expression. Notably, the inducible expression of several noncoding transcripts was also detected. These findings offer potential new insights into the cellular events that shape SE-evoked pathology. PMID:25373493

  8. RNA shotgun metagenomic sequencing of northern California (USA) mosquitoes uncovers viruses, bacteria, and fungi.

    PubMed

    Chandler, James Angus; Liu, Rachel M; Bennett, Shannon N

    2015-01-01

    Mosquitoes, most often recognized for the microbial agents of disease they may carry, harbor diverse microbial communities that include viruses, bacteria, and fungi, collectively called the microbiota. The composition of the microbiota can directly and indirectly affect disease transmission through microbial interactions that could be revealed by its characterization in natural populations of mosquitoes. Furthermore, the use of shotgun metagenomic sequencing (SMS) approaches could allow the discovery of unknown members of the microbiota. In this study, we use RNA SMS to characterize the microbiota of seven individual mosquitoes (species include Culex pipiens, Culiseta incidens, and Ochlerotatus sierrensis) collected from a variety of habitats in California, USA. Sequencing was performed on the Illumina HiSeq platform and the resulting sequences were quality-checked and assembled into contigs using the A5 pipeline. Sequences related to single stranded RNA viruses of the Bunyaviridae and Rhabdoviridae were uncovered, along with an unclassified genus of double-stranded RNA viruses. Phylogenetic analysis finds that in all three cases, the closest relatives of the identified viral sequences are other mosquito-associated viruses, suggesting widespread host-group specificity among disparate viral taxa. Interestingly, we identified a Narnavirus of fungi, also reported elsewhere in mosquitoes, that potentially demonstrates a nested host-parasite association between virus, fungi, and mosquito. Sequences related to 8 bacterial families and 13 fungal families were found across the seven samples. Bacillus and Escherichia/Shigella were identified in all samples and Wolbachia was identified in all Cx. pipiens samples, while no single fungal genus was found in more than two samples. This study exemplifies the utility of RNA SMS in the characterization of the natural microbiota of mosquitoes and, in particular, the value of identifying all microbes associated with a specific host.

  9. RNA shotgun metagenomic sequencing of northern California (USA) mosquitoes uncovers viruses, bacteria, and fungi

    PubMed Central

    Chandler, James Angus; Liu, Rachel M.; Bennett, Shannon N.

    2015-01-01

    Mosquitoes, most often recognized for the microbial agents of disease they may carry, harbor diverse microbial communities that include viruses, bacteria, and fungi, collectively called the microbiota. The composition of the microbiota can directly and indirectly affect disease transmission through microbial interactions that could be revealed by its characterization in natural populations of mosquitoes. Furthermore, the use of shotgun metagenomic sequencing (SMS) approaches could allow the discovery of unknown members of the microbiota. In this study, we use RNA SMS to characterize the microbiota of seven individual mosquitoes (species include Culex pipiens, Culiseta incidens, and Ochlerotatus sierrensis) collected from a variety of habitats in California, USA. Sequencing was performed on the Illumina HiSeq platform and the resulting sequences were quality-checked and assembled into contigs using the A5 pipeline. Sequences related to single stranded RNA viruses of the Bunyaviridae and Rhabdoviridae were uncovered, along with an unclassified genus of double-stranded RNA viruses. Phylogenetic analysis finds that in all three cases, the closest relatives of the identified viral sequences are other mosquito-associated viruses, suggesting widespread host-group specificity among disparate viral taxa. Interestingly, we identified a Narnavirus of fungi, also reported elsewhere in mosquitoes, that potentially demonstrates a nested host-parasite association between virus, fungi, and mosquito. Sequences related to 8 bacterial families and 13 fungal families were found across the seven samples. Bacillus and Escherichia/Shigella were identified in all samples and Wolbachia was identified in all Cx. pipiens samples, while no single fungal genus was found in more than two samples. This study exemplifies the utility of RNA SMS in the characterization of the natural microbiota of mosquitoes and, in particular, the value of identifying all microbes associated with a specific host

  10. Global transcriptional start site mapping using differential RNA sequencing reveals novel antisense RNAs in Escherichia coli.

    PubMed

    Thomason, Maureen K; Bischler, Thorsten; Eisenbart, Sara K; Förstner, Konrad U; Zhang, Aixia; Herbig, Alexander; Nieselt, Kay; Sharma, Cynthia M; Storz, Gisela

    2015-01-01

    While the model organism Escherichia coli has been the subject of intense study for decades, the full complement of its RNAs is only now being examined. Here we describe a survey of the E. coli transcriptome carried out using a differential RNA sequencing (dRNA-seq) approach, which can distinguish between primary and processed transcripts, and an automated prediction algorithm for transcriptional start sites (TSS). With the criterion of expression under at least one of three growth conditions examined, we predicted 14,868 TSS candidates, including 5,574 internal to annotated genes (iTSS) and 5,495 TSS corresponding to potential antisense RNAs (asRNAs). We examined expression of 14 candidate asRNAs by Northern analysis using RNA from wild-type E. coli and from strains defective for RNases III and E, two RNases reported to be involved in asRNA processing. Interestingly, nine asRNAs detected as distinct bands by Northern analysis were differentially affected by the rnc and rne mutations. We also compared our asRNA candidates with previously published asRNA annotations from RNA-seq data and discuss the challenges associated with these cross-comparisons. Our global transcriptional start site map represents a valuable resource for identification of transcription start sites, promoters, and novel transcripts in E. coli and is easily accessible, together with the cDNA coverage plots, in an online genome browser.

  11. Sequence-specific binding of a hormonally regulated mRNA binding protein to cytidine-rich sequences in the lutropin receptor open reading frame.

    PubMed

    Kash, J C; Menon, K M

    1999-12-21

    In previous studies, a lutropin receptor mRNA binding protein implicated in the hormonal regulation of lutropin receptor mRNA stability was identified. This protein, termed LRBP-1, was shown by RNA gel electrophoretic mobility shift assay to specifically interact with lutropin receptor RNA sequences. The present studies have examined the specificity of lutropin receptor mRNA recognition by LRBP-1 and mapped the contact site by RNA footprinting and by site-directed mutagenesis. LRBP-1 was partially purified by cation-exchange chromatography, and the mRNA binding properties of the partially purified LRBP-1 were examined by RNA gel electrophoretic mobility shift assay and hydroxyl-radical RNA footprinting. These data showed that the LRBP-1 binding site is located between nucleotides 203 and 220 of the receptor open reading frame, and consists of the bipartite polypyrimidine sequence 5'-UCUC-X(7)-UCUCCCU-3'. Competition RNA gel electrophoretic mobility shift assays demonstrated that homoribopolymers of poly(rC) were effective RNA binding competitors, while poly(rA), poly(rG), and poly(rU) showed no effect. Mutagenesis of the cytidine residues contained within the LRBP-1 binding site demonstrated that all the cytidines in the bipartite sequence contribute to LRBP-1 binding specificity. Additionally, RNA gel electrophoretic mobility supershift analysis showed that LRBP-1 was not recognized by antibodies against two well-characterized poly(rC) RNA binding proteins, alphaCP-1 and alphaCP-2, implicated in the regulation of RNA stability of alpha-globin and tyrosine hydroxylase mRNAs. In summary, we show that partially purified LRBP-1 binds to a polypyrimidine sequence within nucleotides 203 and 220 of lutropin receptor mRNA with a high degree of specificity which is indicative of its role in posttranscriptional control of lutropin receptor expression.

  12. An evolutionary conserved pattern of 18S rRNA sequence complementarity to mRNA 5' UTRs and its implications for eukaryotic gene translation regulation.

    PubMed

    Pánek, Josef; Kolár, Michal; Vohradský, Jirí; Shivaya Valásek, Leos

    2013-09-01

    There are several key mechanisms regulating eukaryotic gene expression at the level of protein synthesis. Interestingly, the least explored mechanisms of translational control are those that involve the translating ribosome per se, mediated for example via predicted interactions between the ribosomal RNAs (rRNAs) and mRNAs. Here, we took advantage of robustly growing large-scale data sets of mRNA sequences for numerous organisms, solved ribosomal structures and computational power to computationally explore the mRNA-rRNA complementarity that is statistically significant across the species. Our predictions reveal highly specific sequence complementarity of 18S rRNA sequences with mRNA 5' untranslated regions (UTRs) forming a well-defined 3D pattern on the rRNA sequence of the 40S subunit. Broader evolutionary conservation of this pattern may imply that 5' UTRs of eukaryotic mRNAs, which have already emerged from the mRNA-binding channel, may contact several complementary spots on 18S rRNA situated near the exit of the mRNA binding channel and on the middle-to-lower body of the solvent-exposed 40S ribosome including its left foot. We discuss physiological significance of this structurally conserved pattern and, in the context of previously published experimental results, propose that it modulates scanning of the 40S subunit through 5' UTRs of mRNAs.

  13. Comparison of five different RNA sources to examine the lactating bovine mammary gland transcriptome using RNA-Sequencing.

    PubMed

    Cánovas, Angela; Rincón, Gonzalo; Bevilacqua, Claudia; Islas-Trejo, Alma; Brenaut, Pauline; Hovey, Russell C; Boutinaud, Marion; Morgenthaler, Caroline; VanKlompenberg, Monica K; Martin, Patrice; Medrano, Juan F

    2014-07-08

    The objective of this study was to examine five different sources of RNA, namely mammary gland tissue (MGT), milk somatic cells (SC), laser microdissected mammary epithelial cells (LCMEC), milk fat globules (MFG) and antibody-captured milk mammary epithelial cells (mMEC) to analyze the bovine mammary gland transcriptome using RNA-Sequencing. Our results provide a comparison between different sampling methods (invasive and non-invasive) to define the transcriptome of mammary gland tissue and milk cells. This information will be of value to investigators in choosing the most appropriate sampling method for different research applications to study specific physiological states during lactation. One of the simplest procedures to study the transcriptome associated with milk appears to be the isolation of total RNA directly from SC or MFG released into milk during lactation. Our results indicate that the SC and MFG transcriptome are representative of MGT and LCMEC and can be used as effective and alternative samples to study mammary gland expression without the need to perform a tissue biopsy.

  14. Sensitive, multiplex and direct quantification of RNA sequences using a modified RASL assay

    PubMed Central

    Larman, H. Benjamin; Scott, Erick R.; Wogan, Megan; Oliveira, Glenn; Torkamani, Ali; Schultz, Peter G.

    2014-01-01

    A sensitive and highly multiplex method to directly measure RNA sequence abundance without requiring reverse transcription would be of value for a number of biomedical applications, including high throughput small molecule screening, pathogen transcript detection and quantification of short/degraded RNAs. RNA Annealing, Selection and Ligation (RASL) assays, which are based on RNA template-dependent oligonucleotide probe ligation, have been developed to meet this need, but technical limitations have impeded their adoption. Whereas DNA ligase-based RASL assays suffer from extremely low and sequence-dependent ligation efficiencies that compromise assay robustness, Rnl2 can join a fully DNA donor probe to a 3′-diribonucleotide-terminated acceptor probe with high efficiency on an RNA template strand. Rnl2-based RASL exhibits sub-femtomolar transcript detection sensitivity, and permits the rational tuning of probe signals for optimal analysis by massively parallel DNA sequencing (RASL-seq). A streamlined Rnl2-based RASL-seq protocol was assessed in a small molecule screen using 77 probe sets designed to monitor complex human B cell phenotypes during antibody class switch recombination. Our data demonstrate the robustness, cost-efficiency and broad applicability of Rnl2-based RASL assays. PMID:25063296

  15. Rare variant phasing and haplotypic expression from RNA sequencing with phASER.

    PubMed

    Castel, Stephane E; Mohammadi, Pejman; Chung, Wendy K; Shen, Yufeng; Lappalainen, Tuuli

    2016-01-01

    Haplotype phasing of genetic variants is important for clinical interpretation of the genome, population genetic analysis and functional genomic analysis of allelic activity. Here we present phASER, an accurate approach for phasing variants that are overlapped by sequencing reads, including those from RNA sequencing (RNA-seq), which often span multiple exons due to splicing. Using diverse RNA-seq data we demonstrate that this provides more accurate phasing of rare variants compared with population-based phasing and allows phasing of variants in the same gene up to hundreds of kilobases away that cannot be obtained from DNA sequencing (DNA-seq) reads. We show that in the context of medical genetic studies this improves the resolution of compound heterozygotes. Additionally, phASER provides measures of haplotypic expression that increase power and accuracy in studies of allelic expression. In summary, phasing using RNA-seq and phASER is accurate and improves studies where rare variant haplotypes or allelic expression is needed. PMID:27605262

  16. Rare variant phasing and haplotypic expression from RNA sequencing with phASER

    PubMed Central

    Castel, Stephane E.; Mohammadi, Pejman; Chung, Wendy K.; Shen, Yufeng; Lappalainen, Tuuli

    2016-01-01

    Haplotype phasing of genetic variants is important for clinical interpretation of the genome, population genetic analysis and functional genomic analysis of allelic activity. Here we present phASER, an accurate approach for phasing variants that are overlapped by sequencing reads, including those from RNA sequencing (RNA-seq), which often span multiple exons due to splicing. Using diverse RNA-seq data we demonstrate that this provides more accurate phasing of rare variants compared with population-based phasing and allows phasing of variants in the same gene up to hundreds of kilobases away that cannot be obtained from DNA sequencing (DNA-seq) reads. We show that in the context of medical genetic studies this improves the resolution of compound heterozygotes. Additionally, phASER provides measures of haplotypic expression that increase power and accuracy in studies of allelic expression. In summary, phasing using RNA-seq and phASER is accurate and improves studies where rare variant haplotypes or allelic expression is needed. PMID:27605262

  17. Strategy for microbiome analysis using 16S rRNA gene sequence analysis on the Illumina sequencing platform.

    PubMed

    Ram, Jeffrey L; Karim, Aos S; Sendler, Edward D; Kato, Ikuko

    2011-06-01

    Understanding the identity and changes of organisms in the urogenital and other microbiomes of the human body may be key to discovering causes and new treatments of many ailments, such as vaginosis. High-throughput sequencing technologies have recently enabled discovery of the great diversity of the human microbiome. The cost per base of many of these sequencing platforms remains high (thousands of dollars per sample); however, the Illumina Genome Analyzer (IGA) is estimated to have a cost per base less than one-fifth of its nearest competitor. The main disadvantage of the IGA for sequencing PCR-amplified 16S rRNA genes is that the maximum read-length of the IGA is only 100 bases; whereas, at least 300 bases are needed to obtain phylogenetically informative data down to the genus and species level. In this paper we describe and conduct a pilot test of a multiplex sequencing strategy suitable for achieving total reads of > 300 bases per extracted DNA molecule on the IGA. Results show that all proposed primers produce products of the expected size and that correct sequences can be obtained, with all proposed forward primers. Various bioinformatic optimization of the Illumina Bustard analysis pipeline proved necessary to extract the correct sequence from IGA image data, and these modifications of the data files indicate that further optimization of the analysis pipeline may improve the quality rankings of the data and enable more sequence to be correctly analyzed. The successful application of this method could result in an unprecedentedly deep description (800,000 taxonomic identifications per sample) of the urogenital and other microbiomes in a large number of samples at a reasonable cost per sample. PMID:21361774

  18. Determination of genes, restriction sites, and DNA sequences surrounding the 6S RNA template of bacteriophage lambda.

    PubMed Central

    Sklar, J; Yot, P; Weissman, S M

    1975-01-01

    A major product of the transcription of bacteriophage lambda DNA in vitro is the 6S RNA. This article presents a detailed mapping of restriction endonuclease cleavage sites about the region of the 6S RNA template within the lambda genome. Restriction fragments defined by these sites have been used to localize the 6S RNA template within the physical and genetic maps of the lambda genome. Nucleotide sequence analysis of one of these fragments has largely confimed the nucleotide sequence of the 6S RNA reported previously and has indicated the sequence of DNA that immediately follows the 6S RNA template. This article reports the nucleotide sequence following a known site of transcription termination by RNA polymerase of Escherichia coli. Images PMID:1098044

  19. Structural Analysis of Single-Point Mutations Given an RNA Sequence: A Case Study with RNAMute

    NASA Astrophysics Data System (ADS)

    Churkin, Alexander; Barash, Danny

    2006-12-01

    We introduce here for the first time the RNAMute package, a pattern-recognition-based utility to perform mutational analysis and detect vulnerable spots within an RNA sequence that affect structure. Mutations in these spots may lead to a structural change that directly relates to a change in functionality. Previously, the concept was tried on RNA genetic control elements called "riboswitches" and other known RNA switches, without an organized utility that analyzes all single-point mutations and can be further expanded. The RNAMute package allows a comprehensive categorization, given an RNA sequence that has functional relevance, by exploring the patterns of all single-point mutants. For illustration, we apply the RNAMute package on an RNA transcript for which individual point mutations were shown experimentally to inactivate spectinomycin resistance in Escherichia coli. Functional analysis of mutations on this case study was performed experimentally by creating a library of point mutations using PCR and screening to locate those mutations. With the availability of RNAMute, preanalysis can be performed computationally before conducting an experiment.

  20. Spatially Enhanced Differential RNA Methylation Analysis from Affinity-Based Sequencing Data with Hidden Markov Model

    PubMed Central

    Zhang, Yu-Chen; Zhang, Shao-Wu; Liu, Lian; Liu, Hui; Zhang, Lin; Cui, Xiaodong; Huang, Yufei; Meng, Jia

    2015-01-01

    With the development of new sequencing technology, the entire N6-methyl-adenosine (m6A) RNA methylome can now be unbiased profiled with methylated RNA immune-precipitation sequencing technique (MeRIP-Seq), making it possible to detect differential methylation states of RNA between two conditions, for example, between normal and cancerous tissue. However, as an affinity-based method, MeRIP-Seq has yet provided base-pair resolution; that is, a single methylation site determined from MeRIP-Seq data can in practice contain multiple RNA methylation residuals, some of which can be regulated by different enzymes and thus differentially methylated between two conditions. Since existing peak-based methods could not effectively differentiate multiple methylation residuals located within a single methylation site, we propose a hidden Markov model (HMM) based approach to address this issue. Specifically, the detected RNA methylation site is further divided into multiple adjacent small bins and then scanned with higher resolution using a hidden Markov model to model the dependency between spatially adjacent bins for improved accuracy. We tested the proposed algorithm on both simulated data and real data. Result suggests that the proposed algorithm clearly outperforms existing peak-based approach on simulated systems and detects differential methylation regions with higher statistical significance on real dataset. PMID:26301253

  1. Whole Transcriptome Sequencing Reveals Extensive Unspliced mRNA in Metastatic Castration-Resistant Prostate Cancer

    PubMed Central

    Sowalsky, Adam G.; Xia, Zheng; Wang, Liguo; Zhao, Hao; Chen, Shaoyong; Bubley, Glenn J.; Balk, Steven P.; Li, Wei

    2014-01-01

    Men with metastatic prostate cancer (PCa) who are treated with androgen deprivation therapies (ADT) usually relapse within 2–3 years with disease that is termed castration-resistant prostate cancer (CRPC). To identify the mechanism that drives these advanced tumors, paired-end RNA-sequencing (RNA-seq) was performed on a panel of CRPC bone marrow biopsy specimens. From this genome-wide approach, mutations were found in a series of genes with PCa relevance including: AR, NCOR1, KDM3A, KDM4A, CHD1, SETD5, SETD7, INPP4B, RASGRP3, RASA1, TP53BP1 and CDH1, and a novel SND1:BRAF gene fusion. Amongst the most highly-expressed transcripts were ten non-coding RNAs (ncRNAs), including MALAT1 and PABPC1, which are involved in RNA processing. Notably, a high percentage of sequence reads mapped to introns, which were determined to be the result of incomplete splicing at canonical splice junctions. Using quantitative PCR (qPCR) a series of genes (AR, KLK2, KLK3, STEAP2, CPSF6, and CDK19) were confirmed to have a greater proportion of unspliced RNA in CRPC specimens than in normal prostate epithelium, untreated primary PCa, and cultured PCa cells. This inefficient coupling of transcription and mRNA splicing suggests an overall increase in transcription or defect in splicing. PMID:25189356

  2. SHAPE Selection (SHAPES) enrich for RNA structure signal in SHAPE sequencing-based probing data.

    PubMed

    Poulsen, Line Dahl; Kielpinski, Lukasz Jan; Salama, Sofie R; Krogh, Anders; Vinther, Jeppe

    2015-05-01

    Selective 2' Hydroxyl Acylation analyzed by Primer Extension (SHAPE) is an accurate method for probing of RNA secondary structure. In existing SHAPE methods, the SHAPE probing signal is normalized to a no-reagent control to correct for the background caused by premature termination of the reverse transcriptase. Here, we introduce a SHAPE Selection (SHAPES) reagent, N-propanone isatoic anhydride (NPIA), which retains the ability of SHAPE reagents to accurately probe RNA structure, but also allows covalent coupling between the SHAPES reagent and a biotin molecule. We demonstrate that SHAPES-based selection of cDNA-RNA hybrids on streptavidin beads effectively removes the large majority of background signal present in SHAPE probing data and that sequencing-based SHAPES data contain the same amount of RNA structure data as regular sequencing-based SHAPE data obtained through normalization to a no-reagent control. Moreover, the selection efficiently enriches for probed RNAs, suggesting that the SHAPES strategy will be useful for applications with high-background and low-probing signal such as in vivo RNA structure probing.

  3. 5 S and 5.8 S ribosomal RNA sequences and protist phylogenetics.

    PubMed

    Walker, W F

    1985-01-01

    More than 100 5 S 5.8 S rRNA sequences from protists, including fungi, are known. Through a combination of quantitative treeing and special consideration of "signature' nucleotide combinations, the most significant phylogenetic implications of these data are emphasized. Also, limitations of the data for phylogenetic inferences are discussed and other significant data are brought to bear on the inferences obtained. 5 S sequences from red algae are seen as the most isolated among eukaryotics. A 5 S sequence lineage consisting of oomycetes, euglenoids, most protozoa, most slime molds and perhaps dinoflagellates and mesozoa is defined. Such a lineage is not evident from 5.8 S rRNA or cytochrome c sequence data. 5 S sequences from Ascomycota and Basidiomycota are consistent with the proposal that each is derived from a mycelial form with a haploid yeast phase and simple septal pores, probably most resembling present Taphrinales. 5 S sequences from Chytridiomycota and Zygomycota are not clearly distinct from each other and suggest that a major lineage radiation occurred in the early history of each. Qualitative biochemical data clearly supports a dichotomy between an Ascomycota-Basidiomycota lineage and a Zygomycota-Chytridiomycota lineage.

  4. Annotation of primate miRNAs by high throughput sequencing of small RNA libraries

    PubMed Central

    2012-01-01

    Background In addition to genome sequencing, accurate functional annotation of genomes is required in order to carry out comparative and evolutionary analyses between species. Among primates, the human genome is the most extensively annotated. Human miRNA gene annotation is based on multiple lines of evidence including evidence for expression as well as prediction of the characteristic hairpin structure. In contrast, most miRNA genes in non-human primates are annotated based on homology without any expression evidence. We have sequenced small-RNA libraries from chimpanzee, gorilla, orangutan and rhesus macaque from multiple individuals and tissues. Using patterns of miRNA expression in conjunction with a model of miRNA biogenesis we used these high-throughput sequencing data to identify novel miRNAs in non-human primates. Results We predicted 47 new miRNAs in chimpanzee, 240 in gorilla, 55 in orangutan and 47 in rhesus macaque. The algorithm we used was able to predict 64% of the previously known miRNAs in chimpanzee, 94% in gorilla, 61% in orangutan and 71% in rhesus macaque. We therefore added evidence for expression in between one and five tissues to miRNAs that were previously annotated based only on homology to human miRNAs. We increased from 60 to 175 the number miRNAs that are located in orthologous regions in humans and the four non-human primate species studied here. Conclusions In this study we provide expression evidence for homology-based annotated miRNAs and predict de novo miRNAs in four non-human primate species. We increased the number of annotated miRNA genes and provided evidence for their expression in four non-human primates. Similar approaches using different individuals and tissues would improve annotation in non-human primates and allow for further comparative studies in the future. PMID:22453055

  5. Enhancing potency of siRNA targeting fusion genes by optimization outside of target sequence

    PubMed Central

    Gavrilov, Kseniya; Seo, Young-Eun; Tietjen, Gregory T.; Cui, Jiajia; Cheng, Christopher J.; Saltzman, W. Mark

    2015-01-01

    Canonical siRNA design algorithms have become remarkably effective at predicting favorable binding regions within a target mRNA, but in some cases (e.g., a fusion junction site) region choice is restricted. In these instances, alternative approaches are necessary to obtain a highly potent silencing molecule. Here we focus on strategies for rational optimization of two siRNAs that target the junction sites of fusion oncogenes BCR-ABL and TMPRSS2-ERG. We demonstrate that modifying the termini of these siRNAs with a terminal G-U wobble pair or a carefully selected pair of terminal asymmetry-enhancing mismatches can result in an increase in potency at low doses. Importantly, we observed that improvements in silencing at the mRNA level do not necessarily translate to reductions in protein level and/or cell death. Decline in protein level is also heavily influenced by targeted protein half-life, and delivery vehicle toxicity can confound measures of cell death due to silencing. Therefore, for BCR-ABL, which has a long protein half-life that is difficult to overcome using siRNA, we also developed a nontoxic transfection vector: poly(lactic-coglycolic acid) nanoparticles that release siRNA over many days. We show that this system can achieve effective killing of leukemic cells. These findings provide insights into the implications of siRNA sequence for potency and suggest strategies for the design of more effective therapeutic siRNA molecules. Furthermore, this work points to the importance of integrating studies of siRNA design and delivery, while heeding and addressing potential limitations such as restricted targetable mRNA regions, long protein half-lives, and nonspecific toxicities. PMID:26627251

  6. A genome-wide view of microsatellite instability: old stories of cancer mutations revisited with new sequencing technologies

    PubMed Central

    Kim, Tae-Min; Park, Peter J

    2014-01-01

    Microsatellites are simple tandem repeats that are present at millions of loci in the human genome. Microsatellite instability (MSI) refers to DNA slippage events on microsatellites that occur frequently in cancer genomes when there is a defect in the DNA mismatch repair system. These somatic mutations can result in inactivation of tumor suppressor genes or disrupt other non-coding regulatory sequences, thereby playing a role in carcinogenesis. Here, we will discuss the ways in which high-throughput sequencing data can facilitate a genome- or exome-wide discovery and more detailed investigation of MSI events in microsatellite-unstable cancer genomes. We will address the methodological aspects of this approach and highlight insights from recent analyses of colorectal and endometrial cancer genomes from The Cancer Genome Atlas project. These include identification of novel MSI targets within and across tumor types and the relationship between the likelihood of MSI events to chromatin structure. Given the increasing popularity of exome and genome sequencing of cancer genomes, a comprehensive characterization of MSI may serve as a valuable marker of cancer evolution and aid in a search for therapeutic targets. PMID:25371413

  7. New Hosts of Simplicimonas similis and Trichomitus batrachorum Identified by 18S Ribosomal RNA Gene Sequences

    PubMed Central

    Dimasuay, Kris Genelyn B.; Lavilla, Orlie John Y.; Rivera, Windell L.

    2013-01-01

    Trichomonads are obligate anaerobes generally found in the digestive and genitourinary tract of domestic animals. In this study, four trichomonad isolates were obtained from carabao, dog, and pig hosts using rectal swab. Genomic DNA was extracted using Chelex method and the 18S rRNA gene was successfully amplified through novel sets of primers and undergone DNA sequencing. Aligned isolate sequences together with retrieved 18S rRNA gene sequences of known trichomonads were utilized to generate phylogenetic trees using maximum likelihood and neighbor-joining analyses. Two isolates from carabao were identified as Simplicimonas similis while each isolate from dog and pig was identified as Pentatrichomonas hominis and Trichomitus batrachorum, respectively. This is the first report of S. similis in carabao and the identification of T. batrachorum in pig using 18S rRNA gene sequence analysis. The generated phylogenetic tree yielded three distinct groups mostly with relatively moderate to high bootstrap support and in agreement with the most recent classification. Pathogenic potential of the trichomonads in these hosts still needs further investigation. PMID:23936631

  8. Genomic RNA sequence of Feline coronavirus strain FIPV WSU-79/1146

    PubMed Central

    Dye, Charlotte; Siddell, Stuart G.

    2008-01-01

    A consensus sequence of the Feline coronavirus (FCoV) (strain FIPV WSU-79/1146) genome was determined from overlapping cDNA fragments produced by RT-PCR amplification of viral RNA. The genome was found to be 29 125 nt in length, excluding the poly(A) tail. Analysis of the sequence identified conserved open reading frames and revealed an overall genome organization similar to that of other coronaviruses. The genomic RNA was analysed for putative cis-acting elements and the pattern of subgenomic mRNA synthesis was analysed by Northern blotting. Comparative sequence analysis of the predicted FCoV proteins identified 16 replicase proteins (nsp1–nsp16) and four structural proteins (spike, membrane, envelope and nucleocapsid). Two mRNAs encoding putative accessory proteins were also detected. Phylogenetic analyses confirmed that FIPV WSU-79/1146 belongs to the coronavirus subgroup G1-1. These results confirm and extend previous findings from partial sequence analysis of FCoV genomes. PMID:16033972

  9. Computational identification of riboswitches based on RNA conserved functional sequences and conformations.

    PubMed

    Chang, Tzu-Hao; Huang, Hsien-Da; Wu, Li-Ching; Yeh, Chi-Ta; Liu, Baw-Jhiune; Horng, Jorng-Tzong

    2009-07-01

    Riboswitches are cis-acting genetic regulatory elements within a specific mRNA that can regulate both transcription and translation by interacting with their corresponding metabolites. Recently, an increasing number of riboswitches have been identified in different species and investigated for their roles in regulatory functions. Both the sequence contexts and structural conformations are important characteristics of riboswitches. None of the previously developed tools, such as covariance models (CMs), Riboswitch finder, and RibEx, provide a web server for efficiently searching homologous instances of known riboswitches or considers two crucial characteristics of each riboswitch, such as the structural conformations and sequence contexts of functional regions. Therefore, we developed a systematic method for identifying 12 kinds of riboswitches. The method is implemented and provided as a web server, RiboSW, to efficiently and conveniently identify riboswitches within messenger RNA sequences. The predictive accuracy of the proposed method is comparable with other previous tools. The efficiency of the proposed method for identifying riboswitches was improved in order to achieve a reasonable computational time required for the prediction, which makes it possible to have an accurate and convenient web server for biologists to obtain the results of their analysis of a given mRNA sequence. RiboSW is now available on the web at http://RiboSW.mbc.nctu.edu.tw/. PMID:19460868

  10. Mutation Detection in an Antibody-Producing Chinese Hamster Ovary Cell Line by Targeted RNA Sequencing

    PubMed Central

    Zhang, Siyan; Hughes, Jason D.; Murgolo, Nicholas; Levitan, Diane; Chen, Janice; Liu, Zhong

    2016-01-01

    Chinese hamster ovary (CHO) cells have been used widely in the pharmaceutical industry for production of biological therapeutics including monoclonal antibodies (mAb). The integrity of the gene of interest and the accuracy of the relay of genetic information impact product quality and patient safety. Here we employed next-generation sequencing, particularly RNA-seq, and developed a method to systematically analyze the mutation rate of the mRNA of CHO cell lines producing a mAb. The effect of an extended culturing period to mimic the scale of cell expansion in a manufacturing process and varying selection pressure in the cell culture were also closely examined. PMID:27088091

  11. Diagnostic assay for Helicobacter hepaticus based on nucleotide sequence of its 16S rRNA gene.

    PubMed Central

    Battles, J K; Williamson, J C; Pike, K M; Gorelick, P L; Ward, J M; Gonda, M A

    1995-01-01

    Conserved primers were used to PCR amplify 95% of the Helicobacter hepaticus 16S rRNA gene. Its sequence was determined and aligned to those of related bacteria, enabling the selection of primers to highly diverged regions of the 16S rRNA gene and an oligonucleotide probe for the development of a PCR-liquid hybridization assay. This assay was shown to be both sensitive and specific for H. hepaticus 16S rRNA gene sequences. PMID:7542270

  12. Detection and sequence analysis of two novel co-infecting double-strand RNA mycoviruses in Ustilaginoidea virens.

    PubMed

    Zhong, Jie; Lei, Xiang Hua; Zhu, Jun Zi; Song, Ge; Zhang, Ya Dong; Chen, Yi; Gao, Bi Da

    2014-11-01

    Four novel double-stranded RNA molecules, named dsRNA 1 (5124 bp), dsRNA 2(1711 bp), dsRNA 3 (1423 bp) and dsRNA 4 (855 bp), were detected in strain HNHS-1 of Ustilaginoidea virens, the causal agent of rice false smut disease. Sequence analysis showed that the dsRNA1 contains two overlapping open reading frames (ORF) potentially encoding proteins with modest levels of sequence similarity to the coat protein (CP) and putative RNA-dependent RNA polymerase (RdRp), respectively, of viruses of the family Totiviridae. The deduced gene product of the ORF encoded by dsRNA2 is homologous to putative RdRp of viruses in the family Partitiviridae; the ORF encoded by dsRNA3 shares some similarity to a hypothetical protein with unknown function. It is noteworthy that the dsRNA4 lacked integrated ORFs. Isomeric viral particles of about 40 nm in diameter were observed by transmission electron microscopy in a mycelium tissue preparation of strain HNHS-1-R1, a single-spore subculture of strain HNHS-1 containing only the dsRNA1 segment. Phylogenetic analysis and examination of the organization of the two putative RdRp sequences both indicated that there are at least two novel virus species present in strain HNHS-1. We named the two novel viruses Ustilaginoidea virens RNA virus 2 and Ustilaginoidea virens partitivirus 4, respectively.

  13. Evidence for multiple sequences and factors involved in c-myc RNA stability during amphibian oogenesis.

    PubMed

    Lefresne, J; Lemaitre, J M; Selo, M; Goussard, J; Mouton, C; Andeol, Y

    2001-04-01

    To investigate the molecular mechanisms regulating c-myc RNA stability during late amphibian oogenesis, a heterologous system was used in which synthetic Xenopus laevis c-myc transcripts, progressively deleted from their 3' end, were injected into the cytoplasm of two different host axolotl (Ambystoma mexicanum) cells: stage VI oocytes and progesterone-matured oocytes (unfertilized eggs; UFE). This in vivo strategy allowed the behavior of the exogenous c-myc transcripts to be followed and different regions involved in the stability of each intermediate deleted molecule to be identified. Interestingly, these specific regions differ in the two cellular contexts. In oocytes, two stabilizing regions are located in the 3' untranslated region (UTR) and two in the coding sequence (exons II and III) of the RNA. In UFE, the stabilizing regions correspond to the first part of the 3' UTR and to the first part of exon II. However, in UFE, the majority of synthetic transcripts are degraded. This degradation is a consequence of nuclear factors delivered after germinal vesicle breakdown and specifically acting on targeted regions of the RNA. To test the direct implication of these nuclear factors in c-myc RNA degradation, an in vitro system was set up using axolotl germinal vesicle extracts that mimic the in vivo results and confirm the existence of specific destabilizing factors. In vitro analysis revealed that two populations of nuclear molecules are implicated: one of 4.4-5S (50-65 kDa) and the second of 5.4-6S (90-110 kDa). These degrading nuclear factors act preferentially on the coding region of the c-myc RNA and appear to be conserved between axolotl and Xenopus. Thus, this experimental approach has allowed the identification of specific stabilizing sequences in c-myc RNA and the temporal identification of the different factors (cytoplasmic and/or nuclear) involved in post-transcriptional regulation of this RNA during oogenesis. PMID:11284969

  14. Joint modeling of RNase footprint sequencing profiles for genome-wide inference of RNA structure

    PubMed Central

    Zou, Chenchen; Ouyang, Zhengqing

    2015-01-01

    Recent studies have revealed significant roles of RNA structure in almost every step of RNA processing, including transcription, splicing, transport and translation. RNase footprint sequencing (RNase-seq) has emerged to dissect RNA structures at the genome scale. However, it remains challenging to analyze RNase-seq data because of the issues of signal sparsity, variability and correlations among various RNases. We present a probabilistic framework, joint Poisson-gamma mixture (JPGM), for integrative modeling of multiple RNase-seq profiles. Combining JPGM with hidden Markov model allows genome-wide inference of RNA structures. We apply the joint modeling approach for inferring base pairing states on simulated data sets and RNase-seq profiles of the double-strand specific RNase V1 and single-strand specific RNase S1 in yeast. We demonstrate that joint analysis of V1 and S1 profiles outputs interpretable RNA structure states, while approaches that analyze each profile separately do not. The joint modeling approach predicts the structure states of all nucleotides in 3196 transcripts of yeast without compromising accuracy, while the simple thresholding approach misses 43% of the nucleotides. Furthermore, the posterior probabilities outputted by our model are able to resolve the structural ambiguity of ≈300 000 nucleotides with overlapping V1 and S1 cleavage sites. Our model also generates RNA accessibilities, which are associated with three-dimensional conformations. PMID:26400167

  15. Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis.

    PubMed

    Spangler, Jacob B; Feltus, Frank Alex

    2013-01-01

    Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression.

  16. Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

    NASA Technical Reports Server (NTRS)

    Fox, G. E.

    1985-01-01

    Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.

  17. Transcriptome analysis of rosette and folding leaves in Chinese cabbage using high-throughput RNA sequencing.

    PubMed

    Wang, Fengde; Li, Libin; Li, Huayin; Liu, Lifeng; Zhang, Yihui; Gao, Jianwei; Wang, Xiaowu

    2012-05-01

    In this study, we report the first use of RNA-sequencing to gain insight into the wide range of transcriptional events that are associated with leafy head development in Chinese cabbage. We generated 53.5 million sequence reads (90 bp in length) from the rosette and heading leaves. The sequence reads were aligned to the recently sequenced Chiifu genome and were analyzed to measure the gene expression levels, to detect alternative splicing events and novel transcripts, to determine the expression of single nucleotide polymorphisms, and to refine the annotated gene structures. The analysis of the global gene expression pattern suggests two important concepts, which govern leafy head formation. Firstly, some stimuli, such as carbohydrate levels, light intensity and endogenous hormones might play a critical role in regulating the leafy head formation. Secondly, the regulation of transcription factors, protein kinases and calcium may also be involved in this developmental process.

  18. The nucleotide sequence of Beneckea harveyi 5S rRNA. [bioluminescent marine bacterium

    NASA Technical Reports Server (NTRS)

    Luehrsen, K. R.; Fox, G. E.

    1981-01-01

    The primary sequence of the 5S ribosomal RNA isolated from the free-living bioluminescent marine bacterium Beneckea harveyi is reported and discussed in regard to indications of phylogenetic relationships with the bacteria Escherichia coli and Photobacterium phosphoreum. Sequences were determined for oligonucleotide products generated by digestion with ribonuclease T1, pancreatic ribonuclease and ribonuclease T2. The presence of heterogeneity is indicated for two sites. The B. harveyi sequence can be arranged into the same four helix secondary structures as E. coli and other prokaryotic 5S rRNAs. Examination of the 5S-RNS sequences of the three bacteria indicates that B. harveyi and P. phosphoreum are specifically related and share a common ancestor which diverged from an ancestor of E. coli at a somewhat earlier time, consistent with previous studies.

  19. Naturally occurring variations in sequence length creates microRNA isoforms that differ in argonaute effector complex specificity

    PubMed Central

    2010-01-01

    Background Micro(mi)RNAs are short RNA sequences, ranging from 16 to 35 nucleotides (miRBase; http://www.mirbase.org). The majority of the identified sequences are 21 or 22 nucleotides in length. Despite the range of sequence lengths for different miRNAs, individual miRNAs were thought to have a specific sequence of a particular length. A recent report describing a longer variant of a previously identified miRNA in Arabidopsis thaliana prompted this investigation for variations in the length of other miRNAs. Results In this paper, we demonstrate that a fifth of annotated A. thaliana miRNAs recorded in miRBase V.14 have stable miRNA isoforms that are one or two nucleotides longer than their respective recorded miRNA. Further, we demonstrate that miRNA isoforms are co-expressed and often show differential argonaute complex association. We postulate that these extensions are caused by differential cleavage of the parent precursor miRNA. Conclusions Our systematic analysis of A. thaliana miRNAs reveals that miRNA length isoforms are relatively common. This finding not only has implications for miRBase and miRNA annotation, but also extends to miRNA validation experiments and miRNA localization studies. Further, we predict that miRNA isoforms are present in other plant species also. PMID:20534119

  20. Herpes simplex virus virion stimulatory protein mRNA leader contains sequence elements which increase both virus-induced transcription and mRNA stability.

    PubMed

    Blair, E D; Blair, C C; Wagner, E K

    1987-08-01

    To investigate the role of 5' noncoding leader sequence of herpes simplex virus type 1 (HSV-1) mRNA in infected cells, the promoter for the 65,000-dalton virion stimulatory protein (VSP), a beta-gamma polypeptide, was introduced into plasmids bearing the chloramphenicol acetyltransferase (cat) gene together with various lengths of adjacent viral leader sequences. Plasmids containing longer lengths of leader sequence gave rise to significantly higher levels of CAT enzyme in transfected cells superinfected with HSV-1. RNase T2 protection assays of CAT mRNA showed that transcription was initiated from an authentic viral cap site in all VSP-CAT constructs and that CAT mRNA levels corresponded to CAT enzyme levels. Use of cis-linked simian virus 40 enhancer sequences demonstrated that the effect was virus specific. Constructs containing 12 and 48 base pairs of the VSP mRNA leader gave HSV infection-induced CAT activities intermediate between those of the leaderless construct and the VSP-(+77)-CAT construct. Actinomycin D chase experiments demonstrated that the longest leader sequences increased hybrid CAT mRNA stability at least twofold in infected cells. Cotransfection experiments with a cosmid bearing four virus-specified transcription factors (ICP4, ICP0, ICP27, and VSP-65K) showed that sequences from -3 to +77, with respect to the viral mRNA cap site, also contained signals responsive to transcriptional activation. PMID:3037112

  1. LocARNAscan: Incorporating thermodynamic stability in sequence and structure-based RNA homology search

    PubMed Central

    2013-01-01

    Background The search for distant homologs has become an import issue in genome annotation. A particular difficulty is posed by divergent homologs that have lost recognizable sequence similarity. This same problem also arises in the recognition of novel members of large classes of RNAs such as snoRNAs or microRNAs that consist of families unrelated by common descent. Current homology search tools for structured RNAs are either based entirely on sequence similarity (such as blast or hmmer) or combine sequence and secondary structure. The most prominent example of the latter class of tools is Infernal. Alternatives are descriptor-based methods. In most practical applications published to-date, however, the information contained in covariance models or manually prescribed search patterns is dominated by sequence information. Here we ask two related questions: (1) Is secondary structure alone informative for homology search and the detection of novel members of RNA classes? (2) To what extent is the thermodynamic propensity of the target sequence to fold into the correct secondary structure helpful for this task? Results Sequence-structure alignment can be used as an alternative search strategy. In this scenario, the query consists of a base pairing probability matrix, which can be derived either from a single sequence or from a multiple alignment representing a set of known representatives. Sequence information can be optionally added to the query. The target sequence is pre-processed to obtain local base pairing probabilities. As a search engine we devised a semi-global scanning variant of LocARNA’s algorithm for sequence-structure alignment. The LocARNAscan tool is optimized for speed and low memory consumption. In benchmarking experiments on artificial data we observe that the inclusion of thermodynamic stability is helpful, albeit only in a regime of extremely low sequence information in the query. We observe, furthermore, that the sensitivity is bounded in

  2. Identification of RNA sequence isomer by isotope labeling and LC-MS/MS.

    PubMed

    Li, Siwei; Limbach, Patrick A

    2014-11-01

    Recently, we developed a method for modified ribonucleic acid (RNA) analysis based on the comparative analysis of RNA digests (CARD). Within this CARD approach, sequence or modification differences between two samples are identified through differential isotopic labeling of two samples. Components present in both samples will each be labeled, yielding doublets in the CARD mass spectrum. Components unique to only one sample should be detected as singlets. A limitation of the prior singlet identification strategy occurs when the two samples contain components of unique sequence but identical base composition. At the first stage of mass spectrometry, these sequence isomers cannot be differentiated and would appear as doublets rather than singlets. However, underlying sequence differences should be detectable by collision-induced dissociation tandem mass spectrometry (CID MS/MS), as y-type product ions will retain the original enzymatically incorporated isotope label. Here, we determine appropriate instrumental conditions that enable CID MS/MS of isotopically labeled ribonuclease T1 (RNase T1) digestion products such that the original isotope label is maintained in the product ion mass spectrum. Next, we demonstrate how y-type product ions can be used to differentiate singlets and doublets from isomer sequences. We were then able to extend the utility of this approach by using CID MS/MS for the confirmation of an expected RNase T1 digestion product within the CARD analysis of an Escherichia coli mutant strain even in the presence of interfering and overlapping digestion products from other transfer RNAs.

  3. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    PubMed

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark

    2013-01-01

    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).

  4. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    SciTech Connect

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S.; Arbuthnot, Patrick

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  5. An analysis of vertebrate mRNA sequences: intimations of translational control

    PubMed Central

    1991-01-01

    Five structural features in mRNAs have been found to contribute to the fidelity and efficiency of initiation by eukaryotic ribosomes. Scrutiny of vertebrate cDNA sequences in light of these criteria reveals a set of transcripts--encoding oncoproteins, growth factors, transcription factors, and other regulatory proteins--that seem designed to be translated poorly. Thus, throttling at the level of translation may be a critical component of gene regulation in vertebrates. An alternative interpretation is that some (perhaps many) cDNAs with encumbered 5' noncoding sequences represent mRNA precursors, which would imply extensive regulation at a posttranscriptional step that precedes translation. PMID:1955461

  6. Phylogenetic diversity in the genus Bacillus as seen by 16S rRNA sequencing studies

    NASA Technical Reports Server (NTRS)

    Rossler, D.; Ludwig, W.; Schleifer, K. H.; Lin, C.; McGill, T. J.; Wisotzkey, J. D.; Jurtshuk, P. Jr; Fox, G. E.

    1991-01-01

    Comparative sequence analysis of 16S ribosomal (r)RNAs or DNAs of Bacillus alvei, B. laterosporus, B. macerans, B. macquariensis, B. polymyxa and B. stearothermophilus revealed the phylogenetic diversity of the genus Bacillus. Based on the presently available data set of 16S rRNA sequences from bacilli and relatives at least four major "Bacillus clusters" can be defined: a "Bacillus subtilis cluster" including B. stearothermophilus, a "B. brevis cluster" including B. laterosporus, a "B. alvei cluster" including B. macerans, B. maquariensis and B. polymyxa and a "B. cycloheptanicus branch".

  7. A tool kit for quantifying eukaryotic rRNA gene sequences from human microbiome samples.

    PubMed

    Dollive, Serena; Peterfreund, Gregory L; Sherrill-Mix, Scott; Bittinger, Kyle; Sinha, Rohini; Hoffmann, Christian; Nabel, Christopher S; Hill, David A; Artis, David; Bachman, Michael A; Custers-Allen, Rebecca; Grunberg, Stephanie; Wu, Gary D; Lewis, James D; Bushman, Frederic D

    2012-07-03

    Eukaryotic microorganisms are important but understudied components of the human microbiome. Here we present a pipeline for analysis of deep sequencing data on single cell eukaryotes. We designed a new 18S rRNA gene-specific PCR primer set and compared a published rRNA gene internal transcribed spacer (ITS) gene primer set. Amplicons were tested against 24 specimens from defined eukaryotes and eight well-characterized human stool samples. A software pipeline https://sourceforge.net/projects/brocc/ was developed for taxonomic attribution, validated against simulated data, and tested on pyrosequence data. This study provides a well-characterized tool kit for sequence-based enumeration of eukaryotic organisms in human microbiome samples.

  8. Reconstruction and applications of consensus yeast metabolic network based on RNA sequencing.

    PubMed

    Zhao, Yuqi; Wang, Yanjie; Zou, Lei; Huang, Jingfei

    2016-04-01

    One practical application of genome-scale metabolic reconstructions is to interrogate multispecies relationships. Here, we report a consensus metabolic model in four yeast species (Saccharomyces cerevisiae, S. paradoxus, S. mikatae, and S. bayanus) by integrating metabolic network simulations with RNA sequencing (RNA-seq) datasets. We generated high-resolution transcriptome maps of four yeast species through de novo assembly and genome-guided approaches. The transcriptomes were annotated and applied to build the consensus metabolic network, which was verified using independent RNA-seq experiments. The expression profiles reveal that the genes involved in amino acid and lipid metabolism are highly coexpressed. The diverse phenotypic characteristics, such as cellular growth and gene deletions, can be simulated using the metabolic model. We also explored the applications of the consensus model in metabolic engineering using yeast-specific reactions and biofuel production as examples. Similar strategies will benefit communities studying genome-scale metabolic networks of other organisms. PMID:27239440

  9. Nucleotide Sequences and Modifications That Determine RIG-I/RNA Binding and Signaling Activities ▿

    PubMed Central

    Uzri, Dina; Gehrke, Lee

    2009-01-01

    Cytoplasmic viral RNAs with 5′ triphosphates (5′ppp) are detected by the RNA helicase RIG-I, initiating downstream signaling and alpha/beta interferon (IFN-α/β) expression that establish an antiviral state. We demonstrate here that the hepatitis C virus (HCV) 3′ untranslated region (UTR) RNA has greater activity as an immune stimulator than several flavivirus UTR RNAs. We confirmed that the HCV 3′-UTR poly(U/UC) region is the determinant for robust activation of RIG-I-mediated innate immune signaling and that its antisense sequence, poly(AG/A), is an equivalent RIG-I activator. The poly(U/UC) region of the fulminant HCV JFH-1 strain was a relatively weak activator, while the antisense JFH-1 strain poly(AG/A) RNA was very potent. Poly(U/UC) activity does not require primary nucleotide sequence adjacency to the 5′ppp, suggesting that RIG-I recognizes two independent RNA domains. Whereas poly(U) 50-nt or poly(A) 50-nt sequences were minimally active, inserting a single C or G nucleotide, respectively, into these RNAs increased IFN-β expression. Poly(U/UC) RNAs transcribed in vitro using modified uridine 2′ fluoro or pseudouridine ribonucleotides lacked signaling activity while functioning as competitive inhibitors of RIG-I binding and IFN-β expression. Nucleotide base and ribose modifications that convert activator RNAs into competitive inhibitors of RIG-I signaling may be useful as modulators of RIG-I-mediated innate immune responses and as tools to dissect the RNA binding and conformational events associated with signaling. PMID:19224987

  10. Pleiotropic constraints, expression level, and the evolution of miRNA sequences.

    PubMed

    Jovelin, Richard

    2013-12-01

    Post-transcriptional gene regulation mediated by microRNAs (miRNAs) plays critical roles during development by modulating gene expression and conferring robustness to stochastic errors. Phylogenetic analyses suggest that miRNA acquisition could play a role in phenotypic innovation. Moreover, miRNA-induced regulation strongly impacts genome evolution, increasing selective constraints on 3'UTRs, protein sequences, and expression level divergence. Thus, it is essential to understand the factors governing sequence evolution for this important class of regulatory molecules. Investigation of the patterns of molecular evolution at miRNA loci have been limited in Caenorhabditis elegans because of the lack of a close outgroup. Instead, I used Caenorhabditis briggsae as the focus point of this study because of its close relationship to Caenorhabditis sp. 9. I also corroborated the patterns of sequence evolution in Caenorhabditis using published orthologous relationships among miRNAs in Drosophila. In nematodes and in flies, miRNA sequence divergence is not influenced by the genomic neighborhood (i.e., intronic or intergenic) but is nevertheless affected by the genomic context because X-linked miRNAs evolve faster than autosomal miRNAs. However, this effect of chromosomal linkage can be explained by differential expression levels rather than a fast-X effect. The results presented here support a universal negative relationship between rates of molecular evolution and expression level, and suggest that mutations in highly expressed miRNAs are more likely to be deleterious because they potentially affect a larger number of target genes. Finally, I show that many single family member miRNAs evolve faster than miRNAs from multigene families and have limited functional scope, suggesting that they are not strongly integrated in gene regulatory networks.

  11. Metatranscriptome of marine bacterioplankton during winter time in the North Sea assessed by total RNA sequencing.

    PubMed

    Kopf, Anna; Kostadinov, Ivaylo; Wichels, Antje; Quast, Christian; Glöckner, Frank Oliver

    2015-02-01

    Marine metatranscriptome data was generated as part of a study investigating the bacterioplankton communities towards the end of a diatom-dominated spring phytoplankton bloom. This genomic resource article reports a metatranscriptomic dataset from amidst the winter time prior to the occurrence of the spring diatom bloom. Up to 58% of all sequences could be assigned to predicted genes. Taxonomic analysis based on expressed 16S ribosomal RNA genes identified Alphaproteobacteria and Gammaproteobacteria as the most active community members.

  12. Long Noncoding RNA and mRNA Expression Profiles in the Thyroid Gland of Two Phenotypically Extreme Pig Breeds Using Ribo-Zero RNA Sequencing.

    PubMed

    Shen, Yifei; Mao, Haiguang; Huang, Minjie; Chen, Lixing; Chen, Jiucheng; Cai, Zhaowei; Wang, Ying; Xu, Ningying

    2016-01-01

    The thyroid gland is an important endocrine organ modulating development, growth, and metabolism, mainly by controlling the synthesis and secretion of thyroid hormones (THs). However, little is known about the pig thyroid transcriptome. Long non-coding RNAs (lncRNAs) regulate gene expression and play critical roles in many cellular processes. Yorkshire pigs have a higher growth rate but lower fat deposition than that of Jinhua pigs, and thus, these species are ideal models for studying growth and lipid metabolism. This study revealed higher levels of THs in the serum of Yorkshire pigs than in the serum of Jinhua pigs. By using Ribo-zero RNA sequencing-which can capture both polyA and non-polyA transcripts-the thyroid transcriptome of both breeds were analyzed and 22,435 known mRNAs were found to be expressed in the pig thyroid. In addition, 1189 novel mRNAs and 1018 candidate lncRNA transcripts were detected. Multiple TH-synthesis-related genes were identified among the 455 differentially-expressed known mRNAs, 37 novel mRNAs, and 52 lncRNA transcripts. Bioinformatics analysis revealed that differentially-expressed genes were enriched in the microtubule-based process, which contributes to THs secretion. Moreover, integrating analysis predicted 13 potential lncRNA-mRNA gene pairs. These data expanded the repertoire of porcine lncRNAs and mRNAs and contribute to understanding the possible molecular mechanisms involved in animal growth and lipid metabolism. PMID:27409639

  13. Exploration of sequence space as the basis of viral RNA genome segmentation.

    PubMed

    Moreno, Elena; Ojosnegros, Samuel; García-Arriaza, Juan; Escarmís, Cristina; Domingo, Esteban; Perales, Celia

    2014-05-01

    The mechanisms of viral RNA genome segmentation are unknown. On extensive passage of foot-and-mouth disease virus in baby hamster kidney-21 cells, the virus accumulated multiple point mutations and underwent a transition akin to genome segmentation. The standard single RNA genome molecule was replaced by genomes harboring internal in-frame deletions affecting the L- or capsid-coding region. These genomes were infectious and killed cells by complementation. Here we show that the point mutations in the nonstructural protein-coding region (P2, P3) that accumulated in the standard genome before segmentation increased the relative fitness of the segmented version relative to the standard genome. Fitness increase was documented by intracellular expression of virus-coded proteins and infectious progeny production by RNAs with the internal deletions placed in the sequence context of the parental and evolved genome. The complementation activity involved several viral proteins, one of them being the leader proteinase L. Thus, a history of genetic drift with accumulation of point mutations was needed to allow a major variation in the structure of a viral genome. Thus, exploration of sequence space by a viral genome (in this case an unsegmented RNA) can reach a point of the space in which a totally different genome structure (in this case, a segmented RNA) is favored over the form that performed the exploration.

  14. The four ingredients of single-sequence RNA secondary structure prediction. A unifying perspective

    PubMed Central

    Rivas, Elena

    2013-01-01

    Any method for RNA secondary structure prediction is determined by four ingredients. The architecture is the choice of features implemented by the model (such as stacked basepairs, loop length distributions, etc.). The architecture determines the number of parameters in the model. The scoring scheme is the nature of those parameters (whether thermodynamic, probabilistic, or weights). The parameterization stands for the specific values assigned to the parameters. These three ingredients are referred to as “the model.” The fourth ingredient is the folding algorithms used to predict plausible secondary structures given the model and the sequence of a structural RNA. Here, I make several unifying observations drawn from looking at more than 40 years of methods for RNA secondary structure prediction in the light of this classification. As a final observation, there seems to be a performance ceiling that affects all methods with complex architectures, a ceiling that impacts all scoring schemes with remarkable similarity. This suggests that modeling RNA secondary structure by using intrinsic sequence-based plausible “foldability” will require the incorporation of other forms of information in order to constrain the folding space and to improve prediction accuracy. This could give an advantage to probabilistic scoring systems since a probabilistic framework is a natural platform to incorporate different sources of information into one single inference problem. PMID:23695796

  15. The four ingredients of single-sequence RNA secondary structure prediction. A unifying perspective.

    PubMed

    Rivas, Elena

    2013-07-01

    Any method for RNA secondary structure prediction is determined by four ingredients. The architecture is the choice of features implemented by the model (such as stacked basepairs, loop length distributions, etc.). The architecture determines the number of parameters in the model. The scoring scheme is the nature of those parameters (whether thermodynamic, probabilistic, or weights). The parameterization stands for the specific values assigned to the parameters. These three ingredients are referred to as "the model." The fourth ingredient is the folding algorithms used to predict plausible secondary structures given the model and the sequence of a structural RNA. Here, I make several unifying observations drawn from looking at more than 40 years of methods for RNA secondary structure prediction in the light of this classification. As a final observation, there seems to be a performance ceiling that affects all methods with complex architectures, a ceiling that impacts all scoring schemes with remarkable similarity. This suggests that modeling RNA secondary structure by using intrinsic sequence-based plausible "foldability" will require the incorporation of other forms of information in order to constrain the folding space and to improve prediction accuracy. This could give an advantage to probabilistic scoring systems since a probabilistic framework is a natural platform to incorporate different sources of information into one single inference problem.

  16. Rtools: a web server for various secondary structural analyses on single RNA sequences.

    PubMed

    Hamada, Michiaki; Ono, Yukiteru; Kiryu, Hisanori; Sato, Kengo; Kato, Yuki; Fukunaga, Tsukasa; Mori, Ryota; Asai, Kiyoshi

    2016-07-01

    The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD. PMID:27131356

  17. Rtools: a web server for various secondary structural analyses on single RNA sequences

    PubMed Central

    Hamada, Michiaki; Ono, Yukiteru; Kiryu, Hisanori; Sato, Kengo; Kato, Yuki; Fukunaga, Tsukasa; Mori, Ryota; Asai, Kiyoshi

    2016-01-01

    The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD. PMID:27131356

  18. Identification of purple sea urchin telomerase RNA using a next-generation sequencing based approach.

    PubMed

    Li, Yang; Podlevsky, Joshua D; Marz, Manja; Qi, Xiaodong; Hoffmann, Steve; Stadler, Peter F; Chen, Julian J-L

    2013-06-01

    Telomerase is a ribonucleoprotein (RNP) enzyme essential for telomere maintenance and chromosome stability. While the catalytic telomerase reverse transcriptase (TERT) protein is well conserved across eukaryotes, telomerase RNA (TR) is extensively divergent in size, sequence, and structure. This diversity prohibits TR identification from many important organisms. Here we report a novel approach for TR discovery that combines in vitro TR enrichment from total RNA, next-generation sequencing, and a computational screening pipeline. With this approach, we have successfully identified TR from Strongylocentrotus purpuratus (purple sea urchin) from the phylum Echinodermata. Reconstitution of activity in vitro confirmed that this RNA is an integral component of sea urchin telomerase. Comparative phylogenetic analysis against vertebrate TR sequences revealed that the purple sea urchin TR contains vertebrate-like template-pseudoknot and H/ACA domains. While lacking a vertebrate-like CR4/5 domain, sea urchin TR has a unique central domain critical for telomerase activity. This is the first TR identified from the previously unexplored invertebrate clade and provides the first glimpse of TR evolution in the deuterostome lineage. Moreover, our TR discovery approach is a significant step toward the comprehensive understanding of telomerase RNP evolution.

  19. Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles.

    PubMed

    Gautheret, D; Lambert, A

    2001-11-01

    We present here a new approach to the problem of defining RNA signatures and finding their occurrences in sequence databases. The proposed method is based on "secondary structure profiles". An RNA sequence alignment with secondary structure information is used as an input. Two types of weight matrices/profiles are constructed from this alignment: single strands are represented by a classical lod-scores profile while helical regions are represented by an extended "helical profile" comprising 16 lod-scores per position, one for each of the 16 possible base-pairs. Database searches are then conducted using a simultaneous search for helical profiles and dynamic programming alignment of single strand profiles. The algorithm has been implemented into a new software, ERPIN, that performs both profile construction and database search. Applications are presented for several RNA motifs. The automated use of sequence information in both single-stranded and helical regions yields better sensitivity/specificity ratios than descriptor-based programs. Furthermore, since the translation of alignments into profiles is straightforward with ERPIN, iterative searches can easily be conducted to enrich collections of homologous RNAs.

  20. Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences.

    PubMed

    Yarza, Pablo; Yilmaz, Pelin; Pruesse, Elmar; Glöckner, Frank Oliver; Ludwig, Wolfgang; Schleifer, Karl-Heinz; Whitman, William B; Euzéby, Jean; Amann, Rudolf; Rosselló-Móra, Ramon

    2014-09-01

    Publicly available sequence databases of the small subunit ribosomal RNA gene, also known as 16S rRNA in bacteria and archaea, are growing rapidly, and the number of entries currently exceeds 4 million. However, a unified classification and nomenclature framework for all bacteria and archaea does not yet exist. In this Analysis article, we propose rational taxonomic boundaries for high taxa of bacteria and archaea on the basis of 16S rRNA gene sequence identities and suggest a rationale for the circumscription of uncultured taxa that is compatible with the taxonomy of cultured bacteria and archaea. Our analyses show that only nearly complete 16S rRNA sequences give accurate measures of taxonomic diversity. In addition, our analyses suggest that most of the 16S rRNA sequences of the high taxa will be discovered in environmental surveys by the end of the current decade.

  1. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGESBeta

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  2. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  3. miRMOD: a tool for identification and analysis of 5' and 3' miRNA modifications in Next Generation Sequencing small RNA data.

    PubMed

    Kaushik, Abhinav; Saraf, Shradha; Mukherjee, Sunil K; Gupta, Dinesh

    2015-01-01

    In the past decade, the microRNAs (miRNAs) have emerged to be important regulators of gene expression across various species. Several studies have confirmed different types of post-transcriptional modifications at terminal ends of miRNAs. The reports indicate that miRNA modifications are conserved and functionally significant as it may affect miRNA stability and ability to bind mRNA targets, hence affecting target gene repression. Next Generation Sequencing (NGS) of the small RNA (sRNA) provides an efficient and reliable method to explore miRNA modifications. The need for dedicated software, especially for users with little knowledge of computers, to determine and analyze miRNA modifications in sRNA NGS data, motivated us to develop miRMOD. miRMOD is a user-friendly, Microsoft Windows and Graphical User Interface (GUI) based tool for identification and analysis of 5' and 3' miRNA modifications (non-templated nucleotide additions and trimming) in sRNA NGS data. In addition to identification of miRNA modifications, the tool also predicts and compares the targets of query and modified miRNAs. In order to compare binding affinities for the same target, miRMOD utilizes minimum free energies of the miRNA:target and modified-miRNA:target interactions. Comparisons of the binding energies may guide experimental exploration of miRNA post-transcriptional modifications. The tool is available as a stand-alone package to overcome large data transfer problems commonly faced in web-based high-throughput (HT) sequencing data analysis tools. miRMOD package is freely available at http://bioinfo.icgeb.res.in/miRMOD. PMID:26623179

  4. Utility of next-generation RNA-sequencing in identifying chimeric transcription involving human endogenous retroviruses.

    PubMed

    Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou

    2016-01-01

    Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights

  5. Human ribosomal RNA gene cluster: Identification of the proximal end containing a novel tandem repeat sequence

    SciTech Connect

    Sakai, K.; Ohta, T.; Minoshima, S.

    1995-04-10

    Human ribosomal RNA genes (rDNA) are arranged as tandem repeat clusters on the short arms of five pairs of acrocentric chromosomes. We have demonstrated that a majority of the rDNA clusters are detected as 3-Mb DNA fragments when released from human genomic DNA by EcoRV digestion. This indicated the absence of the EcoRV restriction site within the rDNA clusters. We then screened for rDNA-positive cosmid clones using a chromosome 22-specific cosmid library that was constructed from MboI partial digests of the flow-sorted chromosomes. Three hundred twenty rDNA-positive clones negative for the previously reported distal flanking sequence (pACR1) were chosen and subjected to EcoRV digestion. Seven clones susceptible to EcoRV were further characterized as candidate clones that might have been derived from the junctions of the 3-Mb rDNA cluster. We identified one clone containing part of the rDNA unit sequence and a novel flanking sequence. Detailed analysis of this unique clone revealed that the coding region of the last rRNA gene located at the proximal end of the cluster is interrupted with a novel sequence of {approximately}147 bp that is tandemly repeated and is connected with an intervening 68-bp unique sequence. This junction sequence was readily amplified from chromosomes 21 and 15 as well as 22 using the polymerase chain reaction. Fluorescence in situ hybridization further indicated that the {approximately}147-bp sequence repeat is commonly distributed among all the acrocentric short arms. 23 refs., 5 figs.

  6. Reconstructing evolution from eukaryotic small-ribosomal-subunit RNA sequences: calibration of the molecular clock.

    PubMed

    Van de Peer, Y; Neefs, J M; De Rijk, P; De Wachter, R

    1993-08-01

    The detailed descriptions now available for the secondary structure of small-ribosomal-subunit RNA, including areas of highly variable primary structure, facilitate the alignment of nucleotide sequences. However, for optimal exploitation of the information contained in the alignment, a method must be available that takes into account the local sequence variability in the computation of evolutionary distance. A quantitative definition for the variability of an alignment position is proposed in this study. It is a parameter in an equation which expresses the probability that the alignment position contains a different nucleotide in two sequences, as a function of the distance separating these sequences, i.e., the number of substitutions per nucleotide that occurred during their divergence. This parameter can be estimated from the distance matrix resulting from the conversion of pairwise sequence dissimilarities into pairwise distances. Alignment positions can then be subdivided into a number of sets of matching variability, and the average variability of each set can be derived. Next, the conversion of dissimilarity into distance can be recalculated for each set of alignment positions separately, using a modified version of the equation that corrects for multiple substitutions and changing for each set the parameter that reflects its average variability. The distances computed for each set are finally averaged, giving a more precise distance estimation. Trees constructed by the algorithm based on variability calibration have a topology markedly different from that of trees constructed from the same alignments in the absence of calibration. This is illustrated by means of trees constructed from small-ribosomal-subunit RNA sequences of Metazoa. A reconstruction of vertebrate evolution based on calibrated alignments matches the consensus view of paleontologists, contrary to trees based on uncalibrated alignments. In trees derived from sequences covering several metazoan

  7. Comparison of hepatocellular carcinoma miRNA expression profiling as evaluated by next generation sequencing and microarray.

    PubMed

    Murakami, Yoshiki; Tanahashi, Toshihito; Okada, Rina; Toyoda, Hidenori; Kumada, Takashi; Enomoto, Masaru; Tamori, Akihiro; Kawada, Norifumi; Taguchi, Y-h; Azuma, Takeshi

    2014-01-01

    MicroRNA (miRNA) expression profiling has proven useful in diagnosing and understanding the development and progression of several diseases. Microarray is the standard method for analyzing miRNA expression profiles; however, it has several disadvantages, including its limited detection of miRNAs. In recent years, advances in genome sequencing have led to the development of next-generation sequencing (NGS) technologies, which significantly advance genome sequencing speed and discovery. In this study, we compared the expression profiles obtained by next generation sequencing (NGS) with the profiles created using microarray to assess if NGS could produce a more accurate and complete miRNA profile. Total RNA from 14 hepatocellular carcinoma tumors (HCC) and 6 matched non-tumor control tissues were sequenced with Illumina MiSeq 50-bp single-end reads. Micro RNA expression profiles were estimated using miRDeep2 software. As a comparison, miRNA expression profiles for 11 out of 14 HCCs were also established by microarray (Agilent human microRNA microarray). The average total sequencing exceeded 2.2 million reads per sample and of those reads, approximately 57% mapped to the human genome. The average correlation for miRNA expression between microarray and NGS and subtraction were 0.613 and 0.587, respectively, while miRNA expression between technical replicates was 0.976. The diagnostic accuracy of HCC, p-value, and AUC were 90.0%, 7.22×10(-4), and 0.92, respectively. In summary, NGS created an miRNA expression profile that was reproducible and comparable to that produced by microarray. Moreover, NGS discovered novel miRNAs that were otherwise undetectable by microarray. We believe that miRNA expression profiling by NGS can be a useful diagnostic tool applicable to multiple fields of medicine.

  8. Molecular cloning of five individual stage- and tissue-specific mRNA sequences from sea urchin pluteus embryos.

    PubMed

    Fregien, N; Dolecki, G J; Mandel, M; Humphreys, T

    1983-06-01

    Five developmentally regulated sea urchin mRNA sequences which increase in abundance between the blastula and pluteus stages of development were isolated by molecular cloning of cDNA. The regulated sequences all appeared in moderately abundant mRNA molecules of pluteus cells and represented 4% of the clones tested. There were no regulated sequences detected in the 40% of the clones which hybridized to the most abundant mRNA, and the screening procedures were inadequate to detect possible regulation in the 20 to 30% of the clones presumably derived from rare-class mRNA. The reaction of 32P[cDNA] from blastula and pluteus mRNA to dots of the cloned DNAs on nitrocellulose filters indicated that the mRNAs complementary to the different cloned pluteus-specific sequences were between 3- and 47-fold more prevalent at the pluteus stage than at the blastula stage. Polyadenylated RNA from different developmental stages was transferred from electrophoretic gels to nitrocellulose filters and reacted to the different cloned sequences. The regulated mRNAs were undetectable in the RNA of 3-h embryos, became evident at the hatching blastula stage, and reached a maximum in abundance by the gastrula or pluteus stage. Certain of the clones reacted to two sizes of mRNA which did not vary coordinately with development. Transfers of RNA isolated from each of the three cell layers of pluteus embryos that were reacted to the cloned sequences revealed that two of the sequences were found in the mRNA of all three layers, two were ectoderm specific, and one was endoderm specific. Four of the regulated sequences were complementary to one or two major bands and one to at least 50 bands on Southern transfers of restriction endonuclease-digested total sea urchin DNA. PMID:6688291

  9. High quality RNA extraction from Maqui berry for its application in next-generation sequencing.

    PubMed

    Sánchez, Carolina; Villacreses, Javier; Blanc, Noelle; Espinoza, Loreto; Martinez, Camila; Pastor, Gabriela; Manque, Patricio; Undurraga, Soledad F; Polanco, Victor

    2016-01-01

    Maqui berry (Aristotelia chilensis) is a native Chilean species that produces berries that are exceptionally rich in anthocyanins and natural antioxidants. These natural compounds provide an array of health benefits for humans, making them very desirable in a fruit. At the same time, these substances also interfere with nucleic acid preparations, making RNA extraction from Maqui berry a major challenge. Our group established a method for RNA extraction of Maqui berry with a high quality RNA (good purity, good integrity and higher yield). This procedure is based on the adapted CTAB method using high concentrations of PVP (4 %) and β-mercaptoethanol (4 %) and spermidine in the extraction buffer. These reagents help to remove contaminants such as polysaccharides, proteins, phenols and also prevent the oxidation of phenolic compounds. The high quality of RNA isolated through this method allowed its uses with success in molecular applications for this endemic Chilean fruit, such as differential expression analysis of RNA-Seq data using next generation sequencing (NGS). Furthermore, we consider that our method could potentially be used for other plant species with extremely high levels of antioxidants and anthocyanins. PMID:27536526

  10. Gene Profiling of Bone around Orthodontic Mini-Implants by RNA-Sequencing Analysis

    PubMed Central

    Nahm, Kyung-Yen; Heo, Jung Sun; Lee, Jae-Hyung; Lee, Dong-Yeol; Chung, Kyu-Rhim; Ahn, Hyo-Won; Kim, Seong-Hun

    2015-01-01

    This study aimed to evaluate the genes that were expressed in the healing bones around SLA-treated titanium orthodontic mini-implants in a beagle at early (1-week) and late (4-week) stages with RNA-sequencing (RNA-Seq). Samples from sites of surgical defects were used as controls. Total RNA was extracted from the tissue around the implants, and an RNA-Seq analysis was performed with Illumina TruSeq. In the 1-week group, genes in the gene ontology (GO) categories of cell growth and the extracellular matrix (ECM) were upregulated, while genes in the categories of the oxidation-reduction process, intermediate filaments, and structural molecule activity were downregulated. In the 4-week group, the genes upregulated included ECM binding, stem cell fate specification, and intramembranous ossification, while genes in the oxidation-reduction process category were downregulated. GO analysis revealed an upregulation of genes that were related to significant mechanisms, including those with roles in cell proliferation, the ECM, growth factors, and osteogenic-related pathways, which are associated with bone formation. From these results, implant-induced bone formation progressed considerably during the times examined in this study. The upregulation or downregulation of selected genes was confirmed with real-time reverse transcription polymerase chain reaction. The RNA-Seq strategy was useful for defining the biological responses to orthodontic mini-implants and identifying the specific genetic networks for targeted evaluations of successful peri-implant bone remodeling. PMID:25759820

  11. High quality RNA extraction from Maqui berry for its application in next-generation sequencing.

    PubMed

    Sánchez, Carolina; Villacreses, Javier; Blanc, Noelle; Espinoza, Loreto; Martinez, Camila; Pastor, Gabriela; Manque, Patricio; Undurraga, Soledad F; Polanco, Victor

    2016-01-01

    Maqui berry (Aristotelia chilensis) is a native Chilean species that produces berries that are exceptionally rich in anthocyanins and natural antioxidants. These natural compounds provide an array of health benefits for humans, making them very desirable in a fruit. At the same time, these substances also interfere with nucleic acid preparations, making RNA extraction from Maqui berry a major challenge. Our group established a method for RNA extraction of Maqui berry with a high quality RNA (good purity, good integrity and higher yield). This procedure is based on the adapted CTAB method using high concentrations of PVP (4 %) and β-mercaptoethanol (4 %) and spermidine in the extraction buffer. These reagents help to remove contaminants such as polysaccharides, proteins, phenols and also prevent the oxidation of phenolic compounds. The high quality of RNA isolated through this method allowed its uses with success in molecular applications for this endemic Chilean fruit, such as differential expression analysis of RNA-Seq data using next generation sequencing (NGS). Furthermore, we consider that our method could potentially be used for other plant species with extremely high levels of antioxidants and anthocyanins.

  12. DNA and RNA sequencing by nanoscale reading through programmable electrophoresis and nanoelectrode-gated tunneling and dielectric detection

    DOEpatents

    Lee, James W.; Thundat, Thomas G.

    2005-06-14

    An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.

  13. Analyses of Long Non-Coding RNA and mRNA profiling using RNA sequencing during the pre-implantation phases in pig endometrium.

    PubMed

    Wang, Yueying; Xue, Songyi; Liu, Xiaoran; Liu, Huan; Hu, Tao; Qiu, Xiaotian; Zhang, Jinlong; Lei, Minggang

    2016-01-01

    Establishment of implantation in pig is accompanied by a coordinated interaction between the maternal uterine endometrium and conceptus development. We investigated the expression profiles of endometrial tissue on Days 9, 12 and 15 of pregnancy and on Day 12 of non-pregnancy in Yorkshire, and performed a comprehensive analysis of long non-coding RNAs (lncRNAs) in endometrial tissue samples by using RNA sequencing. As a result, 2805 novel lncRNAs, 2,376 (301 lncRNA and 2075 mRNA) differentially expressed genes (DEGs) and 2149 novel transcripts were obtained by pairwise comparison. In agreement with previous reports, lncRNAs shared similar characteristics, such as shorter in length, lower in exon number, lower at expression level and less conserved than protein coding transcripts. Bioinformatics analysis showed that DEGs were involved in protein binding, cellular process, immune system process and enriched in focal adhesion, Jak-STAT, FoxO and MAPK signaling pathway. We also found that lncRNAs TCONS_01729386 and TCONS_01325501 may play a vital role in embryo pre-implantation. Furthermore, the expression of FGF7, NMB, COL5A3, S100A8 and PPP1R3D genes were significantly up-regulated at the time of maternal recognition of pregnancy (Day 12 of pregnancy). Our results first identified the characterization and expression profile of lncRNAs in pig endometrium during pre-implantation phases. PMID:26822553

  14. Sequence and developmental expression of mRNA coding for a gap junction protein in Xenopus

    PubMed Central

    1988-01-01

    Cloned complementary DNAs representing the complete coding sequence for an embryonic gap junction protein in the frog Xenopus laevis have been isolated and sequenced. The cDNAs hybridize with an RNA of 1.5 kb that is first detected in gastrulating embryos and accumulates throughout gastrulation and neurulation. By the tailbud stage, the highest abundance of the transcript is found in the region containing ventroposterior endoderm and the rudiment of the liver. In the adult, transcripts are present in the lungs, alimentary tract organs, and kidneys, but are not detected in the brain, heart, body wall and skeletal muscles, spleen, or ovary. The gene encoding this embryonic gap junction protein is present in only one or a few copies in the frog genome. In vitro translation of RNA synthesized from the cDNA template produces a 30-kD protein, as predicted by the coding sequence. This product has extensive sequence similarity to mammalian gap junction proteins in its putative transmembrane and extracellular domains, but has diverged substantially in two of its intracellular domains. PMID:2843548

  15. The Signal Sequence Coding Region Promotes Nuclear Export of mRNA

    PubMed Central

    Palazzo, Alexander F; Springer, Michael; Shibata, Yoko; Lee, Chung-Sheng; Dias, Anusha P; Rapoport, Tom A

    2007-01-01

    In eukaryotic cells, most mRNAs are exported from the nucleus by the transcription export (TREX) complex, which is loaded onto mRNAs after their splicing and capping. We have studied in mammalian cells the nuclear export of mRNAs that code for secretory proteins, which are targeted to the endoplasmic reticulum membrane by hydrophobic signal sequences. The mRNAs were injected into the nucleus or synthesized from injected or transfected DNA, and their export was followed by fluorescent in situ hybridization. We made the surprising observation that the signal sequence coding region (SSCR) can serve as a nuclear export signal of an mRNA that lacks an intron or functional cap. Even the export of an intron-containing natural mRNA was enhanced by its SSCR. Like conventional export, the SSCR-dependent pathway required the factor TAP, but depletion of the TREX components had only moderate effects. The SSCR export signal appears to be characterized in vertebrates by a low content of adenines, as demonstrated by genome-wide sequence analysis and by the inhibitory effect of silent adenine mutations in SSCRs. The discovery of an SSCR-mediated pathway explains the previously noted amino acid bias in signal sequences and suggests a link between nuclear export and membrane targeting of mRNAs. PMID:18052610

  16. Comparison of two approaches for the classification of 16S rRNA gene sequences.

    PubMed

    Chatellier, Sonia; Mugnier, Nathalie; Allard, Françoise; Bonnaud, Bertrand; Collin, Valérie; van Belkum, Alex; Veyrieras, Jean-Baptiste; Emler, Stefan

    2014-10-01

    The use of 16S rRNA gene sequences for microbial identification in clinical microbiology is accepted widely, and requires databases and algorithms. We compared a new research database containing curated 16S rRNA gene sequences in combination with the lca (lowest common ancestor) algorithm (RDB-LCA) to a commercially available 16S rDNA Centroid approach. We used 1025 bacterial isolates characterized by biochemistry, matrix-assisted laser desorption/ionization time-of-flight MS and 16S rDNA sequencing. Nearly 80 % of isolates were identified unambiguously at the species level by both classification platforms used. The remaining isolates were mostly identified correctly at the genus level due to the limited resolution of 16S rDNA sequencing. Discrepancies between both 16S rDNA platforms were due to differences in database content and the algorithm used, and could amount to up to 10.5 %. Up to 1.4 % of the analyses were found to be inconclusive. It is important to realize that despite the overall good performance of the pipelines for analysis, some inconclusive results remain that require additional in-depth analysis performed using supplementary methods.

  17. Sequence and translation of the murine coronavirus 5'-end genomic RNA reveals the N-terminal structure of the putative RNA polymerase.

    PubMed Central

    Soe, L H; Shieh, C K; Baker, S C; Chang, M F; Lai, M M

    1987-01-01

    A 28-kilodalton protein has been suggested to be the amino-terminal protein cleavage product of the putative coronavirus RNA polymerase (gene A) (M.R. Denison and S. Perlman, Virology 157:565-568, 1987). To elucidate the structure and mechanism of synthesis of this protein, the nucleotide sequence of the 5' 2.0 kilobases of the coronavirus mouse hepatitis virus strain JHM genome was determined. This sequence contains a single, long open reading frame and predicts a highly basic amino-terminal region. Cell-free translation of RNAs transcribed in vitro from DNAs containing gene A sequences in pT7 vectors yielded proteins initiated from the 5'-most optimal initiation codon at position 215 from the 5' end of the genome. The sequence preceding this initiation codon predicts the presence of a stable hairpin loop structure. The presence of an RNA secondary structure at the 5' end of the RNA genome is supported by the observation that gene A sequences were more efficiently translated in vitro when upstream noncoding sequences were removed. By comparing the translation products of virion genomic RNA and in vitro transcribed RNAs, we established that our clones encompassing the 5'-end mouse hepatitis virus genomic RNA encode the 28-kilodalton N-terminal cleavage product of the gene A protein. Possible cleavage sites for this protein are proposed. Images PMID:2824826

  18. The Rhinovirus Subviral A-Particle Exposes 3′-Terminal Sequences of Its Genomic RNA

    PubMed Central

    Harutyunyan, Shushan; Kowalski, Heinrich

    2014-01-01

    ABSTRACT Enteroviruses, which represent a large genus within the family Picornaviridae, undergo important conformational modifications during infection of the host cell. Once internalized by receptor-mediated endocytosis, receptor binding and/or the acidic endosomal environment triggers the native virion to expand and convert into the subviral (altered) A-particle. The A-particle is lacking the internal capsid protein VP4 and exposes N-terminal amphipathic sequences of VP1, allowing for its direct interaction with a lipid bilayer. The genomic single-stranded (+)RNA then exits through a hole close to a 2-fold axis of icosahedral symmetry and passes through a pore in the endosomal membrane into the cytosol, leaving behind the empty shell. We demonstrate that in vitro acidification of a prototype of the minor receptor group of common cold viruses, human rhinovirus A2 (HRV-A2), also results in egress of the poly(A) tail of the RNA from the A-particle, along with adjacent nucleotides totaling ∼700 bases. However, even after hours of incubation at pH 5.2, 5′-proximal sequences remain inside the capsid. In contrast, the entire RNA genome is released within minutes of exposure to the acidic endosomal environment in vivo. This finding suggests that the exposed 3′-poly(A) tail facilitates the positioning of the RNA exit site onto the putative channel in the lipid bilayer, thereby preventing the egress of viral RNA into the endosomal lumen, where it may be degraded. IMPORTANCE For host cell infection, a virus transfers its genome from within the protective capsid into the cytosol; this requires modifications of the viral shell. In common cold viruses, exit of the RNA genome is prepared by the acidic environment in endosomes converting the native virion into the subviral A-particle. We demonstrate that acidification in vitro results in RNA exit starting from the 3′-terminal poly(A). However, the process halts as soon as about 700 bases have left the viral shell

  19. Virtual metagenome reconstruction from 16S rRNA gene sequences.

    PubMed

    Okuda, Shujiro; Tsuchiya, Yuki; Kiriyama, Chiho; Itoh, Masumi; Morisaki, Hisao

    2012-01-01

    Microbial ecologists have investigated roles of species richness and diversity in a wide variety of ecosystems. Recently, metagenomics have been developed to measure functions in ecosystems, but this approach is cost-intensive. Here we describe a novel method for the rapid and efficient reconstruction of a virtual metagenome in environmental microbial communities without using large-scale genomic sequencing. We demonstrate this approach using 16S rRNA gene sequences obtained from denaturing gradient gel electrophoresis analysis, mapped to fully sequenced genomes, to reconstruct virtual metagenome-like organizations. Furthermore, we validate a virtual metagenome using a published metagenome for cocoa bean fermentation samples, and show that metagenomes reconstructed from biofilm formation samples allow for the study of the gene pool dynamics that are necessary for biofilm growth.

  20. Sequence and secondary structure of the mitochondrial 16S ribosomal RNA gene of Ixodes scapularis.

    PubMed

    Krakowetz, Chantel N; Chilton, Neil B

    2015-02-01

    The complete DNA sequences and secondary structure of the mitochondrial (mt) 16S ribosomal (r) RNA gene were determined for six Ixodes scapularis adults. There were 44 variable nucleotide positions in the 1252 bp sequence alignment. Most (95%) nucleotide alterations did not affect the integrity of the secondary structure of the gene because they either occurred at unpaired positions or represented compensatory changes that maintained the base pairing in helices. A large proportion (75%) of the intraspecific variation in DNA sequence occurred within Domains I, II and VI of the 16S gene. Therefore, several regions within this gene may be highly informative for studies of the population genetics and phylogeography of I. scapularis, a major vector of pathogens of humans and domestic animals in North America.

  1. Using deep RNA sequencing for the structural annotation of the laccaria bicolor mycorrhizal transcriptome.

    SciTech Connect

    Larsen, P. E.; Trivedi, G.; Sreedasyam, A.; Lu, V.; Podila, G. K.; Collart, F. R.; Biosciences Division; Univ. of Alabama

    2010-07-06

    Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there

  2. Using Deep RNA Sequencing for the Structural Annotation of the Laccaria Bicolor Mycorrhizal Transcriptome

    PubMed Central

    Larsen, Peter E.; Trivedi, Geetika; Sreedasyam, Avinash; Lu, Vincent; Podila, Gopi K.; Collart, Frank R.

    2010-01-01

    Background Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. Methodology We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. Conclusions 69% of expressed mycorrhizal JGI “best” gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural

  3. MicroRNA-Sequence Profiling Reveals Novel Osmoregulatory MicroRNA Expression Patterns in Catadromous Eel Anguilla marmorata.

    PubMed

    Wang, Xiaolu; Yin, Danqing; Li, Peng; Yin, Shaowu; Wang, Li; Jia, Yihe; Shu, Xinhua

    2015-01-01

    MicroRNAs (miRNAs) are a class of endogenous small non-coding RNAs that regulate gene expression by post-transcriptional repression of mRNAs. Recently, several miRNAs have been confirmed to execute directly or indirectly osmoregulatory functions in fish via translational control. In order to clarify whether miRNAs play relevant roles in the osmoregulation of Anguilla marmorata, three sRNA libraries of A. marmorata during adjusting to three various salinities were sequenced by Illumina sRNA deep sequencing methods. Totally 11,339,168, 11,958,406 and 12,568,964 clear reads were obtained from 3 different libraries, respectively. Meanwhile, 34 conserved miRNAs and 613 novel miRNAs were identified using the sequence data. MiR-10b-5p, miR-181a, miR-26a-5p, miR-30d and miR-99a-5p were dominantly expressed in eels at three salinities. Totally 29 mature miRNAs were significantly up-regulated, while 72 mature miRNAs were significantly down-regulated in brackish water (10‰ salinity) compared with fresh water (0‰ salinity); 24 mature miRNAs were significantly up-regulated, while 54 mature miRNAs were significantly down-regulated in sea water (25‰ salinity) compared with fresh water. Similarly, 24 mature miRNAs were significantly up-regulated, while 45 mature miRNAs were significantly down-regulated in sea water compared with brackish water. The expression patterns of 12 dominantly expressed miRNAs were analyzed at different time points when the eels transferred from fresh water to brackish water or to sea water. These miRNAs showed differential expression patterns in eels at distinct salinities. Interestingly, miR-122, miR-140-3p and miR-10b-5p demonstrated osmoregulatory effects in certain salinities. In addition, the identification and characterization of differentially expressed miRNAs at different salinities can clarify the osmoregulatory roles of miRNAs, which will shed lights for future studies on osmoregulation in fish.

  4. MicroRNA-Sequence Profiling Reveals Novel Osmoregulatory MicroRNA Expression Patterns in Catadromous Eel Anguilla marmorata

    PubMed Central

    Li, Peng; Yin, Shaowu; Wang, Li; Jia, Yihe; Shu, Xinhua

    2015-01-01

    MicroRNAs (miRNAs) are a class of endogenous small non-coding RNAs that regulate gene expression by post-transcriptional repression of mRNAs. Recently, several miRNAs have been confirmed to execute directly or indirectly osmoregulatory functions in fish via translational control. In order to clarify whether miRNAs play relevant roles in the osmoregulation of Anguilla marmorata, three sRNA libraries of A. marmorata during adjusting to three various salinities were sequenced by Illumina sRNA deep sequencing methods. Totally 11,339,168, 11,958,406 and 12,568,964 clear reads were obtained from 3 different libraries, respectively. Meanwhile, 34 conserved miRNAs and 613 novel miRNAs were identified using the sequence data. MiR-10b-5p, miR-181a, miR-26a-5p, miR-30d and miR-99a-5p were dominantly expressed in eels at three salinities. Totally 29 mature miRNAs were significantly up-regulated, while 72 mature miRNAs were significantly down-regulated in brackish water (10‰ salinity) compared with fresh water (0‰ salinity); 24 mature miRNAs were significantly up-regulated, while 54 mature miRNAs were significantly down-regulated in sea water (25‰ salinity) compared with fresh water. Similarly, 24 mature miRNAs were significantly up-regulated, while 45 mature miRNAs were significantly down-regulated in sea water compared with brackish water. The expression patterns of 12 dominantly expressed miRNAs were analyzed at different time points when the eels transferred from fresh water to brackish water or to sea water. These miRNAs showed differential expression patterns in eels at distinct salinities. Interestingly, miR-122, miR-140-3p and miR-10b-5p demonstrated osmoregulatory effects in certain salinities. In addition, the identification and characterization of differentially expressed miRNAs at different salinities can clarify the osmoregulatory roles of miRNAs, which will shed lights for future studies on osmoregulation in fish. PMID:26301415

  5. mRNA deep sequencing reveals 75 new genes and a complex transcriptional landscape in Mimivirus.

    PubMed

    Legendre, Matthieu; Audic, Stéphane; Poirot, Olivier; Hingamp, Pascal; Seltzer, Virginie; Byrne, Deborah; Lartigue, Audrey; Lescot, Magali; Bernadac, Alain; Poulain, Julie; Abergel, Chantal; Claverie, Jean-Michel

    2010-05-01

    Mimivirus, a virus infecting Acanthamoeba, is the prototype of the Mimiviridae, the latest addition to the nucleocytoplasmic large DNA viruses. The Mimivirus genome encodes close to 1000 proteins, many of them never before encountered in a virus, such as four amino-acyl tRNA synthetases. To explore the physiology of this exceptional virus and identify the genes involved in the building of its characteristic intracytoplasmic "virion factory," we coupled electron microscopy observations with the massively parallel pyrosequencing of the polyadenylated RNA fractions of Acanthamoeba castellanii cells at various time post-infection. We generated 633,346 reads, of which 322,904 correspond to Mimivirus transcripts. This first application of deep mRNA sequencing (454 Life Sciences [Roche] FLX) to a large DNA virus allowed the precise delineation of the 5' and 3' extremities of Mimivirus mRNAs and revealed 75 new transcripts including several noncoding RNAs. Mimivirus genes are expressed across a wide dynamic range, in a finely regulated manner broadly described by three main temporal classes: early, intermediate, and late. This RNA-seq study confirmed the AAAATTGA sequence as an early promoter element, as well as the presence of palindromes at most of the polyadenylation sites. It also revealed a new promoter element correlating with late gene expression, which is also prominent in Sputnik, the recently described Mimivirus "virophage." These results-validated genome-wide by the hybridization of total RNA extracted from infected Acanthamoeba cells on a tiling array (Agilent)--will constitute the foundation on which to build subsequent functional studies of the Mimivirus/Acanthamoeba system. PMID:20360389

  6. mRNA deep sequencing reveals 75 new genes and a complex transcriptional landscape in Mimivirus

    PubMed Central

    Legendre, Matthieu; Audic, Stéphane; Poirot, Olivier; Hingamp, Pascal; Seltzer, Virginie; Byrne, Deborah; Lartigue, Audrey; Lescot, Magali; Bernadac, Alain; Poulain, Julie; Abergel, Chantal; Claverie, Jean-Michel

    2010-01-01

    Mimivirus, a virus infecting Acanthamoeba, is the prototype of the Mimiviridae, the latest addition to the nucleocytoplasmic large DNA viruses. The Mimivirus genome encodes close to 1000 proteins, many of them never before encountered in a virus, such as four amino-acyl tRNA synthetases. To explore the physiology of this exceptional virus and identify the genes involved in the building of its characteristic intracytoplasmic “virion factory,” we coupled electron microscopy observations with the massively parallel pyrosequencing of the polyadenylated RNA fractions of Acanthamoeba castellanii cells at various time post-infection. We generated 633,346 reads, of which 322,904 correspond to Mimivirus transcripts. This first application of deep mRNA sequencing (454 Life Sciences [Roche] FLX) to a large DNA virus allowed the precise delineation of the 5′ and 3′ extremities of Mimivirus mRNAs and revealed 75 new transcripts including several noncoding RNAs. Mimivirus genes are expressed across a wide dynamic range, in a finely regulated manner broadly described by three main temporal classes: early, intermediate, and late. This RNA-seq study confirmed the AAAATTGA sequence as an early promoter element, as well as the presence of palindromes at most of the polyadenylation sites. It also revealed a new promoter element correlating with late gene expression, which is also prominent in Sputnik, the recently described Mimivirus “virophage.” These results—validated genome-wide by the hybridization of total RNA extracted from infected Acanthamoeba cells on a tiling array (Agilent)—will constitute the foundation on which to build subsequent functional studies of the Mimivirus/Acanthamoeba system. PMID:20360389

  7. Delivery of siRNA using ternary complexes containing branched cationic peptides: the role of peptide sequence, branching and targeting.

    PubMed

    Kudsiova, Laila; Welser, Katharina; Campbell, Frederick; Mohammadi, Atefeh; Dawson, Natalie; Cui, Lili; Hailes, Helen C; Lawrence, M Jayne; Tabor, Alethea B

    2016-03-01

    Ternary nanocomplexes, composed of bifunctional cationic peptides, lipids and siRNA, as delivery vehicles for siRNA have been investigated. The study is the first to determine the optimal sequence and architecture of the bifunctional cationic peptide used for siRNA packaging and delivery using lipopolyplexes. Specifically three series of cationic peptides of differing sequence, degrees of branching and cell-targeting sequences were co-formulated with siRNA and vesicles prepared from a 1 : 1 molar ratio of the cationic lipid DOTMA and the helper lipid, DOPE. The level of siRNA knockdown achieved in the human alveolar cell line, A549-luc cells, in both reduced serum and in serum supplemented media was evaluated, and the results correlated to the nanocomplex structure (established using a range of physico-chemical tools, namely small angle neutron scattering, transmission electron microscopy, dynamic light scattering and zeta potential measurement); the conformational properties of each component (circular dichroism); the degree of protection of the siRNA in the lipopolyplex (using gel shift assays) and to the cellular uptake, localisation and toxicity of the nanocomplexes (confocal microscopy). Although the size, charge, structure and stability of the various lipopolyplexes were broadly similar, it was clear that lipopolyplexes formulated from branched peptides containing His-Lys sequences perform best as siRNA delivery agents in serum, with protection of the siRNA in serum balanced against efficient release of the siRNA into the cytoplasm of the cell.

  8. SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments

    PubMed Central

    Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric

    2014-01-01

    This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee. PMID:24972831

  9. 16S rRNA sequences of uncultivated hot spring cyanobacterial mat inhabitants retrieved as randomly primed cDNA.

    PubMed Central

    Weller, R; Weller, J W; Ward, D M

    1991-01-01

    Cloning and analysis of cDNAs synthesized from rRNAs is one approach to assess the species composition of natural microbial communities. In some earlier attempts to synthesize cDNA from 16S rRNA (16S rcDNA) from the Octopus Spring cyanobacterial mat, a dominance of short 16S rcDNAs was observed, which appear to have originated only from certain organisms. Priming of cDNA synthesis from small ribosomal subunit RNA with random deoxyhexanucleotides can retrieve longer sequences, more suitable for phylogenetic analysis. Here we report the retrieval of 16S rRNA sequences from three formerly uncultured community members. One sequence type, which was retrieved three times from a total of five sequences analyzed, can be placed in the cyanobacterial phylum. A second sequence type is related to 16S rRNAs from green nonsulfur bacteria. The third sequence type may represent a novel phylogenetic type. Images PMID:1711832

  10. 16S rRNA sequences of uncultivated hot spring cyanobacterial mat inhabitants retrieved as randomly primed cDNA

    SciTech Connect

    Weller, R.; Ward, D.M. ); Weller, J.W. )

    1991-04-01

    Cloning and analysis of cDNAs synthesized from rRNAs is one approach to assess the species composition of natural microbial communities. In some earlier attempts to synthesize cDNA from 16S rRNA (16S rcDNA) from the Octopus Spring cyanobacterial mat, a dominance of short 16S rcDNAs was observed, which appear to have originated only from certain organisms. Priming of cDNA synthesis from small ribosomal subunit RNA with random deoxyhexanucleotides can retrieve longer sequences, more suitable for phylogenetic analysis. Here we report the retrieval of 16S rRNA sequences form three formerly uncultured community members. One sequence type, which was retrieved three times from a total of five sequences analyzed, can be placed in the cyanobacterial phylum. A second sequence type is related to 16S rRNAs from green nonsulfur bacteria. The third sequence type may represent a novel phylogenetic type.

  11. SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments.

    PubMed

    Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric

    2014-07-01

    This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee.

  12. Deep RNA Sequencing of the Skeletal Muscle Transcriptome in Swimming Fish

    PubMed Central

    Palstra, Arjan P.; Beltran, Sergi; Burgerhout, Erik; Brittijn, Sebastiaan A.; Magnoni, Leonardo J.; Henkel, Christiaan V.; Jansen, Hans J.; van den Thillart, Guido E. E. J. M.; Spaink, Herman P.; Planas, Josep V.

    2013-01-01

    Deep RNA sequencing (RNA-seq) was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss) with the specific objective to identify expressed genes and quantify the transcriptomic effects of swimming-induced exercise. Pubertal autumn-spawning seawater-raised female rainbow trout were rested (n = 10) or swum (n = 10) for 1176 km at 0.75 body-lengths per second in a 6,000-L swim-flume under reproductive conditions for 40 days. Red and white muscle RNA of exercised and non-exercised fish (4 lanes) was sequenced and resulted in 15–17 million reads per lane that, after de novo assembly, yielded 149,159 red and 118,572 white muscle contigs. Most contigs were annotated using an iterative homology search strategy against salmonid ESTs, the zebrafish Danio rerio genome and general Metazoan genes. When selecting for large contigs (>500 nucleotides), a number of novel rainbow trout gene sequences were identified in this study: 1,085 and 1,228 novel gene sequences for red and white muscle, respectively, which included a number of important molecules for skeletal muscle function. Transcriptomic analysis revealed that sustained swimming increased transcriptional activity in skeletal muscle and specifically an up-regulation of genes involved in muscle growth and developmental processes in white muscle. The unique collection of transcripts will contribute to our understanding of red and white muscle physiology, specifically during the long-term reproductive migration of salmonids. PMID:23308156

  13. sIR: siRNA Information Resource, a web-based tool for siRNA sequence design and analysis and an open access siRNA database

    PubMed Central

    Shah, Jyoti K; Garner, Harold R; White, Michael A; Shames, David S; Minna, John D

    2007-01-01

    Background RNA interference has revolutionized our ability to study the effects of altering the expression of single genes in mammalian (and other) cells through targeted knockdown of gene expression. In this report we describe a web-based computational tool, siRNA Information Resource (sIR), which consists of a new open source database that contains validation information about published siRNA sequences and also provides a user-friendly interface to design and analyze siRNA sequences against a chosen target sequence. Results The siRNA design tool described in this paper employs empirically determined rules derived from a meta-analysis of the published data; it uses a weighted scoring system that determines the optimal sequence within a target mRNA and thus aids in the rational selection of siRNA sequences. This scoring system shows a non-linear correlation with the knockdown efficiency of siRNAs. sIR provides a fast, customized BLAST output for all selected siRNA sequences against a variety of databases so that the user can verify the uniqueness of the design. We have pre-designed siRNAs for all the known human genes (24,502) in the Refseq database. These siRNAs were pre-BLASTed against the human Unigene database to estimate the target specificity and all results are available online. Conclusion Although most of the rules for this scoring system were influenced by previously published rules, the weighted scoring system provides better flexibility in designing an appropriate siRNA when compared to the un-weighted scoring system. sIR is not only a comprehensive tool used to design siRNA sequences and lookup pre-designed siRNAs, but it is also a platform where researchers can share information on siRNA design and use. PMID:17540034

  14. Identification of Genes Potentially Associated with the Fertility Instability of S-Type Cytoplasmic Male Sterility in Maize via Bulked Segregant RNA-Seq

    PubMed Central

    Xing, Jinfeng; Zhao, Yanxin; Zhang, Ruyang; Li, Chunhui; Duan, Minxiao; Luo, Meijie; Shi, Zi; Zhao, Jiuran

    2016-01-01

    S-type cytoplasmic male sterility (CMS-S) is the largest group among the three major types of CMS in maize. CMS-S exhibits fertility instability as a partial fertility restoration in a specific nuclear genetic background, which impedes its commercial application in hybrid breeding programs. The fertility instability phenomenon of CMS-S is controlled by several minor quantitative trait locus (QTLs), but not the major nuclear fertility restorer (Rf3). However, the gene mapping of these minor QTLs and the molecular mechanism of the genetic modifications are still unclear. Using completely sterile and partially rescued plants of fertility instable line (FIL)-B, we performed bulk segregant RNA-Seq and identified six potential associated genes in minor effect QTLs contributing to fertility instability. Analyses demonstrate that these potential associated