Sample records for a-rich rna sequences

  1. The nonamer UUAUUUAUU is the key AU-rich sequence motif that mediates mRNA degradation.

    PubMed Central

    Zubiaga, A M; Belasco, J G; Greenberg, M E

    1995-01-01

    Labile mRNAs that encode cytokine and immediate-early gene products often contain AU-rich sequences within their 3' untranslated region (UTR). These AU-rich sequences appear to be key determinants of the short half-lives of these mRNAs, although the sequence features of these elements and the mechanism by which they target mRNAs for rapid decay have not been fully defined. We have examined the features of AU-rich elements (AREs) that are crucial for their function as determinants of mRNA instability in mammalian cells by testing the ability of various mutant c-fos AREs and synthetic AREs to direct rapid mRNA deadenylation and decay when inserted within the 3' UTR of the normally stable beta-globin mRNA. Evidence is presented that the pentamer AUUUA, which previously was suggested to be the minimal determinant of instability present in mammalian AREs, cannot direct rapid mRNA deadenylation and decay. Instead, the nonomer UUAUUUAUU is the elemental AU-rich sequence motif that destabilizes mRNA. Removal of one uridine residue from either end of the nonamer (UUAUUUAU or UAUUUAUU) results in a decrease of potency of the element, while removal of a uridine residue from both ends of the nonamer (UAUUUAU) eliminates detectable destabilizing activity. The inclusion of an additional uridine residue at both ends of the nonamer (UUUAUUUAUUU) does not further increase the efficacy of the element. Taken together, these findings suggest that the nonamer UUAUUUAUU is the minimal AU-rich motif that effectively destabilizes mRNA. Additional ARE potency is achieved by combining multiple copies of this nonamer in a single mRNA 3' UTR. Furthermore, analysis of poly(A) shortening rates for ARE-containing mRNAs reveals that the UUAUUUAUU sequence also accelerates mRNA deadenylation and suggests that the UUAUUUAUU motif targets mRNA for rapid deadenylation as an early step in the mRNA decay process. PMID:7891716

  2. The binding of TIA-1 to RNA C-rich sequences is driven by its C-terminal RRM domain.

    PubMed

    Cruz-Gallardo, Isabel; Aroca, Ángeles; Gunzburg, Menachem J; Sivakumaran, Andrew; Yoon, Je-Hyun; Angulo, Jesús; Persson, Cecilia; Gorospe, Myriam; Karlsson, B Göran; Wilce, Jacqueline A; Díaz-Moreno, Irene

    2014-01-01

    T-cell intracellular antigen-1 (TIA-1) is a key DNA/RNA binding protein that regulates translation by sequestering target mRNAs in stress granules (SG) in response to stress conditions. TIA-1 possesses three RNA recognition motifs (RRM) along with a glutamine-rich domain, with the central domains (RRM2 and RRM3) acting as RNA binding platforms. While the RRM2 domain, which displays high affinity for U-rich RNA sequences, is primarily responsible for interaction with RNA, the contribution of RRM3 to bind RNA as well as the target RNA sequences that it binds preferentially are still unknown. Here we combined nuclear magnetic resonance (NMR) and surface plasmon resonance (SPR) techniques to elucidate the sequence specificity of TIA-1 RRM3. With a novel approach using saturation transfer difference NMR (STD-NMR) to quantify protein-nucleic acids interactions, we demonstrate that isolated RRM3 binds to both C- and U-rich stretches with micromolar affinity. In combination with RRM2 and in the context of full-length TIA-1, RRM3 significantly enhanced the binding to RNA, particularly to cytosine-rich RNA oligos, as assessed by biotinylated RNA pull-down analysis. Our findings provide new insight into the role of RRM3 in regulating TIA-1 binding to C-rich stretches, that are abundant at the 5' TOPs (5' terminal oligopyrimidine tracts) of mRNAs whose translation is repressed under stress situations.

  3. The binding of TIA-1 to RNA C-rich sequences is driven by its C-terminal RRM domain

    PubMed Central

    Cruz-Gallardo, Isabel; Aroca, Ángeles; Gunzburg, Menachem J; Sivakumaran, Andrew; Yoon, Je-Hyun; Angulo, Jesús; Persson, Cecilia; Gorospe, Myriam; Karlsson, B Göran; Wilce, Jacqueline A; Díaz-Moreno, Irene

    2014-01-01

    T-cell intracellular antigen-1 (TIA-1) is a key DNA/RNA binding protein that regulates translation by sequestering target mRNAs in stress granules (SG) in response to stress conditions. TIA-1 possesses three RNA recognition motifs (RRM) along with a glutamine-rich domain, with the central domains (RRM2 and RRM3) acting as RNA binding platforms. While the RRM2 domain, which displays high affinity for U-rich RNA sequences, is primarily responsible for interaction with RNA, the contribution of RRM3 to bind RNA as well as the target RNA sequences that it binds preferentially are still unknown. Here we combined nuclear magnetic resonance (NMR) and surface plasmon resonance (SPR) techniques to elucidate the sequence specificity of TIA-1 RRM3. With a novel approach using saturation transfer difference NMR (STD-NMR) to quantify protein–nucleic acids interactions, we demonstrate that isolated RRM3 binds to both C- and U-rich stretches with micromolar affinity. In combination with RRM2 and in the context of full-length TIA-1, RRM3 significantly enhanced the binding to RNA, particularly to cytosine-rich RNA oligos, as assessed by biotinylated RNA pull-down analysis. Our findings provide new insight into the role of RRM3 in regulating TIA-1 binding to C-rich stretches, that are abundant at the 5′ TOPs (5′ terminal oligopyrimidine tracts) of mRNAs whose translation is repressed under stress situations. PMID:24824036

  4. RNA processing in Neurospora crassa mitochondria: use of transfer RNA sequences as signals.

    PubMed Central

    Breitenberger, C A; Browning, K S; Alzner-DeWeerd, B; RajBhandary, U L

    1985-01-01

    We have used RNA gel transfer hybridization, S1 nuclease mapping and primer extension to analyze transcripts derived from several genes in Neurospora crassa mitochondria. The transcripts studied include those for cytochrome oxidase subunit III, 17S rRNA and an unidentified open reading frame. In all three cases, initial transcripts are long, include tRNA sequences, and are subsequently processed to generate the mature RNAs. We find that endpoints of the most abundant transcripts generally coincide with those of tRNA sequences. We therefore conclude that tRNA sequences in long transcripts act as primary signals for RNA processing in N. crassa mitochondria. The situation is somewhat analogous to that observed in mammalian mitochondrial systems. The difference, however, is that in mammalian mitochondria, noncoding spacers between tRNA, rRNA and protein genes are very short and in many cases non-existent, allowing no room for intergenic RNA processing signals whereas, in N. crassa mtDNA, intergenic non-coding sequences are usually several hundred nucleotides long and contain highly conserved GC-rich palindromic sequences. Since these GC-rich palindromic sequences are retained in the processed mature RNAs, we conclude that they do not serve as signals for RNA processing. Images Fig. 2. Fig. 3. Fig. 4. Fig. 5. Fig. 6. Fig. 7. PMID:2990893

  5. A family of long intergenic non-coding RNA genes in human chromosomal region 22q11.2 carry a DNA translocation breakpoint/AT-rich sequence

    PubMed Central

    2018-01-01

    FAM230C, a long intergenic non-coding RNA (lincRNA) gene in human chromosome 13 (chr13) is a member of lincRNA genes termed family with sequence similarity 230. An analysis using bioinformatics search tools and alignment programs was undertaken to determine properties of FAM230C and its related genes. Results reveal that the DNA translocation element, the Translocation Breakpoint Type A (TBTA) sequence, which consists of satellite DNA, Alu elements, and AT-rich sequences is embedded in the FAM230C gene. Eight lincRNA genes related to FAM230C also carry the TBTA sequences. These genes were formed from a large segment of the 3’ half of the FAM230C sequence duplicated in chr22, and are specifically in regions of low copy repeats (LCR22)s, in or close to the 22q.11.2 region. 22q11.2 is a chromosomal segment that undergoes a high rate of DNA translocation and is prone to genetic deletions. FAM230C-related genes present in other chromosomes do not carry the TBTA motif and were formed from the 5’ half region of the FAM230C sequence. These findings identify a high specificity in lincRNA gene formation by gene sequence duplication in different chromosomes. PMID:29668722

  6. mirRICH, a simple method to enrich the small RNA fraction from over-dried RNA pellets.

    PubMed

    Choi, Cheolwon; Yoon, Seulgi; Moon, Hyesu; Bae, Yun-Ui; Kim, Chae-Bin; Diskul-Na-Ayudthaya, Penchatr; Ngu, Trinh Van; Munir, Javaria; Han, JaeWook; Park, Se Bin; Moon, Jong-Seok; Song, Sujung; Ryu, Seongho

    2018-04-11

    Techniques to isolate the small RNA fraction (<200nt) by column-based methods are commercially available. However, their use is limited because of the relatively high cost. We found that large RNA molecules, including mRNAs and rRNAs, are aggregated together in the presence of salts when RNA pellets are over-dried. Moreover, once RNA pellets are over-dried, large RNA molecules are barely soluble again during the elution process, whereas small RNA molecules (<100nt) can be eluted. We therefore modified the acid guanidinium thiocyanate-phenol-chloroform (AGPC)-based RNA extraction protocol by skipping the 70% ethanol washing step and over-drying the RNA pellet for 1 hour at room temperature. We named this novel small RNA isolation method "mirRICH." The quality of the small RNA sequences was validated by electrophoresis, next-generation sequencing, and quantitative PCR, and the findings support that our newly developed column-free method can successfully and efficiently isolate small RNAs from over-dried RNA pellets.

  7. Human Fip1 is a subunit of CPSF that binds to U-rich RNA elements and stimulates poly(A) polymerase.

    PubMed

    Kaufmann, Isabelle; Martin, Georges; Friedlein, Arno; Langen, Hanno; Keller, Walter

    2004-02-11

    In mammals, polyadenylation of mRNA precursors (pre-mRNAs) by poly(A) polymerase (PAP) depends on cleavage and polyadenylation specificity factor (CPSF). CPSF is a multisubunit complex that binds to the canonical AAUAAA hexamer and to U-rich upstream sequence elements on the pre-mRNA, thereby stimulating the otherwise weakly active and nonspecific polymerase to elongate efficiently RNAs containing a poly(A) signal. Based on sequence similarity to the Saccharomyces cerevisiae polyadenylation factor Fip1p, we have identified human Fip1 (hFip1) and found that the protein is an integral subunit of CPSF. hFip1 interacts with PAP and has an arginine-rich RNA-binding motif that preferentially binds to U-rich sequence elements on the pre-mRNA. Recombinant hFip1 is sufficient to stimulate the in vitro polyadenylation activity of PAP in a U-rich element-dependent manner. hFip1, CPSF160 and PAP form a ternary complex in vitro, suggesting that hFip1 and CPSF160 act together in poly(A) site recognition and in cooperative recruitment of PAP to the RNA. These results show that hFip1 significantly contributes to CPSF-mediated stimulation of PAP activity.

  8. Human Fip1 is a subunit of CPSF that binds to U-rich RNA elements and stimulates poly(A) polymerase

    PubMed Central

    Kaufmann, Isabelle; Martin, Georges; Friedlein, Arno; Langen, Hanno; Keller, Walter

    2004-01-01

    In mammals, polyadenylation of mRNA precursors (pre-mRNAs) by poly(A) polymerase (PAP) depends on cleavage and polyadenylation specificity factor (CPSF). CPSF is a multisubunit complex that binds to the canonical AAUAAA hexamer and to U-rich upstream sequence elements on the pre-mRNA, thereby stimulating the otherwise weakly active and nonspecific polymerase to elongate efficiently RNAs containing a poly(A) signal. Based on sequence similarity to the Saccharomyces cerevisiae polyadenylation factor Fip1p, we have identified human Fip1 (hFip1) and found that the protein is an integral subunit of CPSF. hFip1 interacts with PAP and has an arginine-rich RNA-binding motif that preferentially binds to U-rich sequence elements on the pre-mRNA. Recombinant hFip1 is sufficient to stimulate the in vitro polyadenylation activity of PAP in a U-rich element-dependent manner. hFip1, CPSF160 and PAP form a ternary complex in vitro, suggesting that hFip1 and CPSF160 act together in poly(A) site recognition and in cooperative recruitment of PAP to the RNA. These results show that hFip1 significantly contributes to CPSF-mediated stimulation of PAP activity. PMID:14749727

  9. Strong transcription blockage mediated by R-loop formation within a G-rich homopurine-homopyrimidine sequence localized in the vicinity of the promoter.

    PubMed

    Belotserkovskii, Boris P; Soo Shin, Jane Hae; Hanawalt, Philip C

    2017-06-20

    Guanine-rich (G-rich) homopurine-homopyrimidine nucleotide sequences can block transcription with an efficiency that depends upon their orientation, composition and length, as well as the presence of negative supercoiling or breaks in the non-template DNA strand. We report that a G-rich sequence in the non-template strand reduces the yield of T7 RNA polymerase transcription by more than an order of magnitude when positioned close (9 bp) to the promoter, in comparison to that for a distal (∼250 bp) location of the same sequence. This transcription blockage is much less pronounced for a C-rich sequence, and is not significant for an A-rich sequence. Remarkably, the blockage is not pronounced if transcription is performed in the presence of RNase H, which specifically digests the RNA strands within RNA-DNA hybrids. The blockage also becomes less pronounced upon reduced RNA polymerase concentration. Based upon these observations and those from control experiments, we conclude that the blockage is primarily due to the formation of stable RNA-DNA hybrids (R-loops), which inhibit successive rounds of transcription. Our results could be relevant to transcription dynamics in vivo (e.g. transcription 'bursting') and may also have practical implications for the design of expression vectors. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. RNA isolation from mouse pancreas: a ribonuclease-rich tissue.

    PubMed

    Azevedo-Pouly, Ana Clara P; Elgamal, Ola A; Schmittgen, Thomas D

    2014-08-02

    Isolation of high-quality RNA from ribonuclease-rich tissue such as mouse pancreas presents a challenge. As a primary function of the pancreas is to aid in digestion, mouse pancreas may contain as much a 75 mg of ribonuclease. We report modifications of standard phenol/guanidine thiocyanate lysis reagent protocols to isolate RNA from mouse pancreas. Guanidine thiocyanate is a strong protein denaturant and will effectively disrupt the activity of ribonuclease under most conditions. However, critical modifications to standard protocols are necessary to successfully isolate RNA from ribonuclease-rich tissues. Key steps include a high lysis reagent to tissue ratio, removal of undigested tissue prior to phase separation and inclusion of a ribonuclease inhibitor to the RNA solution. Using these and other modifications, we routinely isolate RNA with RNA Integrity Number (RIN) greater than 7. The isolated RNA is of suitable quality for routine gene expression analysis. Adaptation of this protocol to isolate RNA from ribonuclease rich tissues besides the pancreas should be readily achievable.

  11. Strong transcription blockage mediated by R-loop formation within a G-rich homopurine–homopyrimidine sequence localized in the vicinity of the promoter

    PubMed Central

    Soo Shin, Jane Hae

    2017-01-01

    Abstract Guanine-rich (G-rich) homopurine–homopyrimidine nucleotide sequences can block transcription with an efficiency that depends upon their orientation, composition and length, as well as the presence of negative supercoiling or breaks in the non-template DNA strand. We report that a G-rich sequence in the non-template strand reduces the yield of T7 RNA polymerase transcription by more than an order of magnitude when positioned close (9 bp) to the promoter, in comparison to that for a distal (∼250 bp) location of the same sequence. This transcription blockage is much less pronounced for a C-rich sequence, and is not significant for an A-rich sequence. Remarkably, the blockage is not pronounced if transcription is performed in the presence of RNase H, which specifically digests the RNA strands within RNA–DNA hybrids. The blockage also becomes less pronounced upon reduced RNA polymerase concentration. Based upon these observations and those from control experiments, we conclude that the blockage is primarily due to the formation of stable RNA–DNA hybrids (R-loops), which inhibit successive rounds of transcription. Our results could be relevant to transcription dynamics in vivo (e.g. transcription ‘bursting’) and may also have practical implications for the design of expression vectors. PMID:28498974

  12. Influence of 5'-flanking sequence on 4.5SI RNA gene transcription by RNA polymerase III.

    PubMed

    Gogolevskaya, Irina K; Stasenko, Danil V; Tatosyan, Karina A; Kramerov, Dmitri A

    2018-05-01

    Short nuclear 4.5SI RNA can be found in three related rodent families. Its function remains unknown. The genes of 4.5SI RNA contain an internal promoter of RNA polymerase III composed of the boxes A and B. Here, the effect of the sequence immediately upstream of the mouse 4.5SI RNA gene on its transcription was studied. The gene with deletions and substitutions in the 5'-flanking sequence was used to transfect HeLa cells and its transcriptional activity was evaluated from the cellular level of 4.5SI RNA. Single-nucleotide substitutions in the region adjacent to the transcription start site (positions -2 to -8) decreased the expression activity of the gene down to 40%-60% of the control. The substitution of the conserved pentanucleotide AGAAT (positions -14 to -18) could either decrease (43%-56%) or increase (134%) the gene expression. A TATA-like box (TACATGA) was found at positions -24 to -30 of the 4.5SI RNA gene. Its replacement with a polylinker fragment of the vector did not decrease the transcription level, while its replacement with a GC-rich sequence almost completely (down to 2%-5%) suppressed the transcription of the 4.5SI RNA gene. The effect of plasmid sequences bordering the gene on its transcription by RNA polymerase III is discussed.

  13. antaRNA: ant colony-based RNA sequence design.

    PubMed

    Kleinkauf, Robert; Mann, Martin; Backofen, Rolf

    2015-10-01

    RNA sequence design is studied at least as long as the classical folding problem. Although for the latter the functional fold of an RNA molecule is to be found ,: inverse folding tries to identify RNA sequences that fold into a function-specific target structure. In combination with RNA-based biotechnology and synthetic biology ,: reliable RNA sequence design becomes a crucial step to generate novel biochemical components. In this article ,: the computational tool antaRNA is presented. It is capable of compiling RNA sequences for a given structure that comply in addition with an adjustable full range objective GC-content distribution ,: specific sequence constraints and additional fuzzy structure constraints. antaRNA applies ant colony optimization meta-heuristics and its superior performance is shown on a biological datasets. http://www.bioinf.uni-freiburg.de/Software/antaRNA CONTACT: backofen@informatik.uni-freiburg.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  14. GC-rich coding sequences reduce transposon-like, small RNA-mediated transgene silencing.

    PubMed

    Sidorenko, Lyudmila V; Lee, Tzuu-Fen; Woosley, Aaron; Moskal, William A; Bevan, Scott A; Merlo, P Ann Owens; Walsh, Terence A; Wang, Xiujuan; Weaver, Staci; Glancy, Todd P; Wang, PoHao; Yang, Xiaozeng; Sriram, Shreedharan; Meyers, Blake C

    2017-11-01

    The molecular basis of transgene susceptibility to silencing is poorly characterized in plants; thus, we evaluated several transgene design parameters as means to reduce heritable transgene silencing. Analyses of Arabidopsis plants with transgenes encoding a microalgal polyunsaturated fatty acid (PUFA) synthase revealed that small RNA (sRNA)-mediated silencing, combined with the use of repetitive regulatory elements, led to aggressive transposon-like silencing of canola-biased PUFA synthase transgenes. Diversifying regulatory sequences and using native microalgal coding sequences (CDSs) with higher GC content improved transgene expression and resulted in a remarkable trans-generational stability via reduced accumulation of sRNAs and DNA methylation. Further experiments in maize with transgenes individually expressing three crystal (Cry) proteins from Bacillus thuringiensis (Bt) tested the impact of CDS recoding using different codon bias tables. Transgenes with higher GC content exhibited increased transcript and protein accumulation. These results demonstrate that the sequence composition of transgene CDSs can directly impact silencing, providing design strategies for increasing transgene expression levels and reducing risks of heritable loss of transgene expression.

  15. cAMP-Mediated Stimulation of Tyrosine Hydroxylase mRNA Translation Is Mediated by Polypyrimidine-Rich Sequences within Its 3′-Untranslated Region and Poly(C)-Binding Protein 2

    PubMed Central

    Xu, Lu; Sterling, Carol R.

    2009-01-01

    Tyrosine hydroxylase (TH) plays a critical role in maintaining the appropriate concentrations of catecholamine neurotransmitters in brain and periphery, particularly during long-term stress, long-term drug treatment, or neurodegenerative diseases. Its expression is controlled by both transcriptional and post-transcriptional mechanisms. In a previous report, we showed that treatment of rat midbrain slice explant cultures or mouse MN9D cells with cAMP analog or forskolin leads to induction of TH protein without concomitant induction of TH mRNA. We further showed that cAMP activates mechanisms that regulate TH mRNA translation via cis-acting sequences within its 3′-untranslated region (UTR). In the present report, we extend these studies to show that MN9D cytoplasmic proteins bind to the same TH mRNA 3′-UTR domain that is required for the cAMP response. RNase T1 mapping demonstrates binding of proteins to a 27-nucleotide polypyrimidine-rich sequence within this domain. A specific mutation within the polypyrimidine-rich sequence inhibits protein binding and cAMP-mediated translational activation. UV-cross-linking studies identify a ∼44-kDa protein as a major TH mRNA 3′-UTR binding factor, and cAMP induces the 40- to 42-kDa poly(C)-binding protein-2 (PCBP2) in MN9D cells. We show that PCBP2 binds to the TH mRNA 3′-UTR domain that participates in the cAMP response. Overexpression of PCBP2 induces TH protein without concomitant induction of TH mRNA. These results support a model in which cAMP induces PCBP2, leading to increased interaction with its cognate polypyrimidine binding site in the TH mRNA 3′-UTR. This increased interaction presumably plays a role in the activation of TH mRNA translation by cAMP in dopaminergic neurons. PMID:19620256

  16. Rice MEL2, the RNA recognition motif (RRM) protein, binds in vitro to meiosis-expressed genes containing U-rich RNA consensus sequences in the 3'-UTR.

    PubMed

    Miyazaki, Saori; Sato, Yutaka; Asano, Tomoya; Nagamura, Yoshiaki; Nonomura, Ken-Ichi

    2015-10-01

    Post-transcriptional gene regulation by RNA recognition motif (RRM) proteins through binding to cis-elements in the 3'-untranslated region (3'-UTR) is widely used in eukaryotes to complete various biological processes. Rice MEIOSIS ARRESTED AT LEPTOTENE2 (MEL2) is the RRM protein that functions in the transition to meiosis in proper timing. The MEL2 RRM preferentially associated with the U-rich RNA consensus, UUAGUU[U/A][U/G][A/U/G]U, dependently on sequences and proportionally to MEL2 protein amounts in vitro. The consensus sequences were located in the putative looped structures of the RNA ligand. A genome-wide survey revealed a tendency of MEL2-binding consensus appearing in 3'-UTR of rice genes. Of 249 genes that conserved the consensus in their 3'-UTR, 13 genes spatiotemporally co-expressed with MEL2 in meiotic flowers, and included several genes whose function was supposed in meiosis; such as Replication protein A and OsMADS3. The proteome analysis revealed that the amounts of small ubiquitin-related modifier-like protein and eukaryotic translation initiation factor3-like protein were dramatically altered in mel2 mutant anthers. Taken together with transcriptome and gene ontology results, we propose that the rice MEL2 is involved in the translational regulation of key meiotic genes on 3'-UTRs to achieve the faithful transition of germ cells to meiosis.

  17. INFO-RNA--a server for fast inverse RNA folding satisfying sequence constraints.

    PubMed

    Busch, Anke; Backofen, Rolf

    2007-07-01

    INFO-RNA is a new web server for designing RNA sequences that fold into a user given secondary structure. Furthermore, constraints on the sequence can be specified, e.g. one can restrict sequence positions to a fixed nucleotide or to a set of nucleotides. Moreover, the user can allow violations of the constraints at some positions, which can be advantageous in complicated cases. The INFO-RNA web server allows biologists to design RNA sequences in an automatic manner. It is clearly and intuitively arranged and easy to use. The procedure is fast, as most applications are completed within seconds and it proceeds better and faster than other existing tools. The INFO-RNA web server is freely available at http://www.bioinf.uni-freiburg.de/Software/INFO-RNA/

  18. AT-rich sequence elements promote nascent transcript cleavage leading to RNA polymerase II termination

    PubMed Central

    White, Eleanor; Kamieniarz-Gdula, Kinga; Dye, Michael J.; Proudfoot, Nick J.

    2013-01-01

    RNA Polymerase II (Pol II) termination is dependent on RNA processing signals as well as specific terminator elements located downstream of the poly(A) site. One of the two major terminator classes described so far is the Co-Transcriptional Cleavage (CoTC) element. We show that homopolymer A/T tracts within the human β-globin CoTC-mediated terminator element play a critical role in Pol II termination. These short A/T tracts, dispersed within seemingly random sequences, are strong terminator elements, and bioinformatics analysis confirms the presence of such sequences in 70% of the putative terminator regions (PTRs) genome-wide. PMID:23258704

  19. A MicroRNA Superfamily Regulates Nucleotide Binding Site–Leucine-Rich Repeats and Other mRNAs[W][OA

    PubMed Central

    Shivaprasad, Padubidri V.; Chen, Ho-Ming; Patel, Kanu; Bond, Donna M.; Santos, Bruno A.C.M.; Baulcombe, David C.

    2012-01-01

    Analysis of tomato (Solanum lycopersicum) small RNA data sets revealed the presence of a regulatory cascade affecting disease resistance. The initiators of the cascade are microRNA members of an unusually diverse superfamily in which miR482 and miR2118 are prominent members. Members of this superfamily are variable in sequence and abundance in different species, but all variants target the coding sequence for the P-loop motif in the mRNA sequences for disease resistance proteins with nucleotide binding site (NBS) and leucine-rich repeat (LRR) motifs. We confirm, using transient expression in Nicotiana benthamiana, that miR482 targets mRNAs for NBS-LRR disease resistance proteins with coiled-coil domains at their N terminus. The targeting causes mRNA decay and production of secondary siRNAs in a manner that depends on RNA-dependent RNA polymerase 6. At least one of these secondary siRNAs targets other mRNAs of a defense-related protein. The miR482-mediated silencing cascade is suppressed in plants infected with viruses or bacteria so that expression of mRNAs with miR482 or secondary siRNA target sequences is increased. We propose that this process allows pathogen-inducible expression of NBS-LRR proteins and that it contributes to a novel layer of defense against pathogen attack. PMID:22408077

  20. Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

    PubMed Central

    Schnare, M N; Gray, M W

    1982-01-01

    In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. Images PMID:7079176

  1. RNAcentral: A comprehensive database of non-coding RNA sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Kelly Porter; Lau, Britney Yan

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less

  2. RNAcentral: A comprehensive database of non-coding RNA sequences

    DOE PAGES

    Williams, Kelly Porter; Lau, Britney Yan

    2016-10-28

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less

  3. repRNA: a web server for generating various feature vectors of RNA sequences.

    PubMed

    Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen

    2016-02-01

    With the rapid growth of RNA sequences generated in the postgenomic age, it is highly desired to develop a flexible method that can generate various kinds of vectors to represent these sequences by focusing on their different features. This is because nearly all the existing machine-learning methods, such as SVM (support vector machine) and KNN (k-nearest neighbor), can only handle vectors but not sequences. To meet the increasing demands and speed up the genome analyses, we have developed a new web server, called "representations of RNA sequences" (repRNA). Compared with the existing methods, repRNA is much more comprehensive, flexible and powerful, as reflected by the following facts: (1) it can generate 11 different modes of feature vectors for users to choose according to their investigation purposes; (2) it allows users to select the features from 22 built-in physicochemical properties and even those defined by users' own; (3) the resultant feature vectors and the secondary structures of the corresponding RNA sequences can be visualized. The repRNA web server is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repRNA/ .

  4. YM500: a small RNA sequencing (smRNA-seq) database for microRNA research

    PubMed Central

    Cheng, Wei-Chung; Chung, I-Fang; Huang, Tse-Shun; Chang, Shih-Ting; Sun, Hsing-Jen; Tsai, Cheng-Fong; Liang, Muh-Lii; Wong, Tai-Tong; Wang, Hsei-Wei

    2013-01-01

    MicroRNAs (miRNAs) are small RNAs ∼22 nt in length that are involved in the regulation of a variety of physiological and pathological processes. Advances in high-throughput small RNA sequencing (smRNA-seq), one of the next-generation sequencing applications, have reshaped the miRNA research landscape. In this study, we established an integrative database, the YM500 (http://ngs.ym.edu.tw/ym500/), containing analysis pipelines and analysis results for 609 human and mice smRNA-seq results, including public data from the Gene Expression Omnibus (GEO) and some private sources. YM500 collects analysis results for miRNA quantification, for isomiR identification (incl. RNA editing), for arm switching discovery, and, more importantly, for novel miRNA predictions. Wetlab validation on >100 miRNAs confirmed high correlation between miRNA profiling and RT-qPCR results (R = 0.84). This database allows researchers to search these four different types of analysis results via our interactive web interface. YM500 allows researchers to define the criteria of isomiRs, and also integrates the information of dbSNP to help researchers distinguish isomiRs from SNPs. A user-friendly interface is provided to integrate miRNA-related information and existing evidence from hundreds of sequencing datasets. The identified novel miRNAs and isomiRs hold the potential for both basic research and biotech applications. PMID:23203880

  5. RNA sequencing and pathway analysis identify tumor necrosis factor alpha driven small proline-rich protein dysregulation in chronic rhinosinusitis.

    PubMed

    Ramakrishnan, Vijay R; Gonzalez, Joseph R; Cooper, Sarah E; Barham, Henry P; Anderson, Catherine B; Larson, Eric D; Cool, Carlyne D; Diller, John D; Jones, Kenneth; Kinnamon, Sue C

    2017-09-01

    Chronic rhinosinusitis (CRS) is a heterogeneous inflammatory disorder in which many pathways contribute to end-organ disease. Small proline-rich proteins (SPRR) are polypeptides that have recently been shown to contribute to epithelial biomechanical properties relevant in T-helper type 2 inflammation. There is evidence that genetic polymorphism in SPRR genes may predict the development of asthma in children with atopy and, correlatively, that expression of SPRRs is increased under allergic conditions, which leads to epithelial barrier dysfunction in atopic disease. RNAs from uncinate tissue specimens from patients with CRS and control subjects were compared by RNA sequencing by using Ingenuity Pathway Analysis (n = 4 each), and quantitative polymerase chain reaction (PCR) (n = 15). A separate cohort of archived sinus tissue was examined by immunohistochemistry (n = 19). A statistically significant increase of SPRR expression in CRS sinus tissue was identified that was not a result of atopic presence. SPRR1 and SPRR2A expressions were markedly increased in patients with CRS (p < 0.01) on RNA sequencing, with confirmation by using real-time PCR. Immunohistochemistry of archived surgical samples demonstrated staining of SPRR proteins within squamous epithelium of both groups. Pathway analysis indicated tumor necrosis factor (TNF) alpha as a master regulator of the SPRR gene products. Expression of SPRR1 and of SPRR2A is increased in mucosal samples from patients with CRS and appeared as a downstream result of TNF alpha modulation, which possibly resulted in epithelial barrier dysfunction.

  6. zUMIs - A fast and flexible pipeline to process RNA sequencing data with UMIs.

    PubMed

    Parekh, Swati; Ziegenhain, Christoph; Vieth, Beate; Enard, Wolfgang; Hellmann, Ines

    2018-06-01

    Single-cell RNA-sequencing (scRNA-seq) experiments typically analyze hundreds or thousands of cells after amplification of the cDNA. The high throughput is made possible by the early introduction of sample-specific bar codes (BCs), and the amplification bias is alleviated by unique molecular identifiers (UMIs). Thus, the ideal analysis pipeline for scRNA-seq data needs to efficiently tabulate reads according to both BC and UMI. zUMIs is a pipeline that can handle both known and random BCs and also efficiently collapse UMIs, either just for exon mapping reads or for both exon and intron mapping reads. If BC annotation is missing, zUMIs can accurately detect intact cells from the distribution of sequencing reads. Another unique feature of zUMIs is the adaptive downsampling function that facilitates dealing with hugely varying library sizes but also allows the user to evaluate whether the library has been sequenced to saturation. To illustrate the utility of zUMIs, we analyzed a single-nucleus RNA-seq dataset and show that more than 35% of all reads map to introns. Also, we show that these intronic reads are informative about expression levels, significantly increasing the number of detected genes and improving the cluster resolution. zUMIs flexibility makes if possible to accommodate data generated with any of the major scRNA-seq protocols that use BCs and UMIs and is the most feature-rich, fast, and user-friendly pipeline to process such scRNA-seq data.

  7. A U-Rich Element in the 5′ Untranslated Region Is Necessary for the Translation of p27 mRNA

    PubMed Central

    Millard, S. Sean; Vidal, Anxo; Markus, Maurice; Koff, Andrew

    2000-01-01

    Increased translation of p27 mRNA correlates with withdrawal of cells from the cell cycle. This raised the possibility that antimitogenic signals might mediate their effects on p27 expression by altering complexes that formed on p27 mRNA, regulating its translation. In this report, we identify a U-rich sequence in the 5′ untranslated region (5′UTR) of p27 mRNA that is necessary for efficient translation in proliferating and nonproliferating cells. We show that a number of factors bind to the 5′UTR in vitro in a manner dependent on the U-rich element, and their availability in the cytosol is controlled in a growth- and cell cycle-dependent fashion. One of these factors is HuR, a protein previously implicated in mRNA stability, transport, and translation. Another is hnRNP C1 and C2, proteins implicated in mRNA processing and the translation of a specific subset of mRNAs expressed in differentiated cells. In lovastatin-treated MDA468 cells, the mobility of the associated hnRNP C1 and C2 proteins changed, and this correlated with increased p27 expression. Together, these data suggest that the U-rich dependent RNP complex on the 5′UTR may regulate the translation of p27 mRNA and may be a target of antimitogenic signals. PMID:10913178

  8. RNA circularization reveals terminal sequence heterogeneity in a double-stranded RNA virus.

    PubMed

    Widmer, G

    1993-03-01

    Double-stranded RNA viruses (dsRNA), termed LRV1, have been found in several strains of the protozoan parasite Leishmania. With the aim of constructing a full-length cDNA copy of the viral genome, including its terminal sequences, a protocol based on PCR amplification across the 3'-5' junction of circularized RNA was developed. This method proved to be applicable to dsRNA. It provided a relatively simple alternative to one-sided PCR, without loss of specificity inherent in the use of generic primers. LRV1 terminal nucleotide sequences obtained by this method showed a considerable variation in length, particularly at the 5' end of the positive strand, as well as the potential for forming 3' overhangs. The opposite genomic end terminates in 0, 1, or 2 TCA trinucleotide repeats. These results are compared with terminal sequences derived from one-sided PCR experiments.

  9. AMPLIFICATION OF RIBOSOMAL RNA SEQUENCES

    EPA Science Inventory

    This book chapter offers an overview of the use of ribosomal RNA sequences. A history of the technology traces the evolution of techniques to measure bacterial phylogenetic relationships and recent advances in obtaining rRNA sequence information. The manual also describes procedu...

  10. Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.

    PubMed

    Bauer, Markus; Klau, Gunnar W; Reinert, Knut

    2007-07-27

    The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from http://www.planet-lisa.net.

  11. Rtools: a web server for various secondary structural analyses on single RNA sequences.

    PubMed

    Hamada, Michiaki; Ono, Yukiteru; Kiryu, Hisanori; Sato, Kengo; Kato, Yuki; Fukunaga, Tsukasa; Mori, Ryota; Asai, Kiyoshi

    2016-07-08

    The secondary structures, as well as the nucleotide sequences, are the important features of RNA molecules to characterize their functions. According to the thermodynamic model, however, the probability of any secondary structure is very small. As a consequence, any tool to predict the secondary structures of RNAs has limited accuracy. On the other hand, there are a few tools to compensate the imperfect predictions by calculating and visualizing the secondary structural information from RNA sequences. It is desirable to obtain the rich information from those tools through a friendly interface. We implemented a web server of the tools to predict secondary structures and to calculate various structural features based on the energy models of secondary structures. By just giving an RNA sequence to the web server, the user can get the different types of solutions of the secondary structures, the marginal probabilities such as base-paring probabilities, loop probabilities and accessibilities of the local bases, the energy changes by arbitrary base mutations as well as the measures for validations of the predicted secondary structures. The web server is available at http://rtools.cbrc.jp, which integrates software tools, CentroidFold, CentroidHomfold, IPKnot, CapR, Raccess, Rchange and RintD. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Evaluation of two main RNA-seq approaches for gene quantification in clinical RNA sequencing: polyA+ selection versus rRNA depletion.

    PubMed

    Zhao, Shanrong; Zhang, Ying; Gamini, Ramya; Zhang, Baohong; von Schack, David

    2018-03-19

    To allow efficient transcript/gene detection, highly abundant ribosomal RNAs (rRNA) are generally removed from total RNA either by positive polyA+ selection or by rRNA depletion (negative selection) before sequencing. Comparisons between the two methods have been carried out by various groups, but the assessments have relied largely on non-clinical samples. In this study, we evaluated these two RNA sequencing approaches using human blood and colon tissue samples. Our analyses showed that rRNA depletion captured more unique transcriptome features, whereas polyA+ selection outperformed rRNA depletion with higher exonic coverage and better accuracy of gene quantification. For blood- and colon-derived RNAs, we found that 220% and 50% more reads, respectively, would have to be sequenced to achieve the same level of exonic coverage in the rRNA depletion method compared with the polyA+ selection method. Therefore, in most cases we strongly recommend polyA+ selection over rRNA depletion for gene quantification in clinical RNA sequencing. Our evaluation revealed that a small number of lncRNAs and small RNAs made up a large fraction of the reads in the rRNA depletion RNA sequencing data. Thus, we recommend that these RNAs are specifically depleted to improve the sequencing depth of the remaining RNAs.

  13. RNA Helicase Associated with AU-rich Element (RHAU/DHX36) Interacts with the 3′-Tail of the Long Non-coding RNA BC200 (BCYRN1)*

    PubMed Central

    Booy, Evan P.; McRae, Ewan K. S.; Howard, Ryan; Deo, Soumya R.; Ariyo, Emmanuel O.; Dzananovic, Edis; Meier, Markus; Stetefeld, Jörg; McKenna, Sean A.

    2016-01-01

    RNA helicase associated with AU-rich element (RHAU) is an ATP-dependent RNA helicase that demonstrates high affinity for quadruplex structures in DNA and RNA. To elucidate the significance of these quadruplex-RHAU interactions, we have performed RNA co-immunoprecipitation screens to identify novel RNAs bound to RHAU and characterize their function. In the course of this study, we have identified the non-coding RNA BC200 (BCYRN1) as specifically enriched upon RHAU immunoprecipitation. Although BC200 does not adopt a quadruplex structure and does not bind the quadruplex-interacting motif of RHAU, it has direct affinity for RHAU in vitro. Specifically designed BC200 truncations and RNase footprinting assays demonstrate that RHAU binds to an adenosine-rich region near the 3′-end of the RNA. RHAU truncations support binding that is dependent upon a region within the C terminus and is specific to RHAU isoform 1. Tests performed to assess whether BC200 interferes with RHAU helicase activity have demonstrated the ability of BC200 to act as an acceptor of unwound quadruplexes via a cytosine-rich region near the 3′-end of the RNA. Furthermore, an interaction between BC200 and the quadruplex-containing telomerase RNA was confirmed by pull-down assays of the endogenous RNAs. This leads to the possibility that RHAU may direct BC200 to bind and exert regulatory functions at quadruplex-containing RNA or DNA sequences. PMID:26740632

  14. Terminator oligo blocking efficiently eliminates rRNA from Drosophila small RNA sequencing libraries.

    PubMed

    Wickersheim, Michelle L; Blumenstiel, Justin P

    2013-11-01

    A large number of methods are available to deplete ribosomal RNA reads from high-throughput RNA sequencing experiments. Such methods are critical for sequencing Drosophila small RNAs between 20 and 30 nucleotides because size selection is not typically sufficient to exclude the highly abundant class of 30 nucleotide 2S rRNA. Here we demonstrate that pre-annealing terminator oligos complimentary to Drosophila 2S rRNA prior to 5' adapter ligation and reverse transcription efficiently depletes 2S rRNA sequences from the sequencing reaction in a simple and inexpensive way. This depletion is highly specific and is achieved with minimal perturbation of miRNA and piRNA profiles.

  15. DSAP: deep-sequencing small RNA analysis pipeline.

    PubMed

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  16. High-Throughput, Data-Rich Cellular RNA Device Engineering

    PubMed Central

    Townshend, Brent; Kennedy, Andrew B.; Xiang, Joy S.; Smolke, Christina D.

    2015-01-01

    Methods for rapidly assessing sequence-structure-function landscapes and developing conditional gene-regulatory devices are critical to our ability to manipulate and interface with biology. We describe a framework for engineering RNA devices from preexisting aptamers that exhibit ligand-responsive ribozyme tertiary interactions. Our methodology utilizes cell sorting, high-throughput sequencing, and statistical data analyses to enable parallel measurements of the activities of hundreds of thousands of sequences from RNA device libraries in the absence and presence of ligands. Our tertiary interaction RNA devices exhibit improved performance in terms of gene silencing, activation ratio, and ligand sensitivity as compared to optimized RNA devices that rely on secondary structure changes. We apply our method to building biosensors for diverse ligands and determine consensus sequences that enable ligand-responsive tertiary interactions. These methods advance our ability to develop broadly applicable genetic tools and to elucidate understanding of the underlying sequence-structure-function relationships that empower rational design of complex biomolecules. PMID:26258292

  17. Structural polymorphism of a cytosine-rich DNA sequence forming i-motif structure: Exploring pH based biosensors.

    PubMed

    Ahmed, Saami; Kaushik, Mahima; Chaudhary, Swati; Kukreti, Shrikant

    2018-05-01

    Sequence recognition and conformational polymorphism enable DNA to emerge out as a substantial tool in fabricating the devices within nano-dimensions. These DNA associated nano devices work on the principle of conformational switches, which can be facilitated by many factors like sequence of DNA/RNA strand, change in pH or temperature, enzyme or ligand interactions etc. Thus, controlling these DNA conformational changes to acquire the desired function is significant for evolving DNA hybridization biosensor, used in genetic screening and molecular diagnosis. For exploring this conformational switching ability of cytosine-rich DNA oligonucleotides as a function of pH for their potential usage as biosensors, this study has been designed. A C-rich stretch of DNA sequence (5'-TCCCCCAATTAATTCCCCCA-3'; SG20c) has been investigated using UV-Thermal denaturation, poly-acrylamide gel electrophoresis and CD spectroscopy. The SG20c sequence is shown to adopt various topologies of i-motif structure at low pH. This pH dependent transition of SG20c from unstructured single strand to unimolecular and bimolecular i-motif structures can further be exploited for its utilization as switching on/off pH-based biosensors. Copyright © 2018. Published by Elsevier B.V.

  18. Primer-independent RNA sequencing with bacteriophage phi6 RNA polymerase and chain terminators.

    PubMed

    Makeyev, E V; Bamford, D H

    2001-05-01

    Here we propose a new general method for directly determining RNA sequence based on the use of the RNA-dependent RNA polymerase from bacteriophage phi6 and the chain terminators (RdRP sequencing). The following properties of the polymerase render it appropriate for this application: (1) the phi6 polymerase can replicate a number of single-stranded RNA templates in vitro. (2) In contrast to the primer-dependent DNA polymerases utilized in the sequencing procedure by Sanger et al. (Proc Natl Acad Sci USA, 1977, 74:5463-5467), it initiates nascent strand synthesis without a primer, starting the polymerization on the very 3'-terminus of the template. (3) The polymerase can incorporate chain-terminating nucleotide analogs into the nascent RNA chain to produce a set of base-specific termination products. Consequently, 3' proximal or even complete sequence of many target RNA molecules can be rapidly deduced without prior sequence information. The new technique proved useful for sequencing several synthetic ssRNA templates. Furthermore, using genomic segments of the bluetongue virus we show that RdRP sequencing can also be applied to naturally occurring dsRNA templates. This suggests possible uses of the method in the RNA virus research and diagnostics.

  19. RNase H-assisted RNA-primed rolling circle amplification for targeted RNA sequence detection.

    PubMed

    Takahashi, Hirokazu; Ohkawachi, Masahiko; Horio, Kyohei; Kobori, Toshiro; Aki, Tsunehiro; Matsumura, Yukihiko; Nakashimada, Yutaka; Okamura, Yoshiko

    2018-05-17

    RNA-primed rolling circle amplification (RPRCA) is a useful laboratory method for RNA detection; however, the detection of RNA is limited by the lack of information on 3'-terminal sequences. We uncovered that conventional RPRCA using pre-circularized probes could potentially detect the internal sequence of target RNA molecules in combination with RNase H. However, the specificity for mRNA detection was low, presumably due to non-specific hybridization of non-target RNA with the circular probe. To overcome this technical problem, we developed a method for detecting a sequence of interest in target RNA molecules via RNase H-assisted RPRCA using padlocked probes. When padlock probes are hybridized to the target RNA molecule, they are converted to the circular form by SplintR ligase. Subsequently, RNase H creates nick sites only in the hybridized RNA sequence, and single-stranded DNA is finally synthesized from the nick site by phi29 DNA polymerase. This method could specifically detect at least 10 fmol of the target RNA molecule without reverse transcription. Moreover, this method detected GFP mRNA present in 10 ng of total RNA isolated from Escherichia coli without background DNA amplification. Therefore, this method can potentially detect almost all types of RNA molecules without reverse transcription and reveal full-length sequence information.

  20. Introduction to Single-Cell RNA Sequencing.

    PubMed

    Olsen, Thale Kristin; Baryawno, Ninib

    2018-04-01

    During the last decade, high-throughput sequencing methods have revolutionized the entire field of biology. The opportunity to study entire transcriptomes in great detail using RNA sequencing (RNA-seq) has fueled many important discoveries and is now a routine method in biomedical research. However, RNA-seq is typically performed in "bulk," and the data represent an average of gene expression patterns across thousands to millions of cells; this might obscure biologically relevant differences between cells. Single-cell RNA-seq (scRNA-seq) represents an approach to overcome this problem. By isolating single cells, capturing their transcripts, and generating sequencing libraries in which the transcripts are mapped to individual cells, scRNA-seq allows assessment of fundamental biological properties of cell populations and biological systems at unprecedented resolution. Here, we present the most common scRNA-seq protocols in use today and the basics of data analysis and discuss factors that are important to consider before planning and designing an scRNA-seq project. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  1. Polyadenylation of RNA transcribed from mammalian SINEs by RNA polymerase III: Complex requirements for nucleotide sequences.

    PubMed

    Borodulina, Olga R; Golubchikova, Julia S; Ustyantsev, Ilia G; Kramerov, Dmitri A

    2016-02-01

    It is generally accepted that only transcripts synthesized by RNA polymerase II (e.g., mRNA) were subject to AAUAAA-dependent polyadenylation. However, we previously showed that RNA transcribed by RNA polymerase III (pol III) from mouse B2 SINE could be polyadenylated in an AAUAAA-dependent manner. Many species of mammalian SINEs end with the pol III transcriptional terminator (TTTTT) and contain hexamers AATAAA in their A-rich tail. Such SINEs were united into Class T(+), whereas SINEs lacking the terminator and AATAAA sequences were classified as T(-). Here we studied the structural features of SINE pol III transcripts that are necessary for their polyadenylation. Eight and six SINE families from classes T(+) and T(-), respectively, were analyzed. The replacement of AATAAA with AACAAA in T(+) SINEs abolished the RNA polyadenylation. Interestingly, insertion of the polyadenylation signal (AATAAA) and pol III transcription terminator in T(-) SINEs did not result in polyadenylation. The detailed analysis of three T(+) SINEs (B2, DIP, and VES) revealed areas important for the polyadenylation of their pol III transcripts: the polyadenylation signal and terminator in A-rich tail, β region positioned immediately downstream of the box B of pol III promoter, and τ region located upstream of the tail. In DIP and VES (but not in B2), the τ region is a polypyrimidine motif which is also characteristic of many other T(+) SINEs. Most likely, SINEs of different mammals acquired these structural features independently as a result of parallel evolution. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Discriminative Prediction of A-To-I RNA Editing Events from DNA Sequence

    PubMed Central

    Sun, Jiangming; Singh, Pratibha; Bagge, Annika; Valtat, Bérengère; Vikman, Petter; Spégel, Peter; Mulder, Hindrik

    2016-01-01

    RNA editing is a post-transcriptional alteration of RNA sequences that, via insertions, deletions or base substitutions, can affect protein structure as well as RNA and protein expression. Recently, it has been suggested that RNA editing may be more frequent than previously thought. A great impediment, however, to a deeper understanding of this process is the paramount sequencing effort that needs to be undertaken to identify RNA editing events. Here, we describe an in silico approach, based on machine learning, that ameliorates this problem. Using 41 nucleotide long DNA sequences, we show that novel A-to-I RNA editing events can be predicted from known A-to-I RNA editing events intra- and interspecies. The validity of the proposed method was verified in an independent experimental dataset. Using our approach, 203 202 putative A-to-I RNA editing events were predicted in the whole human genome. Out of these, 9% were previously reported. The remaining sites require further validation, e.g., by targeted deep sequencing. In conclusion, the approach described here is a useful tool to identify potential A-to-I RNA editing events without the requirement of extensive RNA sequencing. PMID:27764195

  3. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less

  4. A tale of two sequences: microRNA-target chimeric reads.

    PubMed

    Broughton, James P; Pasquinelli, Amy E

    2016-04-04

    In animals, a functional interaction between a microRNA (miRNA) and its target RNA requires only partial base pairing. The limited number of base pair interactions required for miRNA targeting provides miRNAs with broad regulatory potential and also makes target prediction challenging. Computational approaches to target prediction have focused on identifying miRNA target sites based on known sequence features that are important for canonical targeting and may miss non-canonical targets. Current state-of-the-art experimental approaches, such as CLIP-seq (cross-linking immunoprecipitation with sequencing), PAR-CLIP (photoactivatable-ribonucleoside-enhanced CLIP), and iCLIP (individual-nucleotide resolution CLIP), require inference of which miRNA is bound at each site. Recently, the development of methods to ligate miRNAs to their target RNAs during the preparation of sequencing libraries has provided a new tool for the identification of miRNA target sites. The chimeric, or hybrid, miRNA-target reads that are produced by these methods unambiguously identify the miRNA bound at a specific target site. The information provided by these chimeric reads has revealed extensive non-canonical interactions between miRNAs and their target mRNAs, and identified many novel interactions between miRNAs and noncoding RNAs.

  5. Phylogenetically Structured Differences in rRNA Gene Sequence Variation among Species of Arbuscular Mycorrhizal Fungi and Their Implications for Sequence Clustering

    PubMed Central

    Ekanayake, Saliya; Ruan, Yang; Schütte, Ursel M. E.; Kaonongbua, Wittaya; Fox, Geoffrey; Ye, Yuzhen; Bever, James D.

    2016-01-01

    ABSTRACT Arbuscular mycorrhizal (AM) fungi form mutualisms with plant roots that increase plant growth and shape plant communities. Each AM fungal cell contains a large amount of genetic diversity, but it is unclear if this diversity varies across evolutionary lineages. We found that sequence variation in the nuclear large-subunit (LSU) rRNA gene from 29 isolates representing 21 AM fungal species generally assorted into genus- and species-level clades, with the exception of species of the genera Claroideoglomus and Entrophospora. However, there were significant differences in the levels of sequence variation across the phylogeny and between genera, indicating that it is an evolutionarily constrained trait in AM fungi. These consistent patterns of sequence variation across both phylogenetic and taxonomic groups pose challenges to interpreting operational taxonomic units (OTUs) as approximations of species-level groups of AM fungi. We demonstrate that the OTUs produced by five sequence clustering methods using 97% or equivalent sequence similarity thresholds failed to match the expected species of AM fungi, although OTUs from AbundantOTU, CD-HIT-OTU, and CROP corresponded better to species than did OTUs from mothur or UPARSE. This lack of OTU-to-species correspondence resulted both from sequences of one species being split into multiple OTUs and from sequences of multiple species being lumped into the same OTU. The OTU richness therefore will not reliably correspond to the AM fungal species richness in environmental samples. Conservatively, this error can overestimate species richness by 4-fold or underestimate richness by one-half, and the direction of this error will depend on the genera represented in the sample. IMPORTANCE Arbuscular mycorrhizal (AM) fungi form important mutualisms with the roots of most plant species. Individual AM fungi are genetically diverse, but it is unclear whether the level of this diversity differs among evolutionary lineages. We found

  6. INFO-RNA—a server for fast inverse RNA folding satisfying sequence constraints

    PubMed Central

    Busch, Anke; Backofen, Rolf

    2007-01-01

    INFO-RNA is a new web server for designing RNA sequences that fold into a user given secondary structure. Furthermore, constraints on the sequence can be specified, e.g. one can restrict sequence positions to a fixed nucleotide or to a set of nucleotides. Moreover, the user can allow violations of the constraints at some positions, which can be advantageous in complicated cases. The INFO-RNA web server allows biologists to design RNA sequences in an automatic manner. It is clearly and intuitively arranged and easy to use. The procedure is fast, as most applications are completed within seconds and it proceeds better and faster than other existing tools. The INFO-RNA web server is freely available at http://www.bioinf.uni-freiburg.de/Software/INFO-RNA/ PMID:17452349

  7. Exploration of RNA Sequence Space in the Absence of a Replicase.

    PubMed

    Tirumalai, Madhan R; Tran, Quyen; Paci, Maxim; Chavan, Dimple; Marathe, Anuradha; Fox, George E

    2018-05-11

    It is generally considered that if an RNA World ever existed that it would be driven by an RNA capable of RNA replication. Whether such a catalytic RNA could emerge in an RNA World or not, there would need to be prior routes to increasing complexity in order to produce it. It is hypothesized here that increasing sequence variety, if not complexity, can in fact readily emerge in response to a dynamic equilibrium between synthesis and degradation. A model system in which T4 RNA ligase catalyzes synthesis and Benzonase catalyzes degradation was constructed. An initial 20-mer served as a seed and was subjected to 180 min of simultaneous ligation and degradation. The seed RNA rapidly disappeared and was replaced by an increasing number and variety of both larger and smaller variants. Variants of 40-80 residues were consistently seen, typically representing 2-4% of the unique sequences. In a second experiment with four individual 9-mers, numerous variants were again produced. These included variants of the individual 9-mers as well as sequences that contained sequence segments from two or more 9-mers. In both cases, the RNA products lack large numbers of point mutations but instead incorporate additions and subtractions of fragments of the original RNAs. The system demonstrates that if such equilibrium were established in a prebiotic world it would result in significant exploration of RNA sequence space and likely increased complexity. It remains to be seen if the variety of products produced is affected by the presence of small peptide oligomers.

  8. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    PubMed

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  9. Microbial Diversity in Deep-sea Methane Seep Sediments Presented by SSU rRNA Gene Tag Sequencing

    PubMed Central

    Nunoura, Takuro; Takaki, Yoshihiro; Kazama, Hiromi; Hirai, Miho; Ashi, Juichiro; Imachi, Hiroyuki; Takai, Ken

    2012-01-01

    Microbial community structures in methane seep sediments in the Nankai Trough were analyzed by tag-sequencing analysis for the small subunit (SSU) rRNA gene using a newly developed primer set. The dominant members of Archaea were Deep-sea Hydrothermal Vent Euryarchaeotic Group 6 (DHVEG 6), Marine Group I (MGI) and Deep Sea Archaeal Group (DSAG), and those in Bacteria were Alpha-, Gamma-, Delta- and Epsilonproteobacteria, Chloroflexi, Bacteroidetes, Planctomycetes and Acidobacteria. Diversity and richness were examined by 8,709 and 7,690 tag-sequences from sediments at 5 and 25 cm below the seafloor (cmbsf), respectively. The estimated diversity and richness in the methane seep sediment are as high as those in soil and deep-sea hydrothermal environments, although the tag-sequences obtained in this study were not sufficient to show whole microbial diversity in this analysis. We also compared the diversity and richness of each taxon/division between the sediments from the two depths, and found that the diversity and richness of some taxa/divisions varied significantly along with the depth. PMID:22510646

  10. Experimental investigation of an RNA sequence space

    NASA Technical Reports Server (NTRS)

    Lee, Youn-Hyung; Dsouza, Lisa; Fox, George E.

    1993-01-01

    Modern rRNAs are the historic consequence of an ongoing evolutionary exploration of a sequence space. These extant sequences belong to a special subset of the sequence space that is comprised only of those primary sequences that can validly perform the biological function(s) required of the particular RNA. If it were possible to readily identify all such valid sequences, stochastic predictions could be made about the relative likelihood of various evolutionary pathways available to an RNA. Herein an experimental system which can assess whether a particular sequence is likely to have validity as a eubacterial 5S rRNA is described. A total of ten naturally occurring, and hence known to be valid, sequences and two point mutants of unknown validity were used to test the usefulness of the approach. Nine of the ten valid sequences tested positive whereas both mutants tested as clearly defective. The tenth valid sequence gave results that would be interpreted as reflecting a borderline status were the answer not known. These results demonstrate that it is possible to experimentally determine which sequences in local regions of the sequence space are potentially valid 5S rRNAs.

  11. Characterization of intronic uridine-rich sequence elements acting as possible targets for nuclear proteins during pre-mRNA splicing in Nicotiana plumbaginifolia.

    PubMed

    Gniadkowski, M; Hemmings-Mieszczak, M; Klahre, U; Liu, H X; Filipowicz, W

    1996-02-15

    Introns of nuclear pre-mRNAs in dicotyledonous plants, unlike introns in vertebrates or yeast, are distinctly rich in A+U nucleotides and this feature is essential for their processing. In order to define more precisely sequence elements important for intron recognition in plants, we investigated the effects of short insertions, either U-rich or A-rich, on splicing of synthetic introns in transfected protoplast of Nicotiana plumbaginifolia. It was found that insertions of U-rich (sequence UUUUUAU) but not A-rich (AUAAAAA) segments can activate splicing of a GC-rich synthetic infron, and that U-rich segments, or multimers thereof, can function irrespective of the site of insertion within the intron. Insertions of multiple U-rich segments, either at the same or different locations, generally had an additive, stimulatory effect on splicing. Mutational analysis showed that replacement of one or two U residues in the UUUUUAU sequence with A or C residues had only a small effect on splicing, but replacement with G residues was strongly inhibitory. Proteins that interact with fragments of natural and synthetic pre-mRNAs in vitro were identified in nuclear extracts of N.plumbaginifolia by UV cross- linking. The profile of cross-linked plant proteins was considerably less complex than that obtained with a HeLa cell nuclear extract. Two major cross-linkable plant proteins had apparent molecular mass of 50 and 54 kDa and showed affinity for oligouridilates present in synGC introns or for poly(U).

  12. Characterization of intronic uridine-rich sequence elements acting as possible targets for nuclear proteins during pre-mRNA splicing in Nicotiana plumbaginifolia.

    PubMed Central

    Gniadkowski, M; Hemmings-Mieszczak, M; Klahre, U; Liu, H X; Filipowicz, W

    1996-01-01

    Introns of nuclear pre-mRNAs in dicotyledonous plants, unlike introns in vertebrates or yeast, are distinctly rich in A+U nucleotides and this feature is essential for their processing. In order to define more precisely sequence elements important for intron recognition in plants, we investigated the effects of short insertions, either U-rich or A-rich, on splicing of synthetic introns in transfected protoplast of Nicotiana plumbaginifolia. It was found that insertions of U-rich (sequence UUUUUAU) but not A-rich (AUAAAAA) segments can activate splicing of a GC-rich synthetic infron, and that U-rich segments, or multimers thereof, can function irrespective of the site of insertion within the intron. Insertions of multiple U-rich segments, either at the same or different locations, generally had an additive, stimulatory effect on splicing. Mutational analysis showed that replacement of one or two U residues in the UUUUUAU sequence with A or C residues had only a small effect on splicing, but replacement with G residues was strongly inhibitory. Proteins that interact with fragments of natural and synthetic pre-mRNAs in vitro were identified in nuclear extracts of N.plumbaginifolia by UV cross- linking. The profile of cross-linked plant proteins was considerably less complex than that obtained with a HeLa cell nuclear extract. Two major cross-linkable plant proteins had apparent molecular mass of 50 and 54 kDa and showed affinity for oligouridilates present in synGC introns or for poly(U). PMID:8604302

  13. Sequence of the chloroplast 16S rRNA gene and its surrounding regions of Chlamydomonas reinhardii.

    PubMed Central

    Dron, M; Rahire, M; Rochaix, J D

    1982-01-01

    The sequence of a 2 kb DNA fragment containing the chloroplast 16S ribosomal RNA gene from Chlamydomonas reinhardii and its flanking regions has been determined. The algal 16S rRNA sequence (1475 nucleotides) and secondary structure are highly related to those found in bacteria and in the chloroplasts of higher plants. In contrast, the flanking regions are very different. In C. reinhardii the 16S rRNA gene is surrounded by AT rich segments of about 180 bases, which are followed by a long stretch of complementary bases separated from each other by 1833 nucleotides. It is likely that these structures play an important role in the folding and processing of the precursor of 16S rRNA. The primary and secondary structures of the binding sites of two ribosomal proteins in the 16SrRNAs of E. coli and C. reinhardii are considerably related. Images PMID:6296784

  14. eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing.

    PubMed

    Yuan, Tiezheng; Huang, Xiaoyi; Dittmar, Rachel L; Du, Meijun; Kohli, Manish; Boardman, Lisa; Thibodeau, Stephen N; Wang, Liang

    2014-03-05

    RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification" includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module "mRNA identification" includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module "Target screening" provides expression profiling analyses and graphic visualization. The module "Self-testing" offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program's functionality. eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory.

  15. Properties of a U1 RNA enhancer-like sequence.

    PubMed Central

    Ciliberto, G; Palla, F; Tebb, G; Mattaj, I W; Philipson, L

    1987-01-01

    The properties of a X.laevis U1B snRNA gene enhancer have been studied by microinjection in Xenopus oocytes. The enhancer-like sequence, defined as a short DNA stretch that is able to activate transcription in an orientation independent manner, is interchangeable between different U snRNA genes. The enhancer sequence alone does not, however, efficiently activate transcription from an SV40 pol II promoter but regains its activity when combined with the U-gene specific proximal sequence element. DNase I protection experiments show that the X.laevis U1B enhancer can interact specifically with a nuclear factor present in mammalian cells. Images PMID:3031597

  16. RISE: a database of RNA interactome from sequencing experiments

    PubMed Central

    Gong, Jing; Shao, Di; Xu, Kui

    2018-01-01

    Abstract We present RISE (http://rise.zhanglab.net), a database of RNA Interactome from Sequencing Experiments. RNA-RNA interactions (RRIs) are essential for RNA regulation and function. RISE provides a comprehensive collection of RRIs that mainly come from recent transcriptome-wide sequencing-based experiments like PARIS, SPLASH, LIGR-seq, and MARIO, as well as targeted studies like RIA-seq, RAP-RNA and CLASH. It also includes interactions aggregated from other primary databases and publications. The RISE database currently contains 328,811 RNA-RNA interactions mainly in human, mouse and yeast. While most existing RNA databases mainly contain interactions of miRNA targeting, notably, more than half of the RRIs in RISE are among mRNA and long non-coding RNAs. We compared different RRI datasets in RISE and found limited overlaps in interactions resolved by different techniques and in different cell lines. It may suggest technology preference and also dynamic natures of RRIs. We also analyzed the basic features of the human and mouse RRI networks and found that they tend to be scale-free, small-world, hierarchical and modular. The analysis may nominate important RNAs or RRIs for further investigation. Finally, RISE provides a Circos plot and several table views for integrative visualization, with extensive molecular and functional annotations to facilitate exploration of biological functions for any RRI of interest. PMID:29040625

  17. Ribosomal RNA sequence suggest microsporidia are extremely ancient eukaryotes

    NASA Technical Reports Server (NTRS)

    Vossbrinck, C. R.; Maddox, J. V.; Friedman, S.; Debrunner-Vossbrinck, B. A.; Woese, C. R.

    1987-01-01

    A comparative sequence analysis of the 18S small subunit ribosomal RNA (rRNA) of the microsporidium Vairimorpha necatrix is presented. The results show that this rRNA sequence is more unlike those of other eukaryotes than any known eukaryote rRNA sequence. It is concluded that the lineage leading to microsporidia branched very early from that leading to other eukaryotes.

  18. RStrucFam: a web server to associate structure and cognate RNA for RNA-binding proteins from sequence information.

    PubMed

    Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan

    2016-10-07

    RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential

  19. eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing

    PubMed Central

    2014-01-01

    Background RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. Results We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification” includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module “mRNA identification” includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module “Target screening” provides expression profiling analyses and graphic visualization. The module “Self-testing” offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program’s functionality. Conclusions eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory. PMID:24593312

  20. Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes

    PubMed Central

    Shiroguchi, Katsuyuki; Jia, Tony Z.; Sims, Peter A.; Xie, X. Sunney

    2012-01-01

    RNA sequencing (RNA-Seq) is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable, even in the presence of multiple sequencing errors. This method allows counting with single-copy resolution despite sequence-dependent bias and PCR-amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of Escherichia coli with more accurate and reproducible quantification than conventional RNA-Seq. PMID:22232676

  1. Deep sequencing of cardiac microRNA-mRNA interactomes in clinical and experimental cardiomyopathy

    PubMed Central

    Matkovich, Scot J.; Dorn, Gerald W.

    2018-01-01

    Summary MicroRNAs are a family of short (~21 nucleotide) noncoding RNAs that serve key roles in cellular growth and differentiation and the response of the heart to stress stimuli. As the sequence-specific recognition element of RNA-induced silencing complexes (RISCs), microRNAs bind mRNAs and prevent their translation via mechanisms that may include transcript degradation and/or prevention of ribosome binding. Short microRNA sequences and the ability of microRNAs to bind to mRNA sites having only partial/imperfect sequence complementarity complicates purely computational analyses of microRNA-mRNA interactomes. Furthermore, computational microRNA target prediction programs typically ignore biological context, and therefore the principal determinants of microRNA-mRNA binding: the presence and quantity of each. To address these deficiencies we describe an empirical method, developed via studies of stressed and failing hearts, to determine disease-induced changes in microRNAs, mRNAs, and the mRNAs targeted to the RISC, without cross-linking mRNAs to RISC proteins. Deep sequencing methods are used to determine RNA abundances, delivering unbiased, quantitative RNA data limited only by their annotation in the genome of interest. We describe the laboratory bench steps required to perform these experiments, experimental design strategies to achieve an appropriate number of sequencing reads per biological replicate, and computer-based processing tools and procedures to convert large raw sequencing data files into gene expression measures useful for differential expression analyses. PMID:25836573

  2. Deep sequencing of cardiac microRNA-mRNA interactomes in clinical and experimental cardiomyopathy.

    PubMed

    Matkovich, Scot J; Dorn, Gerald W

    2015-01-01

    MicroRNAs are a family of short (~21 nucleotide) noncoding RNAs that serve key roles in cellular growth and differentiation and the response of the heart to stress stimuli. As the sequence-specific recognition element of RNA-induced silencing complexes (RISCs), microRNAs bind mRNAs and prevent their translation via mechanisms that may include transcript degradation and/or prevention of ribosome binding. Short microRNA sequences and the ability of microRNAs to bind to mRNA sites having only partial/imperfect sequence complementarity complicate purely computational analyses of microRNA-mRNA interactomes. Furthermore, computational microRNA target prediction programs typically ignore biological context, and therefore the principal determinants of microRNA-mRNA binding: the presence and quantity of each. To address these deficiencies we describe an empirical method, developed via studies of stressed and failing hearts, to determine disease-induced changes in microRNAs, mRNAs, and the mRNAs targeted to the RISC, without cross-linking mRNAs to RISC proteins. Deep sequencing methods are used to determine RNA abundances, delivering unbiased, quantitative RNA data limited only by their annotation in the genome of interest. We describe the laboratory bench steps required to perform these experiments, experimental design strategies to achieve an appropriate number of sequencing reads per biological replicate, and computer-based processing tools and procedures to convert large raw sequencing data files into gene expression measures useful for differential expression analyses.

  3. Single-Cell RNA-Sequencing in Glioma.

    PubMed

    Johnson, Eli; Dickerson, Katherine L; Connolly, Ian D; Hayden Gephart, Melanie

    2018-04-10

    In this review, we seek to summarize the literature concerning the use of single-cell RNA-sequencing for CNS gliomas. Single-cell analysis has revealed complex tumor heterogeneity, subpopulations of proliferating stem-like cells and expanded our view of tumor microenvironment influence in the disease process. Although bulk RNA-sequencing has guided our initial understanding of glioma genetics, this method does not accurately define the heterogeneous subpopulations found within these tumors. Single-cell techniques have appealing applications in cancer research, as diverse cell types and the tumor microenvironment have important implications in therapy. High cost and difficult protocols prevent widespread use of single-cell RNA-sequencing; however, continued innovation will improve accessibility and expand our of knowledge gliomas.

  4. The RNA-binding complex ESCRT-II in Xenopus laevis eggs recognizes purine-rich sequences through its subunit Vps25.

    PubMed

    Emerman, Amy B; Blower, Michael

    2018-06-14

    RNA-binding proteins (RBPs) are critical regulators of gene expression. Recent studies have uncovered hundreds of mRNA-binding proteins that do not contain annotated RNA-binding domains and have well-established roles in other cellular processes. Investigation of these nonconventional RBPs is critical for revealing novel RNA-binding domains and may disclose connections between RNA regulation and other aspects of cell biology. Endosomal sorting complex required for transport II (ESCRT-II) is a nonconventional RNA-binding complex that has a canonical role in multivesicular body formation. ESCRT-II previously has been identified as an RNA-binding complex in Drosophila oocytes, but whether its RNA-binding properties extend beyond Drosophila is unknown. In this study, we found that the RNA-binding properties of ESCRT-II are conserved in Xenopus eggs, where ESCRT-II interacted with hundreds of mRNAs. Using a UV-crosslinking approach, we demonstrated that ESCRT-II binds directly to RNA through its subunit Vps25. UV-crosslinking and immunoprecipitation (CLIP)-Seq revealed that Vps25 specifically recognizes a polypurine (i.e. GA-rich) motif in RNA. Using purified components, we could reconstitute the selective Vps25-mediated binding of the polypurine motif in vitro. Our results provide insight into the mechanism by which ESCRT-II selectively binds to mRNAs and also suggest an unexpected link between endosome biology and RNA regulation. Published under license by The American Society for Biochemistry and Molecular Biology, Inc.

  5. Human ribosomal RNA gene: nucleotide sequence of the transcription initiation region and comparison of three mammalian genes.

    PubMed Central

    Financsek, I; Mizumoto, K; Mishima, Y; Muramatsu, M

    1982-01-01

    The transcription initiation site of the human ribosomal RNA gene (rDNA) was located by using the single-strand specific nuclease protection method and by determining the first nucleotide of the in vitro capped 45S preribosomal RNA. The sequence of 1,211 nucleotides surrounding the initiation site was determined. The sequenced region was found to consist of 75% G and C and to contain a number of short direct and inverted repeats and palindromes. By comparison of the corresponding initiation regions of three mammalian species, several conserved sequences were found upstream and downstream from the transcription starting point. Two short A + T-rich sequences are present on human, mouse, and rat ribosomal RNA genes between the initiation site and 40 nucleotides upstream, and a C + T cluster is located at a position around -60. At and downstream from the initiation site, a common sequence, T-AG-C-T-G-A-C-A-C-G-C-T-G-T-C-C-T-CT-T, was found in the three genes from position -1 through +18. The strong conservation of these sequences suggests their functional significance in rDNA. The S1 nuclease protection experiments with cloned rDNA fragments indicated the presence in human 45S RNA of molecules several hundred nucleotides shorter than the supposed primary transcript. The first 19 nucleotides of these molecules appear identical--except for one mismatch--to the nucleotide sequence of the 5' end of a supposed early processing product of the mouse 45S RNA. Images PMID:6954460

  6. The Nucleotide Sequence and Spliced pol mRNA Levels of the Nonprimate Spumavirus Bovine Foamy Virus

    PubMed Central

    Holzschu, Donald L.; Delaney, Mari A.; Renshaw, Randall W.; Casey, James W.

    1998-01-01

    We have determined the complete nucleotide sequence of a replication-competent clone of bovine foamy virus (BFV) and have quantitated the amount of splice pol mRNA processed early in infection. The 544-amino-acid Gag protein precursor has little sequence similarity with its primate foamy virus homologs, but the putative nucleocapsid (NC) protein, like the primate NCs, contains the three glycine-arginine-rich regions that are postulated to bind genomic RNA during virion assembly. The BFV gag and pol open reading frames overlap, with pro and pol in the same translational frame. As with the human foamy virus (HFV) and feline foamy virus, we have detected a spliced pol mRNA by PCR. Quantitatively, this mRNA approximates the level of full-length genomic RNA early in infection. The integrase (IN) domain of reverse transcriptase does not contain the canonical HH-CC zinc finger motif present in all characterized retroviral INs, but it does contain a nearby histidine residue that could conceivably participate as a member of the zinc finger. The env gene encodes a protein that is over 40% identical in sequence to the HFV Env. By comparison, the Gag precursor of BFV is predicted to be only 28% identical to the HFV protein. PMID:9499074

  7. SSMART: Sequence-structure motif identification for RNA-binding proteins.

    PubMed

    Munteanu, Alina; Mukherjee, Neelanjan; Ohler, Uwe

    2018-06-11

    RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary data are available at Bioinformatics online.

  8. SPAR: small RNA-seq portal for analysis of sequencing experiments.

    PubMed

    Kuksa, Pavel P; Amlie-Wolf, Alexandre; Katanic, Živadin; Valladares, Otto; Wang, Li-San; Leung, Yuk Yee

    2018-05-04

    The introduction of new high-throughput small RNA sequencing protocols that generate large-scale genomics datasets along with increasing evidence of the significant regulatory roles of small non-coding RNAs (sncRNAs) have highlighted the urgent need for tools to analyze and interpret large amounts of small RNA sequencing data. However, it remains challenging to systematically and comprehensively discover and characterize sncRNA genes and specifically-processed sncRNA products from these datasets. To fill this gap, we present Small RNA-seq Portal for Analysis of sequencing expeRiments (SPAR), a user-friendly web server for interactive processing, analysis, annotation and visualization of small RNA sequencing data. SPAR supports sequencing data generated from various experimental protocols, including smRNA-seq, short total RNA sequencing, microRNA-seq, and single-cell small RNA-seq. Additionally, SPAR includes publicly available reference sncRNA datasets from our DASHR database and from ENCODE across 185 human tissues and cell types to produce highly informative small RNA annotations across all major small RNA types and other features such as co-localization with various genomic features, precursor transcript cleavage patterns, and conservation. SPAR allows the user to compare the input experiment against reference ENCODE/DASHR datasets. SPAR currently supports analyses of human (hg19, hg38) and mouse (mm10) sequencing data. SPAR is freely available at https://www.lisanwanglab.org/SPAR.

  9. A comparison of RNA with DNA in template-directed synthesis

    NASA Technical Reports Server (NTRS)

    Zielinski, M.; Kozlov, I. A.; Orgel, L. E.; Bada, J. L. (Principal Investigator)

    2000-01-01

    Nonenzymatic template-directed copying of RNA sequences rich in cytidylic acid using nucleoside 5'-(2-methylimidazol-1-yl phosphates) as substrates is substantially more efficient than the copying of corresponding DNA sequences. However, many sequences cannot be copied, and the prospect of replication in this system is remote, even for RNA. Surprisingly, wobble-pairing leads to much more efficient incorporation of G opposite U on RNA templates than of G opposite T on DNA templates.

  10. Computational Analysis of Mouse piRNA Sequence and Biogenesis

    PubMed Central

    Betel, Doron; Sheridan, Robert; Marks, Debora S; Sander, Chris

    2007-01-01

    The recent discovery of a new class of 30-nucleotide long RNAs in mammalian testes, called PIWI-interacting RNA (piRNA), with similarities to microRNAs and repeat-associated small interfering RNAs (rasiRNAs), has raised puzzling questions regarding their biogenesis and function. We report a comparative analysis of currently available piRNA sequence data from the pachytene stage of mouse spermatogenesis that sheds light on their sequence diversity and mechanism of biogenesis. We conclude that (i) there are at least four times as many piRNAs in mouse testes than currently known; (ii) piRNAs, which originate from long precursor transcripts, are generated by quasi-random enzymatic processing that is guided by a weak sequence signature at the piRNA 5′ends resulting in a large number of distinct sequences; and (iii) many of the piRNA clusters contain inverted repeats segments capable of forming double-strand RNA fold-back segments that may initiate piRNA processing analogous to transposon silencing. PMID:17997596

  11. RNAcentral: an international database of ncRNA sequences

    DOE PAGES

    Williams, Kelly Porter

    2014-10-28

    The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.

  12. Complete Genome Sequence of a Double-Stranded RNA Virus from Avocado

    PubMed Central

    Villanueva, Francisco; Sabanadzovic, Sead; Valverde, Rodrigo A.

    2012-01-01

    A number of avocado (Persea americana) cultivars are known to contain high-molecular-weight double-stranded RNA (dsRNA) molecules for which a viral nature has been suggested, although sequence data are not available. Here we report the cloning and complete sequencing of a 13.5-kbp dsRNA virus isolated from avocado and show that it corresponds to the genome of a new species of the genus Endornavirus (family Endornaviridae), tentatively named Persea americana endornavirus (PaEV). PMID:22205720

  13. HYBRIDIZATION PROPERTIES OF DNA SEQUENCES DIRECTING THE SYNTHESIS OF MESSENGER RNA AND HETEROGENEOUS NUCLEAR RNA

    PubMed Central

    Greenberg, Jay R.; Perry, Robert P.

    1971-01-01

    The relationship of the DNA sequences from which polyribosomal messenger RNA (mRNA) and heterogeneous nuclear RNA (NRNA) of mouse L cells are transcribed was investigated by means of hybridization kinetics and thermal denaturation of the hybrids. Hybridization was performed in formamide solutions at DNA excess. Under these conditions most of the hybridizing mRNA and NRNA react at values of Dot (DNA concentration multiplied by time) expected for RNA transcribed from the nonrepeated or rarely repeated fraction of the genome. However, a fraction of both mRNA and NRNA hybridize at values of Dot about 10,000 times lower, and therefore must be transcribed from highly redundant DNA sequences. The fraction of NRNA hybridizing to highly repeated sequences is about 1.7 times greater than the corresponding fraction of mRNA. The hybrids formed by the rapidly reacting fractions of both NRNA and mRNA melt over a narrow temperature range with a midpoint about 11°C below that of native L cell DNA. This indicates that these hybrids consist of partially complementary sequences with approximately 11% mismatching of bases. Hybrids formed by the slowly reacting fraction of NRNA melt within 4°–6°C of native DNA, indicating very little, if any, mismatching of bases. Hybrids of the slowly reacting components of mRNA, formed under conditions of sufficiently low RNA input, have a high thermal stability, similar to that observed for hybrids of the slowly reacting NRNA component. However, when higher inputs of mRNA are used, hybrids are formed which have a strikingly lower thermal stability. This observation can be explained by assuming that there is sufficient similarity among the relatively rare DNA sequences coding for mRNA so that under hybridization conditions, in which these DNA sequences are not truly in excess, reversible hybrids exhibiting a considerable amount of mispairing are formed. The fact that a comparable phenomenon has not been observed for NRNA may mean that there is

  14. miRBase: integrating microRNA annotation and deep-sequencing data.

    PubMed

    Kozomara, Ana; Griffiths-Jones, Sam

    2011-01-01

    miRBase is the primary online repository for all microRNA sequences and annotation. The current release (miRBase 16) contains over 15,000 microRNA gene loci in over 140 species, and over 17,000 distinct mature microRNA sequences. Deep-sequencing technologies have delivered a sharp rise in the rate of novel microRNA discovery. We have mapped reads from short RNA deep-sequencing experiments to microRNAs in miRBase and developed web interfaces to view these mappings. The user can view all read data associated with a given microRNA annotation, filter reads by experiment and count, and search for microRNAs by tissue- and stage-specific expression. These data can be used as a proxy for relative expression levels of microRNA sequences, provide detailed evidence for microRNA annotations and alternative isoforms of mature microRNAs, and allow us to revisit previous annotations. miRBase is available online at: http://www.mirbase.org/.

  15. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    PubMed

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.

  16. Size, Shape, and Sequence-Dependent Immunogenicity of RNA Nanoparticles.

    PubMed

    Guo, Sijin; Li, Hui; Ma, Mengshi; Fu, Jian; Dong, Yizhou; Guo, Peixuan

    2017-12-15

    RNA molecules have emerged as promising therapeutics. Like all other drugs, the safety profile and immune response are important criteria for drug evaluation. However, the literature on RNA immunogenicity has been controversial. Here, we used the approach of RNA nanotechnology to demonstrate that the immune response of RNA nanoparticles is size, shape, and sequence dependent. RNA triangle, square, pentagon, and tetrahedron with same shape but different sizes, or same size but different shapes were used as models to investigate the immune response. The levels of pro-inflammatory cytokines induced by these RNA nanoarchitectures were assessed in macrophage-like cells and animals. It was found that RNA polygons without extension at the vertexes were immune inert. However, when single-stranded RNA with a specific sequence was extended from the vertexes of RNA polygons, strong immune responses were detected. These immunostimulations are sequence specific, because some other extended sequences induced little or no immune response. Additionally, larger-size RNA square induced stronger cytokine secretion. 3D RNA tetrahedron showed stronger immunostimulation than planar RNA triangle. These results suggest that the immunogenicity of RNA nanoparticles is tunable to produce either a minimal immune response that can serve as safe therapeutic vectors, or a strong immune response for cancer immunotherapy or vaccine adjuvants. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  17. An RRM–ZnF RNA recognition module targets RBM10 to exonic sequences to promote exon exclusion

    PubMed Central

    Collins, Katherine M.; Kainov, Yaroslav A.; Christodolou, Evangelos; Ray, Debashish; Morris, Quaid; Hughes, Timothy; Taylor, Ian A.

    2017-01-01

    Abstract RBM10 is an RNA-binding protein that plays an essential role in development and is frequently mutated in the context of human disease. RBM10 recognizes a diverse set of RNA motifs in introns and exons and regulates alternative splicing. However, the molecular mechanisms underlying this seemingly relaxed sequence specificity are not understood and functional studies have focused on 3΄ intronic sites only. Here, we dissect the RNA code recognized by RBM10 and relate it to the splicing regulatory function of this protein. We show that a two-domain RRM1–ZnF unit recognizes a GGA-centered motif enriched in RBM10 exonic sites with high affinity and specificity and test that the interaction with these exonic sequences promotes exon skipping. Importantly, a second RRM domain (RRM2) of RBM10 recognizes a C-rich sequence, which explains its known interaction with the intronic 3΄ site of NUMB exon 9 contributing to regulation of the Notch pathway in cancer. Together, these findings explain RBM10's broad RNA specificity and suggest that RBM10 functions as a splicing regulator using two RNA-binding units with different specificities to promote exon skipping. PMID:28379442

  18. An RRM-ZnF RNA recognition module targets RBM10 to exonic sequences to promote exon exclusion.

    PubMed

    Collins, Katherine M; Kainov, Yaroslav A; Christodolou, Evangelos; Ray, Debashish; Morris, Quaid; Hughes, Timothy; Taylor, Ian A; Makeyev, Eugene V; Ramos, Andres

    2017-06-20

    RBM10 is an RNA-binding protein that plays an essential role in development and is frequently mutated in the context of human disease. RBM10 recognizes a diverse set of RNA motifs in introns and exons and regulates alternative splicing. However, the molecular mechanisms underlying this seemingly relaxed sequence specificity are not understood and functional studies have focused on 3΄ intronic sites only. Here, we dissect the RNA code recognized by RBM10 and relate it to the splicing regulatory function of this protein. We show that a two-domain RRM1-ZnF unit recognizes a GGA-centered motif enriched in RBM10 exonic sites with high affinity and specificity and test that the interaction with these exonic sequences promotes exon skipping. Importantly, a second RRM domain (RRM2) of RBM10 recognizes a C-rich sequence, which explains its known interaction with the intronic 3΄ site of NUMB exon 9 contributing to regulation of the Notch pathway in cancer. Together, these findings explain RBM10's broad RNA specificity and suggest that RBM10 functions as a splicing regulator using two RNA-binding units with different specificities to promote exon skipping. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Evolutionary relationships in the ilarviruses: nucleotide sequence of prunus necrotic ringspot virus RNA 3.

    PubMed

    Sánchez-Navarro, J A; Pallás, V

    1997-01-01

    The complete nucleotide sequence of an isolate of prunus necrotic ringspot virus (PNRSV) RNA 3 has been determined. Elucidation of the amino acid sequence of the proteins encoded by the two large open reading frames (ORFs) allowed us to carry out comparative and phylogenetic studies on the movement (MP) and coat (CP) proteins in the ilarvirus group. Amino acid sequence comparison of the MP revealed a highly conserved basic sequence motif with an amphipathic alpha-helical structure preceding the conserved motif of the '30K superfamily' proposed by Mushegian and Koonin [26] for MP's. Within this '30K' motif a strictly conserved transmembrane domain is present in all ilarviruses sequenced so far. At the amino-terminal end, prune dwarf virus (PDV) has an extension not present in other ilarviruses but which is observed in all bromo- and cucumoviruses, suggesting a common ancestor or a recombinational event in the Bromoviridae family. Examination of the N-terminus of the CP's of all ilarviruses revealed a highly basic region, part of which resembles the Arg-rich motif that has been characterized in the RNA-binding protein family. This motif has also been found in the other members of the Bromoviridae family, suggesting its involvement in a structural function. Furthermore this region is required for infectivity in ilarviruses. The similarities found in this Arg-rich motif are discussed in terms of this process known as genome activation. Finally, phylogenetic analysis of both the MP and CP proteins revealed a higher relationship of A1MV to PNRSV, apple mosaic virus (ApMV) and PDV than any other member of the ilarvirus group. In that sense, A1MV should be considered as a true ilarvirus instead of forming a distinct group of viruses.

  20. Computational Prediction of the Immunomodulatory Potential of RNA Sequences.

    PubMed

    Nagpal, Gandharva; Chaudhary, Kumardeep; Dhanda, Sandeep Kumar; Raghava, Gajendra Pal Singh

    2017-01-01

    Advances in the knowledge of various roles played by non-coding RNAs have stimulated the application of RNA molecules as therapeutics. Among these molecules, miRNA, siRNA, and CRISPR-Cas9 associated gRNA have been identified as the most potent RNA molecule classes with diverse therapeutic applications. One of the major limitations of RNA-based therapeutics is immunotoxicity of RNA molecules as it may induce the innate immune system. In contrast, RNA molecules that are potent immunostimulators are strong candidates for use in vaccine adjuvants. Thus, it is important to understand the immunotoxic or immunostimulatory potential of these RNA molecules. The experimental techniques for determining immunostimulatory potential of siRNAs are time- and resource-consuming. To overcome this limitation, recently our group has developed a web-based server "imRNA" for predicting the immunomodulatory potential of RNA sequences. This server integrates a number of modules that allow users to perform various tasks including (1) generation of RNA analogs with reduced immunotoxicity, (2) identification of highly immunostimulatory regions in RNA sequence, and (3) virtual screening. This server may also assist users in the identification of minimum mutations required in a given RNA sequence to minimize its immunomodulatory potential that is required for designing RNA-based therapeutics. Besides, the server can be used for designing RNA-based vaccine adjuvants as it may assist users in the identification of mutations required for increasing immunomodulatory potential of a given RNA sequence. In summary, this chapter describes major applications of the "imRNA" server in designing RNA-based therapeutics and vaccine adjuvants (http://www.imtech.res.in/raghava/imrna/).

  1. Highly multiplexed subcellular RNA sequencing in situ

    PubMed Central

    Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Yang, Joyce L.; Terry, Richard; Jeanty, Sauveur S. F.; Li, Chao; Amamoto, Ryoji; Peters, Derek T.; Turczyk, Brian M.; Marblestone, Adam H.; Inverso, Samuel A.; Bernard, Amy; Mali, Prashant; Rios, Xavier; Aach, John; Church, George M.

    2014-01-01

    Understanding the spatial organization of gene expression with single nucleotide resolution requires localizing the sequences of expressed RNA transcripts within a cell in situ. Here we describe fluorescent in situ RNA sequencing (FISSEQ), in which stably cross-linked cDNA amplicons are sequenced within a biological sample. Using 30-base reads from 8,742 genes in situ, we examined RNA expression and localization in human primary fibroblasts using a simulated wound healing assay. FISSEQ is compatible with tissue sections and whole mount embryos, and reduces the limitations of optical resolution and noisy signals on single molecule detection. Our platform enables massively parallel detection of genetic elements, including gene transcripts and molecular barcodes, and can be used to investigate cellular phenotype, gene regulation, and environment in situ. PMID:24578530

  2. RNA sequencing: current and prospective uses in metabolic research.

    PubMed

    Vikman, Petter; Fadista, Joao; Oskolkov, Nikolay

    2014-10-01

    Previous global RNA analysis was restricted to known transcripts in species with a defined transcriptome. Next generation sequencing has transformed transcriptomics by making it possible to analyse expressed genes with an exon level resolution from any tissue in any species without any a priori knowledge of which genes that are being expressed, splice patterns or their nucleotide sequence. In addition, RNA sequencing is a more sensitive technique compared with microarrays with a larger dynamic range, and it also allows for investigation of imprinting and allele-specific expression. This can be done for a cost that is able to compete with that of a microarray, making RNA sequencing a technique available to most researchers. Therefore RNA sequencing has recently become the state of the art with regards to large-scale RNA investigations and has to a large extent replaced microarrays. The only drawback is the large data amounts produced, which together with the complexity of the data can make a researcher spend far more time on analysis than performing the actual experiment. © 2014 Society for Endocrinology.

  3. Computational Prediction of miRNA Genes from Small RNA Sequencing Data

    PubMed Central

    Kang, Wenjing; Friedländer, Marc R.

    2015-01-01

    Next-generation sequencing now for the first time allows researchers to gage the depth and variation of entire transcriptomes. However, now as rare transcripts can be detected that are present in cells at single copies, more advanced computational tools are needed to accurately annotate and profile them. microRNAs (miRNAs) are 22 nucleotide small RNAs (sRNAs) that post-transcriptionally reduce the output of protein coding genes. They have established roles in numerous biological processes, including cancers and other diseases. During miRNA biogenesis, the sRNAs are sequentially cleaved from precursor molecules that have a characteristic hairpin RNA structure. The vast majority of new miRNA genes that are discovered are mined from small RNA sequencing (sRNA-seq), which can detect more than a billion RNAs in a single run. However, given that many of the detected RNAs are degradation products from all types of transcripts, the accurate identification of miRNAs remain a non-trivial computational problem. Here, we review the tools available to predict animal miRNAs from sRNA sequencing data. We present tools for generalist and specialist use cases, including prediction from massively pooled data or in species without reference genome. We also present wet-lab methods used to validate predicted miRNAs, and approaches to computationally benchmark prediction accuracy. For each tool, we reference validation experiments and benchmarking efforts. Last, we discuss the future of the field. PMID:25674563

  4. RNAcentral: A vision for an international database of RNA sequences

    PubMed Central

    Bateman, Alex; Agrawal, Shipra; Birney, Ewan; Bruford, Elspeth A.; Bujnicki, Janusz M.; Cochrane, Guy; Cole, James R.; Dinger, Marcel E.; Enright, Anton J.; Gardner, Paul P.; Gautheret, Daniel; Griffiths-Jones, Sam; Harrow, Jen; Herrero, Javier; Holmes, Ian H.; Huang, Hsien-Da; Kelly, Krystyna A.; Kersey, Paul; Kozomara, Ana; Lowe, Todd M.; Marz, Manja; Moxon, Simon; Pruitt, Kim D.; Samuelsson, Tore; Stadler, Peter F.; Vilella, Albert J.; Vogel, Jan-Hinnerk; Williams, Kelly P.; Wright, Mathew W.; Zwieb, Christian

    2011-01-01

    During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor. PMID:21940779

  5. How Messenger RNA and Nascent Chain Sequences Regulate Translation Elongation.

    PubMed

    Choi, Junhong; Grosely, Rosslyn; Prabhakar, Arjun; Lapointe, Christopher P; Wang, Jinfan; Puglisi, Joseph D

    2018-06-20

    Translation elongation is a highly coordinated, multistep, multifactor process that ensures accurate and efficient addition of amino acids to a growing nascent-peptide chain encoded in the sequence of translated messenger RNA (mRNA). Although translation elongation is heavily regulated by external factors, there is clear evidence that mRNA and nascent-peptide sequences control elongation dynamics, determining both the sequence and structure of synthesized proteins. Advances in methods have driven experiments that revealed the basic mechanisms of elongation as well as the mechanisms of regulation by mRNA and nascent-peptide sequences. In this review, we highlight how mRNA and nascent-peptide elements manipulate the translation machinery to alter the dynamics and pathway of elongation.

  6. Transcriptome-Wide Identification of RNA Targets of Arabidopsis SERINE/ARGININE-RICH45 Uncovers the Unexpected Roles of This RNA Binding Protein in RNA Processing[OPEN

    PubMed Central

    Wang, Yajun; Hamilton, Michael; Ben-Hur, Asa; Reddy, Anireddy S.N.

    2015-01-01

    Plant SR45 and its metazoan ortholog RNPS1 are serine/arginine-rich (SR)-like RNA binding proteins that function in splicing/postsplicing events and regulate diverse processes in eukaryotes. Interactions of SR45 with both RNAs and proteins are crucial for regulating RNA processing. However, in vivo RNA targets of SR45 are currently unclear. Using RNA immunoprecipitation followed by high-throughput sequencing, we identified over 4000 Arabidopsis thaliana RNAs that directly or indirectly associate with SR45, designated as SR45-associated RNAs (SARs). Comprehensive analyses of these SARs revealed several roles for SR45. First, SR45 associates with and regulates the expression of 30% of abscisic acid (ABA) signaling genes at the postsplicing level. Second, although most SARs are derived from intron-containing genes, surprisingly, 340 SARs are derived from intronless genes. Expression analysis of the SARs suggests that SR45 differentially regulates intronless and intron-containing SARs. Finally, we identified four overrepresented RNA motifs in SARs that likely mediate SR45’s recognition of its targets. Therefore, SR45 plays an unexpected role in mRNA processing of intronless genes, and numerous ABA signaling genes are targeted for regulation at the posttranscriptional level. The diverse molecular functions of SR45 uncovered in this study are likely applicable to other species in view of its conservation across eukaryotes. PMID:26603559

  7. Lyophilisation and concentration of chitosan/siRNA polyplexes: Influence of buffer composition, oligonucleotide sequence, and hyaluronic acid coating.

    PubMed

    Veilleux, Daniel; Gopalakrishna Panicker, Rajesh Krishnan; Chevrier, Anik; Biniecki, Kristof; Lavertu, Marc; Buschmann, Michael D

    2018-02-15

    Chitosan (CS)/siRNA polyplexes have great therapeutic potential for treating multiple diseases by gene silencing. However, clinical application of this technology requires the development of concentrated, hemocompatible, pH neutral formulations for safe and efficient administration. In this study we evaluate physicochemical properties of chitosan polyplexes in various buffers at increasing ionic strengths, to identify conditions for freeze-drying and rehydration at higher doses of uncoated or hyaluronic acid (HA)-coated polyplexes while maintaining physiological compatibility. Optimized formulations are used to evaluate the impact of the siRNA/oligonucleotide sequence on polyplex physicochemical properties, and to measure their in vitro silencing efficiency, cytotoxicity, and hemocompatibility. Specific oligonucleotide sequences influence polyplex physical properties at low N:P ratios, as well as their stability during freeze-drying. Nanoparticles display greater stability for oligodeoxynucleotides ODN vs siRNA; AT-rich vs GC-rich; and overhangs vs blunt ends. Using this knowledge, various CS/siRNA polyplexes are prepared with and without HA coating, freeze-dried and rehydrated at increased concentrations using reduced rehydration volumes. These polyplexes are non-cytotoxic and preserve silencing activity even after rehydration to 20-fold their initial concentration, while HA-coated polyplexes at pH∼7 also displayed increased hemocompatibility. These concentrated formulations represent a critical step towards clinical development of chitosan-based oligonucleotide intravenous delivery systems. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Structurally complex and highly active RNA ligases derived from random RNA sequences

    NASA Technical Reports Server (NTRS)

    Ekland, E. H.; Szostak, J. W.; Bartel, D. P.

    1995-01-01

    Seven families of RNA ligases, previously isolated from random RNA sequences, fall into three classes on the basis of secondary structure and regiospecificity of ligation. Two of the three classes of ribozymes have been engineered to act as true enzymes, catalyzing the multiple-turnover transformation of substrates into products. The most complex of these ribozymes has a minimal catalytic domain of 93 nucleotides. An optimized version of this ribozyme has a kcat exceeding one per second, a value far greater than that of most natural RNA catalysts and approaching that of comparable protein enzymes. The fact that such a large and complex ligase emerged from a very limited sampling of sequence space implies the existence of a large number of distinct RNA structures of equivalent complexity and activity.

  9. Analysis of sequencing data for probing RNA secondary structures and protein-RNA binding in studying posttranscriptional regulations.

    PubMed

    Hu, Xihao; Wu, Yang; Lu, Zhi John; Yip, Kevin Y

    2016-11-01

    High-throughput sequencing has been used to study posttranscriptional regulations, where the identification of protein-RNA binding is a major and fast-developing sub-area, which is in turn benefited by the sequencing methods for whole-transcriptome probing of RNA secondary structures. In the study of RNA secondary structures using high-throughput sequencing, bases are modified or cleaved according to their structural features, which alter the resulting composition of sequencing reads. In the study of protein-RNA binding, methods have been proposed to immuno-precipitate (IP) protein-bound RNA transcripts in vitro or in vivo By sequencing these transcripts, the protein-RNA interactions and the binding locations can be identified. For both types of data, read counts are affected by a combination of confounding factors, including expression levels of transcripts, sequence biases, mapping errors and the probing or IP efficiency of the experimental protocols. Careful processing of the sequencing data and proper extraction of important features are fundamentally important to a successful analysis. Here we review and compare different experimental methods for probing RNA secondary structures and binding sites of RNA-binding proteins (RBPs), and the computational methods proposed for analyzing the corresponding sequencing data. We suggest how these two types of data should be integrated to study the structural properties of RBP binding sites as a systematic way to better understand posttranscriptional regulations. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  10. Sequence analysis of RNase MRP RNA reveals its origination from eukaryotic RNase P RNA

    PubMed Central

    Zhu, Yanglong; Stribinskis, Vilius; Ramos, Kenneth S.; Li, Yong

    2006-01-01

    RNase MRP is a eukaryote-specific endoribonuclease that generates RNA primers for mitochondrial DNA replication and processes precursor rRNA. RNase P is a ubiquitous endoribonuclease that cleaves precursor tRNA transcripts to produce their mature 5′ termini. We found extensive sequence homology of catalytic domains and specificity domains between their RNA subunits in many organisms. In Candida glabrata, the internal loop of helix P3 is 100% conserved between MRP and P RNAs. The helix P8 of MRP RNA from microsporidia Encephalitozoon cuniculi is identical to that of P RNA. Sequence homology can be widely spread over the whole molecule of MRP RNA and P RNA, such as those from Dictyostelium discoideum. These conserved nucleotides between the MRP and P RNAs strongly support the hypothesis that the MRP RNA is derived from the P RNA molecule in early eukaryote evolution. PMID:16540690

  11. Comprehensive analysis of RNA-protein interactions by high-throughput sequencing-RNA affinity profiling.

    PubMed

    Tome, Jacob M; Ozer, Abdullah; Pagano, John M; Gheba, Dan; Schroth, Gary P; Lis, John T

    2014-06-01

    RNA-protein interactions play critical roles in gene regulation, but methods to quantitatively analyze these interactions at a large scale are lacking. We have developed a high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay by adapting a high-throughput DNA sequencer to quantify the binding of fluorescently labeled protein to millions of RNAs anchored to sequenced cDNA templates. Using HiTS-RAP, we measured the affinity of mutagenized libraries of GFP-binding and NELF-E-binding aptamers to their respective targets and identified critical regions of interaction. Mutations additively affected the affinity of the NELF-E-binding aptamer, whose interaction depended mainly on a single-stranded RNA motif, but not that of the GFP aptamer, whose interaction depended primarily on secondary structure.

  12. a Simple Symmetric Algorithm Using a Likeness with Introns Behavior in RNA Sequences

    NASA Astrophysics Data System (ADS)

    Regoli, Massimo

    2009-02-01

    The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences has some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algoritnm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.

  13. Novel Approach to Analyzing MFE of Noncoding RNA Sequences

    PubMed Central

    George, Tina P.; Thomas, Tessamma

    2016-01-01

    Genomic studies have become noncoding RNA (ncRNA) centric after the study of different genomes provided enormous information on ncRNA over the past decades. The function of ncRNA is decided by its secondary structure, and across organisms, the secondary structure is more conserved than the sequence itself. In this study, the optimal secondary structure or the minimum free energy (MFE) structure of ncRNA was found based on the thermodynamic nearest neighbor model. MFE of over 2600 ncRNA sequences was analyzed in view of its signal properties. Mathematical models linking MFE to the signal properties were found for each of the four classes of ncRNA analyzed. MFE values computed with the proposed models were in concordance with those obtained with the standard web servers. A total of 95% of the sequences analyzed had deviation of MFE values within ±15% relative to those obtained from standard web servers. PMID:27695341

  14. Novel Approach to Analyzing MFE of Noncoding RNA Sequences.

    PubMed

    George, Tina P; Thomas, Tessamma

    2016-01-01

    Genomic studies have become noncoding RNA (ncRNA) centric after the study of different genomes provided enormous information on ncRNA over the past decades. The function of ncRNA is decided by its secondary structure, and across organisms, the secondary structure is more conserved than the sequence itself. In this study, the optimal secondary structure or the minimum free energy (MFE) structure of ncRNA was found based on the thermodynamic nearest neighbor model. MFE of over 2600 ncRNA sequences was analyzed in view of its signal properties. Mathematical models linking MFE to the signal properties were found for each of the four classes of ncRNA analyzed. MFE values computed with the proposed models were in concordance with those obtained with the standard web servers. A total of 95% of the sequences analyzed had deviation of MFE values within ±15% relative to those obtained from standard web servers.

  15. A HuD-ZBP1 ribonucleoprotein complex localizes GAP-43 mRNA into axons through its 3′ untranslated region AU-rich regulatory element

    PubMed Central

    Yoo, Soonmoon; Kim, Hak Hee; Kim, Paul; Donnelly, Christopher J.; Kalinski, Ashley L.; Vuppalanchi, Deepika; Park, Michael; Lee, Seung Joon; Merianda, Tanuja T.; Perrone-Bizzozero, Nora I.; Twiss, Jeffery L.

    2013-01-01

    Localized translation of axonal mRNAs contributes to developmental and regenerative axon growth. Although untranslated regions (UTRs) of many different axonal mRNAs appear to drive their localization, there has been no consensus RNA structure responsible for this localization. We recently showed that limited expression of ZBP1 protein restricts axonal localization of both β-actin and GAP-43 mRNAs. β-actin 3′UTR has a defined element for interaction with ZBP1, but GAP-43 mRNA shows no homology to this RNA sequence. Here, we show that an AU-rich element (ARE) in GAP-43’s 3′UTR is necessary and sufficient for its axonal localization. Axonal GAP-43 mRNA levels increase after in vivo injury, and GAP-43 mRNA shows an increased half-life in regenerating axons. GAP-43 mRNA interacts with both HuD and ZBP1, and HuD and ZBP1 coimmunoprecipitate in an RNA-dependent fashion. Reporter mRNA with the GAP-43 ARE competes with endogenous β-actin mRNA for axonal localization and decreases axon length and branching similar to the β-actin 3′UTR competing with endogenous GAP-43 mRNA. Conversely, overexpressing GAP-43 coding sequence with it’s 3′UTR ARE increases axonal elongation and this effect is lost when just the ARE is deleted from GAP-43’s 3′UTR. PMID:23586486

  16. Genomic Sequence of the WHO International Standard for Hepatitis A Virus RNA.

    PubMed

    Jenkins, Adrian; Minhas, Rehan; Morris, Clare; Berry, Neil

    2018-05-10

    The World Health Organization (WHO) international standard for hepatitis A virus (HAV) RNA nucleic acid assays was characterized by complete genome sequencing. The entire coding sequence and noncoding regions were assigned HAV genotype IB. This information will aid the design, development, and evaluation of HAV RNA amplification assays. Copyright © 2018 Jenkins et al.

  17. Molecular characterization and expression analysis of a glycine-rich RNA-binding protein gene from Malus hupehensis Rehd.

    PubMed

    Wang, Shuncai; Wang, Rongchao; Liang, Dong; Ma, Fengwang; Shu, Huairui

    2012-04-01

    Members of the plant glycine-rich RNA-binding protein (GR-RBP) family play diverse roles in regulating RNA metabolism for various cellular processes. To understand better their function at the molecular level in stress responses, we cloned a GR-RBP gene, MhGR-RBP1, from Malus hupehensis. Its full-length cDNA is 558 bp long, with a 495-bp open reading frame, and it encodes 164 amino acids. The deduced amino acid sequence contains an RNA-recognition motif (RRM) at the amino terminal and a glycine-rich domain at the carboxyl terminal; these are highly homologous with those from other plant species. Multiple alignment and phylogenetic analyses show that the deduced protein is a novel member of the plant GR-RBP family. To characterize this gene, we also applied a model for predicting its homology of protein structure with other species. Both organ-specific and stress-related expression were detected by quantitative real-time PCR and semi-quantitative RT-PCR, indicating that MhGR-RBP1 is expressed abundantly in young leaves but weakly in roots and shoots. Transcript levels in the leaves were increased markedly by drought, hydrogen peroxide (H(2)O(2)), and mechanical wounding, slightly by salt stress. Furthermore, the transcript is initially up- and down-regulated rapidly within 24 h of abscisic acid (ABA) treatment. After 24 h of ABA and jasmonic acid (JA) treatments with different concentrations, the transcript levels of MhGR-RBP1 were significantly repressed. These results suggest that MhGR-RBP1 may be involved in the responses to abiotic stresses, H(2)O(2), ABA, or JA.

  18. Chromatin-associated RNA sequencing (ChAR-seq) maps genome-wide RNA-to-DNA contacts

    PubMed Central

    Jukam, David; Teran, Nicole A; Risca, Viviana I; Smith, Owen K; Johnson, Whitney L; Skotheim, Jan M; Greenleaf, William James

    2018-01-01

    RNA is a critical component of chromatin in eukaryotes, both as a product of transcription, and as an essential constituent of ribonucleoprotein complexes that regulate both local and global chromatin states. Here, we present a proximity ligation and sequencing method called Chromatin-Associated RNA sequencing (ChAR-seq) that maps all RNA-to-DNA contacts across the genome. Using Drosophila cells, we show that ChAR-seq provides unbiased, de novo identification of targets of chromatin-bound RNAs including nascent transcripts, chromosome-specific dosage compensation ncRNAs, and genome-wide trans-associated RNAs involved in co-transcriptional RNA processing. PMID:29648534

  19. Identifying novel sequence variants of RNA 3D motifs

    PubMed Central

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  20. RISC RNA sequencing for context-specific identification of in vivo microRNA targets.

    PubMed

    Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W

    2011-01-07

    MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1645 mRNAs consistently targeted to mouse cardiac RISCs. We used this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing "seed" sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context and is applicable to any tissue and any disease state.

  1. BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high-throughput sequencing data.

    PubMed

    Chae, Heejoon; Rhee, Sungmin; Nephew, Kenneth P; Kim, Sun

    2015-01-15

    It is now well established that microRNAs (miRNAs) play a critical role in regulating gene expression in a sequence-specific manner, and genome-wide efforts are underway to predict known and novel miRNA targets. However, the integrated miRNA-mRNA analysis remains a major computational challenge, requiring powerful informatics systems and bioinformatics expertise. The objective of this study was to modify our widely recognized Web server for the integrated mRNA-miRNA analysis (MMIA) and its subsequent deployment on the Amazon cloud (BioVLAB-MMIA) to be compatible with high-throughput platforms, including next-generation sequencing (NGS) data (e.g. RNA-seq). We developed a new version called the BioVLAB-MMIA-NGS, deployed on both Amazon cloud and on a high-performance publicly available server called MAHA. By using NGS data and integrating various bioinformatics tools and databases, BioVLAB-MMIA-NGS offers several advantages. First, sequencing data is more accurate than array-based methods for determining miRNA expression levels. Second, potential novel miRNAs can be detected by using various computational methods for characterizing miRNAs. Third, because miRNA-mediated gene regulation is due to hybridization of an miRNA to its target mRNA, sequencing data can be used to identify many-to-many relationship between miRNAs and target genes with high accuracy. http://epigenomics.snu.ac.kr/biovlab_mmia_ngs/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Sequence determinants of improved CRISPR sgRNA design.

    PubMed

    Xu, Han; Xiao, Tengfei; Chen, Chen-Hao; Li, Wei; Meyer, Clifford A; Wu, Qiu; Wu, Di; Cong, Le; Zhang, Feng; Liu, Jun S; Brown, Myles; Liu, X Shirley

    2015-08-01

    The CRISPR/Cas9 system has revolutionized mammalian somatic cell genetics. Genome-wide functional screens using CRISPR/Cas9-mediated knockout or dCas9 fusion-mediated inhibition/activation (CRISPRi/a) are powerful techniques for discovering phenotype-associated gene function. We systematically assessed the DNA sequence features that contribute to single guide RNA (sgRNA) efficiency in CRISPR-based screens. Leveraging the information from multiple designs, we derived a new sequence model for predicting sgRNA efficiency in CRISPR/Cas9 knockout experiments. Our model confirmed known features and suggested new features including a preference for cytosine at the cleavage site. The model was experimentally validated for sgRNA-mediated mutation rate and protein knockout efficiency. Tested on independent data sets, the model achieved significant results in both positive and negative selection conditions and outperformed existing models. We also found that the sequence preference for CRISPRi/a is substantially different from that for CRISPR/Cas9 knockout and propose a new model for predicting sgRNA efficiency in CRISPRi/a experiments. These results facilitate the genome-wide design of improved sgRNA for both knockout and CRISPRi/a studies. © 2015 Xu et al.; Published by Cold Spring Harbor Laboratory Press.

  3. Quantifying the relationship between sequence and three-dimensional structure conservation in RNA

    PubMed Central

    2010-01-01

    Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction. PMID:20550657

  4. mESAdb: microRNA Expression and Sequence Analysis Database

    PubMed Central

    Kaya, Koray D.; Karakülah, Gökhan; Yakıcıer, Cengiz M.; Acar, Aybar C.; Konu, Özlen

    2011-01-01

    microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data. PMID:21177657

  5. mESAdb: microRNA expression and sequence analysis database.

    PubMed

    Kaya, Koray D; Karakülah, Gökhan; Yakicier, Cengiz M; Acar, Aybar C; Konu, Ozlen

    2011-01-01

    microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data.

  6. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    PubMed Central

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  7. Integrated sequencing of exome and mRNA of large-sized single cells.

    PubMed

    Wang, Lily Yan; Guo, Jiajie; Cao, Wei; Zhang, Meng; He, Jiankui; Li, Zhoufang

    2018-01-10

    Current approaches of single cell DNA-RNA integrated sequencing are difficult to call SNPs, because a large amount of DNA and RNA is lost during DNA-RNA separation. Here, we performed simultaneous single-cell exome and transcriptome sequencing on individual mouse oocytes. Using microinjection, we kept the nuclei intact to avoid DNA loss, while retaining the cytoplasm inside the cell membrane, to maximize the amount of DNA and RNA captured from the single cell. We then conducted exome-sequencing on the isolated nuclei and mRNA-sequencing on the enucleated cytoplasm. For single oocytes, exome-seq can cover up to 92% of exome region with an average sequencing depth of 10+, while mRNA-sequencing reveals more than 10,000 expressed genes in enucleated cytoplasm, with similar performance for intact oocytes. This approach provides unprecedented opportunities to study DNA-RNA regulation, such as RNA editing at single nucleotide level in oocytes. In future, this method can also be applied to other large cells, including neurons, large dendritic cells and large tumour cells for integrated exome and transcriptome sequencing.

  8. Sequence to Structure (S2S): display, manipulate and interconnect RNA data from sequence to structure.

    PubMed

    Jossinet, Fabrice; Westhof, Eric

    2005-08-01

    Efficient RNA sequence manipulations (such as multiple alignments) need to be constrained by rules of RNA structure folding. The structural knowledge has increased dramatically in the last years with the accumulation of several large RNA structures similar to those of the bacterial ribosome subunits. However, no tool in the RNA community provides an easy way to link and integrate progress made at the sequence level using the available three-dimensional information. Sequence to Structure (S2S) proposes a framework in which an user can easily display, manipulate and interconnect heterogeneous RNA data, such as multiple sequence alignments, secondary and tertiary structures. S2S has been implemented using the Java language and has been developed and tested under UNIX systems, such as Linux and MacOSX. S2S is available at http://bioinformatics.org/S2S/.

  9. Integrating RNA sequencing into neuro-oncology practice.

    PubMed

    Rogawski, David S; Vitanza, Nicholas A; Gauthier, Angela C; Ramaswamy, Vijay; Koschmann, Carl

    2017-11-01

    Malignant tumors of the central nervous system (CNS) cause substantial morbidity and mortality, yet efforts to optimize chemo- and radiotherapy have largely failed to improve dismal prognoses. Over the past decade, RNA sequencing (RNA-seq) has emerged as a powerful tool to comprehensively characterize the transcriptome of CNS tumor cells in one high-throughput step, leading to improved understanding of CNS tumor biology and suggesting new routes for targeted therapies. RNA-seq has been instrumental in improving the diagnostic classification of brain tumors, characterizing oncogenic fusion genes, and shedding light on intratumor heterogeneity. Currently, RNA-seq is beginning to be incorporated into regular neuro-oncology practice in the form of precision neuro-oncology programs, which use information from tumor sequencing to guide implementation of personalized targeted therapies. These programs show great promise in improving patient outcomes for tumors where single agent trials have been ineffective. As RNA-seq is a relatively new technique, many further applications yielding new advances in CNS tumor research and management are expected in the coming years. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Equally parsimonious pathways through an RNA sequence space are not equally likely

    NASA Technical Reports Server (NTRS)

    Lee, Y. H.; DSouza, L. M.; Fox, G. E.

    1997-01-01

    An experimental system for determining the potential ability of sequences resembling 5S ribosomal RNA (rRNA) to perform as functional 5S rRNAs in vivo in the Escherichia coli cellular environment was devised previously. Presumably, the only 5S rRNA sequences that would have been fixed by ancestral populations are ones that were functionally valid, and hence the actual historical paths taken through RNA sequence space during 5S rRNA evolution would have most likely utilized valid sequences. Herein, we examine the potential validity of all sequence intermediates along alternative equally parsimonious trajectories through RNA sequence space which connect two pairs of sequences that had previously been shown to behave as valid 5S rRNAs in E. coli. The first trajectory requires a total of four changes. The 14 sequence intermediates provide 24 apparently equally parsimonious paths by which the transition could occur. The second trajectory involves three changes, six intermediate sequences, and six potentially equally parsimonious paths. In total, only eight of the 20 sequence intermediates were found to be clearly invalid. As a consequence of the position of these invalid intermediates in the sequence space, seven of the 30 possible paths consisted of exclusively valid sequences. In several cases, the apparent validity/invalidity of the intermediate sequences could not be anticipated on the basis of current knowledge of the 5S rRNA structure. This suggests that the interdependencies in RNA sequence space may be more complex than currently appreciated. If ancestral sequences predicted by parsimony are to be regarded as actual historical sequences, then the present results would suggest that they should also satisfy a validity requirement and that, in at least limited cases, this conjecture can be tested experimentally.

  11. Transcriptome and Small RNA Deep Sequencing Reveals Deregulation of miRNA Biogenesis in Human Glioma

    PubMed Central

    Moore, Lynette M.; Kivinen, Virpi; Liu, Yuexin; Annala, Matti; Cogdell, David; Liu, Xiuping; Liu, Chang-Gong; Sawaya, Raymond; Yli-Harja, Olli; Shmulevich, Ilya; Fuller, Gregory N.; Zhang, Wei; Nykter, Matti

    2013-01-01

    Altered expression of oncogenic and tumor-suppressing microRNAs (miRNAs) is widely associated with tumorigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumors. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and interrogated expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression. PMID:23007860

  12. Deep Sequencing Insights in Therapeutic shRNA Processing and siRNA Target Cleavage Precision.

    PubMed

    Denise, Hubert; Moschos, Sterghios A; Sidders, Benjamin; Burden, Frances; Perkins, Hannah; Carter, Nikki; Stroud, Tim; Kennedy, Michael; Fancy, Sally-Ann; Lapthorn, Cris; Lavender, Helen; Kinloch, Ross; Suhy, David; Corbau, Romu

    2014-02-04

    TT-034 (PF-05095808) is a recombinant adeno-associated virus serotype 8 (AAV8) agent expressing three short hairpin RNA (shRNA) pro-drugs that target the hepatitis C virus (HCV) RNA genome. The cytosolic enzyme Dicer cleaves each shRNA into multiple, potentially active small interfering RNA (siRNA) drugs. Using next-generation sequencing (NGS) to identify and characterize active shRNAs maturation products, we observed that each TT-034-encoded shRNA could be processed into as many as 95 separate siRNA strands. Few of these appeared active as determined by Sanger 5' RNA Ligase-Mediated Rapid Amplification of cDNA Ends (5-RACE) and through synthetic shRNA and siRNA analogue studies. Moreover, NGS scrutiny applied on 5-RACE products (RACE-seq) suggested that synthetic siRNAs could direct cleavage in not one, but up to five separate positions on targeted RNA, in a sequence-dependent manner. These data support an on-target mechanism of action for TT-034 without cytotoxicity and question the accepted precision of substrate processing by the key RNA interference (RNAi) enzymes Dicer and siRNA-induced silencing complex (siRISC).Molecular Therapy-Nucleic Acids (2014) 3, e145; doi:10.1038/mtna.2013.73; published online 4 February 2014.

  13. Short RNA indicator sequences are not completely degraded by autoclaving

    PubMed Central

    Unnithan, Veena V.; Unc, Adrian; Joe, Valerisa; Smith, Geoffrey B.

    2014-01-01

    Short indicator RNA sequences (<100 bp) persist after autoclaving and are recovered intact by molecular amplification. Primers targeting longer sequences are most likely to produce false positives due to amplification errors easily verified by melting curves analyses. If short indicator RNA sequences are used for virus identification and quantification then post autoclave RNA degradation methodology should be employed, which may include further autoclaving. PMID:24518856

  14. Sequence-structure relationships in RNA loops: establishing the basis for loop homology modeling.

    PubMed

    Schudoma, Christian; May, Patrick; Nikiforova, Viktoria; Walther, Dirk

    2010-01-01

    The specific function of RNA molecules frequently resides in their seemingly unstructured loop regions. We performed a systematic analysis of RNA loops extracted from experimentally determined three-dimensional structures of RNA molecules. A comprehensive loop-structure data set was created and organized into distinct clusters based on structural and sequence similarity. We detected clear evidence of the hallmark of homology present in the sequence-structure relationships in loops. Loops differing by <25% in sequence identity fold into very similar structures. Thus, our results support the application of homology modeling for RNA loop model building. We established a threshold that may guide the sequence divergence-based selection of template structures for RNA loop homology modeling. Of all possible sequences that are, under the assumption of isosteric relationships, theoretically compatible with actual sequences observed in RNA structures, only a small fraction is contained in the Rfam database of RNA sequences and classes implying that the actual RNA loop space may consist of a limited number of unique loop structures and conserved sequences. The loop-structure data sets are made available via an online database, RLooM. RLooM also offers functionalities for the modeling of RNA loop structures in support of RNA engineering and design efforts.

  15. Assessing the 5S ribosomal RNA heterogeneity in Arabidopsis thaliana using short RNA next generation sequencing data.

    PubMed

    Szymanski, Maciej; Karlowski, Wojciech M

    2016-01-01

    In eukaryotes, ribosomal 5S rRNAs are products of multigene families organized within clusters of tandemly repeated units. Accumulation of genomic data obtained from a variety of organisms demonstrated that the potential 5S rRNA coding sequences show a large number of variants, often incompatible with folding into a correct secondary structure. Here, we present results of an analysis of a large set of short RNA sequences generated by the next generation sequencing techniques, to address the problem of heterogeneity of the 5S rRNA transcripts in Arabidopsis and identification of potentially functional rRNA-derived fragments.

  16. Next-generation sequencing library preparation method for identification of RNA viruses on the Ion Torrent Sequencing Platform.

    PubMed

    Chen, Guiqian; Qiu, Yuan; Zhuang, Qingye; Wang, Suchun; Wang, Tong; Chen, Jiming; Wang, Kaicheng

    2018-05-09

    Next generation sequencing (NGS) is a powerful tool for the characterization, discovery, and molecular identification of RNA viruses. There were multiple NGS library preparation methods published for strand-specific RNA-seq, but some methods are not suitable for identifying and characterizing RNA viruses. In this study, we report a NGS library preparation method to identify RNA viruses using the Ion Torrent PGM platform. The NGS sequencing adapters were directly inserted into the sequencing library through reverse transcription and polymerase chain reaction, without fragmentation and ligation of nucleic acids. The results show that this method is simple to perform, able to identify multiple species of RNA viruses in clinical samples.

  17. How to design a single-cell RNA-sequencing experiment: pitfalls, challenges and perspectives.

    PubMed

    Dal Molin, Alessandra; Di Camillo, Barbara

    2018-01-31

    The sequencing of the transcriptome of single cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types in heterogeneous cell populations or for the study of stochastic gene expression. In recent years, various experimental methods and computational tools for analysing single-cell RNA-sequencing data have been proposed. However, most of them are tailored to different experimental designs or biological questions, and in many cases, their performance has not been benchmarked yet, thus increasing the difficulty for a researcher to choose the optimal single-cell transcriptome sequencing (scRNA-seq) experiment and analysis workflow. In this review, we aim to provide an overview of the current available experimental and computational methods developed to handle single-cell RNA-sequencing data and, based on their peculiarities, we suggest possible analysis frameworks depending on specific experimental designs. Together, we propose an evaluation of challenges and open questions and future perspectives in the field. In particular, we go through the different steps of scRNA-seq experimental protocols such as cell isolation, messenger RNA capture, reverse transcription, amplification and use of quantitative standards such as spike-ins and Unique Molecular Identifiers (UMIs). We then analyse the current methodological challenges related to preprocessing, alignment, quantification, normalization, batch effect correction and methods to control for confounding effects. © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. Targeted RNA-Sequencing with Competitive Multiplex-PCR Amplicon Libraries

    PubMed Central

    Blomquist, Thomas M.; Crawford, Erin L.; Lovett, Jennie L.; Yeo, Jiyoun; Stanoszek, Lauren M.; Levin, Albert; Li, Jia; Lu, Mei; Shi, Leming; Muldrew, Kenneth; Willey, James C.

    2013-01-01

    Whole transcriptome RNA-sequencing is a powerful tool, but is costly and yields complex data sets that limit its utility in molecular diagnostic testing. A targeted quantitative RNA-sequencing method that is reproducible and reduces the number of sequencing reads required to measure transcripts over the full range of expression would be better suited to diagnostic testing. Toward this goal, we developed a competitive multiplex PCR-based amplicon sequencing library preparation method that a) targets only the sequences of interest and b) controls for inter-target variation in PCR amplification during library preparation by measuring each transcript native template relative to a known number of synthetic competitive template internal standard copies. To determine the utility of this method, we intentionally selected PCR conditions that would cause transcript amplification products (amplicons) to converge toward equimolar concentrations (normalization) during library preparation. We then tested whether this approach would enable accurate and reproducible quantification of each transcript across multiple library preparations, and at the same time reduce (through normalization) total sequencing reads required for quantification of transcript targets across a large range of expression. We demonstrate excellent reproducibility (R2 = 0.997) with 97% accuracy to detect 2-fold change using External RNA Controls Consortium (ERCC) reference materials; high inter-day, inter-site and inter-library concordance (R2 = 0.97–0.99) using FDA Sequencing Quality Control (SEQC) reference materials; and cross-platform concordance with both TaqMan qPCR (R2 = 0.96) and whole transcriptome RNA-sequencing following “traditional” library preparation using Illumina NGS kits (R2 = 0.94). Using this method, sequencing reads required to accurately quantify more than 100 targeted transcripts expressed over a 107-fold range was reduced more than 10,000-fold, from 2.3×109 to 1

  19. Comparative Analysis of Single-Cell RNA Sequencing Methods.

    PubMed

    Ziegenhain, Christoph; Vieth, Beate; Parekh, Swati; Reinius, Björn; Guillaumet-Adkins, Amy; Smets, Martha; Leonhardt, Heinrich; Heyn, Holger; Hellmann, Ines; Enard, Wolfgang

    2017-02-16

    Single-cell RNA sequencing (scRNA-seq) offers new possibilities to address biological and medical questions. However, systematic comparisons of the performance of diverse scRNA-seq protocols are lacking. We generated data from 583 mouse embryonic stem cells to evaluate six prominent scRNA-seq methods: CEL-seq2, Drop-seq, MARS-seq, SCRB-seq, Smart-seq, and Smart-seq2. While Smart-seq2 detected the most genes per cell and across cells, CEL-seq2, Drop-seq, MARS-seq, and SCRB-seq quantified mRNA levels with less amplification noise due to the use of unique molecular identifiers (UMIs). Power simulations at different sequencing depths showed that Drop-seq is more cost-efficient for transcriptome quantification of large numbers of cells, while MARS-seq, SCRB-seq, and Smart-seq2 are more efficient when analyzing fewer cells. Our quantitative comparison offers the basis for an informed choice among six prominent scRNA-seq methods, and it provides a framework for benchmarking further improvements of scRNA-seq protocols. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. incaRNAfbinv: a web server for the fragment-based design of RNA sequences

    PubMed Central

    Drory Retwitzer, Matan; Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme; Barash, Danny

    2016-01-01

    Abstract In recent years, new methods for computational RNA design have been developed and applied to various problems in synthetic biology and nanotechnology. Lately, there is considerable interest in incorporating essential biological information when solving the inverse RNA folding problem. Correspondingly, RNAfbinv aims at including biologically meaningful constraints and is the only program to-date that performs a fragment-based design of RNA sequences. In doing so it allows the design of sequences that do not necessarily exactly fold into the target, as long as the overall coarse-grained tree graph shape is preserved. Augmented by the weighted sampling algorithm of incaRNAtion, our web server called incaRNAfbinv implements the method devised in RNAfbinv and offers an interactive environment for the inverse folding of RNA using a fragment-based design approach. It takes as input: a target RNA secondary structure; optional sequence and motif constraints; optional target minimum free energy, neutrality and GC content. In addition to the design of synthetic regulatory sequences, it can be used as a pre-processing step for the detection of novel natural occurring RNAs. The two complementary methodologies RNAfbinv and incaRNAtion are merged together and fully implemented in our web server incaRNAfbinv, available at http://www.cs.bgu.ac.il/incaRNAfbinv. PMID:27185893

  1. Structator: fast index-based search for RNA sequence-structure patterns

    PubMed Central

    2011-01-01

    Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several

  2. Single-Cell RNA-Sequencing: Assessment of Differential Expression Analysis Methods.

    PubMed

    Dal Molin, Alessandra; Baruzzo, Giacomo; Di Camillo, Barbara

    2017-01-01

    The sequencing of the transcriptomes of single-cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types and for the study of stochastic gene expression. In recent years, various tools for analyzing single-cell RNA-sequencing data have been proposed, many of them with the purpose of performing differentially expression analysis. In this work, we compare four different tools for single-cell RNA-sequencing differential expression, together with two popular methods originally developed for the analysis of bulk RNA-sequencing data, but largely applied to single-cell data. We discuss results obtained on two real and one synthetic dataset, along with considerations about the perspectives of single-cell differential expression analysis. In particular, we explore the methods performance in four different scenarios, mimicking different unimodal or bimodal distributions of the data, as characteristic of single-cell transcriptomics. We observed marked differences between the selected methods in terms of precision and recall, the number of detected differentially expressed genes and the overall performance. Globally, the results obtained in our study suggest that is difficult to identify a best performing tool and that efforts are needed to improve the methodologies for single-cell RNA-sequencing data analysis and gain better accuracy of results.

  3. Impact of sequencing depth and read length on single cell RNA sequencing data of T cells.

    PubMed

    Rizzetto, Simone; Eltahla, Auda A; Lin, Peijie; Bull, Rowena; Lloyd, Andrew R; Ho, Joshua W K; Venturi, Vanessa; Luciani, Fabio

    2017-10-06

    Single cell RNA sequencing (scRNA-seq) provides great potential in measuring the gene expression profiles of heterogeneous cell populations. In immunology, scRNA-seq allowed the characterisation of transcript sequence diversity of functionally relevant T cell subsets, and the identification of the full length T cell receptor (TCRαβ), which defines the specificity against cognate antigens. Several factors, e.g. RNA library capture, cell quality, and sequencing output affect the quality of scRNA-seq data. We studied the effects of read length and sequencing depth on the quality of gene expression profiles, cell type identification, and TCRαβ reconstruction, utilising 1,305 single cells from 8 publically available scRNA-seq datasets, and simulation-based analyses. Gene expression was characterised by an increased number of unique genes identified with short read lengths (<50 bp), but these featured higher technical variability compared to profiles from longer reads. Successful TCRαβ reconstruction was achieved for 6 datasets (81% - 100%) with at least 0.25 millions (PE) reads of length >50 bp, while it failed for datasets with <30 bp reads. Sufficient read length and sequencing depth can control technical noise to enable accurate identification of TCRαβ and gene expression profiles from scRNA-seq data of T cells.

  4. Spliced synthetic genes as internal controls in RNA sequencing experiments.

    PubMed

    Hardwick, Simon A; Chen, Wendy Y; Wong, Ted; Deveson, Ira W; Blackburn, James; Andersen, Stacey B; Nielsen, Lars K; Mattick, John S; Mercer, Tim R

    2016-09-01

    RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome.

  5. Complete nucleotide sequences and genome characterization of a novel double-stranded RNA virus infecting Rosa multiflora.

    PubMed

    Salem, Nidá M; Golino, Deborah A; Falk, Bryce W; Rowhani, Adib

    2008-01-01

    The three double-stranded (ds) RNAs were detected in Rosa multiflora plants showing rose spring dwarf (RSD) symptoms. Northern blot analysis revealed three dsRNAs in preparations of both dsRNA and total RNA from R. multiflora plants. The complete sequences of the dsRNAs (referred to as dsRNA 1, dsRNA 2 and dsRNA 3) were determined based on a combination of shotgun cloning of dsRNA cDNAs and reverse transcription-polymerase chain reaction (RT-PCR). The largest dsRNA (dsRNA 1) was 1,762 bp long with a single open reading frame (ORF) that encoded a putative polypeptide containing 479 amino acid residues with a molecular mass of 55.9 kDa. This polypeptide contains amino acid sequence motifs conserved in the RNA-dependent RNA polymerases (RdRp) of members of the family Partitiviridae. Both dsRNA 2 (1,475 bp) and dsRNA 3 (1,384 bp) contained single ORFs, encoding putative proteins of unknown function. The 5' untranslated regions (UTR) of all three segments shared regions of high sequence homology. Phylogenetic analysis using the RdRp sequences of the various partitiviruses revealed that the new sequences would constitute the genome of a virus in family Partitiviridae. This virus would cluster with Fragaria chiloensis cryptic virus and Raphanus sativus cryptic virus 2. We suggest that the three dsRNA segments constitute the genome of a novel cryptic virus infecting roses; we propose the name Rosa multiflora cryptic virus (RMCV). Detection primers were developed and used for RT-PCR detection of RMCV in rose plants.

  6. Simulations Using Random-Generated DNA and RNA Sequences

    ERIC Educational Resources Information Center

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  7. Quantitative assessment of RNA-protein interactions with high-throughput sequencing-RNA affinity profiling.

    PubMed

    Ozer, Abdullah; Tome, Jacob M; Friedman, Robin C; Gheba, Dan; Schroth, Gary P; Lis, John T

    2015-08-01

    Because RNA-protein interactions have a central role in a wide array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay that couples sequencing on an Illumina GAIIx genome analyzer with the quantitative assessment of protein-RNA interactions. This assay is able to analyze interactions between one or possibly several proteins with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of the EGFP and negative elongation factor subunit E (NELF-E) proteins with their corresponding canonical and mutant RNA aptamers. Here we provide a detailed protocol for HiTS-RAP that can be completed in about a month (8 d hands-on time). This includes the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, HiTS and protein binding with a GAIIx instrument, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, quantitative analysis of RNA on a massively parallel array (RNA-MaP) and RNA Bind-n-Seq (RBNS), for quantitative analysis of RNA-protein interactions.

  8. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1987-10-07

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  9. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1990-10-09

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  10. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.

    1990-01-01

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.

  11. Comparison of ribosomal RNA removal methods for transcriptome sequencing workflows in teleost fish

    USDA-ARS?s Scientific Manuscript database

    RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. rRNAs are commonly removed from total RNA via either mRNA selection or rRNA depletion. These methods have...

  12. Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

    PubMed Central

    Benslimane, A A; Dron, M; Hartmann, C; Rode, A

    1986-01-01

    Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553

  13. An Improved Total RNA Extraction Method for White Jelly Mushroom Tremella fuciformis Rich in Polysaccharides

    PubMed Central

    Zhu, Hanyu; Sun, Xueyan; Liu, Dongmei; Zheng, Liesheng; Chen, Liguo

    2017-01-01

    An improved method for extracting high quality and quantity RNA from a jelly mushroom and a dimorphic fungus—Tremella fuciformis which is especially rich in polysaccharides, is described. RNA was extracted from T. fuciformis mycelium M1332 and its parental monokaryotic yeast-like cells Y13 and Y32. The A260/280 and A260/230 ratios were both approximately 2, and the RNA integrity number was larger than 8.9. The yields of RNA were between 108 and 213 µg/g fresh wt. Downstream molecular applications including reverse transcriptional PCR and quantitative real-time PCR were also performed. This protocol is reliable and may be widely applicable for total RNA extraction from other jelly mushrooms or filamentous fungi rich in polysaccharides. PMID:29371814

  14. A method for extracting high-quality total RNA from plant rich in polysaccharides and polyphenols using Dendrobium huoshanense

    PubMed Central

    Yu, Nianjun; Zhang, Wei; Xing, Lihua; Xie, Dongmei; Peng, Daiyin

    2018-01-01

    Acquiring high quality RNA is the basis of plant molecular biology research, plant genetics and other physiological investigations. At present, a large number of nucleotide isolation methods have been exploited or modified, such as commercial kits, CTAB, SDS methods and so on. Due to the nature of different plants, extraction methods vary. Moreover, efficiency of certain approach cannot be guaranteed due to composition of different plants and extracting high quality RNA from plants rich in polysaccharides and polyphenols are often difficult. The physical and chemical properties of polysaccharides which are similar to nucleic acids and other secondary metabolites will be coprecipitated with RNA irreversibly. Therefore, how to remove polysaccharides and other secondary metabolites during RNA extraction is the primary challenge. Dendrobium huoshanense is an Orchidaceae perennial herb that is rich in polysaccharides and other secondary metabolites. By using D. huoshanense as the subject, we improved the method originated from CHAN and made it suitable for plants containing high amount of polysaccharides and polyphenols. The extracted total RNA was clear and non-dispersive, with good integrity and no obvious contamination with DNA and other impurities. And it was also evaluated by gel electrophoresis, nucleic acid quantitative detector and PCR assessment. Thus, as a simple approach, it is suitable and efficient in RNA isolation for plants rich in polysaccharides and polyphenols. PMID:29715304

  15. SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments

    PubMed Central

    Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric

    2014-01-01

    This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee. PMID:24972831

  16. Characterization of Human Salivary Extracellular RNA by Next-generation Sequencing.

    PubMed

    Li, Feng; Kaczor-Urbanowicz, Karolina Elżbieta; Sun, Jie; Majem, Blanca; Lo, Hsien-Chun; Kim, Yong; Koyano, Kikuye; Liu Rao, Shannon; Young Kang, So; Mi Kim, Su; Kim, Kyoung-Mee; Kim, Sung; Chia, David; Elashoff, David; Grogan, Tristan R; Xiao, Xinshu; Wong, David T W

    2018-04-23

    It was recently discovered that abundant and stable extracellular RNA (exRNA) species exist in bodily fluids. Saliva is an emerging biofluid for biomarker development for noninvasive detection and screening of local and systemic diseases. Use of RNA-Sequencing (RNA-Seq) to profile exRNA is rapidly growing; however, no single preparation and analysis protocol can be used for all biofluids. Specifically, RNA-Seq of saliva is particularly challenging owing to high abundance of bacterial contents and low abundance of salivary exRNA. Given the laborious procedures needed for RNA-Seq library construction, sequencing, data storage, and data analysis, saliva-specific and optimized protocols are essential. We compared different RNA isolation methods and library construction kits for long and small RNA sequencing. The role of ribosomal RNA (rRNA) depletion also was evaluated. The miRNeasy Micro Kit (Qiagen) showed the highest total RNA yield (70.8 ng/mL cell-free saliva) and best small RNA recovery, and the NEBNext library preparation kits resulted in the highest number of detected human genes [5649-6813 at 1 reads per kilobase RNA per million mapped (RPKM)] and small RNAs [482-696 microRNAs (miRNAs) and 190-214 other small RNAs]. The proportion of human RNA-Seq reads was much higher in rRNA-depleted saliva samples (41%) than in samples without rRNA depletion (14%). In addition, the transfer RNA (tRNA)-derived RNA fragments (tRFs), a novel class of small RNAs, were highly abundant in human saliva, specifically tRF-4 (4%) and tRF-5 (15.25%). Our results may help in selection of the best adapted methods of RNA isolation and small and long RNA library constructions for salivary exRNA studies. © 2018 American Association for Clinical Chemistry.

  17. Genomic maps of lincRNA occupancy reveal principles of RNA-chromatin interactions

    PubMed Central

    Chu, Ci; Qu, Kun; Zhong, Franklin; Artandi, Steven E.; Chang, Howard Y.

    2011-01-01

    SUMMARY Long intergenic noncoding RNAs (lincRNAs) are key regulators of chromatin state, yet the nature and sites of RNA-chromatin interaction are mostly unknown. Here we introduce Chromatin Isolation by RNA Purification (ChIRP), where tiling oligonucleotides retrieve specific lincRNAs with bound protein and DNA sequences, which are enumerated by deep sequencing. ChIRP-seq of three lincRNAs reveal that RNA occupancy sites in the genome are focal, sequence-specific, and numerous. Drosophila roX2 RNA occupies male X-linked gene bodies with increasing tendency toward the 3’ end, peaking at CES sites. Human telomerase RNA TERC occupies telomeres and Wnt pathway genes. HOTAIR lincRNA preferentially occupies a GA-rich DNA motif to nucleate broad domains of Polycomb occupancy and histone H3 lysine 27 trimethylation. HOTAIR occupancy occurs independently of EZH2, suggesting the order of RNA guidance of Polycomb occupancy. ChIRP-seq is generally applicable to illuminate the intersection of RNA and chromatin with newfound precision genome-wide. PMID:21963238

  18. High-throughput illumina strand-specific RNA sequencing library preparation

    USDA-ARS?s Scientific Manuscript database

    Conventional Illumina RNA-Seq does not have the resolution to decode the complex eukaryote transcriptome due to the lack of RNA polarity information. Strand-specific RNA sequencing (ssRNA-Seq) can overcome these limitations and as such is better suited for genome annotation, de novo transcriptome as...

  19. Nucleotide sequence and genetic organization of barley stripe mosaic virus RNA gamma.

    PubMed

    Gustafson, G; Hunter, B; Hanau, R; Armour, S L; Jackson, A O

    1987-06-01

    The complete nucleotide sequences of RNA gamma from the Type and ND18 strains of barley stripe mosaic virus (BSMV) have been determined. The sequences are 3164 (Type) and 2791 (ND18) nucleotides in length. Both sequences contain a 5'-noncoding region (87 or 88 nucleotides) which is followed by a long open reading frame (ORF1). A 42-nucleotide intercistronic region separates ORF1 from a second, shorter open reading frame (ORF2) located near the 3'-end of the RNA. There is a high degree of homology between the Type and ND18 strains in the nucleotide sequence of ORF1. However, the Type strain contains a 366 nucleotide direct tandem repeat within ORF1 which is absent in the ND18 strain. Consequently, the predicted translation product of Type RNA gamma ORF1 (mol wt 87,312) is significantly larger than that of ND18 RNA gamma ORF1 (mol wt 74,011). The amino acid sequence of the ORF1 polypeptide contains homologies with putative RNA polymerases from other RNA viruses, suggesting that this protein may function in replication of the BSMV genome. The nucleotide sequence of RNA gamma ORF2 is nearly identical in the Type and ND18 strains. ORF2 codes for a polypeptide with a predicted molecular weight of 17,209 (Type) or 17,074 (ND18) which is known to be translated from a subgenomic (sg) RNA. The initiation point of this sgRNA has been mapped to a location 27 nucleotides upstream of the ORF2 initiation codon in the intercistronic region between ORF1 and ORF2. The sgRNA is not coterminal with the 3'-end of the genomic RNA, but instead contains heterogeneous poly(A) termini up to 150 nucleotides long (J. Stanley, R. Hanau, and A. O. Jackson, 1984, Virology 139, 375-383). In the genomic RNA gamma, ORF2 is followed by a short poly(A) tract and a 238-nucleotide tRNA-like structure.

  20. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud

    PubMed Central

    Griffith, Malachi; Walker, Jason R.; Spies, Nicholas C.; Ainscough, Benjamin J.; Griffith, Obi L.

    2015-01-01

    Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki. PMID:26248053

  1. Multifunctional G-Rich and RRM-Containing Domains of TbRGG2 Perform Separate yet Essential Functions in Trypanosome RNA Editing

    PubMed Central

    Foda, Bardees M.; Downey, Kurtis M.; Fisk, John C.

    2012-01-01

    Efficient editing of Trypanosoma brucei mitochondrial RNAs involves the actions of multiple accessory factors. T. brucei RGG2 (TbRGG2) is an essential protein crucial for initiation and 3′-to-5′ progression of editing. TbRGG2 comprises an N-terminal G-rich region containing GWG and RG repeats and a C-terminal RNA recognition motif (RRM)-containing domain. Here, we perform in vitro and in vivo separation-of-function studies to interrogate the mechanism of TbRGG2 action in RNA editing. TbRGG2 preferentially binds preedited mRNA in vitro with high affinity attributable to its G-rich region. RNA-annealing and -melting activities are separable, carried out primarily by the G-rich and RRM domains, respectively. In vivo, the G-rich domain partially complements TbRGG2 knockdown, but the RRM domain is also required. Notably, TbRGG2's RNA-melting activity is dispensable for RNA editing in vivo. Interactions between TbRGG2 and MRB1 complex proteins are mediated by both G-rich and RRM-containing domains, depending on the binding partner. Overall, our results are consistent with a model in which the high-affinity RNA binding and RNA-annealing activities of the G-rich domain are essential for RNA editing in vivo. The RRM domain may have key functions involving interactions with the MRB1 complex and/or regulation of the activities of the G-rich domain. PMID:22798390

  2. Diversity of thermophiles in a Malaysian hot spring determined using 16S rRNA and shotgun metagenome sequencing.

    PubMed

    Chan, Chia Sing; Chan, Kok-Gan; Tay, Yea-Ling; Chua, Yi-Heng; Goh, Kian Mau

    2015-01-01

    The Sungai Klah (SK) hot spring is the second hottest geothermal spring in Malaysia. This hot spring is a shallow, 150-m-long, fast-flowing stream, with temperatures varying from 50 to 110°C and a pH range of 7.0-9.0. Hidden within a wooded area, the SK hot spring is continually fed by plant litter, resulting in a relatively high degree of total organic content (TOC). In this study, a sample taken from the middle of the stream was analyzed at the 16S rRNA V3-V4 region by amplicon metagenome sequencing. Over 35 phyla were detected by analyzing the 16S rRNA data. Firmicutes and Proteobacteria represented approximately 57% of the microbiome. Approximately 70% of the detected thermophiles were strict anaerobes; however, Hydrogenobacter spp., obligate chemolithotrophic thermophiles, represented one of the major taxa. Several thermophilic photosynthetic microorganisms and acidothermophiles were also detected. Most of the phyla identified by 16S rRNA were also found using the shotgun metagenome approaches. The carbon, sulfur, and nitrogen metabolism within the SK hot spring community were evaluated by shotgun metagenome sequencing, and the data revealed diversity in terms of metabolic activity and dynamics. This hot spring has a rich diversified phylogenetic community partly due to its natural environment (plant litter, high TOC, and a shallow stream) and geochemical parameters (broad temperature and pH range). It is speculated that symbiotic relationships occur between the members of the community.

  3. Global regulation of alternative RNA splicing by the SR-rich protein RBM39.

    PubMed

    Mai, Sanyue; Qu, Xiuhua; Li, Ping; Ma, Qingjun; Cao, Cheng; Liu, Xuan

    2016-08-01

    RBM39 is a serine/arginine-rich RNA-binding protein that is highly homologous to the splicing factor U2AF65. However, the role of RBM39 in alternative splicing is poorly understood. In this study, RBM39-mediated global alternative splicing was investigated using RNA-Seq and genome-wide RBM39-RNA interactions were mapped via cross-linking and immunoprecipitation coupled with deep sequencing (CLIP-Seq) in wild-type and RBM39-knockdown MCF-7 cells. RBM39 was involved in the up- or down-regulation of the transcript levels of various genes. Hundreds of alternative splicing events regulated by endogenous RBM39 were identified. The majority of these events were cassette exons. Genes containing RBM39-regulated alternative exons were found to be linked to G2/M transition, cellular response to DNA damage, adherens junctions and endocytosis. CLIP-Seq analysis showed that the binding site of RBM39 was mainly in proximity to 5' and 3' splicing sites. Considerable RBM39 binding to mRNAs encoding proteins involved in translation was observed. Of particular importance, ~20% of the alternative splicing events that were significantly regulated by RBM39 were similarly regulated by U2AF65. RBM39 is extensively involved in alternative splicing of RNA and helps regulate transcript levels. RBM39 may modulate alternative splicing similarly to U2AF65 by either directly binding to RNA or recruiting other splicing factors, such as U2AF65. The current study offers a genome-wide view of RBM39's regulatory function in alternative splicing. RBM39 may play important roles in multiple cellular processes by regulating both alternative splicing of RNA molecules and transcript levels. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. StarScan: a web server for scanning small RNA targets from degradome sequencing data.

    PubMed

    Liu, Shun; Li, Jun-Hao; Wu, Jie; Zhou, Ke-Ren; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2015-07-01

    Endogenous small non-coding RNAs (sRNAs), including microRNAs, PIWI-interacting RNAs and small interfering RNAs, play important gene regulatory roles in animals and plants by pairing to the protein-coding and non-coding transcripts. However, computationally assigning these various sRNAs to their regulatory target genes remains technically challenging. Recently, a high-throughput degradome sequencing method was applied to identify biologically relevant sRNA cleavage sites. In this study, an integrated web-based tool, StarScan (sRNA target Scan), was developed for scanning sRNA targets using degradome sequencing data from 20 species. Given a sRNA sequence from plants or animals, our web server performs an ultrafast and exhaustive search for potential sRNA-target interactions in annotated and unannotated genomic regions. The interactions between small RNAs and target transcripts were further evaluated using a novel tool, alignScore. A novel tool, degradomeBinomTest, was developed to quantify the abundance of degradome fragments located at the 9-11th nucleotide from the sRNA 5' end. This is the first web server for discovering potential sRNA-mediated RNA cleavage events in plants and animals, which affords mechanistic insights into the regulatory roles of sRNAs. The StarScan web server is available at http://mirlab.sysu.edu.cn/starscan/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Small-RNA Deep Sequencing Reveals Arctium tomentosum as a Natural Host of Alstroemeria virus X and a New Putative Emaravirus

    PubMed Central

    Bi, Yaqi; Tugume, Arthur K.; Valkonen, Jari P. T.

    2012-01-01

    Background Arctium species (Asteraceae) are distributed worldwide and are used as food and rich sources of secondary metabolites for the pharmaceutical industry, e.g., against avian influenza virus. RNA silencing is an antiviral defense mechanism that detects and destroys virus-derived double-stranded RNA, resulting in accumulation of virus-derived small RNAs (21–24 nucleotides) that can be used for generic detection of viruses by small-RNA deep sequencing (SRDS). Methodology/Principal Findings SRDS was used to detect viruses in the biennial wild plant species Arctium tomentosum (woolly burdock; family Asteraceae) displaying virus-like symptoms of vein yellowing and leaf mosaic in southern Finland. Assembly of the small-RNA reads resulted in contigs homologous to Alstroemeria virus X (AlsVX), a positive/single-stranded RNA virus of genus Potexvirus (family Alphaflexiviridae), or related to negative/single-stranded RNA viruses of the genus Emaravirus. The coat protein gene of AlsVX was 81% and 89% identical to the two AlsVX isolates from Japan and Norway, respectively. The deduced, partial nucleocapsid protein amino acid sequence of the emara-like virus was only 78% or less identical to reported emaraviruses and showed no variability among the virus isolates characterized. This virus—tentatively named as Woolly burdock yellow vein virus—was exclusively associated with yellow vein and leaf mosaic symptoms in woolly burdock, whereas AlsVX was detected in only one of the 52 plants tested. Conclusions/Significance These results provide novel information about natural virus infections in Acrtium species and reveal woolly burdock as the first natural host of AlsVX besides Alstroemeria (family Alstroemeriaceae). Results also revealed a new virus related to the recently emerged Emaravirus genus and demonstrated applicability of SRDS to detect negative-strand RNA viruses. SRDS potentiates virus surveys of wild plants, a research area underrepresented in plant virology

  6. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.

    PubMed

    Álvarez-Martos, Isabel; Ferapontova, Elena E

    2017-08-05

    A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. A HuD-ZBP1 ribonucleoprotein complex localizes GAP-43 mRNA into axons through its 3' untranslated region AU-rich regulatory element.

    PubMed

    Yoo, Soonmoon; Kim, Hak H; Kim, Paul; Donnelly, Christopher J; Kalinski, Ashley L; Vuppalanchi, Deepika; Park, Michael; Lee, Seung J; Merianda, Tanuja T; Perrone-Bizzozero, Nora I; Twiss, Jeffery L

    2013-09-01

    Localized translation of axonal mRNAs contributes to developmental and regenerative axon growth. Although untranslated regions (UTRs) of many different axonal mRNAs appear to drive their localization, there has been no consensus RNA structure responsible for this localization. We recently showed that limited expression of ZBP1 protein restricts axonal localization of both β-actin and GAP-43 mRNAs. β-actin 3'UTR has a defined element for interaction with ZBP1, but GAP-43 mRNA shows no homology to this RNA sequence. Here, we show that an AU-rich regulatory element (ARE) in GAP-43's 3'UTR is necessary and sufficient for its axonal localization. Axonal GAP-43 mRNA levels increase after in vivo injury, and GAP-43 mRNA shows an increased half-life in regenerating axons. GAP-43 mRNA interacts with both HuD and ZBP1, and HuD and ZBP1 co-immunoprecipitate in an RNA-dependent fashion. Reporter mRNA with the GAP-43 ARE competes with endogenous β-actin mRNA for axonal localization and decreases axon length and branching similar to the β-actin 3'UTR competing with endogenous GAP-43 mRNA. Conversely, over-expressing GAP-43 coding sequence with its 3'UTR ARE increases axonal elongation and this effect is lost when just the ARE is deleted from GAP-43's 3'UTR. We have recently found that over-expression of GAP-43 using an axonally targeted construct with the 3'UTRs of GAP-43 promoted elongating growth of axons, while restricting the mRNA to the cell body with the 3'UTR of γ-actin had minimal effect on axon length. In this study, we show that the ARE in GAP-43's 3'UTR is responsible for localization of GAP-43 mRNA into axons and is sufficient for GAP-43 protein's role in elongating axonal growth. © 2013 International Society for Neurochemistry.

  8. Exploring Connectivity in Sequence Space of Functional RNA

    NASA Technical Reports Server (NTRS)

    Wei, Chenyu; Pohorille, Andrzej; Popovic, Milena; Ditzler, Mark

    2017-01-01

    Emergence of replicable genetic molecules was one of the marking points in the origin of life, evolution of which can be conceptualized as a walk through the space of all possible sequences. A theoretical concept of fitness landscape helps to understand evolutionary processes through assigning a value of fitness to each genotype. Then, evolution of a phenotype is viewed as a series of consecutive, single-point mutations. Natural selection biases evolution toward peaks of high fitness and away from valleys of low fitness. whereas neutral drift occurs in the sequence space without direction as mutations are introduced at random. Large networks of neutral or near-neutral mutations on a fitness landscape, especially for sufficiently long genomes, are possible or even inevitable. Their detection in experiments, however, has been elusive. Although a few near-neutral evolutionary pathways have been found, recent experimental evidence indicates landscapes consist of largely isolated islands. The generality of these results, however, is not clear, as the genome length or the fraction of functional molecules in the genotypic space might have been insufficient for the emergence of large, neutral networks. Thorough investigation on the structure of the fitness landscape is essential to understand the mechanisms of evolution of early genomes. RNA molecules are commonly assumed to play the pivotal role in the origin of genetic systems. They are widely believed to be early, if not the earliest, genetic and catalytic molecules, with abundant biochemical activities as aptamers and ribozymes, i.e. RNA molecules capable, respectively, to bind small molecules or catalyze chemical reactions. Here, we present results of our recent studies on the structure of the sequence space of RNA ligase ribozymes selected through in vitro evolution. Several hundred thousands of sequences active to a different degree were obtained by way of deep sequencing. Analysis of these sequences revealed

  9. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications.

    PubMed

    Haque, Ashraful; Engel, Jessica; Teichmann, Sarah A; Lönnberg, Tapio

    2017-08-18

    RNA sequencing (RNA-seq) is a genomic approach for the detection and quantitative analysis of messenger RNA molecules in a biological sample and is useful for studying cellular responses. RNA-seq has fueled much discovery and innovation in medicine over recent years. For practical reasons, the technique is usually conducted on samples comprising thousands to millions of cells. However, this has hindered direct assessment of the fundamental unit of biology-the cell. Since the first single-cell RNA-sequencing (scRNA-seq) study was published in 2009, many more have been conducted, mostly by specialist laboratories with unique skills in wet-lab single-cell genomics, bioinformatics, and computation. However, with the increasing commercial availability of scRNA-seq platforms, and the rapid ongoing maturation of bioinformatics approaches, a point has been reached where any biomedical researcher or clinician can use scRNA-seq to make exciting discoveries. In this review, we present a practical guide to help researchers design their first scRNA-seq studies, including introductory information on experimental hardware, protocol choice, quality control, data analysis and biological interpretation.

  10. Effects of RNA integrity on transcript quantification by total RNA sequencing of clinically collected human placental samples.

    PubMed

    Reiman, Mario; Laan, Maris; Rull, Kristiina; Sõber, Siim

    2017-08-01

    RNA degradation is a ubiquitous process that occurs in living and dead cells, as well as during handling and storage of extracted RNA. Reduced RNA quality caused by degradation is an established source of uncertainty for all RNA-based gene expression quantification techniques. RNA sequencing is an increasingly preferred method for transcriptome analyses, and dependence of its results on input RNA integrity is of significant practical importance. This study aimed to characterize the effects of varying input RNA integrity [estimated as RNA integrity number (RIN)] on transcript level estimates and delineate the characteristic differences between transcripts that differ in degradation rate. The study used ribodepleted total RNA sequencing data from a real-life clinically collected set ( n = 32) of human solid tissue (placenta) samples. RIN-dependent alterations in gene expression profiles were quantified by using DESeq2 software. Our results indicate that small differences in RNA integrity affect gene expression quantification by introducing a moderate and pervasive bias in expression level estimates that significantly affected 8.1% of studied genes. The rapidly degrading transcript pool was enriched in pseudogenes, short noncoding RNAs, and transcripts with extended 3' untranslated regions. Typical slowly degrading transcripts (median length, 2389 nt) represented protein coding genes with 4-10 exons and high guanine-cytosine content.-Reiman, M., Laan, M., Rull, K., Sõber, S. Effects of RNA integrity on transcript quantification by total RNA sequencing of clinically collected human placental samples. © FASEB.

  11. Rare Cell Detection by Single-Cell RNA Sequencing as Guided by Single-Molecule RNA FISH.

    PubMed

    Torre, Eduardo; Dueck, Hannah; Shaffer, Sydney; Gospocic, Janko; Gupte, Rohit; Bonasio, Roberto; Kim, Junhyong; Murray, John; Raj, Arjun

    2018-02-28

    Although single-cell RNA sequencing can reliably detect large-scale transcriptional programs, it is unclear whether it accurately captures the behavior of individual genes, especially those that express only in rare cells. Here, we use single-molecule RNA fluorescence in situ hybridization as a gold standard to assess trade-offs in single-cell RNA-sequencing data for detecting rare cell expression variability. We quantified the gene expression distribution for 26 genes that range from ubiquitous to rarely expressed and found that the correspondence between estimates across platforms improved with both transcriptome coverage and increased number of cells analyzed. Further, by characterizing the trade-off between transcriptome coverage and number of cells analyzed, we show that when the number of genes required to answer a given biological question is small, then greater transcriptome coverage is more important than analyzing large numbers of cells. More generally, our report provides guidelines for selecting quality thresholds for single-cell RNA-sequencing experiments aimed at rare cell analyses. Copyright © 2018 Elsevier Inc. All rights reserved.

  12. Sequence-specific inhibition of Dicer measured with a force-based microarray for RNA ligands.

    PubMed

    Limmer, Katja; Aschenbrenner, Daniela; Gaub, Hermann E

    2013-04-01

    Malfunction of protein translation causes many severe diseases, and suitable correction strategies may become the basis of effective therapies. One major regulatory element of protein translation is the nuclease Dicer that cuts double-stranded RNA independently of the sequence into pieces of 19-22 base pairs starting the RNA interference pathway and activating miRNAs. Inhibiting Dicer is not desirable owing to its multifunctional influence on the cell's gene regulation. Blocking specific RNA sequences by small-molecule binding, however, is a promising approach to affect the cell's condition in a controlled manner. A label-free assay for the screening of site-specific interference of small molecules with Dicer activity is thus needed. We used the Molecular Force Assay (MFA), recently developed in our lab, to measure the activity of Dicer. As a model system, we used an RNA sequence that forms an aptamer-binding site for paromomycin, a 615-dalton aminoglycoside. We show that Dicer activity is modulated as a function of concentration and incubation time: the addition of paromomycin leads to a decrease of Dicer activity according to the amount of ligand. The measured dissociation constant of paromomycin to its aptamer was found to agree well with literature values. The parallel format of the MFA allows a large-scale search and analysis for ligands for any RNA sequence.

  13. Investigation of Microbial Diversity in Geothermal Hot Springs in Unkeshwar, India, Based on 16S rRNA Amplicon Metagenome Sequencing

    PubMed Central

    Mehetre, Gajanan T.; Paranjpe, Aditi; Dastager, Syed G.

    2016-01-01

    Microbial diversity in geothermal waters of the Unkeshwar hot springs in Maharashtra, India, was studied using 16S rRNA amplicon metagenomic sequencing. Taxonomic analysis revealed the presence of Bacteroidetes, Proteobacteria, Cyanobacteria, Actinobacteria, Archeae, and OD1 phyla. Metabolic function prediction analysis indicated a battery of biological information systems indicating rich and novel microbial diversity, with potential biotechnological applications in this niche. PMID:26950332

  14. Mechanism of transcription termination by RNA polymerase III utilizes a nontemplate-strand sequence-specific signal element

    PubMed Central

    Arimbasseri, Aneeshkumar G.; Maraia, Richard J.

    2015-01-01

    SUMMARY Understanding the mechanism of transcription termination by a eukaryotic RNA polymerase (RNAP) has been limited by lack of a characterizable intermediate that reflects transition from an elongation complex to a true termination event. While other multisubunit RNAPs require multipartite cis-signals and/or ancillary factors to mediate pausing and release of the nascent transcript from the clutches of these enzymes, RNAP III does so with precision and efficiency on a simple oligo(dT) tract, independent of other cis-elements or trans-factors. We report a RNAP III pre-termination complex that reveals termination mechanisms controlled by sequence-specific elements in the non-template strand. Furthermore, the TFIIF-like, RNAP III subunit, C37 is required for this function of the non-template strand signal. The results reveal the RNAP III terminator as an information-rich control element. While the template strand promotes destabilization via a weak oligo(rU:dA) hybrid, the non-template strand provides distinct sequence-specific destabilizing information through interactions with the C37 subunit. PMID:25959395

  15. Special AT-rich sequence binding protein 1 promotes tumor growth and metastasis of esophageal squamous cell carcinoma.

    PubMed

    Ma, Jun; Wu, Kaiming; Zhao, Zhenxian; Miao, Rong; Xu, Zhe

    2017-03-01

    Esophageal squamous cell carcinoma is one of the most aggressive malignancies worldwide. Special AT-rich sequence binding protein 1 is a nuclear matrix attachment region binding protein which participates in higher order chromatin organization and tissue-specific gene expression. However, the role of special AT-rich sequence binding protein 1 in esophageal squamous cell carcinoma remains unknown. In this study, western blot and quantitative real-time polymerase chain reaction analysis were performed to identify differentially expressed special AT-rich sequence binding protein 1 in a series of esophageal squamous cell carcinoma tissue samples. The effects of special AT-rich sequence binding protein 1 silencing by two short-hairpin RNAs on cell proliferation, migration, and invasion were assessed by the CCK-8 assay and transwell assays in esophageal squamous cell carcinoma in vitro. Special AT-rich sequence binding protein 1 was significantly upregulated in esophageal squamous cell carcinoma tissue samples and cell lines. Silencing of special AT-rich sequence binding protein 1 inhibited the proliferation of KYSE450 and EC9706 cells which have a relatively high level of special AT-rich sequence binding protein 1, and the ability of migration and invasion of KYSE450 and EC9706 cells was distinctly suppressed. Special AT-rich sequence binding protein 1 could be a potential target for the treatment of esophageal squamous cell carcinoma and inhibition of special AT-rich sequence binding protein 1 may provide a new strategy for the prevention of esophageal squamous cell carcinoma invasion and metastasis.

  16. Side chain-side chain interactions of arginine with tyrosine and aspartic acid in Arg/Gly/Tyr-rich domains within plant glycine-rich RNA binding proteins.

    PubMed

    Kumaki, Yasuhiro; Nitta, Katsutoshi; Hikichi, Kunio; Matsumoto, Takeshi; Matsushima, Norio

    2004-07-01

    Plant glycine-rich RNA-binding proteins (GRRBPs) contain a glycine-rich region at the C-terminus whose structure is quite unknown. The C-terminal glycine-rich part is interposed with arginine and tyrosine (arginine/glycine/tyrosine (RGY)-rich domain). Comparative sequence analysis of forty-one GRRBPs revealed that the RGY-rich domain contains multiple repeats of Tyr-(Xaa)h-(Arg)k-(Xaa)l, where Xaa is mainly Gly, "k" is 1 or 2, and "h" and "l" range from 0 to 10. Two peptides, 1 (G1G2Y3G4G5G6R7R8D9G10) and 2 (G1G2R3R4D5G6G7Y8G9G10), corresponding to sections of the RGY-rich domain in Zea mays RAB15, were selected for CD and NMR experiments. The CD spectra indicate a unique, positive band near 228 nm in both peptides that has been ascribed to tyrosine residues in ordered structures. The pH titration by NMR revealed that a side chain-side chain interaction, presumably an H-Nepsilon...O=Cgamma hydrogen bonding interaction in the salt bridge, occurs between Arg (i) and Asp (i + 2). 1D GOESY experiments indicated the presence of NOE between the aromatic side chain proton and the arginine side chain proton in the two peptides suggesting strongly that the Arg (i) aromatic side chain interacts directly with the Tyr (i +/- 4 or i +/- 5) side chain.

  17. Study design requirements for RNA sequencing-based breast cancer diagnostics.

    PubMed

    Mer, Arvind Singh; Klevebring, Daniel; Grönberg, Henrik; Rantalainen, Mattias

    2016-02-01

    Sequencing-based molecular characterization of tumors provides information required for individualized cancer treatment. There are well-defined molecular subtypes of breast cancer that provide improved prognostication compared to routine biomarkers. However, molecular subtyping is not yet implemented in routine breast cancer care. Clinical translation is dependent on subtype prediction models providing high sensitivity and specificity. In this study we evaluate sample size and RNA-sequencing read requirements for breast cancer subtyping to facilitate rational design of translational studies. We applied subsampling to ascertain the effect of training sample size and the number of RNA sequencing reads on classification accuracy of molecular subtype and routine biomarker prediction models (unsupervised and supervised). Subtype classification accuracy improved with increasing sample size up to N = 750 (accuracy = 0.93), although with a modest improvement beyond N = 350 (accuracy = 0.92). Prediction of routine biomarkers achieved accuracy of 0.94 (ER) and 0.92 (Her2) at N = 200. Subtype classification improved with RNA-sequencing library size up to 5 million reads. Development of molecular subtyping models for cancer diagnostics requires well-designed studies. Sample size and the number of RNA sequencing reads directly influence accuracy of molecular subtyping. Results in this study provide key information for rational design of translational studies aiming to bring sequencing-based diagnostics to the clinic.

  18. DNA/RNA hybrid substrates modulate the catalytic activity of purified AID.

    PubMed

    Abdouni, Hala S; King, Justin J; Ghorbani, Atefeh; Fifield, Heather; Berghuis, Lesley; Larijani, Mani

    2018-01-01

    Activation-induced cytidine deaminase (AID) converts cytidine to uridine at Immunoglobulin (Ig) loci, initiating somatic hypermutation and class switching of antibodies. In vitro, AID acts on single stranded DNA (ssDNA), but neither double-stranded DNA (dsDNA) oligonucleotides nor RNA, and it is believed that transcription is the in vivo generator of ssDNA targeted by AID. It is also known that the Ig loci, particularly the switch (S) regions targeted by AID are rich in transcription-generated DNA/RNA hybrids. Here, we examined the binding and catalytic behavior of purified AID on DNA/RNA hybrid substrates bearing either random sequences or GC-rich sequences simulating Ig S regions. If substrates were made up of a random sequence, AID preferred substrates composed entirely of DNA over DNA/RNA hybrids. In contrast, if substrates were composed of S region sequences, AID preferred to mutate DNA/RNA hybrids over substrates composed entirely of DNA. Accordingly, AID exhibited a significantly higher affinity for binding DNA/RNA hybrid substrates composed specifically of S region sequences, than any other substrates composed of DNA. Thus, in the absence of any other cellular processes or factors, AID itself favors binding and mutating DNA/RNA hybrids composed of S region sequences. AID:DNA/RNA complex formation and supporting mutational analyses suggest that recognition of DNA/RNA hybrids is an inherent structural property of AID. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Uncultivated Microbial Eukaryotic Diversity: A Method to Link ssu rRNA Gene Sequences with Morphology

    PubMed Central

    Hirst, Marissa B.; Kita, Kelley N.; Dawson, Scott C.

    2011-01-01

    Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA “phylotypes” from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages

  20. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    NASA Astrophysics Data System (ADS)

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M.; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A. C. T.; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M.; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-09-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.

  1. Single Cell Total RNA Sequencing through Isothermal Amplification in Picoliter-Droplet Emulsion.

    PubMed

    Fu, Yusi; Chen, He; Liu, Lu; Huang, Yanyi

    2016-11-15

    Prevalent single cell RNA amplification and sequencing chemistries mainly focus on polyadenylated RNAs in eukaryotic cells by using oligo(dT) primers for reverse transcription. We develop a new RNA amplification method, "easier-seq", to reverse transcribe and amplify the total RNAs, both with and without polyadenylate tails, from a single cell for transcriptome sequencing with high efficiency, reproducibility, and accuracy. By distributing the reverse transcribed cDNA molecules into 1.5 × 10 5 aqueous droplets in oil, the cDNAs are isothermally amplified using random primers in each of these 65-pL reactors separately. This new method greatly improves the ease of single-cell RNA sequencing by reducing the experimental steps. Meanwhile, with less chance to induce errors, this method can easily maintain the quality of single-cell sequencing. In addition, this polyadenylate-tail-independent method can be seamlessly applied to prokaryotic cell RNA sequencing.

  2. Quantitative Assessment of RNA-Protein Interactions with High Throughput Sequencing - RNA Affinity Profiling (HiTS-RAP)

    PubMed Central

    Ozer, Abdullah; Tome, Jacob M.; Friedman, Robin C.; Gheba, Dan; Schroth, Gary P.; Lis, John T.

    2016-01-01

    Because RNA-protein interactions play a central role in a wide-array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the High Throughput Sequencing-RNA Affinity Profiling (HiTS-RAP) assay, which couples sequencing on an Illumina GAIIx with the quantitative assessment of one or several proteins’ interactions with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of EGFP and NELF-E proteins with their corresponding canonical and mutant RNA aptamers. Here, we provide a detailed protocol for HiTS-RAP, which can be completed in about a month (8 days hands-on time) including the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, high-throughput sequencing and protein binding with GAIIx, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, RNA-MaP and RBNS. A successful HiTS-RAP experiment provides the sequence and binding curves for approximately 200 million RNAs in a single experiment. PMID:26182240

  3. Sequence, Structure, and Context Preferences of Human RNA Binding Proteins.

    PubMed

    Dominguez, Daniel; Freese, Peter; Alexis, Maria S; Su, Amanda; Hochman, Myles; Palden, Tsultrim; Bazile, Cassandra; Lambert, Nicole J; Van Nostrand, Eric L; Pratt, Gabriel A; Yeo, Gene W; Graveley, Brenton R; Burge, Christopher B

    2018-06-07

    RNA binding proteins (RBPs) orchestrate the production, processing, and function of mRNAs. Here, we present the affinity landscapes of 78 human RBPs using an unbiased assay that determines the sequence, structure, and context preferences of these proteins in vitro by deep sequencing of bound RNAs. These data enable construction of "RNA maps" of RBP activity without requiring crosslinking-based assays. We found an unexpectedly low diversity of RNA motifs, implying frequent convergence of binding specificity toward a relatively small set of RNA motifs, many with low compositional complexity. Offsetting this trend, however, we observed extensive preferences for contextual features distinct from short linear RNA motifs, including spaced "bipartite" motifs, biased flanking nucleotide composition, and bias away from or toward RNA structure. Our results emphasize the importance of contextual features in RNA recognition, which likely enable targeting of distinct subsets of transcripts by different RBPs that recognize the same linear motif. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  4. A high-throughput RNA extraction for sprouted single-seed malting barley (Hordeum vulgare L.) rich in polysaccharides

    USDA-ARS?s Scientific Manuscript database

    Germinated seed from cereal crops including barley (Hordeum vulgare L.) is an important tissue to extract RNA and analyze expression levels of genes that control aspects of germination. These tissues are rich in polysaccharides and most methods for RNA extraction are not suitable to handle the exces...

  5. Transcription profile of boar spermatozoa as revealed by RNA-sequencing

    USDA-ARS?s Scientific Manuscript database

    High-throughput RNA sequencing (RNA-Seq) overcomes the limitations of the current hybridization-based techniques to detect the actual pool of RNA transcripts in spermatozoa. The application of this technology in livestock can speed the discovery of potential predictors of male fertility. As a first ...

  6. Short intronic repeat sequences facilitate circular RNA production.

    PubMed

    Liang, Dongming; Wilusz, Jeremy E

    2014-10-15

    Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery "backsplices" and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼ 30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3' end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. © 2014 Liang and Wilusz; Published by Cold Spring Harbor Laboratory Press.

  7. Short intronic repeat sequences facilitate circular RNA production

    PubMed Central

    Liang, Dongming

    2014-01-01

    Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery “backsplices” and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3′ end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. PMID:25281217

  8. The crystal structure of an oligo(U):pre-mRNA duplex from a trypanosome RNA editing substrate

    PubMed Central

    Mooers, Blaine H.M.; Singh, Amritanshu

    2011-01-01

    Guide RNAs bind antiparallel to their target pre-mRNAs to form editing substrates in reaction cycles that insert or delete uridylates (Us) in most mitochondrial transcripts of trypanosomes. The 5′ end of each guide RNA has an anchor sequence that binds to the pre-mRNA by base-pair complementarity. The template sequence in the middle of the guide RNA directs the editing reactions. The 3′ ends of most guide RNAs have ∼15 contiguous Us that bind to the purine-rich unedited pre-mRNA upstream of the editing site. The resulting U-helix is rich in G·U wobble base pairs. To gain insights into the structure of the U-helix, we crystallized 8 bp of the U-helix in one editing substrate for the A6 mRNA of Trypanosoma brucei. The fragment provides three samples of the 5′-AGA-3′/5′-UUU-3′ base-pair triple. The fusion of two identical U-helices head-to-head promoted crystallization. We obtained X-ray diffraction data with a resolution limit of 1.37 Å. The U-helix had low and high twist angles before and after each G·U wobble base pair; this variation was partly due to shearing of the wobble base pairs as revealed in comparisons with a crystal structure of a 16-nt RNA with all Watson–Crick base pairs. Both crystal structures had wider major grooves at the junction between the poly(U) and polypurine tracts. This junction mimics the junction between the template helix and the U-helix in RNA-editing substrates and may be a site of major groove invasion by RNA editing proteins. PMID:21878548

  9. miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments.

    PubMed

    Hackenberg, Michael; Sturm, Martin; Langenberger, David; Falcón-Pérez, Juan Manuel; Aransay, Ana M

    2009-07-01

    Next-generation sequencing allows now the sequencing of small RNA molecules and the estimation of their expression levels. Consequently, there will be a high demand of bioinformatics tools to cope with the several gigabytes of sequence data generated in each single deep-sequencing experiment. Given this scene, we developed miRanalyzer, a web server tool for the analysis of deep-sequencing experiments for small RNAs. The web server tool requires a simple input file containing a list of unique reads and its copy numbers (expression levels). Using these data, miRanalyzer (i) detects all known microRNA sequences annotated in miRBase, (ii) finds all perfect matches against other libraries of transcribed sequences and (iii) predicts new microRNAs. The prediction of new microRNAs is an especially important point as there are many species with very few known microRNAs. Therefore, we implemented a highly accurate machine learning algorithm for the prediction of new microRNAs that reaches AUC values of 97.9% and recall values of up to 75% on unseen data. The web tool summarizes all the described steps in a single output page, which provides a comprehensive overview of the analysis, adding links to more detailed output pages for each analysis module. miRanalyzer is available at http://web.bioinformatics.cicbiogune.es/microRNA/.

  10. A novel class of plant-specific zinc-dependent DNA-binding protein that binds to A/T-rich DNA sequences

    PubMed Central

    Nagano, Yukio; Furuhashi, Hirofumi; Inaba, Takehito; Sasaki, Yukiko

    2001-01-01

    Complementary DNA encoding a DNA-binding protein, designated PLATZ1 (plant AT-rich sequence- and zinc-binding protein 1), was isolated from peas. The amino acid sequence of the protein is similar to those of other uncharacterized proteins predicted from the genome sequences of higher plants. However, no paralogous sequences have been found outside the plant kingdom. Multiple alignments among these paralogous proteins show that several cysteine and histidine residues are invariant, suggesting that these proteins are a novel class of zinc-dependent DNA-binding proteins with two distantly located regions, C-x2-H-x11-C-x2-C-x(4–5)-C-x2-C-x(3–7)-H-x2-H and C-x2-C-x(10–11)-C-x3-C. In an electrophoretic mobility shift assay, the zinc chelator 1,10-o-phenanthroline inhibited DNA binding, and two distant zinc-binding regions were required for DNA binding. A protein blot with 65ZnCl2 showed that both regions are required for zinc-binding activity. The PLATZ1 protein non-specifically binds to A/T-rich sequences, including the upstream region of the pea GTPase pra2 and plastocyanin petE genes. Expression of the PLATZ1 repressed those of the reporter constructs containing the coding sequence of luciferase gene driven by the cauliflower mosaic virus (CaMV) 35S90 promoter fused to the tandem repeat of the A/T-rich sequences. These results indicate that PLATZ1 is a novel class of plant-specific zinc-dependent DNA-binding protein responsible for A/T-rich sequence-mediated transcriptional repression. PMID:11600698

  11. Diversity of thermophiles in a Malaysian hot spring determined using 16S rRNA and shotgun metagenome sequencing

    PubMed Central

    Chan, Chia Sing; Chan, Kok-Gan; Tay, Yea-Ling; Chua, Yi-Heng; Goh, Kian Mau

    2015-01-01

    The Sungai Klah (SK) hot spring is the second hottest geothermal spring in Malaysia. This hot spring is a shallow, 150-m-long, fast-flowing stream, with temperatures varying from 50 to 110°C and a pH range of 7.0–9.0. Hidden within a wooded area, the SK hot spring is continually fed by plant litter, resulting in a relatively high degree of total organic content (TOC). In this study, a sample taken from the middle of the stream was analyzed at the 16S rRNA V3-V4 region by amplicon metagenome sequencing. Over 35 phyla were detected by analyzing the 16S rRNA data. Firmicutes and Proteobacteria represented approximately 57% of the microbiome. Approximately 70% of the detected thermophiles were strict anaerobes; however, Hydrogenobacter spp., obligate chemolithotrophic thermophiles, represented one of the major taxa. Several thermophilic photosynthetic microorganisms and acidothermophiles were also detected. Most of the phyla identified by 16S rRNA were also found using the shotgun metagenome approaches. The carbon, sulfur, and nitrogen metabolism within the SK hot spring community were evaluated by shotgun metagenome sequencing, and the data revealed diversity in terms of metabolic activity and dynamics. This hot spring has a rich diversified phylogenetic community partly due to its natural environment (plant litter, high TOC, and a shallow stream) and geochemical parameters (broad temperature and pH range). It is speculated that symbiotic relationships occur between the members of the community. PMID:25798135

  12. Advancing viral RNA structure prediction: measuring the thermodynamics of pyrimidine-rich internal loops

    PubMed Central

    Phan, Andy; Mailey, Katherine; Saeki, Jessica; Gu, Xiaobo

    2017-01-01

    Accurate thermodynamic parameters improve RNA structure predictions and thus accelerate understanding of RNA function and the identification of RNA drug binding sites. Many viral RNA structures, such as internal ribosome entry sites, have internal loops and bulges that are potential drug target sites. Current models used to predict internal loops are biased toward small, symmetric purine loops, and thus poorly predict asymmetric, pyrimidine-rich loops with >6 nucleotides (nt) that occur frequently in viral RNA. This article presents new thermodynamic data for 40 pyrimidine loops, many of which can form UU or protonated CC base pairs. Uracil and protonated cytosine base pairs stabilize asymmetric internal loops. Accurate prediction rules are presented that account for all thermodynamic measurements of RNA asymmetric internal loops. New loop initiation terms for loops with >6 nt are presented that do not follow previous assumptions that increasing asymmetry destabilizes loops. Since the last 2004 update, 126 new loops with asymmetry or sizes greater than 2 × 2 have been measured. These new measurements significantly deepen and diversify the thermodynamic database for RNA. These results will help better predict internal loops that are larger, pyrimidine-rich, and occur within viral structures such as internal ribosome entry sites. PMID:28213527

  13. Advances in single-cell RNA sequencing and its applications in cancer research.

    PubMed

    Zhu, Sibo; Qing, Tao; Zheng, Yuanting; Jin, Li; Shi, Leming

    2017-08-08

    Unlike population-level approaches, single-cell RNA sequencing enables transcriptomic analysis of an individual cell. Through the combination of high-throughput sequencing and bioinformatic tools, single-cell RNA-seq can detect more than 10,000 transcripts in one cell to distinguish cell subsets and dynamic cellular changes. After several years' development, single-cell RNA-seq can now achieve massively parallel, full-length mRNA sequencing as well as in situ sequencing and even has potential for multi-omic detection. One appealing area of single-cell RNA-seq is cancer research, and it is regarded as a promising way to enhance prognosis and provide more precise target therapy by identifying druggable subclones. Indeed, progresses have been made regarding solid tumor analysis to reveal intratumoral heterogeneity, correlations between signaling pathways, stemness, drug resistance, and tumor architecture shaping the microenvironment. Furthermore, through investigation into circulating tumor cells, many genes have been shown to promote a propensity toward stemness and the epithelial-mesenchymal transition, to enhance anchoring and adhesion, and to be involved in mechanisms of anoikis resistance and drug resistance. This review focuses on advances and progresses of single-cell RNA-seq with regard to the following aspects: 1. Methodologies of single-cell RNA-seq 2. Single-cell isolation techniques 3. Single-cell RNA-seq in solid tumor research 4. Single-cell RNA-seq in circulating tumor cell research 5.

  14. Advances in single-cell RNA sequencing and its applications in cancer research

    PubMed Central

    Zhu, Sibo; Qing, Tao; Zheng, Yuanting; Jin, Li; Shi, Leming

    2017-01-01

    Unlike population-level approaches, single-cell RNA sequencing enables transcriptomic analysis of an individual cell. Through the combination of high-throughput sequencing and bioinformatic tools, single-cell RNA-seq can detect more than 10,000 transcripts in one cell to distinguish cell subsets and dynamic cellular changes. After several years’ development, single-cell RNA-seq can now achieve massively parallel, full-length mRNA sequencing as well as in situ sequencing and even has potential for multi-omic detection. One appealing area of single-cell RNA-seq is cancer research, and it is regarded as a promising way to enhance prognosis and provide more precise target therapy by identifying druggable subclones. Indeed, progresses have been made regarding solid tumor analysis to reveal intratumoral heterogeneity, correlations between signaling pathways, stemness, drug resistance, and tumor architecture shaping the microenvironment. Furthermore, through investigation into circulating tumor cells, many genes have been shown to promote a propensity toward stemness and the epithelial-mesenchymal transition, to enhance anchoring and adhesion, and to be involved in mechanisms of anoikis resistance and drug resistance. This review focuses on advances and progresses of single-cell RNA-seq with regard to the following aspects: 1. Methodologies of single-cell RNA-seq 2. Single-cell isolation techniques 3. Single-cell RNA-seq in solid tumor research 4. Single-cell RNA-seq in circulating tumor cell research 5. Perspectives PMID:28881849

  15. ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data

    PubMed Central

    Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa

    2017-01-01

    Abstract RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM’s model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. PMID:28977546

  16. The nucleotide sequence of 5S rRNA from a cellular slime mold Dictyostelium discoideum.

    PubMed Central

    Hori, H; Osawa, S; Iwabuchi, M

    1980-01-01

    The nucleotide sequence of ribosomal 5S rRNA from a cellular slime mold Dictyostelium discoideum is GUAUACGGCCAUACUAGGUUGGAAACACAUCAUCCCGUUCGAUCUGAUA AGUAAAUCGACCUCAGGCCUUCCAAGUACUCUGGUUGGAGACAACAGGGGAACAUAGGGUGCUGUAUACU. A model for the secondary structure of this 5S rRNA is proposed. The sequence is more similar to those of animals (62% similarity on the average) rather than those of yeasts (56%). Images PMID:7465421

  17. Synchronous detection of ebolavirus conserved RNA sequences and ebolavirus-encoded miRNA-like fragment based on a zwitterionic copper (II) metal-organic framework.

    PubMed

    Qiu, Gui-Hua; Weng, Zi-Hua; Hu, Pei-Pei; Duan, Wen-Jun; Xie, Bao-Ping; Sun, Bin; Tang, Xiao-Yan; Chen, Jin-Xiang

    2018-04-01

    From a three-dimensional (3D) metal-organic framework (MOF) of {[Cu(Cmdcp)(phen)(H 2 O)] 2 ·9H 2 O} n (1, H 3 CmdcpBr = N-carboxymethyl-(3,5-dicarboxyl)pyridinium bromide, phen = phenanthroline), a sensitive and selective fluorescence sensor has been developed for the simultaneous detection of ebolavirus conserved RNA sequences and ebolavirus-encoded microRNA-like (miRNA-like) fragment. The results from molecular dynamics simulation confirmed that MOF 1 absorbs carboxyfluorescein (FAM)-tagged and 5(6)-carboxyrhodamine, triethylammonium salt (ROX)-tagged probe ss-DNA (probe DNA, P-DNA) by π … π stacking and hydrogen bonding, as well as additional electrostatic interactions to form a sensing platform of P-DNAs@1 with quenched FAM and ROX fluorescence. In the presence of targeted ebolavirus conserved RNA sequences or ebolavirus-encoded miRNA-like fragment, the fluorophore-labeled P-DNA hybridizes with the analyte to give a P-DNA@RNA duplex and released from MOF 1, triggering a fluorescence recovery. Simultaneous detection of two target RNAs has also been realized by single and synchronous fluorescence analysis. The formed sensing platform shows high sensitivity for ebolavirus conserved RNA sequences and ebolavirus-encoded miRNA-like fragment with detection limits at the picomolar level and high selectivity without cross-reaction between the two probes. MOF 1 thus shows the potential as an effective fluorescent sensing platform for the synchronous detection of two ebolavirus-related sequences, and offer improved diagnostic accuracy of Ebola virus disease. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Optimized approach for Ion Proton RNA sequencing reveals details of RNA splicing and editing features of the transcriptome.

    PubMed

    Brown, Roger B; Madrid, Nathaniel J; Suzuki, Hideaki; Ness, Scott A

    2017-01-01

    RNA-sequencing (RNA-seq) has become the standard method for unbiased analysis of gene expression but also provides access to more complex transcriptome features, including alternative RNA splicing, RNA editing, and even detection of fusion transcripts formed through chromosomal translocations. However, differences in library methods can adversely affect the ability to recover these different types of transcriptome data. For example, some methods have bias for one end of transcripts or rely on low-efficiency steps that limit the complexity of the resulting library, making detection of rare transcripts less likely. We tested several commonly used methods of RNA-seq library preparation and found vast differences in the detection of advanced transcriptome features, such as alternatively spliced isoforms and RNA editing sites. By comparing several different protocols available for the Ion Proton sequencer and by utilizing detailed bioinformatics analysis tools, we were able to develop an optimized random primer based RNA-seq technique that is reliable at uncovering rare transcript isoforms and RNA editing features, as well as fusion reads from oncogenic chromosome rearrangements. The combination of optimized libraries and rapid Ion Proton sequencing provides a powerful platform for the transcriptome analysis of research and clinical samples.

  19. Genome Sequence of Saccharomyces cerevisiae Double-Stranded RNA Virus L-A-28.

    PubMed

    Konovalovas, Aleksandras; Serviené, Elena; Serva, Saulius

    2016-06-16

    We cloned and sequenced the complete genome of the L-A-28 virus from the Saccharomyces cerevisiae K28 killer strain. This sequence completes the set of currently identified L-A helper viruses required for expression of double-stranded RNA-originated killer phenotypes in baking yeast. Copyright © 2016 Konovalovas et al.

  20. Mapping the miRNA interactome by crosslinking ligation and sequencing of hybrids (CLASH)

    PubMed Central

    Helwak, Aleksandra; Tollervey, David

    2014-01-01

    RNA-RNA interactions play critical roles in many cellular processes but studying them is difficult and laborious. Here, we describe an experimental procedure, termed crosslinking ligation and sequencing of hybrids (CLASH), which allows high-throughput identification of sites of RNA-RNA interaction. During CLASH, a tagged bait protein is UV crosslinked in vivo to stabilise RNA interactions and purified under denaturing conditions. RNAs associated with the bait protein are partially truncated, and the ends of RNA-duplexes are ligated together. Following linker addition, cDNA library preparation and high-throughput sequencing, the ligated duplexes give rise to chimeric cDNAs, which unambiguously identify RNA-RNA interaction sites independent of bioinformatic predictions. This protocol is optimized for studying miRNA targets bound by Argonaute proteins, but should be easily adapted for other RNA-binding proteins and classes of RNA. The protocol requires around 5 days to complete, excluding the time required for high-throughput sequencing and bioinformatic analyses. PMID:24577361

  1. Ni2+-binding RNA motifs with an asymmetric purine-rich internal loop and a G-A base pair.

    PubMed Central

    Hofmann, H P; Limmer, S; Hornung, V; Sprinzl, M

    1997-01-01

    RNA molecules with high affinity for immobilized Ni2+ were isolated from an RNA pool with 50 randomized positions by in vitro selection-amplification. The selected RNAs preferentially bind Ni2+ and Co2+ over other cations from first series transition metals. Conserved structure motifs, comprising about 15 nt, were identified that are likely to represent the Ni2+ binding sites. Two conserved motifs contain an asymmetric purine-rich internal loop and probably a mismatch G-A base pair. The structure of one of these motifs was studied with proton NMR spectroscopy and formation of the G-A pair at the junction of helix and internal loop was demonstrated. Using Ni2+ as a paramagnetic probe, a divalent metal ion binding site near this G-A base pair was identified. Ni2+ ions bound to this motif exert a specific stabilization effect. We propose that small asymmetric purine-rich loops that contain a G-A interaction may represent a divalent metal ion binding site in RNA. PMID:9409620

  2. Investigation of Microbial Diversity in Geothermal Hot Springs in Unkeshwar, India, Based on 16S rRNA Amplicon Metagenome Sequencing.

    PubMed

    Mehetre, Gajanan T; Paranjpe, Aditi; Dastager, Syed G; Dharne, Mahesh S

    2016-02-25

    Microbial diversity in geothermal waters of the Unkeshwar hot springs in Maharashtra, India, was studied using 16S rRNA amplicon metagenomic sequencing. Taxonomic analysis revealed the presence of Bacteroidetes, Proteobacteria, Cyanobacteria, Actinobacteria, Archeae, and OD1 phyla. Metabolic function prediction analysis indicated a battery of biological information systems indicating rich and novel microbial diversity, with potential biotechnological applications in this niche. Copyright © 2016 Mehetre et al.

  3. Direct Sequence Detection of Structured H5 Influenza Viral RNA

    PubMed Central

    Kerby, Matthew B.; Freeman, Sarah; Prachanronarong, Kristina; Artenstein, Andrew W.; Opal, Steven M.; Tripathi, Anubhav

    2008-01-01

    We describe the development of sequence-specific molecular beacons (dual-labeled DNA probes) for identification of the H5 influenza subtype, cleavage motif, and receptor specificity when hybridized directly with in vitro transcribed viral RNA (vRNA). The cloned hemagglutinin segment from a highly pathogenic H5N1 strain, A/Hanoi/30408/2005(H5N1), isolated from humans was used as template for in vitro transcription of sense-strand vRNA. The hybridization behavior of vRNA and a conserved subtype probe was characterized experimentally by varying conditions of time, temperature, and Mg2+ to optimize detection. Comparison of the hybridization rates of probe to DNA and RNA targets indicates that conformational switching of influenza RNA structure is a rate-limiting step and that the secondary structure of vRNA dominates the binding kinetics. The sensitivity and specificity of probe recognition of other H5 strains was calculated from sequence matches to the National Center for Biotechnology Information influenza database. The hybridization specificity of the subtype probes was experimentally verified with point mutations within the probe loop at five locations corresponding to the other human H5 strains. The abundance frequencies of the hemagglutinin cleavage motif and sialic acid recognition sequences were experimentally tested for H5 in all host viral species. Although the detection assay must be coupled with isothermal amplification on the chip, the new probes form the basis of a portable point-of-care diagnostic device for influenza subtyping. PMID:18403607

  4. Identification of extracellular miRNA in archived serum samples by next-generation sequencing from RNA extracted using multiple methods.

    PubMed

    Gautam, Aarti; Kumar, Raina; Dimitrov, George; Hoke, Allison; Hammamieh, Rasha; Jett, Marti

    2016-10-01

    miRNAs act as important regulators of gene expression by promoting mRNA degradation or by attenuating protein translation. Since miRNAs are stably expressed in bodily fluids, there is growing interest in profiling these miRNAs, as it is minimally invasive and cost-effective as a diagnostic matrix. A technical hurdle in studying miRNA dynamics is the ability to reliably extract miRNA as small sample volumes and low RNA abundance create challenges for extraction and downstream applications. The purpose of this study was to develop a pipeline for the recovery of miRNA using small volumes of archived serum samples. The RNA was extracted employing several widely utilized RNA isolation kits/methods with and without addition of a carrier. The small RNA library preparation was carried out using Illumina TruSeq small RNA kit and sequencing was carried out using Illumina platform. A fraction of five microliters of total RNA was used for library preparation as quantification is below the detection limit. We were able to profile miRNA levels in serum from all the methods tested. We found out that addition of nucleic acid based carrier molecules had higher numbers of processed reads but it did not enhance the mapping of any miRBase annotated sequences. However, some of the extraction procedures offer certain advantages: RNA extracted by TRIzol seemed to align to the miRBase best; extractions using TRIzol with carrier yielded higher miRNA-to-small RNA ratios. Nuclease free glycogen can be carrier of choice for miRNA sequencing. Our findings illustrate that miRNA extraction and quantification is influenced by the choice of methodologies. Addition of nucleic acid- based carrier molecules during extraction procedure is not a good choice when assaying miRNA using sequencing. The careful selection of an extraction method permits the archived serum samples to become valuable resources for high-throughput applications.

  5. Advancing viral RNA structure prediction: measuring the thermodynamics of pyrimidine-rich internal loops.

    PubMed

    Phan, Andy; Mailey, Katherine; Saeki, Jessica; Gu, Xiaobo; Schroeder, Susan J

    2017-05-01

    Accurate thermodynamic parameters improve RNA structure predictions and thus accelerate understanding of RNA function and the identification of RNA drug binding sites. Many viral RNA structures, such as internal ribosome entry sites, have internal loops and bulges that are potential drug target sites. Current models used to predict internal loops are biased toward small, symmetric purine loops, and thus poorly predict asymmetric, pyrimidine-rich loops with >6 nucleotides (nt) that occur frequently in viral RNA. This article presents new thermodynamic data for 40 pyrimidine loops, many of which can form UU or protonated CC base pairs. Uracil and protonated cytosine base pairs stabilize asymmetric internal loops. Accurate prediction rules are presented that account for all thermodynamic measurements of RNA asymmetric internal loops. New loop initiation terms for loops with >6 nt are presented that do not follow previous assumptions that increasing asymmetry destabilizes loops. Since the last 2004 update, 126 new loops with asymmetry or sizes greater than 2 × 2 have been measured. These new measurements significantly deepen and diversify the thermodynamic database for RNA. These results will help better predict internal loops that are larger, pyrimidine-rich, and occur within viral structures such as internal ribosome entry sites. © 2017 Phan et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  6. Nucleotide sequences of Dictyostelium discoideum developmentally regulated cDNAs rich in (AAC) imply proteins that contain clusters of asparagine, glutamine, or threonine.

    PubMed

    Shaw, D R; Richter, H; Giorda, R; Ohmachi, T; Ennis, H L

    1989-09-01

    A Dictyostelium discoideum repetitive element composed of long repeats of the codon (AAC) is found in developmentally regulated transcripts. The concentration of (AAC) sequences is low in mRNA from dormant spores and growing cells and increases markedly during spore germination and multicellular development. The sequence hybridizes to many different sized Dictyostelium DNA restriction fragments indicating that it is scattered throughout the genome. Four cDNA clones isolated contain (AAC) sequences in the deduced coding region. Interestingly, the (AAC)-rich sequences are present in all three reading frames in the deduced proteins, i.e., AAC (asparagine), ACA (threonine) and CAA (glutamine). Three of the clones contain only one of these in-frame so that the individual proteins carry either asparagine, threonine, or glutamine clusters, not mixtures. However, one clone is both glutamine- and asparagine-rich. The (AAC) portion of the transcripts are reiterated 300 times in the haploid genome while the other portions of the cDNAs represent single copy genes, whose sequences show no similarity other than the (AAC) repeats. The repeated sequence is similar to the opa or M sequence found in Drosophila melanogaster notch and homeo box genes and in fly developmentally regulated transcripts. The transcripts are present on polysomes suggesting that they are translated. Although the function of these repeats is unknown, long amino acid repeats are a characteristic feature of extracellular proteins of lower eukaryotes.

  7. High-Throughput Single-Cell RNA Sequencing and Data Analysis.

    PubMed

    Sagar; Herman, Josip Stefan; Pospisilik, John Andrew; Grün, Dominic

    2018-01-01

    Understanding biological systems at a single cell resolution may reveal several novel insights which remain masked by the conventional population-based techniques providing an average readout of the behavior of cells. Single-cell transcriptome sequencing holds the potential to identify novel cell types and characterize the cellular composition of any organ or tissue in health and disease. Here, we describe a customized high-throughput protocol for single-cell RNA-sequencing (scRNA-seq) combining flow cytometry and a nanoliter-scale robotic system. Since scRNA-seq requires amplification of a low amount of endogenous cellular RNA, leading to substantial technical noise in the dataset, downstream data filtering and analysis require special care. Therefore, we also briefly describe in-house state-of-the-art data analysis algorithms developed to identify cellular subpopulations including rare cell types as well as to derive lineage trees by ordering the identified subpopulations of cells along the inferred differentiation trajectories.

  8. Methods for processing high-throughput RNA sequencing data.

    PubMed

    Ares, Manuel

    2014-11-03

    High-throughput sequencing (HTS) methods for analyzing RNA populations (RNA-Seq) are gaining rapid application to many experimental situations. The steps in an RNA-Seq experiment require thought and planning, especially because the expense in time and materials is currently higher and the protocols are far less routine than those used for other high-throughput methods, such as microarrays. As always, good experimental design will make analysis and interpretation easier. Having a clear biological question, an idea about the best way to do the experiment, and an understanding of the number of replicates needed will make the entire process more satisfying. Whether the goal is capturing transcriptome complexity from a tissue or identifying small fragments of RNA cross-linked to a protein of interest, conversion of the RNA to cDNA followed by direct sequencing using the latest methods is a developing practice, with new technical modifications and applications appearing every day. Even more rapid are the development and improvement of methods for analysis of the very large amounts of data that arrive at the end of an RNA-Seq experiment, making considerations regarding reproducibility, validation, visualization, and interpretation increasingly important. This introduction is designed to review and emphasize a pathway of analysis from experimental design through data presentation that is likely to be successful, with the recognition that better methods are right around the corner. © 2014 Cold Spring Harbor Laboratory Press.

  9. 3' terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing.

    PubMed

    Goldfarb, Katherine C; Cech, Thomas R

    2013-09-21

    Post-transcriptional 3' end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3' RACE coupled with high-throughput sequencing to characterize the 3' terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. The 3' terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3' terminus of an in vitro transcribed MRP RNA control and the differing 3' terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). 3' RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3' terminal sequences of noncoding RNAs.

  10. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    PubMed

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  11. Profiling of drought-responsive microRNA and mRNA in tomato using high-throughput sequencing.

    PubMed

    Liu, Minmin; Yu, Huiyang; Zhao, Gangjun; Huang, Qiufeng; Lu, Yongen; Ouyang, Bo

    2017-06-26

    Abiotic stresses cause severe loss of crop production. Among them, drought is one of the most frequent environmental stresses, which limits crop growth, development and productivity. Plant drought tolerance is fine-tuned by a complex gene regulatory network. Understanding the molecular regulation of this polygenic trait is crucial for the eventual success to improve plant yield and quality. Recent studies have demonstrated that microRNAs play critical roles in plant drought tolerance. However, little is known about the microRNA in drought response of the model plant tomato. Here, we described the profiling of drought-responsive microRNA and mRNA in tomato using high-throughput next-generation sequencing. Drought stress was applied on the seedlings of M82, a drought-sensitive cultivated tomato genotype, and IL9-1, a drought-tolerant introgression line derived from the stress-resistant wild species Solanum pennellii LA0716 and M82. Under drought, IL9-1 performed superior than M82 regarding survival rate, H 2 O 2 elimination and leaf turgor maintenance. A total of four small RNA and eight mRNA libraries were constructed and sequenced using Illumina sequencing technology. 105 conserved and 179 novel microRNAs were identified, among them, 54 and 98 were differentially expressed upon drought stress, respectively. The majority of the differentially-expressed conserved microRNAs was up-regulated in IL9-1 whereas down-regulated in M82. Under drought stress, 2714 and 1161 genes were found to be differentially expressed in M82 and IL9-1, respectively, and many of their homologues are involved in plant stress, such as genes encoding transcription factor and protein kinase. Various pathways involved in abiotic stress were revealed by Gene Ontology and pathway analysis. The mRNA sequencing results indicated that most of the target genes were regulated by their corresponding microRNAs, which suggested that microRNAs may play essential roles in the drought tolerance of tomato. In

  12. Accurate RNA consensus sequencing for high-fidelity detection of transcriptional mutagenesis-induced epimutations.

    PubMed

    Reid-Bayliss, Kate S; Loeb, Lawrence A

    2017-08-29

    Transcriptional mutagenesis (TM) due to misincorporation during RNA transcription can result in mutant RNAs, or epimutations, that generate proteins with altered properties. TM has long been hypothesized to play a role in aging, cancer, and viral and bacterial evolution. However, inadequate methodologies have limited progress in elucidating a causal association. We present a high-throughput, highly accurate RNA sequencing method to measure epimutations with single-molecule sensitivity. Accurate RNA consensus sequencing (ARC-seq) uniquely combines RNA barcoding and generation of multiple cDNA copies per RNA molecule to eliminate errors introduced during cDNA synthesis, PCR, and sequencing. The stringency of ARC-seq can be scaled to accommodate the quality of input RNAs. We apply ARC-seq to directly assess transcriptome-wide epimutations resulting from RNA polymerase mutants and oxidative stress.

  13. Embedding siRNA sequences targeting Apolipoprotein B100 in shRNA and miRNA scaffolds results in differential processing and in vivo efficacy

    PubMed Central

    Maczuga, Piotr; Lubelski, Jacek; van Logtenstein, Richard; Borel, Florie; Blits, Bas; Fakkert, Erwin; Costessi, Adalberto; Butler, Derek; van Deventer, Sander; Petry, Harald; Koornneef, Annemart; Konstantinova, Pavlina

    2013-01-01

    Overexpression of short hairpin RNA (shRNA) often causes cytotoxicity and using microRNA (miRNA) scaffolds can circumvent this problem. In this study, identically predicted small interfering RNA (siRNA) sequences targeting apolipoprotein B100 (siApoB) were embedded in shRNA (shApoB) or miRNA (miApoB) scaffolds and a direct comparison of the processing and long-term in vivo efficacy was performed. Next generation sequencing of small RNAs originating from shApoB- or miApoB-transfected cells revealed substantial differences in processing, resulting in different siApoB length, 5′ and 3′ cleavage sites and abundance of the guide or passenger strands. Murine liver transduction with adeno-associated virus (AAV) vectors expressing shApoB or miApoB resulted in high levels of siApoB expression associated with strong decrease of plasma ApoB protein and cholesterol. Expression of miApoB from the liver-specific LP1 promoter was restricted to the liver, while the H1 promoter-expressed shApoB was ectopically present. Delivery of 1 × 1011 genome copies AAV-shApoB or AAV-miApoB led to a gradual loss of ApoB and plasma cholesterol inhibition, which was circumvented by delivering a 20-fold lower vector dose. In conclusion, incorporating identical siRNA sequences in shRNA or miRNA scaffolds results in differential processing patterns and in vivo efficacy that may have serious consequences for future RNAi-based therapeutics. PMID:23089734

  14. Mapping contacts between gRNA and mRNA in trypanosome RNA editing.

    PubMed

    Leung, S S; Koslowsky, D J

    1999-02-01

    All guide RNAs (gRNAs) identified to date have defined 5' anchor sequences, guiding sequences and a non-encoded 3' uridylate tail. The 5' anchor is required for in vitro editing and is thought to be responsible for selection and binding to the pre-edited mRNA. Little is known, however, about how the gRNAs are used to direct RNA editing. Utilizing the photo-reactive crosslinking agent, azidophenacyl (APA), attached to the 5'- or 3'-terminus of the gRNA, we have begun to map the structural relationships between the different defined regions of the gRNA with the pre-edited mRNA. Analyses of crosslinked conjugates produced with a 5'-terminal APA group confirm that the anchor of the gRNA is correctly positioning the interacting molecules. 3' Crosslinks (X-linker placed at the 3'-end of a U10tail) have also been mapped for three different gRNA/mRNA pairs. In all cases, analyses indicate that the U-tail can interact with a range of nucleotides located upstream of the first edited site. It appears that the U-tail prefers purine-rich sites, close to the first few editing sites. These results suggest that the U-tail may act in concert with the anchor to melt out secondary structure in the mRNA in the immediate editing domain, possibly increasing the accessibility of the editing complex to the proper editing sites.

  15. Deep sequencing and in silico analysis of small RNA library reveals novel miRNA from leaf Persicaria minor transcriptome.

    PubMed

    Samad, Abdul Fatah A; Nazaruddin, Nazaruddin; Murad, Abdul Munir Abdul; Jani, Jaeyres; Zainal, Zamri; Ismail, Ismanizan

    2018-03-01

    In current era, majority of microRNA (miRNA) are being discovered through computational approaches which are more confined towards model plants. Here, for the first time, we have described the identification and characterization of novel miRNA in a non-model plant, Persicaria minor ( P . minor ) using computational approach. Unannotated sequences from deep sequencing were analyzed based on previous well-established parameters. Around 24 putative novel miRNAs were identified from 6,417,780 reads of the unannotated sequence which represented 11 unique putative miRNA sequences. PsRobot target prediction tool was deployed to identify the target transcripts of putative novel miRNAs. Most of the predicted target transcripts (mRNAs) were known to be involved in plant development and stress responses. Gene ontology showed that majority of the putative novel miRNA targets involved in cellular component (69.07%), followed by molecular function (30.08%) and biological process (0.85%). Out of 11 unique putative miRNAs, 7 miRNAs were validated through semi-quantitative PCR. These novel miRNAs discoveries in P . minor may develop and update the current public miRNA database.

  16. Single-Cell RNA Sequencing of the Bronchial Epithelium in Smokers With Lung Cancer

    DTIC Science & Technology

    2015-07-01

    AWARD NUMBER: W81XWH-14-1-0234 TITLE: Single-Cell RNA Sequencing of the Bronchial Epithelium in Smokers With Lung Cancer PRINCIPAL INVESTIGATOR...TITLE AND SUBTITLE Single-Cell RNA Sequencing of the Bronchial Epithelium in Smokers With Lung Cancer 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH...single cell RNA sequencing on airway epithelial cells obtained from smokers with and without lung cancer to identify cell-type dependent gene expression

  17. Mapping a nucleolar targeting sequence of an RNA binding nucleolar protein, Nop25

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fujiwara, Takashi; Suzuki, Shunji; Kanno, Motoko

    2006-06-10

    Nop25 is a putative RNA binding nucleolar protein associated with rRNA transcription. The present study was undertaken to determine the mechanism of Nop25 localization in the nucleolus. Deletion experiments of Nop25 amino acid sequence showed Nop25 to contain a nuclear targeting sequence in the N-terminal and a nucleolar targeting sequence in the C-terminal. By expressing derivative peptides from the C-terminal as GFP-fusion proteins in the cells, a lysine and arginine residue-enriched peptide (KRKHPRRAQDSTKKPPSATRTSKTQRRRR) allowed a GFP-fusion protein to be transported and fully retained in the nucleolus. When the peptide was fused with cMyc epitope and expressed in the cells, amore » cMyc epitope was then detected in the nucleolus. Nop25 did not localize in the nucleolus by deletion of the peptide from Nop25. Furthermore, deletion of a subdomain (KRKHPRRAQ) in the peptide or amino acid substitution of lysine and arginine residues in the subdomain resulted in the loss of Nop25 nucleolar localization. These results suggest that the lysine and arginine residue-enriched peptide is the most prominent nucleolar targeting sequence of Nop25 and that the long stretch of basic residues might play an important role in the nucleolar localization of Nop25. Although Nop25 contained putative SUMOylation, phosphorylation and glycosylation sites, the amino acid substitution in these sites had no effect on the nucleolar localization, thus suggesting that these post-translational modifications did not contribute to the localization of Nop25 in the nucleolus. The treatment of the cells, which expressed a GFP-fusion protein with a nucleolar targeting sequence of Nop25, with RNase A resulted in a complete dislocation of the protein from the nucleolus. These data suggested that the nucleolar targeting sequence might therefore play an important role in the binding of Nop25 to RNA molecules and that the RNA binding of Nop25 might be essential for the nucleolar localization of Nop25.« less

  18. Common 5S rRNA variants are likely to be accepted in many sequence contexts

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; D'Souza, Lisa M.; Lee, Youn-Hyung; Fox, George E.

    2003-01-01

    Over evolutionary time RNA sequences which are successfully fixed in a population are selected from among those that satisfy the structural and chemical requirements imposed by the function of the RNA. These sequences together comprise the structure space of the RNA. In principle, a comprehensive understanding of RNA structure and function would make it possible to enumerate which specific RNA sequences belong to a particular structure space and which do not. We are using bacterial 5S rRNA as a model system to attempt to identify principles that can be used to predict which sequences do or do not belong to the 5S rRNA structure space. One promising idea is the very intuitive notion that frequently seen sequence changes in an aligned data set of naturally occurring 5S rRNAs would be widely accepted in many other 5S rRNA sequence contexts. To test this hypothesis, we first developed well-defined operational definitions for a Vibrio region of the 5S rRNA structure space and what is meant by a highly variable position. Fourteen sequence variants (10 point changes and 4 base-pair changes) were identified in this way, which, by the hypothesis, would be expected to incorporate successfully in any of the known sequences in the Vibrio region. All 14 of these changes were constructed and separately introduced into the Vibrio proteolyticus 5S rRNA sequence where they are not normally found. Each variant was evaluated for its ability to function as a valid 5S rRNA in an E. coli cellular context. It was found that 93% (13/14) of the variants tested are likely valid 5S rRNAs in this context. In addition, seven variants were constructed that, although present in the Vibrio region, did not meet the stringent criteria for a highly variable position. In this case, 86% (6/7) are likely valid. As a control we also examined seven variants that are seldom or never seen in the Vibrio region of 5S rRNA sequence space. In this case only two of seven were found to be potentially valid. The

  19. RNAPattMatch: a web server for RNA sequence/structure motif detection based on pattern matching with flexible gaps

    PubMed Central

    Drory Retwitzer, Matan; Polishchuk, Maya; Churkin, Elena; Kifer, Ilona; Yakhini, Zohar; Barash, Danny

    2015-01-01

    Searching for RNA sequence-structure patterns is becoming an essential tool for RNA practitioners. Novel discoveries of regulatory non-coding RNAs in targeted organisms and the motivation to find them across a wide range of organisms have prompted the use of computational RNA pattern matching as an enhancement to sequence similarity. State-of-the-art programs differ by the flexibility of patterns allowed as queries and by their simplicity of use. In particular—no existing method is available as a user-friendly web server. A general program that searches for RNA sequence-structure patterns is RNA Structator. However, it is not available as a web server and does not provide the option to allow flexible gap pattern representation with an upper bound of the gap length being specified at any position in the sequence. Here, we introduce RNAPattMatch, a web-based application that is user friendly and makes sequence/structure RNA queries accessible to practitioners of various background and proficiency. It also extends RNA Structator and allows a more flexible variable gaps representation, in addition to analysis of results using energy minimization methods. RNAPattMatch service is available at http://www.cs.bgu.ac.il/rnapattmatch. A standalone version of the search tool is also available to download at the site. PMID:25940619

  20. Sequence characterization of 5S ribosomal RNA from eight gram positive procaryotes

    NASA Technical Reports Server (NTRS)

    Woese, C. R.; Luehrsen, K. R.; Pribula, C. D.; Fox, G. E.

    1976-01-01

    Complete nucleotide sequences are presented for 5S rRNA from Bacillus subtilis, B. firmus, B. pasteurii, B. brevis, Lactobacillus brevis, and Streptococcus faecalis, and 5S rRNA oligonucleotide catalogs and partial sequence data are given for B. cereus and Sporosarcina ureae. These data demonstrate a striking consistency of 5S rRNA primary and secondary structure within a given bacterial grouping. An exception is B. brevis, in which the 5S rRNA sequence varies significantly from that of other bacilli in the tuned helix and the procaryotic loop. The localization of these variations suggests that B. brevis occupies an ecological niche that selects such changes. It is noted that this organism produces antibiotics which affect ribosome function.

  1. JNSViewer—A JavaScript-based Nucleotide Sequence Viewer for DNA/RNA secondary structures

    PubMed Central

    Dong, Min; Graham, Mitchell; Yadav, Nehul

    2017-01-01

    Many tools are available for visualizing RNA or DNA secondary structures, but there is scarce implementation in JavaScript that provides seamless integration with the increasingly popular web computational platforms. We have developed JNSViewer, a highly interactive web service, which is bundled with several popular tools for DNA/RNA secondary structure prediction and can provide precise and interactive correspondence among nucleotides, dot-bracket data, secondary structure graphs, and genic annotations. In JNSViewer, users can perform RNA secondary structure predictions with different programs and settings, add customized genic annotations in GFF format to structure graphs, search for specific linear motifs, and extract relevant structure graphs of sub-sequences. JNSViewer also allows users to choose a transcript or specific segment of Arabidopsis thaliana genome sequences and predict the corresponding secondary structure. Popular genome browsers (i.e., JBrowse and BrowserGenome) were integrated into JNSViewer to provide powerful visualizations of chromosomal locations, genic annotations, and secondary structures. In addition, we used StructureFold with default settings to predict some RNA structures for Arabidopsis by incorporating in vivo high-throughput RNA structure profiling data and stored the results in our web server, which might be a useful resource for RNA secondary structure studies in plants. JNSViewer is available at http://bioinfolab.miamioh.edu/jnsviewer/index.html. PMID:28582416

  2. Use of 16S Ribosomal RNA Sequences to Infer Relationships among Archaebacteria.

    DTIC Science & Technology

    1987-04-16

    the rRNAs of one or both other kingdoms , and among the archaebacteria there are also substantial variations) (1), echinoderms (5, 11), major...Security Classification) Ln Use of 16S Ribosomal RNA Sequences to infer Relationships among Archaebacteria : Annual Report (U) q 12 PERSONAL AUTHOR(S...FIELD GROUP SUB-GROUP Archaebacteria ; Eubacteria; Eukaryotes; 16S Ribosomal RNA; 08 I Phylogeny; rRNA; RNA Sequencing; Molecular Clock; Urkingdoms; r

  3. Small molecule alteration of RNA sequence in cells and animals.

    PubMed

    Guan, Lirui; Luo, Yiling; Ja, William W; Disney, Matthew D

    2017-10-18

    RNA regulation and maintenance are critical for proper cell function. Small molecules that specifically alter RNA sequence would be exceptionally useful as probes of RNA structure and function or as potential therapeutics. Here, we demonstrate a photochemical approach for altering the trinucleotide expanded repeat causative of myotonic muscular dystrophy type 1 (DM1), r(CUG) exp . The small molecule, 2H-4-Ru, binds to r(CUG) exp and converts guanosine residues to 8-oxo-7,8-dihydroguanosine upon photochemical irradiation. We demonstrate targeted modification upon irradiation in cell culture and in Drosophila larvae provided a diet containing 2H-4-Ru. Our results highlight a general chemical biology approach for altering RNA sequence in vivo by using small molecules and photochemistry. Furthermore, these studies show that addition of 8-oxo-G lesions into RNA 3' untranslated regions does not affect its steady state levels. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. The chemical structure of DNA sequence signals for RNA transcription

    NASA Technical Reports Server (NTRS)

    George, D. G.; Dayhoff, M. O.

    1982-01-01

    The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

  5. Sequence specificity of the human mRNA N6-adenosine methylase in vitro.

    PubMed Central

    Harper, J E; Miceli, S M; Roberts, R J; Manley, J L

    1990-01-01

    N6-adenosine methylation is a frequent modification of mRNAs and their precursors, but little is known about the mechanism of the reaction or the function of the modification. To explore these questions, we developed conditions to examine N6-adenosine methylase activity in HeLa cell nuclear extracts. Transfer of the methyl group from S-[3H methyl]-adenosylmethionine to unlabeled random copolymer RNA substrates of varying ribonucleotide composition revealed a substrate specificity consistent with a previously deduced consensus sequence, Pu[G greater than A]AC[A/C/U]. 32-P labeled RNA substrates of defined sequence were used to examine the minimum sequence requirements for methylation. Each RNA was 20 nucleotides long, and contained either the core consensus sequence GGACU, or some variation of this sequence. RNAs containing GGACU, either in single or multiple copies, were good substrates for methylation, whereas RNAs containing single base substitutions within the GGACU sequence gave dramatically reduced methylation. These results demonstrate that the N6-adenosine methylase has a strict sequence specificity, and that there is no requirement for extended sequences or secondary structures for methylation. Recognition of this sequence does not require an RNA component, as micrococcal nuclease pretreatment of nuclear extracts actually increased methylation efficiency. Images PMID:2216767

  6. [Influence of antisense RNA and sequences of viral transactivators traps on RNA synthesis of HTLV-1 virus].

    PubMed

    Borisenko, A S; Kotus, E V; Kaloshin, A A

    2008-01-01

    Significant number of scientific publications devoted to inhibition of viral replication by antisense RNA (asRNA) genes shows that this approach is useful for gene therapy of viral infections. To investigate the possibility of suppression of HTLV-1 virus reproduction by asRNA we constructed recombinant plasmids containing asRNA genes against U3 long terminal repeats region and X gene under the control of promoter of myeloproliferative sarcoma virus (MPSV) or without such promoter. Using stable calcium-phosphate transfection method with subsequent selection in the presence of G-418, RaHOS line-based cell clones carrying both asRNA genes and sequences able to bind HTLV-1 transactivator proteins (i.e. "traps" of viral transactivators, TVT) were obtained. Data from dot-hybridization analysis of viral RNA extracted from RaHOS cell clones showed that TVT sequences are able to suppress the viral RNA synthesis on 90% and asRNA against X gene synthesis--on 50%.

  7. RNA Binding of T-cell Intracellular Antigen-1 (TIA-1) C-terminal RNA Recognition Motif Is Modified by pH Conditions*

    PubMed Central

    Cruz-Gallardo, Isabel; Aroca, Ángeles; Persson, Cecilia; Karlsson, B. Göran; Díaz-Moreno, Irene

    2013-01-01

    T-cell intracellular antigen-1 (TIA-1) is a DNA/RNA-binding protein that regulates critical events in cell physiology by the regulation of pre-mRNA splicing and mRNA translation. TIA-1 is composed of three RNA recognition motifs (RRMs) and a glutamine-rich domain and binds to uridine-rich RNA sequences through its C-terminal RRM2 and RRM3 domains. Here, we show that RNA binding mediated by either isolated RRM3 or the RRM23 construct is controlled by slight environmental pH changes due to the protonation/deprotonation of TIA-1 RRM3 histidine residues. The auxiliary role of the C-terminal RRM3 domain in TIA-1 RNA recognition is poorly understood, and this work provides insight into its binding mechanisms. PMID:23902765

  8. Toward Rare Blood Cell Preservation for RNA Sequencing.

    PubMed

    Vickovic, Sanja; Ahmadian, Afshin; Lewensohn, Rolf; Lundeberg, Joakim

    2015-07-01

    Cancer is driven by various events leading to cell differentiation and disease progression. Molecular tools are powerful approaches for describing how and why these events occur. With the growing field of next-generation DNA sequencing, there is an increasing need for high-quality nucleic acids derived from human cells and tissues-a prerequisite for successful cell profiling. Although advances in RNA preservation have been made, some of the largest biobanks still do not employ RNA blood preservation as standard because of limitations in low blood-input volume and RNA stability over the whole gene body. Therefore, we have developed a robust protocol for blood preservation and long-term storage while maintaining RNA integrity. Furthermore, we explored the possibility of using the protocol for preserving rare cell samples, such as circulating tumor cells. The results of our study confirmed that gene expression was not impacted by the preservation procedure (r(2) > 0.88) or by long-term storage (r(2) = 0.95), with RNA integrity number values averaging over 8. Similarly, cell surface antigens were still available for antibody selection (r(2) = 0.95). Lastly, data mining for fusion events showed that it was possible to detect rare tumor cells among a background of other cells present in blood irrespective of fixation. Thus, the developed protocol would be suitable for rare blood cell preservation followed by RNA sequencing analysis. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  9. Molecular equilibria and condensation sequences in carbon rich gases

    NASA Technical Reports Server (NTRS)

    Sharp, C. M.; Wasserburg, G. J.

    1993-01-01

    Chemical equilibria in stellar atmospheres have been investigated by many authors. Lattimer, Schramm, and Grossman presented calculations in both O rich and C rich environments and predicted possible presolar condensates. A recent paper by Cherchneff and Barker considered a C rich composition with PAH's included in the calculations. However, the condensation sequences of C bearing species have not been investigated in detail. In a carbon rich gas surrounding an AGB star, it is often assumed that graphite (or diamond) condenses out before TiC and SiC. However, Lattimer et al. found some conditions under which TiC condenses before graphite. We have performed molecular equilibrium calculations to establish the stability fields of C(s), TiC(s), and SiC(s) and other high temperature phases under conditions of different pressures and C/O. The preserved presolar interstellar dust grains so far discovered in meteorites are graphite, diamond, SiC, TiC, and possibly Al2O3.

  10. Recovering complete mitochondrial genome sequences from RNA-Seq: A case study of Polytomella non-photosynthetic green algae.

    PubMed

    Tian, Yao; Smith, David Roy

    2016-05-01

    Thousands of mitochondrial genomes have been sequenced, but there are comparatively few available mitochondrial transcriptomes. This might soon be changing. High-throughput RNA sequencing (RNA-Seq) techniques have made it fast and cheap to generate massive amounts of mitochondrial transcriptomic data. Here, we explore the utility of RNA-Seq for assembling mitochondrial genomes and studying their expression patterns. Specifically, we investigate the mitochondrial transcriptomes from Polytomella non-photosynthetic green algae, which have among the smallest, most reduced mitochondrial genomes from the Archaeplastida as well as fragmented rRNA-coding regions, palindromic genes, and linear chromosomes with telomeres. Isolation of whole genomic RNA from the four known Polytomella species followed by Illumina paired-end sequencing generated enough mitochondrial-derived reads to easily recover almost-entire mitochondrial genome sequences. Read-mapping and coverage statistics also gave insights into Polytomella mitochondrial transcriptional architecture, revealing polycistronic transcripts and the expression of telomeres and palindromic genes. Ultimately, RNA-Seq is a promising, cost-effective technique for studying mitochondrial genetics, but it does have drawbacks, which are discussed. One of its greatest potentials, as shown here, is that it can be used to generate near-complete mitochondrial genome sequences, which could be particularly useful in situations where there is a lack of available mtDNA data. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  11. PRADA: pipeline for RNA sequencing data analysis.

    PubMed

    Torres-García, Wandaliz; Zheng, Siyuan; Sivachenko, Andrey; Vegesna, Rahulsimham; Wang, Qianghu; Yao, Rong; Berger, Michael F; Weinstein, John N; Getz, Gad; Verhaak, Roel G W

    2014-08-01

    Technological advances in high-throughput sequencing necessitate improved computational tools for processing and analyzing large-scale datasets in a systematic automated manner. For that purpose, we have developed PRADA (Pipeline for RNA-Sequencing Data Analysis), a flexible, modular and highly scalable software platform that provides many different types of information available by multifaceted analysis starting from raw paired-end RNA-seq data: gene expression levels, quality metrics, detection of unsupervised and supervised fusion transcripts, detection of intragenic fusion variants, homology scores and fusion frame classification. PRADA uses a dual-mapping strategy that increases sensitivity and refines the analytical endpoints. PRADA has been used extensively and successfully in the glioblastoma and renal clear cell projects of The Cancer Genome Atlas program.  http://sourceforge.net/projects/prada/  gadgetz@broadinstitute.org or rverhaak@mdanderson.org  Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. miRDis: a Web tool for endogenous and exogenous microRNA discovery based on deep-sequencing data analysis.

    PubMed

    Zhang, Hanyuan; Vieira Resende E Silva, Bruno; Cui, Juan

    2018-05-01

    Small RNA sequencing is the most widely used tool for microRNA (miRNA) discovery, and shows great potential for the efficient study of miRNA cross-species transport, i.e., by detecting the presence of exogenous miRNA sequences in the host species. Because of the increased appreciation of dietary miRNAs and their far-reaching implication in human health, research interests are currently growing with regard to exogenous miRNAs bioavailability, mechanisms of cross-species transport and miRNA function in cellular biological processes. In this article, we present microRNA Discovery (miRDis), a new small RNA sequencing data analysis pipeline for both endogenous and exogenous miRNA detection. Specifically, we developed and deployed a Web service that supports the annotation and expression profiling data of known host miRNAs and the detection of novel miRNAs, other noncoding RNAs, and the exogenous miRNAs from dietary species. As a proof-of-concept, we analyzed a set of human plasma sequencing data from a milk-feeding study where 225 human miRNAs were detected in the plasma samples and 44 show elevated expression after milk intake. By examining the bovine-specific sequences, data indicate that three bovine miRNAs (bta-miR-378, -181* and -150) are present in human plasma possibly because of the dietary uptake. Further evaluation based on different sets of public data demonstrates that miRDis outperforms other state-of-the-art tools in both detection and quantification of miRNA from either animal or plant sources. The miRDis Web server is available at: http://sbbi.unl.edu/miRDis/index.php.

  13. Modeling bias and variation in the stochastic processes of small RNA sequencing

    PubMed Central

    Etheridge, Alton; Sakhanenko, Nikita; Galas, David

    2017-01-01

    Abstract The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data. PMID:28369495

  14. Three RNA recognition motifs participate in RNA recognition and structural organization by the pro-apoptotic factor TIA-1

    PubMed Central

    Bauer, William J.; Heath, Jason; Jenkins, Jermaine L.; Kielkopf, Clara L.

    2012-01-01

    T-cell intracellular antigen-1 (TIA-1) regulates developmental and stress-responsive pathways through distinct activities at the levels of alternative pre-mRNA splicing and mRNA translation. The TIA-1 polypeptide contains three RNA recognition motifs (RRMs). The central RRM2 and C-terminal RRM3 associate with cellular mRNAs. The N-terminal RRM1 enhances interactions of a C-terminal Q-rich domain of TIA-1 with the U1-C splicing factor, despite linear separation of the domains in the TIA-1 sequence. Given the expanded functional repertoire of the RRM family, it was unknown whether TIA-1 RRM1 contributes to RNA binding as well as documented protein interactions. To address this question, we used isothermal titration calorimetry and small-angle X-ray scattering (SAXS) to dissect the roles of the TIA-1 RRMs in RNA recognition. Notably, the fas RNA exhibited two binding sites with indistinguishable affinities for TIA-1. Analyses of TIA-1 variants established that RRM1 was dispensable for binding AU-rich fas sites, yet all three RRMs were required to bind a polyU RNA with high affinity. SAXS analyses demonstrated a `V' shape for a TIA-1 construct comprising the three RRMs, and revealed that its dimensions became more compact in the RNA-bound state. The sequence-selective involvement of TIA-1 RRM1 in RNA recognition suggests a possible role for RNA sequences in regulating the distinct functions of TIA-1. Further implications for U1-C recruitment by the adjacent TIA-1 binding sites of the fas pre-mRNA and the bent TIA-1 shape, which organizes the N- and C-termini on the same side of the protein, are discussed. PMID:22154808

  15. The conserved CAAGAAAGA spacer sequence is an essential element for the formation of 3' termini of the sea urchin H3 histone mRNA by RNA processing.

    PubMed Central

    Georgiev, O; Birnstiel, M L

    1985-01-01

    Analysis of cDNA sequences obtained from the small nuclear RNA U7 has previously suggested specific contacts, by base pairing, between the conserved stem-loop structure and CAAGAAAGA sequence of the histone pre-mRNA and the 5'-terminal sequence of the U7 RNA during RNA processing. In order to test some aspects of the model we have created a series of linker scan, deletion and insertion mutants of the 3' terminus of a sea urchin H3 histone gene and have injected mutant DNAs or in vitro synthesized precursors into frog oocyte nuclei for interpretation. We find that, in addition to the stem-loop structure of the mRNA, the CAAGAAAGA spacer transcript within the histone pre-mRNA is required absolutely for RNA processing, as predicted from our model. Spacer sequences immediately downstream of the CAAGAAAGA motif are not complementary to U7 RNA. Nevertheless, they are necessary for obtaining a maximal rate of RNA processing, as is the ACCA sequence coding for the 3' terminus of the mature mRNA. An increase of distance between the mRNA palindrome and the CAAGAAAGA by as little as six nucleotides abolishes all processing. It may, therefore, be useful to regard both these sequence motifs as part of one and the same RNA processing signal with narrowly defined topologies. Interestingly, U7 RNA-dependent 3' processing of histone pre-mRNA can occur in RNA injection experiments only when the in vitro synthesized pre-mRNA contains sequence extensions well beyond the region of sequence complementarities to the U7 RNA. In addition to directing 3' processing the terminal mRNA sequences may have a role in histone mRNA stabilization in the cytoplasmic compartment. Images Fig. 3. Fig. 4. Fig. 5. Fig. 6. Fig. 7. PMID:2410259

  16. Gene expression distribution deconvolution in single-cell RNA sequencing.

    PubMed

    Wang, Jingshu; Huang, Mo; Torre, Eduardo; Dueck, Hannah; Shaffer, Sydney; Murray, John; Raj, Arjun; Li, Mingyao; Zhang, Nancy R

    2018-06-26

    Single-cell RNA sequencing (scRNA-seq) enables the quantification of each gene's expression distribution across cells, thus allowing the assessment of the dispersion, nonzero fraction, and other aspects of its distribution beyond the mean. These statistical characterizations of the gene expression distribution are critical for understanding expression variation and for selecting marker genes for population heterogeneity. However, scRNA-seq data are noisy, with each cell typically sequenced at low coverage, thus making it difficult to infer properties of the gene expression distribution from raw counts. Based on a reexamination of nine public datasets, we propose a simple technical noise model for scRNA-seq data with unique molecular identifiers (UMI). We develop deconvolution of single-cell expression distribution (DESCEND), a method that deconvolves the true cross-cell gene expression distribution from observed scRNA-seq counts, leading to improved estimates of properties of the distribution such as dispersion and nonzero fraction. DESCEND can adjust for cell-level covariates such as cell size, cell cycle, and batch effects. DESCEND's noise model and estimation accuracy are further evaluated through comparisons to RNA FISH data, through data splitting and simulations and through its effectiveness in removing known batch effects. We demonstrate how DESCEND can clarify and improve downstream analyses such as finding differentially expressed genes, identifying cell types, and selecting differentiation markers. Copyright © 2018 the Author(s). Published by PNAS.

  17. Single-cell RNA sequencing identifies diverse roles of epithelial cells in idiopathic pulmonary fibrosis

    PubMed Central

    Mizuno, Takako; Sridharan, Anusha; Du, Yina; Guo, Minzhe; Wikenheiser-Brokamp, Kathryn A.; Perl, Anne-Karina T.; Funari, Vincent A.; Gokey, Jason J.; Stripp, Barry R.; Whitsett, Jeffrey A.

    2016-01-01

    Idiopathic pulmonary fibrosis (IPF) is a lethal interstitial lung disease characterized by airway remodeling, inflammation, alveolar destruction, and fibrosis. We utilized single-cell RNA sequencing (scRNA-seq) to identify epithelial cell types and associated biological processes involved in the pathogenesis of IPF. Transcriptomic analysis of normal human lung epithelial cells defined gene expression patterns associated with highly differentiated alveolar type 2 (AT2) cells, indicated by enrichment of RNAs critical for surfactant homeostasis. In contrast, scRNA-seq of IPF cells identified 3 distinct subsets of epithelial cell types with characteristics of conducting airway basal and goblet cells and an additional atypical transitional cell that contributes to pathological processes in IPF. Individual IPF cells frequently coexpressed alveolar type 1 (AT1), AT2, and conducting airway selective markers, demonstrating “indeterminate” states of differentiation not seen in normal lung development. Pathway analysis predicted aberrant activation of canonical signaling via TGF-β, HIPPO/YAP, P53, WNT, and AKT/PI3K. Immunofluorescence confocal microscopy identified the disruption of alveolar structure and loss of the normal proximal-peripheral differentiation of pulmonary epithelial cells. scRNA-seq analyses identified loss of normal epithelial cell identities and unique contributions of epithelial cells to the pathogenesis of IPF. The present study provides a rich data source to further explore lung health and disease. PMID:27942595

  18. A short autocomplementary sequence plays an essential role in avian sarcoma-leukosis virus RNA dimerization.

    PubMed

    Fossé, P; Motté, N; Roumier, A; Gabus, C; Muriaux, D; Darlix, J L; Paoletti, J

    1996-12-24

    Retroviral genomes consist of two identical RNA molecules joined noncovalently near their 5'-ends. Recently, two models have been proposed for RNA dimer formation on the basis of results obtained in vitro with human immunodeficiency virus type 1 RNA and Moloney murine leukemia virus RNA. It was first proposed that viral RNA dimerizes by forming an interstrand quadruple helix with purine tetrads. The second model postulates that RNA dimerization is initiated by a loop-loop interaction between the two RNA molecules. In order to better characterize the dimerization process of retroviral genomic RNA, we analyzed the in vitro dimerization of avian sarcoma-leukosis virus (ASLV) RNA using different transcripts. We determined the requirements for heterodimer formation, the thermal dissociation of RNA dimers, and the influence of antisense DNA oligonucleotides on dimer formation. Our results strongly suggest that purine tetrads are not involved in dimer formation. Data show that an autocomplementary sequence located upstream from the splice donor site and within a major packaging signal plays a crucial role in ASLV RNA dimer formation in vitro. This sequence is able to form a stem-loop structure, and phylogenetic analysis reveals that it is conserved in 28 different avian sarcoma and leukosis viruses. These results suggest that dimerization of ASLV RNA is initiated by a loop-loop interaction between two RNA molecules and provide an additional argument for the ubiquity of the dimerization process via loop-loop interaction.

  19. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    PubMed

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-07

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. Phylogenetic origins of the plant mitochondrion based on a comparative analysis of 5S ribosomal RNA sequences

    NASA Technical Reports Server (NTRS)

    Villanueva, E.; Delihas, N.; Luehrsen, K. R.; Fox, G. E.; Gibson, J.

    1985-01-01

    The complete nucleotide sequences of 5S ribosomal RNAs from Rhodocyclus gelatinosa, Rhodobacter sphaeroides, and Pseudomonas cepacia were determined. Comparisons of these 5S RNA sequences show that rather than being phylogenetically related to one another, the two photosynthetic bacterial 5S RNAs share more sequence and signature homology with the RNAs of two nonphotosynthetic strains. Rhodobacter sphaeroides is specifically related to Paracoccus denitrificans and Rc. gelatinosa is related to Ps. cepacia. These results support earlier 16S ribosomal RNA studies and add two important groups to the 5S RNA data base. Unique 5S RNA structural features previously found in P. denitrificans are present also in the 5S RNA of Rb. sphaeroides; these provide the basis for subdivisional signatures. The immediate consequence of obtaining these new sequences is that it is possible to clarify the phylogenetic origins of the plant mitochondrion. In particular, a close phylogenetic relationship is found between the plant mitochondria and members of the alpha subdivision of the purple photosynthetic bacteria, namely, Rb. sphaeroides, P. denitrificans, and Rhodospirillum rubrum.

  1. SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2018-05-25

    Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.

  2. Sequencing RNA by a combination of exonuclease digestion and uridine specific chemical cleavage using MALDI-TOF.

    PubMed Central

    Tolson, D A; Nicholson, N H

    1998-01-01

    The determination of DNA sequences by partial exonuclease digestion followed by Matrix-Assisted Laser Desorption Time of Flight Mass Spectrometry (MALDI-TOF) is a well established method. When the same procedure is applied to RNA, difficulties arise due to the small (1 Da) mass difference between the nucleotides U and C, which makes unambiguous assignment difficult using a MALDI-TOF instrument. Here we report our experiences with sequence specific endonucleases and chemical methods followed by MALDI-TOF to resolve these sequence ambiguities. We have found chemical methods superior to endonucleases both in terms of correct specificity and extent of sequence coverage. This methodology can be used in combination with exonuclease digestion to rapidly assign RNA sequences. PMID:9421498

  3. Sequences within the 5' untranslated region regulate the levels of a kinetoplast DNA topoisomerase mRNA during the cell cycle.

    PubMed Central

    Pasion, S G; Hines, J C; Ou, X; Mahmood, R; Ray, D S

    1996-01-01

    Gene expression in trypanosomatids appears to be regulated largely at the posttranscriptional level and involves maturation of mRNA precursors by trans splicing of a 39-nucleotide miniexon sequence to the 5' end of the mRNA and cleavage and polyadenylation at the 3' end of the mRNA. To initiate the identification of sequences involved in the periodic expression of DNA replication genes in trypanosomatids, we have mapped splice acceptor sites in the 5' flanking region of the TOP2 gene, which encodes the kinetoplast DNA topoisomerase, and have carried out deletion analysis of this region on a plasmid-encoded TOP2 gene. Block deletions within the 5' untranslated region (UTR) identified two regions (-608 to -388 and -387 to -186) responsible for periodic accumulation of the mRNA. Deletion of one or the other of these sequences had no effect on periodic expression of the mRNA, while deletion of both regions resulted in constitutive expression of the mRNA throughout the cell cycle. Subcloning of these sequences into the 5' UTR of a construct lacking both regions of the TOP2 5' UTR has shown that an octamer consensus sequence present in the 5' UTR of the TOP2, RPA1, and DHFR-TS mRNAs is required for normal cycling of the TOP2 mRNA. Mutation of the consensus octamer sequence in the TOP2 5' UTR in a plasmid construct containing only a single consensus octamer and that shows normal cycling of the plasmid-encoded TOP2 mRNA resulted in substantial reduction of the cycling of the mRNA level. These results imply a negative regulation of TOP2 mRNA during the cell cycle by a mechanism involving redundant elements containing one or more copies of a conserved octamer sequence within the 5' UTR of TOP2 mRNA. PMID:8943327

  4. Sequences within the 5' untranslated region regulate the levels of a kinetoplast DNA topoisomerase mRNA during the cell cycle.

    PubMed

    Pasion, S G; Hines, J C; Ou, X; Mahmood, R; Ray, D S

    1996-12-01

    Gene expression in trypanosomatids appears to be regulated largely at the posttranscriptional level and involves maturation of mRNA precursors by trans splicing of a 39-nucleotide miniexon sequence to the 5' end of the mRNA and cleavage and polyadenylation at the 3' end of the mRNA. To initiate the identification of sequences involved in the periodic expression of DNA replication genes in trypanosomatids, we have mapped splice acceptor sites in the 5' flanking region of the TOP2 gene, which encodes the kinetoplast DNA topoisomerase, and have carried out deletion analysis of this region on a plasmid-encoded TOP2 gene. Block deletions within the 5' untranslated region (UTR) identified two regions (-608 to -388 and -387 to -186) responsible for periodic accumulation of the mRNA. Deletion of one or the other of these sequences had no effect on periodic expression of the mRNA, while deletion of both regions resulted in constitutive expression of the mRNA throughout the cell cycle. Subcloning of these sequences into the 5' UTR of a construct lacking both regions of the TOP2 5' UTR has shown that an octamer consensus sequence present in the 5' UTR of the TOP2, RPA1, and DHFR-TS mRNAs is required for normal cycling of the TOP2 mRNA. Mutation of the consensus octamer sequence in the TOP2 5' UTR in a plasmid construct containing only a single consensus octamer and that shows normal cycling of the plasmid-encoded TOP2 mRNA resulted in substantial reduction of the cycling of the mRNA level. These results imply a negative regulation of TOP2 mRNA during the cell cycle by a mechanism involving redundant elements containing one or more copies of a conserved octamer sequence within the 5' UTR of TOP2 mRNA.

  5. Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer

    DTIC Science & Technology

    2016-09-01

    AWARD NUMBER: W81XWH-14-1-0080 TITLE: Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer. PRINCIPAL INVESTIGATOR...PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland 21702-5012 DISTRIBUTION STATEMENT: Approved for Public Release...SUBTITLE Total RNA Sequencing Analysis of DCIS Progressing to Invasive Breast Cancer. 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-14-1-0080 GRANT11489

  6. Structural insights into the recognition of the internal A-rich linker from OxyS sRNA by Escherichia coli Hfq

    PubMed Central

    Wang, Lijun; Wang, Weiwei; Li, Fudong; Zhang, Jiahai; Wu, Jihui; Gong, Qingguo; Shi, Yunyu

    2015-01-01

    Small RNA OxyS is induced during oxidative stress in Escherichia coli and it is an Hfq-dependent negative regulator of mRNA translation. OxyS represses the translation of fhlA and rpoS mRNA, which encode the transcriptional activator and σs subunit of RNA polymerase, respectively. However, little is known regarding how Hfq, an RNA chaperone, interacts with OxyS at the atomic level. Here, using fluorescence polarization and tryptophan fluorescence quenching assays, we verified that the A-rich linker region of OxyS sRNA binds Hfq at its distal side. We also report two crystal structures of Hfq in complex with A-rich RNA fragments from this linker region. Both of these RNA fragments bind to the distal side of Hfq and adopt a different conformation compared with those previously reported for the (A-R-N)n tripartite recognition motif. Furthermore, using fluorescence polarization, electrophoresis mobility shift assays and in vivo translation assays, we found that an Hfq mutant, N48A, increases the binding affinity of OxyS for Hfq in vitro but is defective in the negative regulation of fhlA translation in vivo, suggesting that the normal function of OxyS depends on the details of the interaction with Hfq that may be related to the rapid recycling of Hfq in the cell. PMID:25670676

  7. Identification of characteristic oligonucleotides in the bacterial 16S ribosomal RNA sequence dataset

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; Willson, Richard C.; Fox, George E.

    2002-01-01

    MOTIVATION: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification. RESULTS: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Q(s), was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 5-11 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16,000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.

  8. RISC RNA sequencing for context-specific identification of in vivo miR targets

    PubMed Central

    Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W

    2010-01-01

    Rationale MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. Objective To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). Methods and Results We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias, and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1,645 mRNAs consistently targeted to mouse cardiac RISCs. We employed this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing ‘seed’ sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. Conclusions RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context, and is applicable to any tissue and any disease state. Summary MicroRNAs (miRs) are key regulators of mRNA translation in health and disease. While bioinformatic predictions suggest that a single miR may target hundreds of mRNAs, the number of experimentally verified targets of miRs is low. To enable comprehensive, unbiased examination of miR targets, we have performed deep RNA sequencing of cardiac transcriptomes in parallel with cardiac RNA-induced silencing complex

  9. Sequence heterogeneity in the two 16S rRNA genes of Phormium yellow leaf phytoplasma.

    PubMed Central

    Liefting, L W; Andersen, M T; Beever, R E; Gardner, R C; Forster, R L

    1996-01-01

    Phormium yellow leaf (PYL) phytoplasma causes a lethal disease of the monocotyledon, New Zealand flax (Phormium tenax). The 16S rRNA genes of PYL phytoplasma were amplified from infected flax by PCR and cloned, and the nucleotide sequences were determined. DNA sequencing and Southern hybridization analysis of genomic DNA indicated the presence of two copies of the 16S rRNA gene. The two 16S rRNA genes exhibited sequence heterogeneity in 4 nucleotide positions and could be distinguished by the restriction enzymes BpmI and BsrI. This is the first record in which sequence heterogeneity in the 16S rRNA genes of a phytoplasma has been determined by sequence analysis. A phylogenetic tree based on 16S rRNA gene sequences showed that PYL phytoplasma is most closely related to the stolbur and German grapevine yellows phytoplasmas, which form the stolbur subgroup of the aster yellows group. This phylogenetic position of PYL phytoplasma was supported by 16S/23S spacer region sequence data. PMID:8795200

  10. Nucleotide Sequence Analysis of RNA Synthesized from Rabbit Globin Complementary DNA

    PubMed Central

    Poon, Raymond; Paddock, Gary V.; Heindell, Howard; Whitcome, Philip; Salser, Winston; Kacian, Dan; Bank, Arthur; Gambino, Roberto; Ramirez, Francesco

    1974-01-01

    Rabbit globin complementary DNA made with RNA-dependent DNA polymerase (reverse transcriptase) was used as template for in vitro synthesis of 32P-labeled RNA. The sequences of the nucleotides in most of the fragments resulting from combined ribonuclease T1 and alkaline phosphatase digestion have been determined. Several fragments were long enough to fit uniquely with the α or β globin amino-acid sequences. These data demonstrate that the cDNA was copied from globin mRNA and contained no detectable contaminants. Images PMID:4139714

  11. Simultaneous sequencing of coding and noncoding RNA reveals a human transcriptome dominated by a small number of highly expressed noncoding genes.

    PubMed

    Boivin, Vincent; Deschamps-Francoeur, Gabrielle; Couture, Sonia; Nottingham, Ryan M; Bouchard-Bourelle, Philia; Lambowitz, Alan M; Scott, Michelle S; Abou-Elela, Sherif

    2018-07-01

    Comparing the abundance of one RNA molecule to another is crucial for understanding cellular functions but most sequencing techniques can target only specific subsets of RNA. In this study, we used a new fragmented ribodepleted TGIRT sequencing method that uses a thermostable group II intron reverse transcriptase (TGIRT) to generate a portrait of the human transcriptome depicting the quantitative relationship of all classes of nonribosomal RNA longer than 60 nt. Comparison between different sequencing methods indicated that FRT is more accurate in ranking both mRNA and noncoding RNA than viral reverse transcriptase-based sequencing methods, even those that specifically target these species. Measurements of RNA abundance in different cell lines using this method correlate with biochemical estimates, confirming tRNA as the most abundant nonribosomal RNA biotype. However, the single most abundant transcript is 7SL RNA, a component of the signal recognition particle. S tructured n on c oding RNAs (sncRNAs) associated with the same biological process are expressed at similar levels, with the exception of RNAs with multiple functions like U1 snRNA. In general, sncRNAs forming RNPs are hundreds to thousands of times more abundant than their mRNA counterparts. Surprisingly, only 50 sncRNA genes produce half of the non-rRNA transcripts detected in two different cell lines. Together the results indicate that the human transcriptome is dominated by a small number of highly expressed sncRNAs specializing in functions related to translation and splicing. © 2018 Boivin et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  12. Sequence-Based Prediction of RNA-Binding Residues in Proteins.

    PubMed

    Walia, Rasna R; El-Manzalawy, Yasser; Honavar, Vasant G; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein-RNA complexes is important for understanding the molecular determinants of protein-RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein-RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein-RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner.

  13. Sequence-Based Prediction of RNA-Binding Residues in Proteins

    PubMed Central

    Walia, Rasna R.; EL-Manzalawy, Yasser; Honavar, Vasant G.; Dobbs, Drena

    2017-01-01

    Identifying individual residues in the interfaces of protein–RNA complexes is important for understanding the molecular determinants of protein–RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein–RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein–RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner. PMID:27787829

  14. Species richness of arbuscular mycorrhizal fungi: associations with grassland plant richness and biomass.

    PubMed

    Hiiesalu, Inga; Pärtel, Meelis; Davison, John; Gerhold, Pille; Metsis, Madis; Moora, Mari; Öpik, Maarja; Vasar, Martti; Zobel, Martin; Wilson, Scott D

    2014-07-01

    Although experiments show a positive association between vascular plant and arbuscular mycorrhizal fungal (AMF) species richness, evidence from natural ecosystems is scarce. Furthermore, there is little knowledge about how AMF richness varies with belowground plant richness and biomass. We examined relationships among AMF richness, above- and belowground plant richness, and plant root and shoot biomass in a native North American grassland. Root-colonizing AMF richness and belowground plant richness were detected from the same bulk root samples by 454-sequencing of the AMF SSU rRNA and plant trnL genes. In total we detected 63 AMF taxa. Plant richness was 1.5 times greater belowground than aboveground. AMF richness was significantly positively correlated with plant species richness, and more strongly with below- than aboveground plant richness. Belowground plant richness was positively correlated with belowground plant biomass and total plant biomass, whereas aboveground plant richness was positively correlated only with belowground plant biomass. By contrast, AMF richness was negatively correlated with belowground and total plant biomass. Our results indicate that AMF richness and plant belowground richness are more strongly related with each other and with plant community biomass than with the plant aboveground richness measures that have been almost exclusively considered to date. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.

  15. RNA synthesis is modulated by G-quadruplex formation in Hepatitis C virus negative RNA strand.

    PubMed

    Chloé, Jaubert; Amina, Bedrat; Laura, Bartolucci; Carmelo, Di Primo; Michel, Ventura; Jean-Louis, Mergny; Samir, Amrane; Marie-Line, Andreola

    2018-05-25

    DNA and RNA guanine-rich oligonucleotides can form non-canonical structures called G-quadruplexes or "G4" that are based on the stacking of G-quartets. The role of DNA and RNA G4 is documented in eukaryotic cells and in pathogens such as viruses. Yet, G4 have been identified only in a few RNA viruses, including the Flaviviridae family. In this study, we analysed the last 157 nucleotides at the 3'end of the HCV (-) strand. This sequence is known to be the minimal sequence required for an efficient RNA replication. Using bioinformatics and biophysics, we identified a highly conserved G4-prone sequence located in the stem-loop IIy' of the negative strand. We also showed that the formation of this G-quadruplex inhibits the in vitro RNA synthesis by the RdRp. Furthermore, Phen-DC3, a specific G-quadruplex binder, is able to inhibit HCV viral replication in cells in conditions where no cytotoxicity was measured. Considering that this domain of the negative RNA strand is well conserved among HCV genotypes, G4 ligands could be of interest for new antiviral therapies.

  16. Single-cell full-length total RNA sequencing uncovers dynamics of recursive splicing and enhancer RNAs.

    PubMed

    Hayashi, Tetsutaro; Ozaki, Haruka; Sasagawa, Yohei; Umeda, Mana; Danno, Hiroki; Nikaido, Itoshi

    2018-02-12

    Total RNA sequencing has been used to reveal poly(A) and non-poly(A) RNA expression, RNA processing and enhancer activity. To date, no method for full-length total RNA sequencing of single cells has been developed despite the potential of this technology for single-cell biology. Here we describe random displacement amplification sequencing (RamDA-seq), the first full-length total RNA-sequencing method for single cells. Compared with other methods, RamDA-seq shows high sensitivity to non-poly(A) RNA and near-complete full-length transcript coverage. Using RamDA-seq with differentiation time course samples of mouse embryonic stem cells, we reveal hundreds of dynamically regulated non-poly(A) transcripts, including histone transcripts and long noncoding RNA Neat1. Moreover, RamDA-seq profiles recursive splicing in >300-kb introns. RamDA-seq also detects enhancer RNAs and their cell type-specific activity in single cells. Taken together, we demonstrate that RamDA-seq could help investigate the dynamics of gene expression, RNA-processing events and transcriptional regulation in single cells.

  17. 3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

    PubMed Central

    2013-01-01

    Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768

  18. Comparison of Species Richness Estimates Obtained Using Nearly Complete Fragments and Simulated Pyrosequencing-Generated Fragments in 16S rRNA Gene-Based Environmental Surveys▿ †

    PubMed Central

    Youssef, Noha; Sheik, Cody S.; Krumholz, Lee R.; Najar, Fares Z.; Roe, Bruce A.; Elshahed, Mostafa S.

    2009-01-01

    Pyrosequencing-based 16S rRNA gene surveys are increasingly utilized to study highly diverse bacterial communities, with special emphasis on utilizing the large number of sequences obtained (tens to hundreds of thousands) for species richness estimation. However, it is not yet clear how the number of operational taxonomic units (OTUs) and, hence, species richness estimates determined using shorter fragments at different taxonomic cutoffs correlates with the number of OTUs assigned using longer, nearly complete 16S rRNA gene fragments. We constructed a 16S rRNA clone library from an undisturbed tallgrass prairie soil (1,132 clones) and used it to compare species richness estimates obtained using eight pyrosequencing candidate fragments (99 to 361 bp in length) and the nearly full-length fragment. Fragments encompassing the V1 and V2 (V1+V2) region and the V6 region (generated using primer pairs 8F-338R and 967F-1046R) overestimated species richness; fragments encompassing the V3, V7, and V7+V8 hypervariable regions (generated using primer pairs 338F-530R, 1046F-1220R, and 1046F-1392R) underestimated species richness; and fragments encompassing the V4, V5+V6, and V6+V7 regions (generated using primer pairs 530F-805R, 805F-1046R, and 967F-1220R) provided estimates comparable to those obtained with the nearly full-length fragment. These patterns were observed regardless of the alignment method utilized or the parameter used to gauge comparative levels of species richness (number of OTUs observed, slope of scatter plots of pairwise distance values for short and nearly complete fragments, and nonparametric and parametric species richness estimates). Similar results were obtained when analyzing three other datasets derived from soil, adult Zebrafish gut, and basaltic formations in the East Pacific Rise. Regression analysis indicated that these observed discrepancies in species richness estimates within various regions could readily be explained by the proportions of

  19. Next-generation sequencing identifies the natural killer cell microRNA transcriptome

    PubMed Central

    Fehniger, Todd A.; Wylie, Todd; Germino, Elizabeth; Leong, Jeffrey W.; Magrini, Vincent J.; Koul, Sunita; Keppel, Catherine R.; Schneider, Stephanie E.; Koboldt, Daniel C.; Sullivan, Ryan P.; Heinz, Michael E.; Crosby, Seth D.; Nagarajan, Rakesh; Ramsingh, Giridharan; Link, Daniel C.; Ley, Timothy J.; Mardis, Elaine R.

    2010-01-01

    Natural killer (NK) cells are innate lymphocytes important for early host defense against infectious pathogens and surveillance against malignant transformation. Resting murine NK cells regulate the translation of effector molecule mRNAs (e.g., granzyme B, GzmB) through unclear molecular mechanisms. MicroRNAs (miRNAs) are small noncoding RNAs that post-transcriptionally regulate the translation of their mRNA targets, and are therefore candidates for mediating this control process. While the expression and importance of miRNAs in T and B lymphocytes have been established, little is known about miRNAs in NK cells. Here, we used two next-generation sequencing (NGS) platforms to define the miRNA transcriptomes of resting and cytokine-activated primary murine NK cells, with confirmation by quantitative real-time PCR (qRT-PCR) and microarrays. We delineate a bioinformatics analysis pipeline that identified 302 known and 21 novel mature miRNAs from sequences obtained from NK cell small RNA libraries. These miRNAs are expressed over a broad range and exhibit isomiR complexity, and a subset is differentially expressed following cytokine activation. Using these miRNA NGS data, miR-223 was identified as a mature miRNA present in resting NK cells with decreased expression following cytokine activation. Furthermore, we demonstrate that miR-223 specifically targets the 3′ untranslated region of murine GzmB in vitro, indicating that this miRNA may contribute to control of GzmB translation in resting NK cells. Thus, the sequenced NK cell miRNA transcriptome provides a valuable framework for further elucidation of miRNA expression and function in NK cell biology. PMID:20935160

  20. Development and Verification of an RNA Sequencing (RNA-Seq) Assay for the Detection of Gene Fusions in Tumors.

    PubMed

    Winters, Jennifer L; Davila, Jaime I; McDonald, Amber M; Nair, Asha A; Fadra, Numrah; Wehrs, Rebecca N; Thomas, Brittany C; Balcom, Jessica R; Jin, Long; Wu, Xianglin; Voss, Jesse S; Klee, Eric W; Oliver, Gavin R; Graham, Rondell P; Neff, Jadee L; Rumilla, Kandelaria M; Aypar, Umut; Kipp, Benjamin R; Jenkins, Robert B; Jen, Jin; Halling, Kevin C

    2018-06-13

    We assessed the performance characteristics of an RNA sequencing (RNA-Seq) assay designed to detect gene fusions in 571 genes to help manage patients with cancer. Polyadenylated RNA was converted to cDNA, which was then used to prepare next-generation sequencing libraries that were sequenced on an Illumina HiSeq 2500 instrument and analyzed with an in-house developed bioinformatic pipeline. The assay identified 38 of 41 gene fusions detected by another method, such as fluorescence in situ hybridization or RT-PCR, for a sensitivity of 93%. No false-positive gene fusions were identified in 15 normal tissue specimens and 10 tumor specimens that were negative for fusions by RNA sequencing or Mate Pair NGS (100% specificity). The assay also identified 22 fusions in 17 tumor specimens that had not been detected by other methods. Eighteen of the 22 fusions had not previously been described. Good intra-assay and interassay reproducibility was observed with complete concordance for the presence or absence of gene fusions in replicates. The analytical sensitivity of the assay was tested by diluting RNA isolated from gene fusion-positive cases with fusion-negative RNA. Gene fusions were generally detectable down to 12.5% dilutions for most fusions and as little as 3% for some fusions. This assay can help identify fusions in patients with cancer; these patients may in turn benefit from both US Food and Drug Administration-approved and investigational targeted therapies. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  1. Structural Requirement in Clostridium perfringens Collagenase mRNA 5′ Leader Sequence for Translational Induction through Small RNA-mRNA Base Pairing

    PubMed Central

    Nomura, Nobuhiko; Nakamura, Kouji

    2013-01-01

    The Gram-positive anaerobic bacterium Clostridium perfringens is pathogenic to humans and animals, and the production of its toxins is strictly regulated during the exponential phase. We recently found that the 5′ leader sequence of the colA transcript encoding collagenase, which is a major toxin of this organism, is processed and stabilized in the presence of the small RNA VR-RNA. The primary colA 5′-untranslated region (5′UTR) forms a long stem-loop structure containing an internal bulge and masks its own ribosomal binding site. Here we found that VR-RNA directly regulates colA expression through base pairing with colA mRNA in vivo. However, when the internal bulge structure was closed by point mutations in colA mRNA, translation ceased despite the presence of VR-RNA. In addition, a mutation disrupting the colA stem-loop structure induced mRNA processing and ColA-FLAG translational activation in the absence of VR-RNA, indicating that the stem-loop and internal bulge structure of the colA 5′ leader sequence is important for regulation by VR-RNA. On the other hand, processing was required for maximal ColA expression but was not essential for VR-RNA-dependent colA regulation. Finally, colA processing and translational activation were induced at a high temperature without VR-RNA. These results suggest that inhibition of the colA 5′ leader structure through base pairing is the primary role of VR-RNA in colA regulation and that the colA 5′ leader structure is a possible thermosensor. PMID:23585542

  2. Small RNA Deep Sequencing and the Effects of microRNA408 on Root Gravitropic Bending in Arabidopsis

    NASA Astrophysics Data System (ADS)

    Li, Huasheng; Lu, Jinying; Sun, Qiao; Chen, Yu; He, Dacheng; Liu, Min

    2015-11-01

    MicroRNA (miRNA) is a non-coding small RNA composed of 20 to 24 nucleotides that influences plant root development. This study analyzed the miRNA expression in Arabidopsis root tip cells using Illumina sequencing and real-time PCR before (sample 0) and 15 min after (sample 15) a 3-D clinostat rotational treatment was administered. After stimulation was performed, the expression levels of seven miRNA genes, including Arabidopsis miR160, miR161, miR394, miR402, miR403, miR408, and miR823, were significantly upregulated. Illumina sequencing results also revealed two novel miRNAsthat have not been previously reported, The target genes of these miRNAs included pentatricopeptide repeat-containing protein and diadenosine tetraphosphate hydrolase. An overexpression vector of Arabidopsis miR408 was constructed and transferred to Arabidopsis plant. The roots of plants over expressing miR408 exhibited a slower reorientation upon gravistimulation in comparison with those of wild-type. This result indicate that miR408 could play a role in root gravitropic response.

  3. A highly efficient method for extracting next-generation sequencing quality RNA from adipose tissue of recalcitrant animal species.

    PubMed

    Sharma, Davinder; Golla, Naresh; Singh, Dheer; Onteru, Suneel K

    2018-03-01

    The next-generation sequencing (NGS) based RNA sequencing (RNA-Seq) and transcriptome profiling offers an opportunity to unveil complex biological processes. Successful RNA-Seq and transcriptome profiling requires a large amount of high-quality RNA. However, NGS-quality RNA isolation is extremely difficult from recalcitrant adipose tissue (AT) with high lipid content and low cell numbers. Further, the amount and biochemical composition of AT lipid varies depending upon the animal species which can pose different degree of resistance to RNA extraction. Currently available approaches may work effectively in one species but can be almost unproductive in another species. Herein, we report a two step protocol for the extraction of NGS quality RNA from AT across a broad range of animal species. © 2017 Wiley Periodicals, Inc.

  4. The maize stripe virus major noncapsid protein messenger RNA transcripts contain heterogeneous leader sequences at their 5' termini.

    PubMed

    Huiet, L; Feldstein, P A; Tsai, J H; Falk, B W

    1993-12-01

    Primer extension analyses and a PCR-based cloning strategy were used to identify and characterize 5' nucleotide sequences on the maize stripe virus (MStV) RNA4 mRNA transcripts encoding the major noncapsid protein (NCP). Direct RNA sequence analysis by primer extension showed that the NCP mRNA transcripts had 10-15 nucleotides beyond the 5' terminus of the MStV RNA4 nucleotide sequence. MStV genomic RNAs isolated from ribonucleoprotein particles (RNPs) lacked the additional 5' nucleotides. cDNA clones representing the 5' region of the mRNA transcripts were constructed, and the nucleotide sequences of the 5' regions were determined for 16 clones. Each was found to have a distinct 10-15 nucleotide sequence immediately 5' of the MStV RNA4 sequence. Eleven of 16 clones had the correct MStV RNA4 5' nucleotide sequence, while five showed minor variations at or near the 5' most MStV RNA4 nucleotide. These characteristics show strong similarities to other viral mRNA transcripts which are synthesized by cap snatching.

  5. Combinatory RNA-Sequencing Analyses Reveal a Dual Mode of Gene Regulation by ADAR1 in Gastric Cancer.

    PubMed

    Cho, Charles J; Jung, Jaeeun; Jiang, Lushang; Lee, Eun Ji; Kim, Dae-Soo; Kim, Byung Sik; Kim, Hee Sung; Jung, Hwoon-Yong; Song, Ho-June; Hwang, Sung Wook; Park, Yangsoon; Jung, Min Kyo; Pack, Chan Gi; Myung, Seung-Jae; Chang, Suhwan

    2018-04-25

    Adenosine deaminase acting on RNA 1 (ADAR1) is known to mediate deamination of adenosine-to-inosine through binding to double-stranded RNA, the phenomenon known as RNA editing. Currently, the function of ADAR1 in gastric cancer is unclear. This study was aimed at investigating RNA editing-dependent and editing-independent functions of ADAR1 in gastric cancer, especially focusing on its influence on editing of 3' untranslated regions (UTRs) and subsequent changes in expression of messenger RNAs (mRNAs) as well as microRNAs (miRNAs). RNA-sequencing and small RNA-sequencing were performed on AGS and MKN-45 cells with a stable ADAR1 knockdown. Changed frequencies of editing and mRNA and miRNA expression were then identified by bioinformatic analyses. Targets of RNA editing were further validated in patients' samples. In the Alu region of both gastric cell lines, editing was most commonly of the A-to-I type in 3'-UTR or intron. mRNA and protein levels of PHACTR4 increased in ADAR1 knockdown cells, because of the loss of seed sequences in 3'-UTR of PHACTR4 mRNA that are required for miRNA-196a-3p binding. Immunohistochemical analyses of tumor and paired normal samples from 16 gastric cancer patients showed that ADAR1 expression was higher in tumors than in normal tissues and inversely correlated with PHACTR4 staining. On the other hand, decreased miRNA-148a-3p expression in ADAR1 knockdown cells led to increased mRNA and protein expression of NFYA, demonstrating ADAR1's editing-independent function. ADAR1 regulates post-transcriptional gene expression in gastric cancer through both RNA editing-dependent and editing-independent mechanisms.

  6. Preparation of highly multiplexed small RNA sequencing libraries.

    PubMed

    Persson, Helena; Søkilde, Rolf; Pirona, Anna Chiara; Rovira, Carlos

    2017-08-01

    MicroRNAs (miRNAs) are ~22-nucleotide-long small non-coding RNAs that regulate the expression of protein-coding genes by base pairing to partially complementary target sites, preferentially located in the 3´ untranslated region (UTR) of target mRNAs. The expression and function of miRNAs have been extensively studied in human disease, as well as the possibility of using these molecules as biomarkers for prognostication and treatment guidance. To identify and validate miRNAs as biomarkers, their expression must be screened in large collections of patient samples. Here, we develop a scalable protocol for the rapid and economical preparation of a large number of small RNA sequencing libraries using dual indexing for multiplexing. Combined with the use of off-the-shelf reagents, more samples can be sequenced simultaneously on large-scale sequencing platforms at a considerably lower cost per sample. Sample preparation is simplified by pooling libraries prior to gel purification, which allows for the selection of a narrow size range while minimizing sample variation. A comparison with publicly available data from benchmarking of miRNA analysis platforms showed that this method captures absolute and differential expression as effectively as commercially available alternatives.

  7. Structural Analysis of Single-Point Mutations Given an RNA Sequence: A Case Study with RNAMute

    NASA Astrophysics Data System (ADS)

    Churkin, Alexander; Barash, Danny

    2006-12-01

    We introduce here for the first time the RNAMute package, a pattern-recognition-based utility to perform mutational analysis and detect vulnerable spots within an RNA sequence that affect structure. Mutations in these spots may lead to a structural change that directly relates to a change in functionality. Previously, the concept was tried on RNA genetic control elements called "riboswitches" and other known RNA switches, without an organized utility that analyzes all single-point mutations and can be further expanded. The RNAMute package allows a comprehensive categorization, given an RNA sequence that has functional relevance, by exploring the patterns of all single-point mutants. For illustration, we apply the RNAMute package on an RNA transcript for which individual point mutations were shown experimentally to inactivate spectinomycin resistance in Escherichia coli. Functional analysis of mutations on this case study was performed experimentally by creating a library of point mutations using PCR and screening to locate those mutations. With the availability of RNAMute, preanalysis can be performed computationally before conducting an experiment.

  8. Computer-Aided Design of RNA Origami Structures.

    PubMed

    Sparvath, Steffen L; Geary, Cody W; Andersen, Ebbe S

    2017-01-01

    RNA nanostructures can be used as scaffolds to organize, combine, and control molecular functionalities, with great potential for applications in nanomedicine and synthetic biology. The single-stranded RNA origami method allows RNA nanostructures to be folded as they are transcribed by the RNA polymerase. RNA origami structures provide a stable framework that can be decorated with functional RNA elements such as riboswitches, ribozymes, interaction sites, and aptamers for binding small molecules or protein targets. The rich library of RNA structural and functional elements combined with the possibility to attach proteins through aptamer-based binding creates virtually limitless possibilities for constructing advanced RNA-based nanodevices.In this chapter we provide a detailed protocol for the single-stranded RNA origami design method using a simple 2-helix tall structure as an example. The first step involves 3D modeling of a double-crossover between two RNA double helices, followed by decoration with tertiary motifs. The second step deals with the construction of a 2D blueprint describing the secondary structure and sequence constraints that serves as the input for computer programs. In the third step, computer programs are used to design RNA sequences that are compatible with the structure, and the resulting outputs are evaluated and converted into DNA sequences to order.

  9. Deep sequencing of foot-and-mouth disease virus reveals RNA sequences involved in genome packaging.

    PubMed

    Logan, Grace; Newman, Joseph; Wright, Caroline F; Lasecka-Dykes, Lidia; Haydon, Daniel T; Cottam, Eleanor M; Tuthill, Tobias J

    2017-10-18

    Non-enveloped viruses protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. Packaging and capsid assembly in RNA viruses can involve interactions between capsid proteins and secondary structures in the viral genome as exemplified by the RNA bacteriophage MS2 and as proposed for other RNA viruses of plants, animals and human. In the picornavirus family of non-enveloped RNA viruses, the requirements for genome packaging remain poorly understood. Here we show a novel and simple approach to identify predicted RNA secondary structures involved in genome packaging in the picornavirus foot-and-mouth disease virus (FMDV). By interrogating deep sequencing data generated from both packaged and unpackaged populations of RNA we have determined multiple regions of the genome with constrained variation in the packaged population. Predicted secondary structures of these regions revealed stem loops with conservation of structure and a common motif at the loop. Disruption of these features resulted in attenuation of virus growth in cell culture due to a reduction in assembly of mature virions. This study provides evidence for the involvement of predicted RNA structures in picornavirus packaging and offers a readily transferable methodology for identifying packaging requirements in many other viruses. Importance In order to transmit their genetic material to a new host, non-enveloped viruses must protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. For many non-enveloped RNA viruses the requirements for this critical part of the viral life cycle remain poorly understood. We have identified RNA sequences involved in genome packaging of the picornavirus foot-and-mouth disease virus. This virus causes an economically devastating disease of livestock affecting both the developed and developing world. The experimental methods developed to carry out this work are novel, simple and transferable to the

  10. Characterization of a tandemly repeated DNA sequence family originally derived by retroposition of tRNA(Glu) in the newt.

    PubMed

    Nagahashi, S; Endoh, H; Suzuki, Y; Okada, N

    1991-11-20

    A previous report from this laboratory showed that in vitro transcription of total genomic DNA of the newt Cynopus pyrrhogaster resulted in a discrete sized 8 S RNA, which represented highly repetitive and transcribable sequences with a glutamic acid tRNA-like structure in the newt genome. We isolated four independent clones from a newt genomic library and determined the complete sequences of three 2000 to 2400 base-pair PstI fragments spanning the 8 S RNA gene. The glutamic acid tRNA-related segment in the 8 S RNA gene contains the CCA sequence expected as the 3' terminus of a tRNA molecule. Further, the 11 nucleotides located 13 nucleotides upstream from one of the two transcription initiation sites of the 8 S RNA were found to be repeated in the region upstream from the termination site, suggesting that the original unit, which is shorter than the 8 S RNA, was retrotransposed via cDNA intermediates from the PolIII transcript. In the upstream region of the 8 S RNA gene, a 360 nucleotide unit containing the glutamic acid tRNA-related segment was found to be duplicated (clones NE1 and NE10) or triplicated (clone NE3). Except for the difference in the number of the 360 nucleotide unit, the three sequences of the 2000 to 2400 base-pair PstI fragment were essentially the same with only a few mutations and minor deletions. Inverse polymerase chain reaction and sequence determination of the products, together with a Southern hybridization experiment, demonstrated that the family consists of a tandemly repeated unit of 3300, 3700 or 4100 base-pairs. Thus during evolution, this family in the newt was created by retroposition via cDNA intermediates, followed by duplication or triplication of the 360 nucleotide unit and multiplication of the 3300 to 4100 base-pair region at the DNA level.

  11. Comparative analysis of bacteria associated with different mosses by 16S rRNA and 16S rDNA sequencing.

    PubMed

    Tian, Yang; Li, Yan Hong

    2017-01-01

    To understand the differences of the bacteria associated with different mosses, a phylogenetic study of bacterial communities in three mosses was carried out based on 16S rDNA and 16S rRNA sequencing. The mosses used were Hygroamblystegium noterophilum, Entodon compressus and Grimmia montana, representing hygrophyte, shady plant and xerophyte, respectively. In total, the operational taxonomic units (OTUs), richness and diversity were different regardless of the moss species and the library level. All the examined 1183 clones were assigned to 248 OTUs, 56 genera were assigned in rDNA libraries and 23 genera were determined at the rRNA level. Proteobacteria and Bacteroidetes were considered as the most dominant phyla in all the libraries, whereas abundant Actinobacteria and Acidobacteria were detected in the rDNA library of Entodon compressus and approximately 24.7% clones were assigned to Candidate division TM7 in Grimmia montana at rRNA level. The heatmap showed the bacterial profiles derived from rRNA and rDNA were partly overlapping. However, the principle component analysis of all the profiles derived from rDNA showed sharper differences between the different mosses than that of rRNA-based profiles. This suggests that the metabolically active bacterial compositions in different mosses were more phylogenetically similar and the differences of the bacteria associated with different mosses were mainly detected at the rDNA level. Obtained results clearly demonstrate that combination of 16S rDNA and 16S rRNA sequencing is preferred approach to have a good understanding on the constitution of the microbial communities in mosses. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. G quadruplex RNA structures in PSD-95 mRNA: potential regulators of miR-125a seed binding site accessibility

    PubMed Central

    Stefanovic, Snezana; Bassell, Gary J.

    2015-01-01

    Fragile X syndrome (FXS) is the most common inherited form of intellectual disability caused by the CGG trinucleotide expansion in the 3′-untranslated region of the FMR1 gene on the X chromosome, that silences the expression of the Fragile X mental retardation protein (FMRP). FMRP has been shown to bind to a G-rich region within the PSD-95 mRNA which encodes for the postsynaptic density protein 95 (PSD-95), and together with the microRNA miR-125a, to play an important role in the reversible inhibition of the PSD-95 mRNA translation in neurons. The loss of FMRP in Fmr1 KO mice disables this translation control in the production of the PSD-95 protein. Interestingly, the miR-125a binding site on PSD-95 mRNA is embedded in the G-rich region bound by FMRP and postulated to adopt one or more G quadruplex structures. In this study, we have used different biophysical techniques to validate and characterize the formation of parallel G quadruplex structures and binding of miR-125a to its complementary sequence located within the 3′ UTR of PSD-95 mRNA. Our results indicate that the PSD-95 mRNA G-rich region folds into alternate G quadruplex conformations that coexist in equilibrium. miR-125a forms a stable complex with PSD-95 mRNA, as evident by characteristic Watson–Crick base-pairing that coexists with one of the G quadruplex forms, suggesting a novel mechanism for G quadruplex structures to regulate the access of miR-125a to its binding site. PMID:25406362

  13. Accurate identification of RNA editing sites from primitive sequence with deep neural networks.

    PubMed

    Ouyang, Zhangyi; Liu, Feng; Zhao, Chenghui; Ren, Chao; An, Gaole; Mei, Chuan; Bo, Xiaochen; Shu, Wenjie

    2018-04-16

    RNA editing is a post-transcriptional RNA sequence alteration. Current methods have identified editing sites and facilitated research but require sufficient genomic annotations and prior-knowledge-based filtering steps, resulting in a cumbersome, time-consuming identification process. Moreover, these methods have limited generalizability and applicability in species with insufficient genomic annotations or in conditions of limited prior knowledge. We developed DeepRed, a deep learning-based method that identifies RNA editing from primitive RNA sequences without prior-knowledge-based filtering steps or genomic annotations. DeepRed achieved 98.1% and 97.9% area under the curve (AUC) in training and test sets, respectively. We further validated DeepRed using experimentally verified U87 cell RNA-seq data, achieving 97.9% positive predictive value (PPV). We demonstrated that DeepRed offers better prediction accuracy and computational efficiency than current methods with large-scale, mass RNA-seq data. We used DeepRed to assess the impact of multiple factors on editing identification with RNA-seq data from the Association of Biomolecular Resource Facilities and Sequencing Quality Control projects. We explored developmental RNA editing pattern changes during human early embryogenesis and evolutionary patterns in Drosophila species and the primate lineage using DeepRed. Our work illustrates DeepRed's state-of-the-art performance; it may decipher the hidden principles behind RNA editing, making editing detection convenient and effective.

  14. Genome-wide identification and analysis of A-to-I RNA editing events in bovine by transcriptome sequencing

    PubMed Central

    Salehi, Abdolreza; Rivera, Rocío Melissa

    2018-01-01

    RNA editing increases the diversity of the transcriptome and proteome. Adenosine-to-inosine (A-to-I) editing is the predominant type of RNA editing in mammals and it is catalyzed by the adenosine deaminases acting on RNA (ADARs) family. Here, we used a largescale computational analysis of transcriptomic data from brain, heart, colon, lung, spleen, kidney, testes, skeletal muscle and liver, from three adult animals in order to identify RNA editing sites in bovine. We developed a computational pipeline and used a rigorous strategy to identify novel editing sites from RNA-Seq data in the absence of corresponding DNA sequence information. Our methods take into account sequencing errors, mapping bias, as well as biological replication to reduce the probability of obtaining a false-positive result. We conducted a detailed characterization of sequence and structural features related to novel candidate sites and found 1,600 novel canonical A-to-I editing sites in the nine bovine tissues analyzed. Results show that these sites 1) occur frequently in clusters and short interspersed nuclear elements (SINE) repeats, 2) have a preference for guanines depletion/enrichment in the flanking 5′/3′ nucleotide, 3) occur less often in coding sequences than other regions of the genome, and 4) have low evolutionary conservation. Further, we found that a positive correlation exists between expression of ADAR family members and tissue-specific RNA editing. Most of the genes with predicted A-to-I editing in each tissue were significantly enriched in biological terms relevant to the function of the corresponding tissue. Lastly, the results highlight the importance of the RNA editome in nervous system regulation. The present study extends the list of RNA editing sites in bovine and provides pipelines that may be used to investigate the editome in other organisms. PMID:29470549

  15. Cold shock domain proteins and glycine-rich RNA-binding proteins from Arabidopsis thaliana can promote the cold adaptation process in Escherichia coli

    PubMed Central

    Kim, Jin Sun; Park, Su Jung; Kwak, Kyung Jin; Kim, Yeon Ok; Kim, Joo Yeol; Song, Jinkyung; Jang, Boseung; Jung, Che-Hun; Kang, Hunseung

    2007-01-01

    Despite the fact that cold shock domain proteins (CSDPs) and glycine-rich RNA-binding proteins (GRPs) have been implicated to play a role during the cold adaptation process, their importance and function in eukaryotes, including plants, are largely unknown. To understand the functional role of plant CSDPs and GRPs in the cold response, two CSDPs (CSDP1 and CSDP2) and three GRPs (GRP2, GRP4 and GRP7) from Arabidopsis thaliana were investigated. Heterologous expression of CSDP1 or GRP7 complemented the cold sensitivity of BX04 mutant Escherichia coli that lack four cold shock proteins (CSPs) and is highly sensitive to cold stress, and resulted in better survival rate than control cells during incubation at low temperature. In contrast, CSDP2 and GRP4 had very little ability. Selective evolution of ligand by exponential enrichment (SELEX) revealed that GRP7 does not recognize specific RNAs but binds preferentially to G-rich RNA sequences. CSDP1 and GRP7 had DNA melting activity, and enhanced RNase activity. In contrast, CSDP2 and GRP4 had no DNA melting activity and did not enhance RNAase activity. Together, these results indicate that CSDPs and GRPs help E.coli grow and survive better during cold shock, and strongly imply that CSDP1 and GRP7 exhibit RNA chaperone activity during the cold adaptation process. PMID:17169986

  16. A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

    PubMed

    Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng

    2017-05-10

    Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite

  17. Evaluation of nearest-neighbor methods for detection of chimeric small-subunit rRNA sequences

    NASA Technical Reports Server (NTRS)

    Robison-Cox, J. F.; Bateson, M. M.; Ward, D. M.

    1995-01-01

    Detection of chimeric artifacts formed when PCR is used to retrieve naturally occurring small-subunit (SSU) rRNA sequences may rely on demonstrating that different sequence domains have different phylogenetic affiliations. We evaluated the CHECK_CHIMERA method of the Ribosomal Database Project and another method which we developed, both based on determining nearest neighbors of different sequence domains, for their ability to discern artificially generated SSU rRNA chimeras from authentic Ribosomal Database Project sequences. The reliability of both methods decreases when the parental sequences which contribute to chimera formation are more than 82 to 84% similar. Detection is also complicated by the occurrence of authentic SSU rRNA sequences that behave like chimeras. We developed a naive statistical test based on CHECK_CHIMERA output and used it to evaluate previously reported SSU rRNA chimeras. Application of this test also suggests that chimeras might be formed by retrieving SSU rRNAs as cDNA. The amount of uncertainty associated with nearest-neighbor analyses indicates that such tests alone are insufficient and that better methods are needed.

  18. tRNADB-CE: tRNA gene database well-timed in the era of big sequence data.

    PubMed

    Abe, Takashi; Inokuchi, Hachiro; Yamada, Yuko; Muto, Akira; Iwasaki, Yuki; Ikemura, Toshimichi

    2014-01-01

    The tRNA gene data base curated by experts "tRNADB-CE" (http://trna.ie.niigata-u.ac.jp) was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses', 121 chloroplasts', and 12 eukaryotes' genomes plus fragment sequences obtained by metagenome studies of environmental samples. 595,115 tRNA genes in total, and thus two times of genes compiled previously, have been registered, for which sequence, clover-leaf structure, and results of sequence-similarity and oligonucleotide-pattern searches can be browsed. To provide collective knowledge with help from experts in tRNA researches, we added a column for enregistering comments to each tRNA. By grouping bacterial tRNAs with an identical sequence, we have found high phylogenetic preservation of tRNA sequences, especially at the phylum level. Since many species-unknown tRNAs from metagenomic sequences have sequences identical to those found in species-known prokaryotes, the identical sequence group (ISG) can provide phylogenetic markers to investigate the microbial community in an environmental ecosystem. This strategy can be applied to a huge amount of short sequences obtained from next-generation sequencers, as showing that tRNADB-CE is a well-timed database in the era of big sequence data. It is also discussed that batch-learning self-organizing-map with oligonucleotide composition is useful for efficient knowledge discovery from big sequence data.

  19. G quadruplex RNA structures in PSD-95 mRNA: potential regulators of miR-125a seed binding site accessibility.

    PubMed

    Stefanovic, Snezana; Bassell, Gary J; Mihailescu, Mihaela Rita

    2015-01-01

    Fragile X syndrome (FXS) is the most common inherited form of intellectual disability caused by the CGG trinucleotide expansion in the 3'-untranslated region of the FMR1 gene on the X chromosome, that silences the expression of the Fragile X mental retardation protein (FMRP). FMRP has been shown to bind to a G-rich region within the PSD-95 mRNA which encodes for the postsynaptic density protein 95 (PSD-95), and together with the microRNA miR-125a, to play an important role in the reversible inhibition of the PSD-95 mRNA translation in neurons. The loss of FMRP in Fmr1 KO mice disables this translation control in the production of the PSD-95 protein. Interestingly, the miR-125a binding site on PSD-95 mRNA is embedded in the G-rich region bound by FMRP and postulated to adopt one or more G quadruplex structures. In this study, we have used different biophysical techniques to validate and characterize the formation of parallel G quadruplex structures and binding of miR-125a to its complementary sequence located within the 3' UTR of PSD-95 mRNA. Our results indicate that the PSD-95 mRNA G-rich region folds into alternate G quadruplex conformations that coexist in equilibrium. miR-125a forms a stable complex with PSD-95 mRNA, as evident by characteristic Watson-Crick base-pairing that coexists with one of the G quadruplex forms, suggesting a novel mechanism for G quadruplex structures to regulate the access of miR-125a to its binding site. © 2014 Stefanovic et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  20. The RNase P RNA from cyanobacteria: short tandemly repeated repetitive (STRR) sequences are present within the RNase P RNA gene in heterocyst-forming cyanobacteria.

    PubMed Central

    Vioque, A

    1997-01-01

    The RNase P RNA gene (rnpB) from 10 cyanobacteria has been characterized. These new RNAs, together with the previously available ones, provide a comprehensive data set of RNase P RNA from diverse cyanobacterial lineages. All heterocystous cyanobacteria, but none of the non-heterocystous strains analyzed, contain short tandemly repeated repetitive (STRR) sequences that increase the length of helix P12. Site-directed mutagenesis experiments indicate that the STRR sequences are not required for catalytic activity in vitro. STRR sequences seem to have recently and independently invaded the RNase P RNA genes in heterocyst-forming cyanobacteria because closely related strains contain unrelated STRR sequences. Most cyanobacteria RNase P RNAs lack the sequence GGU in the loop connecting helices P15 and P16 that has been established to interact with the 3'-end CCA in precursor tRNA substrates in other bacteria. This character is shared with plastid RNase P RNA. Helix P6 is longer than usual in most cyanobacteria as well as in plastid RNase P RNA. PMID:9254706

  1. Expression cloning and characterization of a novel gene that encodes the RNA-binding protein FAU-1 from Pyrococcus furiosus.

    PubMed Central

    Kanai, Akio; Oida, Hanako; Matsuura, Nana; Doi, Hirofumi

    2003-01-01

    We systematically screened a genomic DNA library to identify proteins of the hyperthermophilic archaeon Pyrococcus furiosus using an expression cloning method. One gene product, which we named FAU-1 (P. furiosus AU-binding), demonstrated the strongest binding activity of all the genomic library-derived proteins tested against an AU-rich RNA sequence. The protein was purified to near homogeneity as a 54 kDa single polypeptide, and the gene locus corresponding to this FAU-1 activity was also sequenced. The FAU-1 gene encoded a 472-amino-acid protein that was characterized by highly charged domains consisting of both acidic and basic amino acids. The N-terminal half of the gene had a degree of similarity (25%) with RNase E from Escherichia coli. Five rounds of RNA-binding-site selection and footprinting analysis showed that the FAU-1 protein binds specifically to the AU-rich sequence in a loop region of a possible RNA ligand. Moreover, we demonstrated that the FAU-1 protein acts as an oligomer, and mainly as a trimer. These results showed that the FAU-1 protein is a novel heat-stable protein with an RNA loop-binding characteristic. PMID:12614195

  2. Definition of a high-affinity Gag recognition structure mediating packaging of a retroviral RNA genome

    PubMed Central

    Gherghe, Cristina; Lombo, Tania; Leonard, Christopher W.; Datta, Siddhartha A. K.; Bess, Julian W.; Gorelick, Robert J.; Rein, Alan; Weeks, Kevin M.

    2010-01-01

    All retroviral genomic RNAs contain a cis-acting packaging signal by which dimeric genomes are selectively packaged into nascent virions. However, it is not understood how Gag (the viral structural protein) interacts with these signals to package the genome with high selectivity. We probed the structure of murine leukemia virus RNA inside virus particles using SHAPE, a high-throughput RNA structure analysis technology. These experiments showed that NC (the nucleic acid binding domain derived from Gag) binds within the virus to the sequence UCUG-UR-UCUG. Recombinant Gag and NC proteins bound to this same RNA sequence in dimeric RNA in vitro; in all cases, interactions were strongest with the first U and final G in each UCUG element. The RNA structural context is critical: High-affinity binding requires base-paired regions flanking this motif, and two UCUG-UR-UCUG motifs are specifically exposed in the viral RNA dimer. Mutating the guanosine residues in these two motifs—only four nucleotides per genomic RNA—reduced packaging 100-fold, comparable to the level of nonspecific packaging. These results thus explain the selective packaging of dimeric RNA. This paradigm has implications for RNA recognition in general, illustrating how local context and RNA structure can create information-rich recognition signals from simple single-stranded sequence elements in large RNAs. PMID:20974908

  3. Self-Assembly of Measles Virus Nucleocapsid-like Particles: Kinetics and RNA Sequence Dependence.

    PubMed

    Milles, Sigrid; Jensen, Malene Ringkjøbing; Communie, Guillaume; Maurin, Damien; Schoehn, Guy; Ruigrok, Rob W H; Blackledge, Martin

    2016-08-01

    Measles virus RNA genomes are packaged into helical nucleocapsids (NCs), comprising thousands of nucleo-proteins (N) that bind the entire genome. N-RNA provides the template for replication and transcription by the viral polymerase and is a promising target for viral inhibition. Elucidation of mechanisms regulating this process has been severely hampered by the inability to controllably assemble NCs. Here, we demonstrate self-organization of N into NC-like particles in vitro upon addition of RNA, providing a simple and versatile tool for investigating assembly. Real-time NMR and fluorescence spectroscopy reveals biphasic assembly kinetics. Remarkably, assembly depends strongly on the RNA-sequence, with the genomic 5' end and poly-Adenine sequences assembling efficiently, while sequences such as poly-Uracil are incompetent for NC formation. This observation has important consequences for understanding the assembly process. © 2016 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  4. Different modes of interaction by TIAR and HuR with target RNA and DNA

    PubMed Central

    Kim, Henry S.; Wilce, Matthew C. J.; Yoga, Yano M. K.; Pendini, Nicole R.; Gunzburg, Menachem J.; Cowieson, Nathan P.; Wilson, Gerald M.; Williams, Bryan R. G.; Gorospe, Myriam; Wilce, Jacqueline A.

    2011-01-01

    TIAR and HuR are mRNA-binding proteins that play important roles in the regulation of translation. They both possess three RNA recognition motifs (RRMs) and bind to AU-rich elements (AREs), with seemingly overlapping specificity. Here we show using SPR that TIAR and HuR bind to both U-rich and AU-rich RNA in the nanomolar range, with higher overall affinity for U-rich RNA. However, the higher affinity for U–rich sequences is mainly due to faster association with U-rich RNA, which we propose is a reflection of the higher probability of association. Differences between TIAR and HuR are observed in their modes of binding to RNA. TIAR is able to bind deoxy-oligonucleotides with nanomolar affinity, whereas HuR affinity is reduced to a micromolar level. Studies with U-rich DNA reveal that TIAR binding depends less on the 2′-hydroxyl group of RNA than HuR binding. Finally we show that SAXS data, recorded for the first two domains of TIAR in complex with RNA, are more consistent with a flexible, elongated shape and not the compact shape that the first two domains of Hu proteins adopt upon binding to RNA. We thus propose that these triple-RRM proteins, which compete for the same binding sites in cells, interact with their targets in fundamentally different ways. PMID:21233170

  5. Different modes of interaction by TIAR and HuR with target RNA and DNA.

    PubMed

    Kim, Henry S; Wilce, Matthew C J; Yoga, Yano M K; Pendini, Nicole R; Gunzburg, Menachem J; Cowieson, Nathan P; Wilson, Gerald M; Williams, Bryan R G; Gorospe, Myriam; Wilce, Jacqueline A

    2011-02-01

    TIAR and HuR are mRNA-binding proteins that play important roles in the regulation of translation. They both possess three RNA recognition motifs (RRMs) and bind to AU-rich elements (AREs), with seemingly overlapping specificity. Here we show using SPR that TIAR and HuR bind to both U-rich and AU-rich RNA in the nanomolar range, with higher overall affinity for U-rich RNA. However, the higher affinity for U-rich sequences is mainly due to faster association with U-rich RNA, which we propose is a reflection of the higher probability of association. Differences between TIAR and HuR are observed in their modes of binding to RNA. TIAR is able to bind deoxy-oligonucleotides with nanomolar affinity, whereas HuR affinity is reduced to a micromolar level. Studies with U-rich DNA reveal that TIAR binding depends less on the 2'-hydroxyl group of RNA than HuR binding. Finally we show that SAXS data, recorded for the first two domains of TIAR in complex with RNA, are more consistent with a flexible, elongated shape and not the compact shape that the first two domains of Hu proteins adopt upon binding to RNA. We thus propose that these triple-RRM proteins, which compete for the same binding sites in cells, interact with their targets in fundamentally different ways.

  6. Single-Cell RNA Sequencing of Glioblastoma Cells.

    PubMed

    Sen, Rajeev; Dolgalev, Igor; Bayin, N Sumru; Heguy, Adriana; Tsirigos, Aris; Placantonakis, Dimitris G

    2018-01-01

    Single-cell RNA sequencing (sc-RNASeq) is a recently developed technique used to evaluate the transcriptome of individual cells. As opposed to conventional RNASeq in which entire populations are sequenced in bulk, sc-RNASeq can be beneficial when trying to better understand gene expression patterns in markedly heterogeneous populations of cells or when trying to identify transcriptional signatures of rare cells that may be underrepresented when using conventional bulk RNASeq. In this method, we describe the generation and analysis of cDNA libraries from single patient-derived glioblastoma cells using the C1 Fluidigm system. The protocol details the use of the C1 integrated fluidics circuit (IFC) for capturing, imaging and lysing cells; performing reverse transcription; and generating cDNA libraries that are ready for sequencing and analysis.

  7. Suitability of partial 16S ribosomal RNA gene sequence analysis for the identification of dangerous bacterial pathogens.

    PubMed

    Ruppitsch, W; Stöger, A; Indra, A; Grif, K; Schabereiter-Gurtner, C; Hirschl, A; Allerberger, F

    2007-03-01

    In a bioterrorism event a rapid tool is needed to identify relevant dangerous bacteria. The aim of the study was to assess the usefulness of partial 16S rRNA gene sequence analysis and the suitability of diverse databases for identifying dangerous bacterial pathogens. For rapid identification purposes a 500-bp fragment of the 16S rRNA gene of 28 isolates comprising Bacillus anthracis, Brucella melitensis, Burkholderia mallei, Burkholderia pseudomallei, Francisella tularensis, Yersinia pestis, and eight genus-related and unrelated control strains was amplified and sequenced. The obtained sequence data were submitted to three public and two commercial sequence databases for species identification. The most frequent reason for incorrect identification was the lack of the respective 16S rRNA gene sequences in the database. Sequence analysis of a 500-bp 16S rDNA fragment allows the rapid identification of dangerous bacterial species. However, for discrimination of closely related species sequencing of the entire 16S rRNA gene, additional sequencing of the 23S rRNA gene or sequencing of the 16S-23S rRNA intergenic spacer is essential. This work provides comprehensive information on the suitability of partial 16S rDNA analysis and diverse databases for rapid and accurate identification of dangerous bacterial pathogens.

  8. RnaSeqSampleSize: real data based sample size estimation for RNA sequencing.

    PubMed

    Zhao, Shilin; Li, Chung-I; Guo, Yan; Sheng, Quanhu; Shyr, Yu

    2018-05-30

    One of the most important and often neglected components of a successful RNA sequencing (RNA-Seq) experiment is sample size estimation. A few negative binomial model-based methods have been developed to estimate sample size based on the parameters of a single gene. However, thousands of genes are quantified and tested for differential expression simultaneously in RNA-Seq experiments. Thus, additional issues should be carefully addressed, including the false discovery rate for multiple statistic tests, widely distributed read counts and dispersions for different genes. To solve these issues, we developed a sample size and power estimation method named RnaSeqSampleSize, based on the distributions of gene average read counts and dispersions estimated from real RNA-seq data. Datasets from previous, similar experiments such as the Cancer Genome Atlas (TCGA) can be used as a point of reference. Read counts and their dispersions were estimated from the reference's distribution; using that information, we estimated and summarized the power and sample size. RnaSeqSampleSize is implemented in R language and can be installed from Bioconductor website. A user friendly web graphic interface is provided at http://cqs.mc.vanderbilt.edu/shiny/RnaSeqSampleSize/ . RnaSeqSampleSize provides a convenient and powerful way for power and sample size estimation for an RNAseq experiment. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.

  9. Genome-wide evidence for local DNA methylation spreading from small RNA-targeted sequences in Arabidopsis.

    PubMed

    Ahmed, Ikhlak; Sarazin, Alexis; Bowler, Chris; Colot, Vincent; Quesneville, Hadi

    2011-09-01

    Transposable elements (TEs) and their relics play major roles in genome evolution. However, mobilization of TEs is usually deleterious and strongly repressed. In plants and mammals, this repression is typically associated with DNA methylation, but the relationship between this epigenetic mark and TE sequences has not been investigated systematically. Here, we present an improved annotation of TE sequences and use it to analyze genome-wide DNA methylation maps obtained at single-nucleotide resolution in Arabidopsis. We show that although the majority of TE sequences are methylated, ∼26% are not. Moreover, a significant fraction of TE sequences densely methylated at CG, CHG and CHH sites (where H = A, T or C) have no or few matching small interfering RNA (siRNAs) and are therefore unlikely to be targeted by the RNA-directed DNA methylation (RdDM) machinery. We provide evidence that these TE sequences acquire DNA methylation through spreading from adjacent siRNA-targeted regions. Further, we show that although both methylated and unmethylated TE sequences located in euchromatin tend to be more abundant closer to genes, this trend is least pronounced for methylated, siRNA-targeted TE sequences located 5' to genes. Based on these and other findings, we propose that spreading of DNA methylation through promoter regions explains at least in part the negative impact of siRNA-targeted TE sequences on neighboring gene expression.

  10. visnormsc: A Graphical User Interface to Normalize Single-cell RNA Sequencing Data.

    PubMed

    Tang, Lijun; Zhou, Nan

    2017-12-26

    Single-cell RNA sequencing (RNA-seq) allows the analysis of gene expression with high resolution. The intrinsic defects of this promising technology imports technical noise into the single-cell RNA-seq data, increasing the difficulty of accurate downstream inference. Normalization is a crucial step in single-cell RNA-seq data pre-processing. SCnorm is an accurate and efficient method that can be used for this purpose. An R implementation of this method is currently available. On one hand, the R package possesses many excellent features from R. On the other hand, R programming ability is required, which prevents the biologists who lack the skills from learning to use it quickly. To make this method more user-friendly, we developed a graphical user interface, visnormsc, for normalization of single-cell RNA-seq data. It is implemented in Python and is freely available at https://github.com/solo7773/visnormsc . Although visnormsc is based on the existing method, it contributes to this field by offering a user-friendly alternative. The out-of-the-box and cross-platform features make visnormsc easy to learn and to use. It is expected to serve biologists by simplifying single-cell RNA-seq normalization.

  11. The technology and biology of single-cell RNA sequencing.

    PubMed

    Kolodziejczyk, Aleksandra A; Kim, Jong Kyoung; Svensson, Valentine; Marioni, John C; Teichmann, Sarah A

    2015-05-21

    The differences between individual cells can have profound functional consequences, in both unicellular and multicellular organisms. Recently developed single-cell mRNA-sequencing methods enable unbiased, high-throughput, and high-resolution transcriptomic analysis of individual cells. This provides an additional dimension to transcriptomic information relative to traditional methods that profile bulk populations of cells. Already, single-cell RNA-sequencing methods have revealed new biology in terms of the composition of tissues, the dynamics of transcription, and the regulatory relationships between genes. Rapid technological developments at the level of cell capture, phenotyping, molecular biology, and bioinformatics promise an exciting future with numerous biological and medical applications. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. Reanalysis of RNA-Sequencing Data Reveals Several Additional Fusion Genes with Multiple Isoforms

    PubMed Central

    Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli

    2012-01-01

    RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts. PMID:23119097

  13. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    PubMed

    Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli

    2012-01-01

    RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  14. RNA sequencing confirms similarities between PPI-responsive oesophageal eosinophilia and eosinophilic oesophagitis.

    PubMed

    Peterson, K A; Yoshigi, M; Hazel, M W; Delker, D A; Lin, E; Krishnamurthy, C; Consiglio, N; Robson, J; Yandell, M; Clayton, F

    2018-06-04

    Although current American guidelines distinguish proton pump inhibitor-responsive oesophageal eosinophilia (PPI-REE) from eosinophilic oesophagitis (EoE), these entities are broadly similar. While two microarray studies showed that they have similar transcriptomes, more extensive RNA sequencing studies have not been done previously. To determine whether RNA sequencing identifies genetic markers distinguishing PPI-REE from EoE. We retrospectively examined 13 PPI-REE and 14 EoE biopsies, matched for tissue eosinophil content, and 14 normal controls. Patients and controls were not PPI-treated at the time of biopsy. We did RNA sequencing on formalin-fixed, paraffin-embedded tissue, with differential expression confirmation by quantitative polymerase chain reaction (PCR). We validated the use of formalin-fixed, paraffin-embedded vs RNAlater-preserved tissue, and compared our formalin-fixed, paraffin-embedded EoE results to a prior EoE study. By RNA sequencing, no genes were differentially expressed between the EoE and PPI-REE groups at the false discovery rate (FDR) ≤0.01 level. Compared to normal controls, 1996 genes were differentially expressed in the PPI-REE group and 1306 genes in the EoE group. By less stringent criteria, only MAPK8IP2 was differentially expressed between PPI-REE and EoE (FDR = 0.029, 2.2-fold less in EoE than in PPI-REE), with similar results by PCR. KCNJ2, which was differentially expressed in a prior study, was similar in the EoE and PPI-REE groups by both RNA sequencing and real-time PCR. Eosinophilic oesophagitis and PPI-REE have comparable transcriptomes, confirming that they are part of the same disease continuum. © 2018 John Wiley & Sons Ltd.

  15. Adenylylation of small RNA sequencing adapters using the TS2126 RNA ligase I.

    PubMed

    Lama, Lodoe; Ryan, Kevin

    2016-01-01

    Many high-throughput small RNA next-generation sequencing protocols use 5' preadenylylated DNA oligonucleotide adapters during cDNA library preparation. Preadenylylation of the DNA adapter's 5' end frees from ATP-dependence the ligation of the adapter to RNA collections, thereby avoiding ATP-dependent side reactions. However, preadenylylation of the DNA adapters can be costly and difficult. The currently available method for chemical adenylylation of DNA adapters is inefficient and uses techniques not typically practiced in laboratories profiling cellular RNA expression. An alternative enzymatic method using a commercial RNA ligase was recently introduced, but this enzyme works best as a stoichiometric adenylylating reagent rather than a catalyst and can therefore prove costly when several variant adapters are needed or during scale-up or high-throughput adenylylation procedures. Here, we describe a simple, scalable, and highly efficient method for the 5' adenylylation of DNA oligonucleotides using the thermostable RNA ligase 1 from bacteriophage TS2126. Adapters with 3' blocking groups are adenylylated at >95% yield at catalytic enzyme-to-adapter ratios and need not be gel purified before ligation to RNA acceptors. Experimental conditions are also reported that enable DNA adapters with free 3' ends to be 5' adenylylated at >90% efficiency. © 2015 Lama and Ryan; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  16. Gel-seq: A Method for Simultaneous Sequencing Library Preparation of DNA and RNA Using Hydrogel Matrices.

    PubMed

    Hoople, Gordon D; Richards, Andrew; Wu, Yan; Pisano, Albert P; Zhang, Kun

    2018-03-26

    The ability to amplify and sequence either DNA or RNA from small starting samples has only been achieved in the last five years. Unfortunately, the standard protocols for generating genomic or transcriptomic libraries are incompatible and researchers must choose whether to sequence DNA or RNA for a particular sample. Gel-seq solves this problem by enabling researchers to simultaneously prepare libraries for both DNA and RNA starting with 100 - 1000 cells using a simple hydrogel device. This paper presents a detailed approach for the fabrication of the device as well as the biological protocol to generate paired libraries. We designed Gel-seq so that it could be easily implemented by other researchers; many genetics labs already have the necessary equipment to reproduce the Gel-seq device fabrication. Our protocol employs commonly-used kits for both whole-transcript amplification (WTA) and library preparation, which are also likely to be familiar to researchers already versed in generating genomic and transcriptomic libraries. Our approach allows researchers to bring to bear the power of both DNA and RNA sequencing on a single sample without splitting and with negligible added cost.

  17. Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads.

    PubMed

    Sasagawa, Yohei; Danno, Hiroki; Takada, Hitomi; Ebisawa, Masashi; Tanaka, Kaori; Hayashi, Tetsutaro; Kurisaki, Akira; Nikaido, Itoshi

    2018-03-09

    High-throughput single-cell RNA-seq methods assign limited unique molecular identifier (UMI) counts as gene expression values to single cells from shallow sequence reads and detect limited gene counts. We thus developed a high-throughput single-cell RNA-seq method, Quartz-Seq2, to overcome these issues. Our improvements in the reaction steps make it possible to effectively convert initial reads to UMI counts, at a rate of 30-50%, and detect more genes. To demonstrate the power of Quartz-Seq2, we analyzed approximately 10,000 transcriptomes from in vitro embryonic stem cells and an in vivo stromal vascular fraction with a limited number of reads.

  18. High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases

    PubMed Central

    Qin, Yidan; Yao, Jun; Wu, Douglas C.; Nottingham, Ryan M.; Mohr, Sabine; Hunicke-Smith, Scott; Lambowitz, Alan M.

    2016-01-01

    Next-generation RNA-sequencing (RNA-seq) has revolutionized transcriptome profiling, gene expression analysis, and RNA-based diagnostics. Here, we developed a new RNA-seq method that exploits thermostable group II intron reverse transcriptases (TGIRTs) and used it to profile human plasma RNAs. TGIRTs have higher thermostability, processivity, and fidelity than conventional reverse transcriptases, plus a novel template-switching activity that can efficiently attach RNA-seq adapters to target RNA sequences without RNA ligation. The new TGIRT-seq method enabled construction of RNA-seq libraries from <1 ng of plasma RNA in <5 h. TGIRT-seq of RNA in 1-mL plasma samples from a healthy individual revealed RNA fragments mapping to a diverse population of protein-coding gene and long ncRNAs, which are enriched in intron and antisense sequences, as well as nearly all known classes of small ncRNAs, some of which have never before been seen in plasma. Surprisingly, many of the small ncRNA species were present as full-length transcripts, suggesting that they are protected from plasma RNases in ribonucleoprotein (RNP) complexes and/or exosomes. This TGIRT-seq method is readily adaptable for profiling of whole-cell, exosomal, and miRNAs, and for related procedures, such as HITS-CLIP and ribosome profiling. PMID:26554030

  19. Discovery of DNA viruses in wild-caught mosquitoes using small RNA high throughput sequencing.

    PubMed

    Ma, Maijuan; Huang, Yong; Gong, Zhengda; Zhuang, Lu; Li, Cun; Yang, Hong; Tong, Yigang; Liu, Wei; Cao, Wuchun

    2011-01-01

    Mosquito-borne infectious diseases pose a severe threat to public health in many areas of the world. Current methods for pathogen detection and surveillance are usually dependent on prior knowledge of the etiologic agents involved. Hence, efficient approaches are required for screening wild mosquito populations to detect known and unknown pathogens. In this study, we explored the use of Next Generation Sequencing to identify viral agents in wild-caught mosquitoes. We extracted total RNA from different mosquito species from South China. Small 18-30 bp length RNA molecules were purified, reverse-transcribed into cDNA and sequenced using Illumina GAIIx instrumentation. Bioinformatic analyses to identify putative viral agents were conducted and the results confirmed by PCR. We identified a non-enveloped single-stranded DNA densovirus in the wild-caught Culex pipiens molestus mosquitoes. The majority of the viral transcripts (.>80% of the region) were covered by the small viral RNAs, with a few peaks of very high coverage obtained. The +/- strand sequence ratio of the small RNAs was approximately 7∶1, indicating that the molecules were mainly derived from the viral RNA transcripts. The small viral RNAs overlapped, enabling contig assembly of the viral genome sequence. We identified some small RNAs in the reverse repeat regions of the viral 5'- and 3' -untranslated regions where no transcripts were expected. Our results demonstrate for the first time that high throughput sequencing of small RNA is feasible for identifying viral agents in wild-caught mosquitoes. Our results show that it is possible to detect DNA viruses by sequencing the small RNAs obtained from insects, although the underlying mechanism of small viral RNA biogenesis is unclear. Our data and those of other researchers show that high throughput small RNA sequencing can be used for pathogen surveillance in wild mosquito vectors.

  20. Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.

    PubMed

    Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant

    2017-11-28

    Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.

  1. Specific binding of a HeLa cell nuclear protein to RNA sequences in the human immunodeficiency virus transactivating region.

    PubMed Central

    Gaynor, R; Soultanakis, E; Kuwabara, M; Garcia, J; Sigman, D S

    1989-01-01

    The transactivator protein, tat, encoded by the human immunodeficiency virus is a key regulator of viral transcription. Activation by the tat protein requires sequences downstream of the transcription initiation site called the transactivating region (TAR). RNA derived from the TAR is capable of forming a stable stem-loop structure and the maintenance of both the stem structure and the loop sequences located between +19 and +44 is required for complete in vivo activation by tat. Gel retardation assays with RNA from both wild-type and mutant TAR constructs generated in vitro with SP6 polymerase indicated specific binding of HeLa nuclear proteins to the TAR. To characterize this RNA-protein interaction, a method of chemical "imprinting" has been developed using photoactivated uranyl acetate as the nucleolytic agent. This reagent nicks RNA under physiological conditions at all four nucleotides in a reaction that is independent of sequence and secondary structure. Specific interaction of cellular proteins with TAR RNA could be detected by enhanced cleavages or imprints surrounding the loop region. Mutations that either disrupted stem base-pairing or extensively changed the primary sequence resulted in alterations in the cleavage pattern of the TAR RNA. Structural features of the TAR RNA stem-loop essential for tat activation are also required for specific binding of the HeLa cell nuclear protein. Images PMID:2544877

  2. RNAfbinv: an interactive Java application for fragment-based design of RNA sequences.

    PubMed

    Weinbrand, Lina; Avihoo, Assaf; Barash, Danny

    2013-11-15

    In RNA design problems, it is plausible to assume that the user would be interested in preserving a particular RNA secondary structure motif, or fragment, for biological reasons. The preservation could be in structure or sequence, or both. Thus, the inverse RNA folding problem could benefit from considering fragment constraints. We have developed a new interactive Java application called RNA fragment-based inverse that allows users to insert an RNA secondary structure in dot-bracket notation. It then performs sequence design that conforms to the shape of the input secondary structure, the specified thermodynamic stability, the specified mutational robustness and the user-selected fragment after shape decomposition. In this shape-based design approach, specific RNA structural motifs with known biological functions are strictly enforced, while others can possess more flexibility in their structure in favor of preserving physical attributes and additional constraints. RNAfbinv is freely available for download on the web at http://www.cs.bgu.ac.il/~RNAexinv/RNAfbinv. The site contains a help file with an explanation regarding the exact use.

  3. Empirical analysis of RNA robustness and evolution using high-throughput sequencing of ribozyme reactions.

    PubMed

    Hayden, Eric J

    2016-08-15

    RNA molecules provide a realistic but tractable model of a genotype to phenotype relationship. This relationship has been extensively investigated computationally using secondary structure prediction algorithms. Enzymatic RNA molecules, or ribozymes, offer access to genotypic and phenotypic information in the laboratory. Advancements in high-throughput sequencing technologies have enabled the analysis of sequences in the lab that now rivals what can be accomplished computationally. This has motivated a resurgence of in vitro selection experiments and opened new doors for the analysis of the distribution of RNA functions in genotype space. A body of computational experiments has investigated the persistence of specific RNA structures despite changes in the primary sequence, and how this mutational robustness can promote adaptations. This article summarizes recent approaches that were designed to investigate the role of mutational robustness during the evolution of RNA molecules in the laboratory, and presents theoretical motivations, experimental methods and approaches to data analysis. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. The siRNA Non-seed Region and Its Target Sequences Are Auxiliary Determinants of Off-Target Effects.

    PubMed

    Kamola, Piotr J; Nakano, Yuko; Takahashi, Tomoko; Wilson, Paul A; Ui-Tei, Kumiko

    2015-12-01

    RNA interference (RNAi) is a powerful tool for post-transcriptional gene silencing. However, the siRNA guide strand may bind unintended off-target transcripts via partial sequence complementarity by a mechanism closely mirroring micro RNA (miRNA) silencing. To better understand these off-target effects, we investigated the correlation between sequence features within various subsections of siRNA guide strands, and its corresponding target sequences, with off-target activities. Our results confirm previous reports that strength of base-pairing in the siRNA seed region is the primary factor determining the efficiency of off-target silencing. However, the degree of downregulation of off-target transcripts with shared seed sequence is not necessarily similar, suggesting that there are additional auxiliary factors that influence the silencing potential. Here, we demonstrate that both the melting temperature (Tm) in a subsection of siRNA non-seed region, and the GC contents of its corresponding target sequences, are negatively correlated with the efficiency of off-target effect. Analysis of experimentally validated miRNA targets demonstrated a similar trend, indicating a putative conserved mechanistic feature of seed region-dependent targeting mechanism. These observations may prove useful as parameters for off-target prediction algorithms and improve siRNA 'specificity' design rules.

  5. Evaluating Quality of Aged Archival Formalin-Fixed Paraffin-Embedded Samples for RNA-Sequencing

    EPA Science Inventory

    Archival formalin-fixed paraffin-embedded (FFPE) samples offer a vast, untapped source of genomic data for biomarker discovery. However, the quality of FFPE samples is often highly variable, and conventional methods to assess RNA quality for RNA-sequencing (RNA-seq) are not infor...

  6. Ensemble-based classification approach for micro-RNA mining applied on diverse metagenomic sequences.

    PubMed

    ElGokhy, Sherin M; ElHefnawi, Mahmoud; Shoukry, Amin

    2014-05-06

    MicroRNAs (miRNAs) are endogenous ∼22 nt RNAs that are identified in many species as powerful regulators of gene expressions. Experimental identification of miRNAs is still slow since miRNAs are difficult to isolate by cloning due to their low expression, low stability, tissue specificity and the high cost of the cloning procedure. Thus, computational identification of miRNAs from genomic sequences provide a valuable complement to cloning. Different approaches for identification of miRNAs have been proposed based on homology, thermodynamic parameters, and cross-species comparisons. The present paper focuses on the integration of miRNA classifiers in a meta-classifier and the identification of miRNAs from metagenomic sequences collected from different environments. An ensemble of classifiers is proposed for miRNA hairpin prediction based on four well-known classifiers (Triplet SVM, Mipred, Virgo and EumiR), with non-identical features, and which have been trained on different data. Their decisions are combined using a single hidden layer neural network to increase the accuracy of the predictions. Our ensemble classifier achieved 89.3% accuracy, 82.2% f-measure, 74% sensitivity, 97% specificity, 92.5% precision and 88.2% negative predictive value when tested on real miRNA and pseudo sequence data. The area under the receiver operating characteristic curve of our classifier is 0.9 which represents a high performance index.The proposed classifier yields a significant performance improvement relative to Triplet-SVM, Virgo and EumiR and a minor refinement over MiPred.The developed ensemble classifier is used for miRNA prediction in mine drainage, groundwater and marine metagenomic sequences downloaded from the NCBI sequence reed archive. By consulting the miRBase repository, 179 miRNAs have been identified as highly probable miRNAs. Our new approach could thus be used for mining metagenomic sequences and finding new and homologous miRNAs. The paper investigates a

  7. A Universal Next-Generation Sequencing Protocol To Generate Noninfectious Barcoded cDNA Libraries from High-Containment RNA Viruses

    PubMed Central

    Moser, Lindsey A.; Ramirez-Carvajal, Lisbeth; Puri, Vinita; Pauszek, Steven J.; Matthews, Krystal; Dilley, Kari A.; Mullan, Clancy; McGraw, Jennifer; Khayat, Michael; Beeri, Karen; Yee, Anthony; Dugan, Vivien; Heise, Mark T.; Frieman, Matthew B.; Rodriguez, Luis L.; Bernard, Kristen A.; Wentworth, David E.

    2016-01-01

    ABSTRACT Several biosafety level 3 and/or 4 (BSL-3/4) pathogens are high-consequence, single-stranded RNA viruses, and their genomes, when introduced into permissive cells, are infectious. Moreover, many of these viruses are select agents (SAs), and their genomes are also considered SAs. For this reason, cDNAs and/or their derivatives must be tested to ensure the absence of infectious virus and/or viral RNA before transfer out of the BSL-3/4 and/or SA laboratory. This tremendously limits the capacity to conduct viral genomic research, particularly the application of next-generation sequencing (NGS). Here, we present a sequence-independent method to rapidly amplify viral genomic RNA while simultaneously abolishing both viral and genomic RNA infectivity across multiple single-stranded positive-sense RNA (ssRNA+) virus families. The process generates barcoded DNA amplicons that range in length from 300 to 1,000 bp, which cannot be used to rescue a virus and are stable to transport at room temperature. Our barcoding approach allows for up to 288 barcoded samples to be pooled into a single library and run across various NGS platforms without potential reconstitution of the viral genome. Our data demonstrate that this approach provides full-length genomic sequence information not only from high-titer virion preparations but it can also recover specific viral sequence from samples with limited starting material in the background of cellular RNA, and it can be used to identify pathogens from unknown samples. In summary, we describe a rapid, universal standard operating procedure that generates high-quality NGS libraries free of infectious virus and infectious viral RNA. IMPORTANCE This report establishes and validates a standard operating procedure (SOP) for select agents (SAs) and other biosafety level 3 and/or 4 (BSL-3/4) RNA viruses to rapidly generate noninfectious, barcoded cDNA amenable for next-generation sequencing (NGS). This eliminates the burden of testing all

  8. CapZyme-Seq Comprehensively Defines Promoter-Sequence Determinants for RNA 5' Capping with NAD.

    PubMed

    Vvedenskaya, Irina O; Bird, Jeremy G; Zhang, Yuanchao; Zhang, Yu; Jiao, Xinfu; Barvík, Ivan; Krásný, Libor; Kiledjian, Megerditch; Taylor, Deanne M; Ebright, Richard H; Nickels, Bryce E

    2018-05-03

    Nucleoside-containing metabolites such as NAD + can be incorporated as 5' caps on RNA by serving as non-canonical initiating nucleotides (NCINs) for transcription initiation by RNA polymerase (RNAP). Here, we report CapZyme-seq, a high-throughput-sequencing method that employs NCIN-decapping enzymes NudC and Rai1 to detect and quantify NCIN-capped RNA. By combining CapZyme-seq with multiplexed transcriptomics, we determine efficiencies of NAD + capping by Escherichia coli RNAP for ∼16,000 promoter sequences. The results define preferred transcription start site (TSS) positions for NAD + capping and define a consensus promoter sequence for NAD + capping: HRRASWW (TSS underlined). By applying CapZyme-seq to E. coli total cellular RNA, we establish that sequence determinants for NCIN capping in vivo match the NAD + -capping consensus defined in vitro, and we identify and quantify NCIN-capped small RNAs (sRNAs). Our findings define the promoter-sequence determinants for NCIN capping with NAD + and provide a general method for analysis of NCIN capping in vitro and in vivo. Copyright © 2018 Elsevier Inc. All rights reserved.

  9. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control consortium

    PubMed Central

    2014-01-01

    We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the United States Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed, for these and qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings. PMID:25150838

  10. The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus.

    PubMed Central

    Gustafson, G; Armour, S L

    1986-01-01

    The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus (BSMV) has been determined. The sequence is 3289 nucleotides in length and contains four open reading frames (ORFs) which code for proteins of Mr 22,147 (ORF1), Mr 58,098 (ORF2), Mr 17,378 (ORF3), and Mr 14,119 (ORF4). The predicted N-terminal amino acid sequence of the polypeptide encoded by the ORF nearest the 5'-end of the RNA (ORF1) is identical (after the initiator methionine) to the published N-terminal amino acid sequence of BSMV coat protein for 29 of the first 30 amino acids. ORF2 occupies the central portion of the coding region of RNA beta and ORF3 is located at the 3'-end. The ORF4 sequence overlaps the 3'-region of ORF2 and the 5'-region of ORF3 and differs in codon usage from the other three RNA beta ORFs. The coding region of RNA beta is followed by a poly(A) tract and a 238 nucleotide tRNA-like structure which are common to all three BSMV genomic RNAs. Images PMID:3754962

  11. DNA/RNA transverse current sequencing: intrinsic structural noise from neighboring bases

    PubMed Central

    Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.

    2015-01-01

    Nanopore DNA sequencing via transverse current has emerged as a promising candidate for third-generation sequencing technology. It produces long read lengths which could alleviate problems with assembly errors inherent in current technologies. However, the high error rates of nanopore sequencing have to be addressed. A very important source of the error is the intrinsic noise in the current arising from carrier dispersion along the chain of the molecule, i.e., from the influence of neighboring bases. In this work we perform calculations of the transverse current within an effective multi-orbital tight-binding model derived from first-principles calculations of the DNA/RNA molecules, to study the effect of this structural noise on the error rates in DNA/RNA sequencing via transverse current in nanopores. We demonstrate that a statistical technique, utilizing not only the currents through the nucleotides but also the correlations in the currents, can in principle reduce the error rate below any desired precision. PMID:26150827

  12. An Optimized Transient Dual Luciferase Assay for Quantifying MicroRNA Directed Repression of Targeted Sequences

    PubMed Central

    Moyle, Richard L.; Carvalhais, Lilia C.; Pretorius, Lara-Simone; Nowak, Ekaterina; Subramaniam, Gayathery; Dalton-Morgan, Jessica; Schenk, Peer M.

    2017-01-01

    Studies investigating the action of small RNAs on computationally predicted target genes require some form of experimental validation. Classical molecular methods of validating microRNA action on target genes are laborious, while approaches that tag predicted target sequences to qualitative reporter genes encounter technical limitations. The aim of this study was to address the challenge of experimentally validating large numbers of computationally predicted microRNA-target transcript interactions using an optimized, quantitative, cost-effective, and scalable approach. The presented method combines transient expression via agroinfiltration of Nicotiana benthamiana leaves with a quantitative dual luciferase reporter system, where firefly luciferase is used to report the microRNA-target sequence interaction and Renilla luciferase is used as an internal standard to normalize expression between replicates. We report the appropriate concentration of N. benthamiana leaf extracts and dilution factor to apply in order to avoid inhibition of firefly LUC activity. Furthermore, the optimal ratio of microRNA precursor expression construct to reporter construct and duration of the incubation period post-agroinfiltration were determined. The optimized dual luciferase assay provides an efficient, repeatable and scalable method to validate and quantify microRNA action on predicted target sequences. The optimized assay was used to validate five predicted targets of rice microRNA miR529b, with as few as six technical replicates. The assay can be extended to assess other small RNA-target sequence interactions, including assessing the functionality of an artificial miRNA or an RNAi construct on a targeted sequence. PMID:28979287

  13. Using small RNA (sRNA) deep sequencing to understand global virus distribution in plants

    USDA-ARS?s Scientific Manuscript database

    Small RNAs (sRNAs), a class of regulatory RNAs, have been used to serve as the specificity determinants of suppressing gene expression in plants and animals. Next generation sequencing (NGS) uncovered the sRNA landscape in most organisms including their associated microbes. In the current study, w...

  14. Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns

    PubMed Central

    2013-01-01

    Background It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. Results We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. Conclusions The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected

  15. Annealing to sequences within the primer binding site loop promotes an HIV-1 RNA conformation favoring RNA dimerization and packaging

    PubMed Central

    Seif, Elias; Niu, Meijuan; Kleiman, Lawrence

    2013-01-01

    The 5′ untranslated region (5′ UTR) of HIV-1 genomic RNA (gRNA) includes structural elements that regulate reverse transcription, transcription, translation, tRNALys3 annealing to the gRNA, and gRNA dimerization and packaging into viruses. It has been reported that gRNA dimerization and packaging are regulated by changes in the conformation of the 5′-UTR RNA. In this study, we show that annealing of tRNALys3 or a DNA oligomer complementary to sequences within the primer binding site (PBS) loop of the 5′ UTR enhances its dimerization in vitro. Structural analysis of the 5′-UTR RNA using selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) shows that the annealing promotes a conformational change of the 5′ UTR that has been previously reported to favor gRNA dimerization and packaging into virus. The model predicted by SHAPE analysis is supported by antisense experiments designed to test which annealed sequences will promote or inhibit gRNA dimerization. Based on reports showing that the gRNA dimerization favors its incorporation into viruses, we tested the ability of a mutant gRNA unable to anneal to tRNALys3 to be incorporated into virions. We found a ∼60% decrease in mutant gRNA packaging compared with wild-type gRNA. Together, these data further support a model for viral assembly in which the initial annealing of tRNALys3 to gRNA is cytoplasmic, which in turn aids in the promotion of gRNA dimerization and its incorporation into virions. PMID:23960173

  16. Integrated analysis of RNA-binding protein complexes using in vitro selection and high-throughput sequencing and sequence specificity landscapes (SEQRS).

    PubMed

    Lou, Tzu-Fang; Weidmann, Chase A; Killingsworth, Jordan; Tanaka Hall, Traci M; Goldstrohm, Aaron C; Campbell, Zachary T

    2017-04-15

    RNA-binding proteins (RBPs) collaborate to control virtually every aspect of RNA function. Tremendous progress has been made in the area of global assessment of RBP specificity using next-generation sequencing approaches both in vivo and in vitro. Understanding how protein-protein interactions enable precise combinatorial regulation of RNA remains a significant problem. Addressing this challenge requires tools that can quantitatively determine the specificities of both individual proteins and multimeric complexes in an unbiased and comprehensive way. One approach utilizes in vitro selection, high-throughput sequencing, and sequence-specificity landscapes (SEQRS). We outline a SEQRS experiment focused on obtaining the specificity of a multi-protein complex between Drosophila RBPs Pumilio (Pum) and Nanos (Nos). We discuss the necessary controls in this type of experiment and examine how the resulting data can be complemented with structural and cell-based reporter assays. Additionally, SEQRS data can be integrated with functional genomics data to uncover biological function. Finally, we propose extensions of the technique that will enhance our understanding of multi-protein regulatory complexes assembled onto RNA. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. CoLIde: a bioinformatics tool for CO-expression-based small RNA Loci Identification using high-throughput sequencing data.

    PubMed

    Mohorianu, Irina; Stocks, Matthew Benedict; Wood, John; Dalmay, Tamas; Moulton, Vincent

    2013-07-01

    Small RNAs (sRNAs) are 20-25 nt non-coding RNAs that act as guides for the highly sequence-specific regulatory mechanism known as RNA silencing. Due to the recent increase in sequencing depth, a highly complex and diverse population of sRNAs in both plants and animals has been revealed. However, the exponential increase in sequencing data has also made the identification of individual sRNA transcripts corresponding to biological units (sRNA loci) more challenging when based exclusively on the genomic location of the constituent sRNAs, hindering existing approaches to identify sRNA loci. To infer the location of significant biological units, we propose an approach for sRNA loci detection called CoLIde (Co-expression based sRNA Loci Identification) that combines genomic location with the analysis of other information such as variation in expression levels (expression pattern) and size class distribution. For CoLIde, we define a locus as a union of regions sharing the same pattern and located in close proximity on the genome. Biological relevance, detected through the analysis of size class distribution, is also calculated for each locus. CoLIde can be applied on ordered (e.g., time-dependent) or un-ordered (e.g., organ, mutant) series of samples both with or without biological/technical replicates. The method reliably identifies known types of loci and shows improved performance on sequencing data from both plants (e.g., A. thaliana, S. lycopersicum) and animals (e.g., D. melanogaster) when compared with existing locus detection techniques. CoLIde is available for use within the UEA Small RNA Workbench which can be downloaded from: http://srna-workbench.cmp.uea.ac.uk.

  18. Avian acute leukemia viruses MC29 and MH2 share specific RNA sequences: Evidence for a second class of transforming genes

    PubMed Central

    Duesberg, Peter H.; Vogt, Peter K.

    1979-01-01

    The genome of the defective avian tumor virus MH2 was identified as a RNA of 5.7 kilobases by its presence in different MH2-helper virus complexes and its absence from pure helper virus, by its unique fingerprint pattern of RNase T1-resistant (T1) oligonucleotides that differed from those of two helper virus RNAs, and by its structural analogy to the RNA of MC29, another avian acute leukemia virus. Two sets of sequences were distinguished in MH2 RNA: 66% hybridized with DNA complementary to helper-independent avian tumor viruses, termed group-specific, and 34% were specific. The percentage of specific sequences is considered a minimal estimate because the MH2 RNA used was about 30% contaminated by helper virus RNA. No sequences related to the transforming src gene of avian sarcoma viruses were found in MH2. MH2 shared three large T1 oligonucleotides with MC29, two of which could also be isolated from a RNase A- and T1-resistant hybrid formed between MH2 RNA and MC29 specific cDNA. These oligonucleotides belong to a group of six that define the specific segment of MC29 RNA described previously. The group-specific sequences of MH2 and MC29 RNA shared only the two smallest out of about 20 T1 oligonucleotides associated with MH2 RNA. It is concluded that the specific sequences of MH2 and MC29 are related, and it is proposed that they are necessary for, or identical with, the onc genes of these viruses. These sequences would define a related class of transforming genes in avian tumor viruses that differs from the src genes of avian sarcoma viruses. Images PMID:221900

  19. Deep sequencing of Salmonella RNA associated with heterologous Hfq proteins in vivo reveals small RNAs as a major target class and identifies RNA processing phenotypes.

    PubMed

    Sittka, Alexandra; Sharma, Cynthia M; Rolle, Katarzyna; Vogel, Jörg

    2009-01-01

    The bacterial Sm-like protein, Hfq, is a key factor for the stability and function of small non-coding RNAs (sRNAs) in Escherichia coli. Homologues of this protein have been predicted in many distantly related organisms yet their functional conservation as sRNA-binding proteins has not entirely been clear. To address this, we expressed in Salmonella the Hfq proteins of two eubacteria (Neisseria meningitides, Aquifex aeolicus) and an archaeon (Methanocaldococcus jannaschii), and analyzed the associated RNA by deep sequencing. This in vivo approach identified endogenous Salmonella sRNAs as a major target of the foreign Hfq proteins. New Salmonella sRNA species were also identified, and some of these accumulated specifically in the presence of a foreign Hfq protein. In addition, we observed specific RNA processing defects, e.g., suppression of precursor processing of SraH sRNA by Methanocaldococcus Hfq, or aberrant accumulation of extracytoplasmic target mRNAs of the Salmonella GcvB, MicA or RybB sRNAs. Taken together, our study provides evidence of a conserved inherent sRNA-binding property of Hfq, which may facilitate the lateral transmission of regulatory sRNAs among distantly related species. It also suggests that the expression of heterologous RNA-binding proteins combined with deep sequencing analysis of RNA ligands can be used as a molecular tool to dissect individual steps of RNA metabolism in vivo.

  20. mRNA stability in mammalian cells.

    PubMed Central

    Ross, J

    1995-01-01

    This review concerns how cytoplasmic mRNA half-lives are regulated and how mRNA decay rates influence gene expression. mRNA stability influences gene expression in virtually all organisms, from bacteria to mammals, and the abundance of a particular mRNA can fluctuate manyfold following a change in the mRNA half-life, without any change in transcription. The processes that regulate mRNA half-lives can, in turn, affect how cells grow, differentiate, and respond to their environment. Three major questions are addressed. Which sequences in mRNAs determine their half-lives? Which enzymes degrade mRNAs? Which (trans-acting) factors regulate mRNA stability, and how do they function? The following specific topics are discussed: techniques for measuring eukaryotic mRNA stability and for calculating decay constants, mRNA decay pathways, mRNases, proteins that bind to sequences shared among many mRNAs [like poly(A)- and AU-rich-binding proteins] and proteins that bind to specific mRNAs (like the c-myc coding-region determinant-binding protein), how environmental factors like hormones and growth factors affect mRNA stability, and how translation and mRNA stability are linked. Some perspectives and predictions for future research directions are summarized at the end. PMID:7565413

  1. DNA sequence analysis of a 10 624 bp fragment of the left arm of chromosome XV from Saccharomyces cerevisiae reveals a RNA binding protein, a mitochondrial protein, two ribosomal proteins and two new open reading frames.

    PubMed

    Lafuente, M J; Gamo, F J; Gancedo, C

    1996-09-01

    We have determined the sequence of a 10624 bp DNA segment located in the left arm of chromosome XV of Saccharomyces cerevisiae. The sequence contains eight open reading frames (ORFs) longer than 100 amino acids. Two of them do not present significant homology with sequences found in the databases. The product of ORF o0553 is identical to the protein encoded by the gene SMF1. Internal to it there is another ORF, o0555 that is apparently expressed. The proteins encoded by ORFs o0559 and o0565 are identical to ribosomal proteins S19.e and L18 respectively. ORF o0550 encodes a protein with an RNA binding signature including RNP motifs and stretches rich in asparagine, glutamine and arginine.

  2. RNA Sequencing and Bioinformatics Analysis Implicate the Regulatory Role of a Long Noncoding RNA-mRNA Network in Hepatic Stellate Cell Activation.

    PubMed

    Guo, Can-Jie; Xiao, Xiao; Sheng, Li; Chen, Lili; Zhong, Wei; Li, Hai; Hua, Jing; Ma, Xiong

    2017-01-01

    To analyze the long noncoding (lncRNA)-mRNA expression network and potential roles in rat hepatic stellate cells (HSCs) during activation. LncRNA expression was analyzed in quiescent and culture-activated HSCs by RNA sequencing, and differentially expressed lncRNAs verified by quantitative reverse transcription polymerase chain reaction (qRT-PCR) were subjected to bioinformatics analysis. In vivo analyses of differential lncRNA-mRNA expression were performed on a rat model of liver fibrosis. We identified upregulation of 12 lncRNAs and 155 mRNAs and downregulation of 12 lncRNAs and 374 mRNAs in activated HSCs. Additionally, we identified the differential expression of upregulated lncRNAs (NONRATT012636.2, NONRATT016788.2, and NONRATT021402.2) and downregulated lncRNAs (NONRATT007863.2, NONRATT019720.2, and NONRATT024061.2) in activated HSCs relative to levels observed in quiescent HSCs, and Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses showed that changes in lncRNAs associated with HSC activation revealed 11 significantly enriched pathways according to their predicted targets. Moreover, based on the predicted co-expression network, the relative dynamic levels of NONRATT013819.2 and lysyl oxidase (Lox) were compared during HSC activation both in vitro and in vivo. Our results confirmed the upregulation of lncRNA NONRATT013819.2 and Lox mRNA associated with the extracellular matrix (ECM)-related signaling pathway in HSCs and fibrotic livers. Our results detailing a dysregulated lncRNA-mRNA network might provide new treatment strategies for hepatic fibrosis based on findings indicating potentially critical roles for NONRATT013819.2 and Lox in ECM remodeling during HSC activation. © 2017 The Author(s). Published by S. Karger AG, Basel.

  3. Sequencing and Characterisation of an Extensive Atlantic Salmon (Salmo salar L.) MicroRNA Repertoire

    PubMed Central

    Bekaert, Michaël; Lowe, Natalie R.; Bishop, Stephen C.; Bron, James E.; Taggart, John B.; Houston, Ross D.

    2013-01-01

    Atlantic salmon (Salmo salar L.), a member of the family Salmonidae, is a totemic species of ecological and cultural significance that is also economically important in terms of both sports fisheries and aquaculture. These factors have promoted the continuous development of genomic resources for this species, furthering both fundamental and applied research. MicroRNAs (miRNA) are small endogenous non-coding RNA molecules that control spatial and temporal expression of targeted genes through post-transcriptional regulation. While miRNA have been characterised in detail for many other species, this is not yet the case for Atlantic salmon. To identify miRNAs from Atlantic salmon, we constructed whole fish miRNA libraries for 18 individual juveniles (fry, four months post hatch) and characterised them by Illumina high-throughput sequencing (total of 354,505,167 paired-ended reads). We report an extensive and partly novel repertoire of miRNA sequences, comprising 888 miRNA genes (547 unique mature miRNA sequences), quantify their expression levels in basal conditions, examine their homology to miRNAs from other species and identify their predicted target genes. We also identify the location and putative copy number of the miRNA genes in the draft Atlantic salmon reference genome sequence. The Atlantic salmon miRNAs experimentally identified in this study provide a robust large-scale resource for functional genome research in salmonids. There is an opportunity to explore the evolution of salmonid miRNAs following the relatively recent whole genome duplication event in salmonid species and to investigate the role of miRNAs in the regulation of gene expression in particular their contribution to variation in economically and ecologically important traits. PMID:23922936

  4. Single-cell mRNA cytometry via sequence-specific nanoparticle clustering and trapping

    NASA Astrophysics Data System (ADS)

    Labib, Mahmoud; Mohamadi, Reza M.; Poudineh, Mahla; Ahmed, Sharif U.; Ivanov, Ivaylo; Huang, Ching-Lung; Moosavi, Maral; Sargent, Edward H.; Kelley, Shana O.

    2018-05-01

    Cell-to-cell variation in gene expression creates a need for techniques that can characterize expression at the level of individual cells. This is particularly true for rare circulating tumour cells, in which subtyping and drug resistance are of intense interest. Here we describe a method for cell analysis—single-cell mRNA cytometry—that enables the isolation of rare cells from whole blood as a function of target mRNA sequences. This approach uses two classes of magnetic particles that are labelled to selectively hybridize with different regions of the target mRNA. Hybridization leads to the formation of large magnetic clusters that remain localized within the cells of interest, thereby enabling the cells to be magnetically separated. Targeting specific intracellular mRNAs enablescirculating tumour cells to be distinguished from normal haematopoietic cells. No polymerase chain reaction amplification is required to determine RNA expression levels and genotype at the single-cell level, and minimal cell manipulation is required. To demonstrate this approach we use single-cell mRNA cytometry to detect clinically important sequences in prostate cancer specimens.

  5. The zinc fingers of YY1 bind single-stranded RNA with low sequence specificity.

    PubMed

    Wai, Dorothy C C; Shihab, Manar; Low, Jason K K; Mackay, Joel P

    2016-11-02

    Classical zinc fingers (ZFs) are traditionally considered to act as sequence-specific DNA-binding domains. More recently, classical ZFs have been recognised as potential RNA-binding modules, raising the intriguing possibility that classical-ZF transcription factors are involved in post-transcriptional gene regulation via direct RNA binding. To date, however, only one classical ZF-RNA complex, that involving TFIIIA, has been structurally characterised. Yin Yang-1 (YY1) is a multi-functional transcription factor involved in many regulatory processes, and binds DNA via four classical ZFs. Recent evidence suggests that YY1 also interacts with RNA, but the molecular nature of the interaction remains unknown. In the present work, we directly assess the ability of YY1 to bind RNA using in vitro assays. Systematic Evolution of Ligands by EXponential enrichment (SELEX) was used to identify preferred RNA sequences bound by the YY1 ZFs from a randomised library over multiple rounds of selection. However, a strong motif was not consistently recovered, suggesting that the RNA sequence selectivity of these domains is modest. YY1 ZF residues involved in binding to single-stranded RNA were identified by NMR spectroscopy and found to be largely distinct from the set of residues involved in DNA binding, suggesting that interactions between YY1 and ssRNA constitute a separate mode of nucleic acid binding. Our data are consistent with recent reports that YY1 can bind to RNA in a low-specificity, yet physiologically relevant manner. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Synthetic Spike-in Standards Improve Run-Specific Systematic Error Analysis for DNA and RNA Sequencing

    PubMed Central

    Zook, Justin M.; Samarov, Daniel; McDaniel, Jennifer; Sen, Shurjo K.; Salit, Marc

    2012-01-01

    While the importance of random sequencing errors decreases at higher DNA or RNA sequencing depths, systematic sequencing errors (SSEs) dominate at high sequencing depths and can be difficult to distinguish from biological variants. These SSEs can cause base quality scores to underestimate the probability of error at certain genomic positions, resulting in false positive variant calls, particularly in mixtures such as samples with RNA editing, tumors, circulating tumor cells, bacteria, mitochondrial heteroplasmy, or pooled DNA. Most algorithms proposed for correction of SSEs require a data set used to calculate association of SSEs with various features in the reads and sequence context. This data set is typically either from a part of the data set being “recalibrated” (Genome Analysis ToolKit, or GATK) or from a separate data set with special characteristics (SysCall). Here, we combine the advantages of these approaches by adding synthetic RNA spike-in standards to human RNA, and use GATK to recalibrate base quality scores with reads mapped to the spike-in standards. Compared to conventional GATK recalibration that uses reads mapped to the genome, spike-ins improve the accuracy of Illumina base quality scores by a mean of 5 Phred-scaled quality score units, and by as much as 13 units at CpG sites. In addition, since the spike-in data used for recalibration are independent of the genome being sequenced, our method allows run-specific recalibration even for the many species without a comprehensive and accurate SNP database. We also use GATK with the spike-in standards to demonstrate that the Illumina RNA sequencing runs overestimate quality scores for AC, CC, GC, GG, and TC dinucleotides, while SOLiD has less dinucleotide SSEs but more SSEs for certain cycles. We conclude that using these DNA and RNA spike-in standards with GATK improves base quality score recalibration. PMID:22859977

  7. Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

    PubMed

    Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

    2013-01-01

    Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.

  8. The adenovirus L4-22K protein regulates transcription and RNA splicing via a sequence-specific single-stranded RNA binding.

    PubMed

    Lan, Susan; Kamel, Wael; Punga, Tanel; Akusjärvi, Göran

    2017-02-28

    The adenovirus L4-22K protein both activates and suppresses transcription from the adenovirus major late promoter (MLP) by binding to DNA elements located downstream of the MLP transcriptional start site: the so-called DE element (positive) and the R1 region (negative). Here we show that L4-22K preferentially binds to the RNA form of the R1 region, both to the double-stranded RNA and the single-stranded RNA of the same polarity as the nascent MLP transcript. Further, L4-22K binds to a 5΄-CAAA-3΄ motif in the single-stranded RNA, which is identical to the sequence motif characterized for L4-22K DNA binding. L4-22K binding to single-stranded RNA results in an enhancement of U1 snRNA recruitment to the major late first leader 5΄ splice site. This increase in U1 snRNA binding results in a suppression of MLP transcription and a concurrent stimulation of major late first intron splicing. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Identifying mRNA sequence elements for target recognition by human Argonaute proteins

    PubMed Central

    Li, Jingjing; Kim, TaeHyung; Nutiu, Razvan; Ray, Debashish; Hughes, Timothy R.; Zhang, Zhaolei

    2014-01-01

    It is commonly known that mammalian microRNAs (miRNAs) guide the RNA-induced silencing complex (RISC) to target mRNAs through the seed-pairing rule. However, recent experiments that coimmunoprecipitate the Argonaute proteins (AGOs), the central catalytic component of RISC, have consistently revealed extensive AGO-associated mRNAs that lack seed complementarity with miRNAs. We herein test the hypothesis that AGO has its own binding preference within target mRNAs, independent of guide miRNAs. By systematically analyzing the data from in vivo cross-linking experiments with human AGOs, we have identified a structurally accessible and evolutionarily conserved region (∼10 nucleotides in length) that alone can accurately predict AGO–mRNA associations, independent of the presence of miRNA binding sites. Within this region, we further identified an enriched motif that was replicable on independent AGO-immunoprecipitation data sets. We used RNAcompete to enumerate the RNA-binding preference of human AGO2 to all possible 7-mer RNA sequences and validated the AGO motif in vitro. These findings reveal a novel function of AGOs as sequence-specific RNA-binding proteins, which may aid miRNAs in recognizing their targets with high specificity. PMID:24663241

  10. Geoseq: a tool for dissecting deep-sequencing datasets.

    PubMed

    Gurtowski, James; Cancio, Anthony; Shah, Hardik; Levovitz, Chaya; George, Ajish; Homann, Robert; Sachidanandam, Ravi

    2010-10-12

    Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a) identify differential isoform expression in mRNA-seq datasets, b) identify miRNAs (microRNAs) in libraries, and identify mature and star sequences in miRNAS and c) to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  11. A framework for establishing predictive relationships between specific bacterial 16S rRNA sequence abundances and biotransformation rates.

    PubMed

    Helbling, Damian E; Johnson, David R; Lee, Tae Kwon; Scheidegger, Andreas; Fenner, Kathrin

    2015-03-01

    The rates at which wastewater treatment plant (WWTP) microbial communities biotransform specific substrates can differ by orders of magnitude among WWTP communities. Differences in taxonomic compositions among WWTP communities may predict differences in the rates of some types of biotransformations. In this work, we present a novel framework for establishing predictive relationships between specific bacterial 16S rRNA sequence abundances and biotransformation rates. We selected ten WWTPs with substantial variation in their environmental and operational metrics and measured the in situ ammonia biotransformation rate constants in nine of them. We isolated total RNA from samples from each WWTP and analyzed 16S rRNA sequence reads. We then developed multivariate models between the measured abundances of specific bacterial 16S rRNA sequence reads and the ammonia biotransformation rate constants. We constructed model scenarios that systematically explored the effects of model regularization, model linearity and non-linearity, and aggregation of 16S rRNA sequences into operational taxonomic units (OTUs) as a function of sequence dissimilarity threshold (SDT). A large percentage (greater than 80%) of model scenarios resulted in well-performing and significant models at intermediate SDTs of 0.13-0.14 and 0.26. The 16S rRNA sequences consistently selected into the well-performing and significant models at those SDTs were classified as Nitrosomonas and Nitrospira groups. We then extend the framework by applying it to the biotransformation rate constants of ten micropollutants measured in batch reactors seeded with the ten WWTP communities. We identified phylogenetic groups that were robustly selected into all well-performing and significant models constructed with biotransformation rates of isoproturon, propachlor, ranitidine, and venlafaxine. These phylogenetic groups can be used as predictive biomarkers of WWTP microbial community activity towards these specific

  12. INFO-RNA--a fast approach to inverse RNA folding.

    PubMed

    Busch, Anke; Backofen, Rolf

    2006-08-01

    The structure of RNA molecules is often crucial for their function. Therefore, secondary structure prediction has gained much interest. Here, we consider the inverse RNA folding problem, which means designing RNA sequences that fold into a given structure. We introduce a new algorithm for the inverse folding problem (INFO-RNA) that consists of two parts; a dynamic programming method for good initial sequences and a following improved stochastic local search that uses an effective neighbor selection method. During the initialization, we design a sequence that among all sequences adopts the given structure with the lowest possible energy. For the selection of neighbors during the search, we use a kind of look-ahead of one selection step applying an additional energy-based criterion. Afterwards, the pre-ordered neighbors are tested using the actual optimization criterion of minimizing the structure distance between the target structure and the mfe structure of the considered neighbor. We compared our algorithm to RNAinverse and RNA-SSD for artificial and biological test sets. Using INFO-RNA, we performed better than RNAinverse and in most cases, we gained better results than RNA-SSD, the probably best inverse RNA folding tool on the market. www.bioinf.uni-freiburg.de?Subpages/software.html.

  13. JingleBells: A Repository of Immune-Related Single-Cell RNA-Sequencing Datasets.

    PubMed

    Ner-Gaon, Hadas; Melchior, Ariel; Golan, Nili; Ben-Haim, Yael; Shay, Tal

    2017-05-01

    Recent advances in single-cell RNA-sequencing (scRNA-seq) technology increase the understanding of immune differentiation and activation processes, as well as the heterogeneity of immune cell types. Although the number of available immune-related scRNA-seq datasets increases rapidly, their large size and various formats render them hard for the wider immunology community to use, and read-level data are practically inaccessible to the non-computational immunologist. To facilitate datasets reuse, we created the JingleBells repository for immune-related scRNA-seq datasets ready for analysis and visualization of reads at the single-cell level (http://jinglebells.bgu.ac.il/). To this end, we collected the raw data of publicly available immune-related scRNA-seq datasets, aligned the reads to the relevant genome, and saved aligned reads in a uniform format, annotated for cell of origin. We also added scripts and a step-by-step tutorial for visualizing each dataset at the single-cell level, through the commonly used Integrated Genome Viewer (www.broadinstitute.org/igv/). The uniform scRNA-seq format used in JingleBells can facilitate reuse of scRNA-seq data by computational biologists. It also enables immunologists who are interested in a specific gene to visualize the reads aligned to this gene to estimate cell-specific preferences for splicing, mutation load, or alleles. Thus JingleBells is a resource that will extend the usefulness of scRNA-seq datasets outside the programming aficionado realm. Copyright © 2017 by The American Association of Immunologists, Inc.

  14. Missing data and technical variability in single-cell RNA-sequencing experiments.

    PubMed

    Hicks, Stephanie C; Townes, F William; Teng, Mingxiang; Irizarry, Rafael A

    2017-11-06

    Until recently, high-throughput gene expression technology, such as RNA-Sequencing (RNA-seq) required hundreds of thousands of cells to produce reliable measurements. Recent technical advances permit genome-wide gene expression measurement at the single-cell level. Single-cell RNA-Seq (scRNA-seq) is the most widely used and numerous publications are based on data produced with this technology. However, RNA-seq and scRNA-seq data are markedly different. In particular, unlike RNA-seq, the majority of reported expression levels in scRNA-seq are zeros, which could be either biologically-driven, genes not expressing RNA at the time of measurement, or technically-driven, genes expressing RNA, but not at a sufficient level to be detected by sequencing technology. Another difference is that the proportion of genes reporting the expression level to be zero varies substantially across single cells compared to RNA-seq samples. However, it remains unclear to what extent this cell-to-cell variation is being driven by technical rather than biological variation. Furthermore, while systematic errors, including batch effects, have been widely reported as a major challenge in high-throughput technologies, these issues have received minimal attention in published studies based on scRNA-seq technology. Here, we use an assessment experiment to examine data from published studies and demonstrate that systematic errors can explain a substantial percentage of observed cell-to-cell expression variability. Specifically, we present evidence that some of these reported zeros are driven by technical variation by demonstrating that scRNA-seq produces more zeros than expected and that this bias is greater for lower expressed genes. In addition, this missing data problem is exacerbated by the fact that this technical variation varies cell-to-cell. Then, we show how this technical cell-to-cell variability can be confused with novel biological results. Finally, we demonstrate and discuss how batch

  15. RNA Sequencing Identifies New RNase III Cleavage Sites in Escherichia coli and Reveals Increased Regulation of mRNA

    DOE PAGES

    Gordon, Gina C.; Cameron, Jeffrey C.; Pfleger, Brian F.

    2017-03-28

    Ribonucleases facilitate rapid turnover of RNA, providing cells with another mechanism to adjust transcript and protein levels in response to environmental conditions. While many examples have been documented, a comprehensive list of RNase targets is not available. To address this knowledge gap, we compared levels of RNA sequencing coverage of Escherichia coli and a corresponding RNase III mutant to expand the list of known RNase III targets. RNase III is a widespread endoribonuclease that binds and cleaves double-stranded RNA in many critical transcripts. RNase III cleavage at novel sites found in aceEF, proP, tnaC, dctA, pheM, sdhC, yhhQ, glpT, aceK,more » and gluQ accelerated RNA decay, consistent with previously described targets wherein RNase III cleavage initiates rapid degradation of secondary messages by other RNases. In contrast, cleavage at three novel sites in the ahpF, pflB, and yajQ transcripts led to stabilized secondary transcripts. Two other novel sites in hisL and pheM overlapped with transcriptional attenuators that likely serve to ensure turnover of these highly structured RNAs. Many of the new RNase III target sites are located on transcripts encoding metabolic enzymes. For instance, two novel RNase III sites are located within transcripts encoding enzymes near a key metabolic node connecting glycolysis and the tricarboxylic acid (TCA) cycle. Pyruvate dehydrogenase activity was increased in an rnc deletion mutant compared to the wild-type (WT) strain in early stationary phase, confirming the novel link between RNA turnover and regulation of pathway activity. Identification of these novel sites suggests that mRNA turnover may be an underappreciated mode of regulating metabolism. IMPORTANCE: The concerted action and overlapping functions of endoribonucleases, exoribonucleases, and RNA processing enzymes complicate the study of global RNA turnover and recycling of specific transcripts. More information about RNase specificity and activity is

  16. RNA Sequencing Identifies New RNase III Cleavage Sites in Escherichia coli and Reveals Increased Regulation of mRNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gordon, Gina C.; Cameron, Jeffrey C.; Pfleger, Brian F.

    Ribonucleases facilitate rapid turnover of RNA, providing cells with another mechanism to adjust transcript and protein levels in response to environmental conditions. While many examples have been documented, a comprehensive list of RNase targets is not available. To address this knowledge gap, we compared levels of RNA sequencing coverage of Escherichia coli and a corresponding RNase III mutant to expand the list of known RNase III targets. RNase III is a widespread endoribonuclease that binds and cleaves double-stranded RNA in many critical transcripts. RNase III cleavage at novel sites found in aceEF, proP, tnaC, dctA, pheM, sdhC, yhhQ, glpT, aceK,more » and gluQ accelerated RNA decay, consistent with previously described targets wherein RNase III cleavage initiates rapid degradation of secondary messages by other RNases. In contrast, cleavage at three novel sites in the ahpF, pflB, and yajQ transcripts led to stabilized secondary transcripts. Two other novel sites in hisL and pheM overlapped with transcriptional attenuators that likely serve to ensure turnover of these highly structured RNAs. Many of the new RNase III target sites are located on transcripts encoding metabolic enzymes. For instance, two novel RNase III sites are located within transcripts encoding enzymes near a key metabolic node connecting glycolysis and the tricarboxylic acid (TCA) cycle. Pyruvate dehydrogenase activity was increased in an rnc deletion mutant compared to the wild-type (WT) strain in early stationary phase, confirming the novel link between RNA turnover and regulation of pathway activity. Identification of these novel sites suggests that mRNA turnover may be an underappreciated mode of regulating metabolism. IMPORTANCE: The concerted action and overlapping functions of endoribonucleases, exoribonucleases, and RNA processing enzymes complicate the study of global RNA turnover and recycling of specific transcripts. More information about RNase specificity and activity is

  17. MicroRNAs Form Triplexes with Double Stranded DNA at Sequence-Specific Binding Sites; a Eukaryotic Mechanism via which microRNAs Could Directly Alter Gene Expression

    PubMed Central

    Grace, Christy R.; Ferreira, Antonio M.; Waddell, M. Brett; Ridout, Granger; Naeve, Deanna; Leuze, Michael; LoCascio, Philip F.; Panetta, John C.; Wilkinson, Mark R.; Pui, Ching-Hon; Naeve, Clayton W.; Uberbacher, Edward C.; Bonten, Erik J.; Evans, William E.

    2016-01-01

    MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA) and typically down-regulating their stability or translation. Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence (i.e., NMR, FRET, SPR) that purine or pyrimidine-rich microRNAs of appropriate length and sequence form triple-helical structures with purine-rich sequences of duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show that several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 × 10−16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. This work has thus revealed a new mechanism by which microRNAs could interact with gene promoter regions to modify gene transcription. PMID:26844769

  18. Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

    NASA Technical Reports Server (NTRS)

    Fox, G. E.

    1985-01-01

    Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.

  19. Unifying cancer and normal RNA sequencing data from different sources

    PubMed Central

    Wang, Qingguo; Armenia, Joshua; Zhang, Chao; Penson, Alexander V.; Reznik, Ed; Zhang, Liguo; Minet, Thais; Ochoa, Angelica; Gross, Benjamin E.; Iacobuzio-Donahue, Christine A.; Betel, Doron; Taylor, Barry S.; Gao, Jianjiong; Schultz, Nikolaus

    2018-01-01

    Driven by the recent advances of next generation sequencing (NGS) technologies and an urgent need to decode complex human diseases, a multitude of large-scale studies were conducted recently that have resulted in an unprecedented volume of whole transcriptome sequencing (RNA-seq) data, such as the Genotype Tissue Expression project (GTEx) and The Cancer Genome Atlas (TCGA). While these data offer new opportunities to identify the mechanisms underlying disease, the comparison of data from different sources remains challenging, due to differences in sample and data processing. Here, we developed a pipeline that processes and unifies RNA-seq data from different studies, which includes uniform realignment, gene expression quantification, and batch effect removal. We find that uniform alignment and quantification is not sufficient when combining RNA-seq data from different sources and that the removal of other batch effects is essential to facilitate data comparison. We have processed data from GTEx and TCGA and successfully corrected for study-specific biases, enabling comparative analysis between TCGA and GTEx. The normalized datasets are available for download on figshare. PMID:29664468

  20. Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences

    PubMed Central

    2018-01-01

    Prediction of taxonomy for marker gene sequences such as 16S ribosomal RNA (rRNA) is a fundamental task in microbiology. Most experimentally observed sequences are diverged from reference sequences of authoritatively named organisms, creating a challenge for prediction methods. I assessed the accuracy of several algorithms using cross-validation by identity, a new benchmark strategy which explicitly models the variation in distances between query sequences and the closest entry in a reference database. When the accuracy of genus predictions was averaged over a representative range of identities with the reference database (100%, 99%, 97%, 95% and 90%), all tested methods had ≤50% accuracy on the currently-popular V4 region of 16S rRNA. Accuracy was found to fall rapidly with identity; for example, better methods were found to have V4 genus prediction accuracy of ∼100% at 100% identity but ∼50% at 97% identity. The relationship between identity and taxonomy was quantified as the probability that a rank is the lowest shared by a pair of sequences with a given pair-wise identity. With the V4 region, 95% identity was found to be a twilight zone where taxonomy is highly ambiguous because the probabilities that the lowest shared rank between pairs of sequences is genus, family, order or class are approximately equal. PMID:29682424

  1. Mapping RNA Structure In Vitro with SHAPE Chemistry and Next-Generation Sequencing (SHAPE-Seq).

    PubMed

    Watters, Kyle E; Lucks, Julius B

    2016-01-01

    Mapping RNA structure with selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry has proven to be a versatile method for characterizing RNA structure in a variety of contexts. SHAPE reagents covalently modify RNAs in a structure-dependent manner to create adducts at the 2'-OH group of the ribose backbone at nucleotides that are structurally flexible. The positions of these adducts are detected using reverse transcriptase (RT) primer extension, which stops one nucleotide before the modification, to create a pool of cDNAs whose lengths reflect the location of SHAPE modification. Quantification of the cDNA pools is used to estimate the "reactivity" of each nucleotide in an RNA molecule to the SHAPE reagent. High reactivities indicate nucleotides that are structurally flexible, while low reactivities indicate nucleotides that are inflexible. These SHAPE reactivities can then be used to infer RNA structures by restraining RNA structure prediction algorithms. Here, we provide a state-of-the-art protocol describing how to perform in vitro RNA structure probing with SHAPE chemistry using next-generation sequencing to quantify cDNA pools and estimate reactivities (SHAPE-Seq). The use of next-generation sequencing allows for higher throughput, more consistent data analysis, and multiplexing capabilities. The technique described herein, SHAPE-Seq v2.0, uses a universal reverse transcription priming site that is ligated to the RNA after SHAPE modification. The introduced priming site allows for the structural analysis of an RNA independent of its sequence.

  2. Application of Stochastic Labeling with Random-Sequence Barcodes for Simultaneous Quantification and Sequencing of Environmental 16S rRNA Genes.

    PubMed

    Hoshino, Tatsuhiko; Inagaki, Fumio

    2017-01-01

    Next-generation sequencing (NGS) is a powerful tool for analyzing environmental DNA and provides the comprehensive molecular view of microbial communities. For obtaining the copy number of particular sequences in the NGS library, however, additional quantitative analysis as quantitative PCR (qPCR) or digital PCR (dPCR) is required. Furthermore, number of sequences in a sequence library does not always reflect the original copy number of a target gene because of biases caused by PCR amplification, making it difficult to convert the proportion of particular sequences in the NGS library to the copy number using the mass of input DNA. To address this issue, we applied stochastic labeling approach with random-tag sequences and developed a NGS-based quantification protocol, which enables simultaneous sequencing and quantification of the targeted DNA. This quantitative sequencing (qSeq) is initiated from single-primer extension (SPE) using a primer with random tag adjacent to the 5' end of target-specific sequence. During SPE, each DNA molecule is stochastically labeled with the random tag. Subsequently, first-round PCR is conducted, specifically targeting the SPE product, followed by second-round PCR to index for NGS. The number of random tags is only determined during the SPE step and is therefore not affected by the two rounds of PCR that may introduce amplification biases. In the case of 16S rRNA genes, after NGS sequencing and taxonomic classification, the absolute number of target phylotypes 16S rRNA gene can be estimated by Poisson statistics by counting random tags incorporated at the end of sequence. To test the feasibility of this approach, the 16S rRNA gene of Sulfolobus tokodaii was subjected to qSeq, which resulted in accurate quantification of 5.0 × 103 to 5.0 × 104 copies of the 16S rRNA gene. Furthermore, qSeq was applied to mock microbial communities and environmental samples, and the results were comparable to those obtained using digital PCR and

  3. Method for rapid base sequencing in DNA and RNA with two base labeling

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Posner, Richard G.; Marrone, Babetta L.; Hammond, Mark L.; Simpson, Daniel J.

    1995-01-01

    Method for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand.

  4. Phenotype classification of single cells using SRS microscopy, RNA sequencing, and microfluidics (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi

    2016-03-01

    Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.

  5. Double-stranded RNA interferes in a sequence-specific manner with the infection of representative members of the two viroid families

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carbonell, Alberto; Martinez de Alba, Angel-Emilio; Flores, Ricardo

    2008-02-05

    Infection by viroids, non-protein-coding circular RNAs, occurs with the accumulation of 21-24 nt viroid-derived small RNAs (vd-sRNAs) with characteristic properties of small interfering RNAs (siRNAs) associated to RNA silencing. The vd-sRNAs most likely derive from dicer-like (DCL) enzymes acting on viroid-specific dsRNA, the key elicitor of RNA silencing, or on the highly structured genomic RNA. Previously, viral dsRNAs delivered mechanically or agroinoculated have been shown to interfere with virus infection in a sequence-specific manner. Here, we report similar results with members of the two families of nuclear- and chloroplast-replicating viroids. Moreover, homologous vd-sRNAs co-delivered mechanically also interfered with one ofmore » the viroids examined. The interference was sequence-specific, temperature-dependent and, in some cases, also dependent on the dose of the co-inoculated dsRNA or vd-sRNAs. The sequence-specific nature of these effects suggests the involvement of the RNA induced silencing complex (RISC), which provides sequence specificity to RNA silencing machinery. Therefore, viroid titer in natural infections might be regulated by the concerted action of DCL and RISC. Viroids could have evolved their secondary structure as a compromise between resistance to DCL and RISC, which act preferentially against RNAs with compact and relaxed secondary structures, respectively. In addition, compartmentation, association with proteins or active replication might also help viroids to elude their host RNA silencing machinery.« less

  6. Near-Complete Genome Sequence of a Novel Single-Stranded RNA Virus Discovered in Indoor Air

    PubMed Central

    2018-01-01

    ABSTRACT Viral metagenomic analysis of heating, ventilation, and air conditioning (HVAC) filters recovered the near-complete genome sequence of a novel virus, named HVAC-associated RNA virus 1 (HVAC-RV1). The HVAC-RV1 genome is most similar to those of picorna-like viruses identified in arthropods but encodes a small domain observed only in negative-sense single-stranded RNA viruses. PMID:29567746

  7. Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions and MapReduce.

    PubMed

    Yehdego, Daniel T; Zhang, Boyu; Kodimala, Vikram K R; Johnson, Kyle L; Taufer, Michela; Leung, Ming-Ying

    2013-05-01

    Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudoknots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudoknots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudoknots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure.

  8. Functional Gene Analysis of Freshwater Iron-Rich Flocs at Circumneutral pH and Isolation of a Stalk-Forming Microaerophilic Iron-Oxidizing Bacterium

    PubMed Central

    Chan, Clara; Itoh, Takashi; Ohkuma, Moriya

    2013-01-01

    Iron-rich flocs often occur where anoxic water containing ferrous iron encounters oxygenated environments. Culture-independent molecular analyses have revealed the presence of 16S rRNA gene sequences related to diverse bacteria, including autotrophic iron oxidizers and methanotrophs in iron-rich flocs; however, the metabolic functions of the microbial communities remain poorly characterized, particularly regarding carbon cycling. In the present study, we cultivated iron-oxidizing bacteria (FeOB) and performed clone library analyses of functional genes related to carbon fixation and methane oxidization (cbbM and pmoA, respectively), in addition to bacterial and archaeal 16S rRNA genes, in freshwater iron-rich flocs at groundwater discharge points. The analyses of 16S rRNA, cbbM, and pmoA genes strongly suggested the coexistence of autotrophic iron oxidizers and methanotrophs in the flocs. Furthermore, a novel stalk-forming microaerophilic FeOB, strain OYT1, was isolated and characterized phylogenetically and physiologically. The 16S rRNA and cbbM gene sequences of OYT1 are related to those of other microaerophilic FeOB in the family Gallionellaceae, of the Betaproteobacteria, isolated from freshwater environments at circumneutral pH. The physiological characteristics of OYT1 will help elucidate the ecophysiology of microaerophilic FeOB. Overall, this study demonstrates functional roles of microorganisms in iron flocs, suggesting several possible linkages between Fe and C cycling. PMID:23811518

  9. RNA-ID, a highly sensitive and robust method to identify cis-regulatory sequences using superfolder GFP and a fluorescence-based assay.

    PubMed

    Dean, Kimberly M; Grayhack, Elizabeth J

    2012-12-01

    We have developed a robust and sensitive method, called RNA-ID, to screen for cis-regulatory sequences in RNA using fluorescence-activated cell sorting (FACS) of yeast cells bearing a reporter in which expression of both superfolder green fluorescent protein (GFP) and yeast codon-optimized mCherry red fluorescent protein (RFP) is driven by the bidirectional GAL1,10 promoter. This method recapitulates previously reported progressive inhibition of translation mediated by increasing numbers of CGA codon pairs, and restoration of expression by introduction of a tRNA with an anticodon that base pairs exactly with the CGA codon. This method also reproduces effects of paromomycin and context on stop codon read-through. Five key features of this method contribute to its effectiveness as a selection for regulatory sequences: The system exhibits greater than a 250-fold dynamic range, a quantitative and dose-dependent response to known inhibitory sequences, exquisite resolution that allows nearly complete physical separation of distinct populations, and a reproducible signal between different cells transformed with the identical reporter, all of which are coupled with simple methods involving ligation-independent cloning, to create large libraries. Moreover, we provide evidence that there are sequences within a 9-nt library that cause reduced GFP fluorescence, suggesting that there are novel cis-regulatory sequences to be found even in this short sequence space. This method is widely applicable to the study of both RNA-mediated and codon-mediated effects on expression.

  10. [Screening specific recognition motif of RNA-binding proteins by SELEX in combination with next-generation sequencing technique].

    PubMed

    Zhang, Lu; Xu, Jinhao; Ma, Jinbiao

    2016-07-25

    RNA-binding protein exerts important biological function by specifically recognizing RNA motif. SELEX (Systematic evolution of ligands by exponential enrichment), an in vitro selection method, can obtain consensus motif with high-affinity and specificity for many target molecules from DNA or RNA libraries. Here, we combined SELEX with next-generation sequencing to study the protein-RNA interaction in vitro. A pool of RNAs with 20 bp random sequences were transcribed by T7 promoter, and target protein was inserted into plasmid containing SBP-tag, which can be captured by streptavidin beads. Through only one cycle, the specific RNA motif can be obtained, which dramatically improved the selection efficiency. Using this method, we found that human hnRNP A1 RRMs domain (UP1 domain) bound RNA motifs containing AGG and AG sequences. The EMSA experiment indicated that hnRNP A1 RRMs could bind the obtained RNA motif. Taken together, this method provides a rapid and effective method to study the RNA binding specificity of proteins.

  11. The complete nucleotide sequence of RNA 3 of a peach isolate of Prunus necrotic ringspot virus.

    PubMed

    Hammond, R W; Crosslin, J M

    1995-04-01

    The complete nucleotide sequence of RNA 3 of the PE-5 peach isolate of Prunus necrotic ringspot ilarvirus (PNRSV) was obtained from cloned cDNA. The RNA sequence is 1941 nucleotides and contains two open reading frames (ORFs). ORF 1 consisted of 284 amino acids with a calculated molecular weight of 31,729 Da and ORF 2 contained 224 amino acids with a calculated molecular weight of 25,018 Da. ORF 2 corresponds to the coat protein gene. Expression of ORF 2 engineered into a pTrcHis vector in Escherichia coli results in a fusion polypeptide of approximately 28 kDa which cross-reacts with PNRSV polyclonal antiserum. Analysis of the coat protein amino acid sequence reveals a putative "zinc-finger" domain at the amino-terminal portion of the protein. Two tetranucleotide AUGC motifs occur in the 3'-UTR of the RNA and may function in coat protein binding and genome activation. ORF 1 homologies to other ilarviruses and alfalfa mosaic virus are confined to limited regions of conserved amino acids. The translated amino acid sequence of the coat protein gene shows 92% similarity to one isolate of apple mosaic virus, a closely related member of the ilarvirus group of plant viruses, but only 66% similarity to the amino acid sequence of the coat protein gene of a second isolate. These relationships are also reflected at the nucleotide sequence level. These results in one instance confirm the close similarities observed at the biophysical and serological levels between these two viruses, but on the other hand call into question the nomenclature used to describe these viruses.

  12. Freiburg RNA tools: a central online resource for RNA-focused research and teaching.

    PubMed

    Raden, Martin; Ali, Syed M; Alkhnbashi, Omer S; Busch, Anke; Costa, Fabrizio; Davis, Jason A; Eggenhofer, Florian; Gelhausen, Rick; Georg, Jens; Heyne, Steffen; Hiller, Michael; Kundu, Kousik; Kleinkauf, Robert; Lott, Steffen C; Mohamed, Mostafa M; Mattheis, Alexander; Miladi, Milad; Richter, Andreas S; Will, Sebastian; Wolff, Joachim; Wright, Patrick R; Backofen, Rolf

    2018-05-21

    The Freiburg RNA tools webserver is a well established online resource for RNA-focused research. It provides a unified user interface and comprehensive result visualization for efficient command line tools. The webserver includes RNA-RNA interaction prediction (IntaRNA, CopraRNA, metaMIR), sRNA homology search (GLASSgo), sequence-structure alignments (LocARNA, MARNA, CARNA, ExpaRNA), CRISPR repeat classification (CRISPRmap), sequence design (antaRNA, INFO-RNA, SECISDesign), structure aberration evaluation of point mutations (RaSE), and RNA/protein-family models visualization (CMV), and other methods. Open education resources offer interactive visualizations of RNA structure and RNA-RNA interaction prediction as well as basic and advanced sequence alignment algorithms. The services are freely available at http://rna.informatik.uni-freiburg.de.

  13. Development of a functional cell-based assay that probes the specific interaction between influenza A virus NP and its packaging signal sequence RNA.

    PubMed

    Woo, Jiwon; Yu, Kyung Lee; Lee, Sun Hee; You, Ji Chang

    2015-02-06

    Although cis-acting packaging signal RNA sequences for the influenza virus NP encoding vRNA have been identified recently though genetic studies, little is known about the interaction between NP and the vRNA packaging signals either in vivo or in vitro. Here, we provide evidence that NP is able to interact specifically with the vRNA packaging sequence RNA within living cells and that the specific RNA binding activity of NP in vivo requires both the N-terminal and central region of the protein. This assay established would be a valuable tool for further detailed studies of the NP-packaging signal RNA interaction in living cells. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Method for rapid base sequencing in DNA and RNA with two base labeling

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Posner, R.G.; Marrone, B.L.; Hammond, M.L.; Simpson, D.J.

    1995-04-11

    A method is described for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand. 4 figures.

  15. DNA and RNA sequencing by nanoscale reading through programmable electrophoresis and nanoelectrode-gated tunneling and dielectric detection

    DOEpatents

    Lee, James W.; Thundat, Thomas G.

    2005-06-14

    An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.

  16. 16S ribosomal RNA sequence analysis for determination of phylogenetic relationship among methylotrophs.

    PubMed

    Tsuji, K; Tsien, H C; Hanson, R S; DePalma, S R; Scholtz, R; LaRoche, S

    1990-01-01

    16S ribosomal RNAs (rRNA) of 12 methylotrophic bacteria have been almost completely sequenced to establish their phylogenetic relationships. Methylotrophs that are physiologically related are phylogenetically diverse and are scattered among the purple eubacteria (class Proteobacteria). Group I methylotrophs can be classified in the beta- and the gamma-subdivisions and group II methylotrophs in the alpha-subdivision of the purple eubacteria, respectively. Pink-pigmented facultative and non-pigmented obligate group II methylotrophs form two distinctly separate branches within the alpha-subdivision. The secondary structures of the 16S rRNA sequences of 'Methylocystis parvus' strain OBBP, 'Methylosinus trichosporium' strain OB3b, 'Methylosporovibrio methanica' strain 81Z and Hyphomicrobium sp. strain DM2 are similar, and these non-pigmented obligate group II methylotrophs form one tight cluster in the alpha-subdivision. The pink-pigmented facultative methylotrophs, Methylobacterium extorquens strain AM1, Methylobacterium sp. strain DM4 and Methylobacterium organophilum strain XX form another cluster within the alpha-subdivision. Although similar in phenotypic characteristics, Methylobacterium organophilum strain XX and Methylobacterium extorquens strain AM1 are clearly distinguishable by their 16S rRNA sequences. The group I methylotrophs, Methylophilus methylotrophus strain AS1 and methylotrophic species DM11, which do not utilize methane, are similar in 16S rRNA sequence to bacteria in the beta-subdivision. The methane-utilizing, obligate group I methanotrophs, Methylococcus capsulatus strain BATH and Methylomonas methanica, are placed in the gamma-subdivision. The results demonstrate that it is possible to distinguish and classify the methylotrophic bacteria using 16S rRNA sequence analysis.

  17. Discovery and molecular characterization of a new cryptovirus dsRNA genome from Japanese persimmon through conventional cloning and high-throughput sequencing.

    PubMed

    Morelli, M; Chiumenti, M; De Stradis, A; La Notte, P; Minafra, A

    2015-02-01

    Through the application of next generation sequencing, in synergy with conventional cloning of DOP-PCR fragments, two double-stranded RNA (dsRNA) molecules of about 1.5 kbp in size were isolated from leaf tissue of a Japanese persimmon (accession SSPI) from Apulia (southern Italy) showing veinlets necrosis. High-throughput sequencing allowed whole genome sequence assembly, yielding a 1,577 and a 1,491 bp contigs identified as dsRNA-1 and dsRNA-2 of a previously undescribed virus, provisionally named as Persimmon cryptic virus (PeCV). In silico analysis showed that both dsRNA fragments were monocistronic and comprised the RNA-dependent RNA polymerase (RdRp) and the capsid protein (CP) genes, respectively. Phylogenetic reconstruction revealed a close relationship of these dsRNAs with those of cryptoviruses described in woody and herbaceous hosts, recently gathered in genus Deltapartitivirus. Virus-specific primers for RT-PCR, designed in the CP cistron, detected viral RNAs also in symptomless persimmon trees sampled from the same geographical area of SSPI, thus proving that PeCV infection may be fairly common and presumably latent.

  18. Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues

    PubMed Central

    Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Terry, Richard; Turczyk, Brian M.; Yang, Joyce L.; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M.

    2014-01-01

    RNA sequencing measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. On the other hand, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq our method enriches for context-specific transcripts over house-keeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d. PMID:25675209

  19. Detection and characterization of Pasteuria 16S rRNA gene sequences from nematodes and soils.

    PubMed

    Duan, Y P; Castro, H F; Hewlett, T E; White, J H; Ogram, A V

    2003-01-01

    Various bacterial species in the genus Pasteuria have great potential as biocontrol agents against plant-parasitic nematodes, although study of this important genus is hampered by the current inability to cultivate Pasteuria species outside their host. To aid in the study of this genus, an extensive 16S rRNA gene sequence phylogeny was constructed and this information was used to develop cultivation-independent methods for detection of Pasteuria in soils and nematodes. Thirty new clones of Pasteuria 16S rRNA genes were obtained directly from nematodes and soil samples. These were sequenced and used to construct an extensive phylogeny of this genus. These sequences were divided into two deeply branching clades within the low-G + C, Gram-positive division; some sequences appear to represent novel species within the genus Pasteuria. In addition, a surprising degree of 16S rRNA gene sequence diversity was observed within what had previously been designated a single strain of Pasteuria penetrans (P-20). PCR primers specific to Pasteuria 16S rRNA for detection of Pasteuria in soils were also designed and evaluated. Detection limits for soil DNA were 100-10,000 Pasteuria endospores (g soil)(-1).

  20. Improved Annotation of 3′ Untranslated Regions and Complex Loci by Combination of Strand-Specific Direct RNA Sequencing, RNA-Seq and ESTs

    PubMed Central

    Song, Junfang; Duc, Céline; Storey, Kate G.; McLean, W. H. Irwin; Brown, Sara J.; Simpson, Gordon G.; Barton, Geoffrey J.

    2014-01-01

    The reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence reads are mapped against the genome and assigned to genes according to the annotation. Inconsistencies in annotations between the reference and the experimental system can lead to incorrect interpretation of the effect on RNA expression of an experimental treatment or mutation in the system under study. Until recently, the genome-wide annotation of 3′ untranslated regions received less attention than coding regions and the delineation of intron/exon boundaries. In this paper, data produced for samples in Human, Chicken and A. thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing technology from Helicos Biosciences which locates 3′ polyadenylation sites to within +/− 2 nt, were combined with archival EST and RNA-Seq data. Nine examples are illustrated where this combination of data allowed: (1) gene and 3′ UTR re-annotation (including extension of one 3′ UTR by 5.9 kb); (2) disentangling of gene expression in complex regions; (3) clearer interpretation of small RNA expression and (4) identification of novel genes. While the specific examples displayed here may become obsolete as genome sequences and their annotations are refined, the principles laid out in this paper will be of general use both to those annotating genomes and those seeking to interpret existing publically available annotations in the context of their own experimental data. PMID:24722185

  1. An improved and validated RNA HLA class I SBT approach for obtaining full length coding sequences.

    PubMed

    Gerritsen, K E H; Olieslagers, T I; Groeneweg, M; Voorter, C E M; Tilanus, M G J

    2014-11-01

    The functional relevance of human leukocyte antigen (HLA) class I allele polymorphism beyond exons 2 and 3 is difficult to address because more than 70% of the HLA class I alleles are defined by exons 2 and 3 sequences only. For routine application on clinical samples we improved and validated the HLA sequence-based typing (SBT) approach based on RNA templates, using either a single locus-specific or two overlapping group-specific polymerase chain reaction (PCR) amplifications, with three forward and three reverse sequencing reactions for full length sequencing. Locus-specific HLA typing with RNA SBT of a reference panel, representing the major antigen groups, showed identical results compared to DNA SBT typing. Alleles encountered with unknown exons in the IMGT/HLA database and three samples, two with Null and one with a Low expressed allele, have been addressed by the group-specific RNA SBT approach to obtain full length coding sequences. This RNA SBT approach has proven its value in our routine full length definition of alleles. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  2. BrAD-seq: Breath Adapter Directional sequencing: a streamlined, ultra-simple and fast library preparation protocol for strand specific mRNA library construction.

    PubMed

    Townsley, Brad T; Covington, Michael F; Ichihashi, Yasunori; Zumstein, Kristina; Sinha, Neelima R

    2015-01-01

    Next Generation Sequencing (NGS) is driving rapid advancement in biological understanding and RNA-sequencing (RNA-seq) has become an indispensable tool for biology and medicine. There is a growing need for access to these technologies although preparation of NGS libraries remains a bottleneck to wider adoption. Here we report a novel method for the production of strand specific RNA-seq libraries utilizing the terminal breathing of double-stranded cDNA to capture and incorporate a sequencing adapter. Breath Adapter Directional sequencing (BrAD-seq) reduces sample handling and requires far fewer enzymatic steps than most available methods to produce high quality strand-specific RNA-seq libraries. The method we present is optimized for 3-prime Digital Gene Expression (DGE) libraries and can easily extend to full transcript coverage shotgun (SHO) type strand-specific libraries and is modularized to accommodate a diversity of RNA and DNA input materials. BrAD-seq offers a highly streamlined and inexpensive option for RNA-seq libraries.

  3. SEXCMD: Development and validation of sex marker sequences for whole-exome/genome and RNA sequencing.

    PubMed

    Jeong, Seongmun; Kim, Jiwoong; Park, Won; Jeon, Hongmin; Kim, Namshin

    2017-01-01

    Over the last decade, a large number of nucleotide sequences have been generated by next-generation sequencing technologies and deposited to public databases. However, most of these datasets do not specify the sex of individuals sampled because researchers typically ignore or hide this information. Male and female genomes in many species have distinctive sex chromosomes, XX/XY and ZW/ZZ, and expression levels of many sex-related genes differ between the sexes. Herein, we describe how to develop sex marker sequences from syntenic regions of sex chromosomes and use them to quickly identify the sex of individuals being analyzed. Array-based technologies routinely use either known sex markers or the B-allele frequency of X or Z chromosomes to deduce the sex of an individual. The same strategy has been used with whole-exome/genome sequence data; however, all reads must be aligned onto a reference genome to determine the B-allele frequency of the X or Z chromosomes. SEXCMD is a pipeline that can extract sex marker sequences from reference sex chromosomes and rapidly identify the sex of individuals from whole-exome/genome and RNA sequencing after training with a known dataset through a simple machine learning approach. The pipeline counts total numbers of hits from sex-specific marker sequences and identifies the sex of the individuals sampled based on the fact that XX/ZZ samples do not have Y or W chromosome hits. We have successfully validated our pipeline with mammalian (Homo sapiens; XY) and avian (Gallus gallus; ZW) genomes. Typical calculation time when applying SEXCMD to human whole-exome or RNA sequencing datasets is a few minutes, and analyzing human whole-genome datasets takes about 10 minutes. Another important application of SEXCMD is as a quality control measure to avoid mixing samples before bioinformatics analysis. SEXCMD comprises simple Python and R scripts and is freely available at https://github.com/lovemun/SEXCMD.

  4. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server

    PubMed Central

    Cannone, Jamie J.; Sweeney, Blake A.; Petrov, Anton I.; Gutell, Robin R.; Zirbel, Craig L.; Leontis, Neocles

    2015-01-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960

  5. An evolutionary conserved pattern of 18S rRNA sequence complementarity to mRNA 5′ UTRs and its implications for eukaryotic gene translation regulation

    PubMed Central

    Pánek, Josef; Kolář, Michal; Vohradský, Jiří; Shivaya Valášek, Leoš

    2013-01-01

    There are several key mechanisms regulating eukaryotic gene expression at the level of protein synthesis. Interestingly, the least explored mechanisms of translational control are those that involve the translating ribosome per se, mediated for example via predicted interactions between the ribosomal RNAs (rRNAs) and mRNAs. Here, we took advantage of robustly growing large-scale data sets of mRNA sequences for numerous organisms, solved ribosomal structures and computational power to computationally explore the mRNA–rRNA complementarity that is statistically significant across the species. Our predictions reveal highly specific sequence complementarity of 18S rRNA sequences with mRNA 5′ untranslated regions (UTRs) forming a well-defined 3D pattern on the rRNA sequence of the 40S subunit. Broader evolutionary conservation of this pattern may imply that 5′ UTRs of eukaryotic mRNAs, which have already emerged from the mRNA-binding channel, may contact several complementary spots on 18S rRNA situated near the exit of the mRNA binding channel and on the middle-to-lower body of the solvent-exposed 40S ribosome including its left foot. We discuss physiological significance of this structurally conserved pattern and, in the context of previously published experimental results, propose that it modulates scanning of the 40S subunit through 5′ UTRs of mRNAs. PMID:23804757

  6. Identification of Alternative Splicing and Fusion Transcripts in Non-Small Cell Lung Cancer by RNA Sequencing.

    PubMed

    Hong, Yoonki; Kim, Woo Jin; Bang, Chi Young; Lee, Jae Cheol; Oh, Yeon-Mok

    2016-04-01

    Lung cancer is the most common cause of cancer related death. Alterations in gene sequence, structure, and expression have an important role in the pathogenesis of lung cancer. Fusion genes and alternative splicing of cancer-related genes have the potential to be oncogenic. In the current study, we performed RNA-sequencing (RNA-seq) to investigate potential fusion genes and alternative splicing in non-small cell lung cancer. RNA was isolated from lung tissues obtained from 86 subjects with lung cancer. The RNA samples from lung cancer and normal tissues were processed with RNA-seq using the HiSeq 2000 system. Fusion genes were evaluated using Defuse and ChimeraScan. Candidate fusion transcripts were validated by Sanger sequencing. Alternative splicing was analyzed using multivariate analysis of transcript sequencing and validated using quantitative real time polymerase chain reaction. RNA-seq data identified oncogenic fusion genes EML4-ALK and SLC34A2-ROS1 in three of 86 normal-cancer paired samples. Nine distinct fusion transcripts were selected using DeFuse and ChimeraScan; of which, four fusion transcripts were validated by Sanger sequencing. In 33 squamous cell carcinoma, 29 tumor specific skipped exon events and six mutually exclusive exon events were identified. ITGB4 and PYCR1 were top genes that showed significant tumor specific splice variants. In conclusion, RNA-seq data identified novel potential fusion transcripts and splice variants. Further evaluation of their functional significance in the pathogenesis of lung cancer is required.

  7. Evidence for Context-Dependent Complementarity of Non-Shine-Dalgarno Ribosome Binding Sites to Escherichia coli rRNA

    PubMed Central

    Barendt, Pamela A.; Shah, Najaf A.; Barendt, Gregory A.; Kothari, Parth A.; Sarkar, Casim A.

    2013-01-01

    While the ribosome has evolved to function in complex intracellular environments, these contexts do not easily allow for the study of its inherent capabilities. We have used a synthetic, well-defined, Escherichia coli (E. coli)-based translation system in conjunction with ribosome display, a powerful in vitro selection method, to identify ribosome binding sites (RBSs) that can promote the efficient translation of messenger RNAs (mRNAs) with a leader length representative of natural E. coli mRNAs. In previous work, we used a longer leader sequence and unexpectedly recovered highly efficient cytosine-rich sequences with complementarity to the 16S ribosomal RNA (rRNA) and similarity to eukaryotic RBSs. In the current study, Shine-Dalgarno (SD) sequences were prevalent but non-SD sequences were also heavily enriched and were dominated by novel guanine- and uracil-rich motifs which showed statistically significant complementarity to the 16S rRNA. Additionally, only SD motifs exhibited position-dependent decreases in sequence entropy, indicating that non-SD motifs likely operate by increasing the local concentration of ribosomes in the vicinity of the start codon, rather than by a position-dependent mechanism. These results further support the putative generality of mRNA-rRNA complementarity in facilitating mRNA translation, but also suggest that context (e.g., leader length and composition) dictates the specific subset of possible RBSs that are used for efficient translation of a given transcript. PMID:23427812

  8. Ultra Deep Sequencing of Listeria monocytogenes sRNA Transcriptome Revealed New Antisense RNAs

    PubMed Central

    Behrens, Sebastian; Widder, Stefanie; Mannala, Gopala Krishna; Qing, Xiaoxing; Madhugiri, Ramakanth; Kefer, Nathalie; Mraheil, Mobarak Abu; Rattei, Thomas; Hain, Torsten

    2014-01-01

    Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs) as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS) techniques have made RNA sequencing (RNA-Seq) the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from <40 nt, 40–150 nt and >150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs) associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes. PMID:24498259

  9. Isoform-level gene expression patterns in single-cell RNA-sequencing data.

    PubMed

    Vu, Trung Nghia; Wills, Quin F; Kalari, Krishna R; Niu, Nifang; Wang, Liewei; Pawitan, Yudi; Rantalainen, Mattias

    2018-02-27

    RNA sequencing of single cells enables characterization of transcriptional heterogeneity in seemingly homogeneous cell populations. Single-cell sequencing has been applied in a wide range of researches fields. However, few studies have focus on characterization of isoform-level expression patterns at the single-cell level. In this study we propose and apply a novel method, ISOform-Patterns (ISOP), based on mixture modeling, to characterize the expression patterns of isoform pairs from the same gene in single-cell isoform-level expression data. We define six principal patterns of isoform expression relationships and describe a method for differential-pattern analysis. We demonstrate ISOP through analysis of single-cell RNA-sequencing data from a breast cancer cell line, with replication in three independent datasets. We assigned the pattern types to each of 16,562 isoform-pairs from 4,929 genes. Among those, 26% of the discovered patterns were significant (p<0.05), while remaining patterns are possibly effects of transcriptional bursting, drop-out and stochastic biological heterogeneity. Furthermore, 32% of genes discovered through differential-pattern analysis were not detected by differential-expression analysis. The effect of drop-out events, mean expression level, and properties of the expression distribution on the performances of ISOP were also investigated through simulated datasets. To conclude, ISOP provides a novel approach for characterization of isoformlevel preference, commitment and heterogeneity in single-cell RNA-sequencing data. The ISOP method has been implemented as a R package and is available at https://github.com/nghiavtr/ISOP under a GPL-3 license. mattias.rantalainen@ki.se. Supplementary data are available at Bioinformatics online.

  10. Fast selection of miRNA candidates based on large-scale pre-computed MFE sets of randomized sequences.

    PubMed

    Warris, Sven; Boymans, Sander; Muiser, Iwe; Noback, Michiel; Krijnen, Wim; Nap, Jan-Peter

    2014-01-13

    Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification.

  11. Fast selection of miRNA candidates based on large-scale pre-computed MFE sets of randomized sequences

    PubMed Central

    2014-01-01

    Background Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Results Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. Conclusion The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification. PMID:24418292

  12. External Guide Sequences Targeting the aac(6′)-Ib mRNA Induce Inhibition of Amikacin Resistance▿

    PubMed Central

    Bistué, Alfonso J. C. Soler; Ha, Hongphuc; Sarno, Renee; Don, Michelle; Zorreguieta, Angeles; Tolmasky, Marcelo E.

    2007-01-01

    The dissemination of AAC(6′)-I-type acetyltransferases have rendered amikacin and other aminoglycosides all but useless in some parts of the world. Antisense technologies could be an alternative to extend the life of these antibiotics. External guide sequences are short antisense oligoribonucleotides that induce RNase P-mediated cleavage of a target RNA by forming a precursor tRNA-like complex. Thirteen-nucleotide external guide sequences complementary to locations within five regions accessible for interaction with antisense oligonucleotides in the mRNA that encodes AAC(6′)-Ib were analyzed. While small variations in the location targeted by different external guide sequences resulted in big changes in efficiency of binding to native aac(6′)-Ib mRNA, most of them induced high levels of RNase P-mediated cleavage in vitro. Recombinant plasmids coding for selected external guide sequences were introduced into Escherichia coli harboring aac(6′)-Ib, and the transformant strains were tested to determine their resistance to amikacin. The two external guide sequences that showed the strongest binding efficiency to the mRNA in vitro, EGSC3 and EGSA2, interfered with expression of the resistance phenotype at different degrees. Growth curve experiments showed that E. coli cells harboring a plasmid coding for EGSC3, the external guide sequence with the highest mRNA binding affinity in vitro, did not grow for at least 300 min in the presence of 15 μg of amikacin/ml. EGSA2, which had a lower mRNA-binding affinity in vitro than EGSC3, inhibited the expression of amikacin resistance at a lesser level; growth of E. coli harboring a plasmid coding for EGSA2, in the presence of 15 μg of amikacin/ml was undetectable for 200 min but reached an optical density at 600 nm of 0.5 after 5 h of incubation. Our results indicate that the use of external guide sequences could be a viable strategy to preserve the efficacy of amikacin. PMID:17387154

  13. How to Tackle the Challenge of siRNA Delivery with Sequence-Defined Oligoamino Amides.

    PubMed

    Reinhard, Sören; Wagner, Ernst

    2017-01-01

    RNA interference (RNAi) as a mechanism of gene regulation provides exciting opportunities for medical applications. Synthetic small interfering RNA (siRNA) triggers the knockdown of complementary mRNA sequences in a catalytic fashion and has to be delivered into the cytosol of the targeted cells. The design of adequate carrier systems to overcome multiple extracellular and intracellular roadblocks within the delivery process has utmost importance. Cationic polymers form polyplexes through electrostatic interaction with negatively charged nucleic acids and present a promising class of carriers. Issues of polycations regarding toxicity, heterogeneity, and polydispersity can be overcome by solid-phase-assisted synthesis of sequence-defined cationic oligomers. These medium-sized highly versatile nucleic acid carriers display low cytotoxicity and can be modified and tailored in multiple ways to meet specific requirements of nucleic acid binding, polyplex size, shielding, targeting, and intracellular release of the cargo. In this way, sequence-defined cationic oligomers can mimic the dynamic and bioresponsive behavior of viruses. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. The Rhinovirus Subviral A-Particle Exposes 3′-Terminal Sequences of Its Genomic RNA

    PubMed Central

    Harutyunyan, Shushan; Kowalski, Heinrich

    2014-01-01

    ABSTRACT Enteroviruses, which represent a large genus within the family Picornaviridae, undergo important conformational modifications during infection of the host cell. Once internalized by receptor-mediated endocytosis, receptor binding and/or the acidic endosomal environment triggers the native virion to expand and convert into the subviral (altered) A-particle. The A-particle is lacking the internal capsid protein VP4 and exposes N-terminal amphipathic sequences of VP1, allowing for its direct interaction with a lipid bilayer. The genomic single-stranded (+)RNA then exits through a hole close to a 2-fold axis of icosahedral symmetry and passes through a pore in the endosomal membrane into the cytosol, leaving behind the empty shell. We demonstrate that in vitro acidification of a prototype of the minor receptor group of common cold viruses, human rhinovirus A2 (HRV-A2), also results in egress of the poly(A) tail of the RNA from the A-particle, along with adjacent nucleotides totaling ∼700 bases. However, even after hours of incubation at pH 5.2, 5′-proximal sequences remain inside the capsid. In contrast, the entire RNA genome is released within minutes of exposure to the acidic endosomal environment in vivo. This finding suggests that the exposed 3′-poly(A) tail facilitates the positioning of the RNA exit site onto the putative channel in the lipid bilayer, thereby preventing the egress of viral RNA into the endosomal lumen, where it may be degraded. IMPORTANCE For host cell infection, a virus transfers its genome from within the protective capsid into the cytosol; this requires modifications of the viral shell. In common cold viruses, exit of the RNA genome is prepared by the acidic environment in endosomes converting the native virion into the subviral A-particle. We demonstrate that acidification in vitro results in RNA exit starting from the 3′-terminal poly(A). However, the process halts as soon as about 700 bases have left the viral shell

  15. The impact of CRISPR repeat sequence on structures of a Cas6 protein-RNA complex

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Ruiying; Zheng, Han; Preamplume, Gan

    The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of amore » noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.« less

  16. Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence.

    PubMed

    Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab; Khandekar, Sushant; Crawford, Erin; Zirbel, Craig L; Leisner, Scott; Prakash, Ashwin; Fedorova, Larisa; Fedorov, Alexei

    2014-09-10

    Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals. Copyright © 2014 Elsevier B.V. All rights reserved.

  17. Mapping Hfq-RNA interaction surfaces using tryptophan fluorescence quenching

    PubMed Central

    Robinson, Kirsten E.; Orans, Jillian; Kovach, Alexander R.; Link, Todd M.; Brennan, Richard G.

    2014-01-01

    Hfq is a posttranscriptional riboregulator and RNA chaperone that binds small RNAs and target mRNAs to effect their annealing and message-specific regulation in response to environmental stressors. Structures of Hfq-RNA complexes indicate that U-rich sequences prefer the proximal face and A-rich sequences the distal face; however, the Hfq-binding sites of most RNAs are unknown. Here, we present an Hfq-RNA mapping approach that uses single tryptophan-substituted Hfq proteins, all of which retain the wild-type Hfq structure, and tryptophan fluorescence quenching (TFQ) by proximal RNA binding. TFQ properly identified the respective distal and proximal binding of A15 and U6 RNA to Gram-negative Escherichia coli (Ec) Hfq and the distal face binding of (AA)3A, (AU)3A and (AC)3A to Gram-positive Staphylococcus aureus (Sa) Hfq. The inability of (GU)3G to bind the distal face of Sa Hfq reveals the (R-L)n binding motif is a more restrictive (A-L)n binding motif. Remarkably Hfq from Gram-positive Listeria monocytogenes (Lm) binds (GU)3G on its proximal face. TFQ experiments also revealed the Ec Hfq (A-R-N)n distal face-binding motif should be redefined as an (A-A-N)n binding motif. TFQ data also demonstrated that the 5′-untranslated region of hfq mRNA binds both the proximal and distal faces of Ec Hfq and the unstructured C-terminus. PMID:24288369

  18. QNB: differential RNA methylation analysis for count-based small-sample sequencing data with a quad-negative binomial model.

    PubMed

    Liu, Lian; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia

    2017-08-31

    As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. We present QNB, a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data. Compared with previous approaches such as DRME model based on a statistical test covering the IP samples only with 2 negative binomial distributions, QNB is based on 4 independent negative binomial distributions with their variances and means linked by local regressions, and in the way, the input control samples are also properly taken care of. In addition, different from DRME approach, which relies only the input control sample only for estimating the background, QNB uses a more robust estimator for gene expression by combining information from both input and IP samples, which could largely improve the testing performance for very lowly expressed genes. QNB showed improved performance on both simulated and real MeRIP-Seq datasets when compared with competing algorithms. And the QNB model is also applicable to other datasets related RNA modifications, including but not limited to RNA bisulfite sequencing, m 1 A-Seq, Par-CLIP, RIP-Seq, etc.

  19. YM500v2: a small RNA sequencing (smRNA-seq) database for human cancer miRNome research

    PubMed Central

    Cheng, Wei-Chung; Chung, I-Fang; Tsai, Cheng-Fong; Huang, Tse-Shun; Chen, Chen-Yang; Wang, Shao-Chuan; Chang, Ting-Yu; Sun, Hsing-Jen; Chao, Jeffrey Yung-Chuan; Cheng, Cheng-Chung; Wu, Cheng-Wen; Wang, Hsei-Wei

    2015-01-01

    We previously presented YM500, which is an integrated database for miRNA quantification, isomiR identification, arm switching discovery and novel miRNA prediction from 468 human smRNA-seq datasets. Here in this updated YM500v2 database (http://ngs.ym.edu.tw/ym500/), we focus on the cancer miRNome to make the database more disease-orientated. New miRNA-related algorithms developed after YM500 were included in YM500v2, and, more significantly, more than 8000 cancer-related smRNA-seq datasets (including those of primary tumors, paired normal tissues, PBMC, recurrent tumors, and metastatic tumors) were incorporated into YM500v2. Novel miRNAs (miRNAs not included in the miRBase R21) were not only predicted by three independent algorithms but also cleaned by a new in silico filtration strategy and validated by wetlab data such as Cross-Linked ImmunoPrecipitation sequencing (CLIP-seq) to reduce the false-positive rate. A new function ‘Meta-analysis’ is additionally provided for allowing users to identify real-time differentially expressed miRNAs and arm-switching events according to customer-defined sample groups and dozens of clinical criteria tidying up by proficient clinicians. Cancer miRNAs identified hold the potential for both basic research and biotech applications. PMID:25398902

  20. YM500v2: a small RNA sequencing (smRNA-seq) database for human cancer miRNome research.

    PubMed

    Cheng, Wei-Chung; Chung, I-Fang; Tsai, Cheng-Fong; Huang, Tse-Shun; Chen, Chen-Yang; Wang, Shao-Chuan; Chang, Ting-Yu; Sun, Hsing-Jen; Chao, Jeffrey Yung-Chuan; Cheng, Cheng-Chung; Wu, Cheng-Wen; Wang, Hsei-Wei

    2015-01-01

    We previously presented YM500, which is an integrated database for miRNA quantification, isomiR identification, arm switching discovery and novel miRNA prediction from 468 human smRNA-seq datasets. Here in this updated YM500v2 database (http://ngs.ym.edu.tw/ym500/), we focus on the cancer miRNome to make the database more disease-orientated. New miRNA-related algorithms developed after YM500 were included in YM500v2, and, more significantly, more than 8000 cancer-related smRNA-seq datasets (including those of primary tumors, paired normal tissues, PBMC, recurrent tumors, and metastatic tumors) were incorporated into YM500v2. Novel miRNAs (miRNAs not included in the miRBase R21) were not only predicted by three independent algorithms but also cleaned by a new in silico filtration strategy and validated by wetlab data such as Cross-Linked ImmunoPrecipitation sequencing (CLIP-seq) to reduce the false-positive rate. A new function 'Meta-analysis' is additionally provided for allowing users to identify real-time differentially expressed miRNAs and arm-switching events according to customer-defined sample groups and dozens of clinical criteria tidying up by proficient clinicians. Cancer miRNAs identified hold the potential for both basic research and biotech applications. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. The nucleotide sequence of the intergenic region between the 5.8S and 26S rRNA genes of the yeast ribosomal RNA operon. Possible implications for the interaction between 5.8S and 26S rRNA and the processing of the primary transcript.

    PubMed Central

    Veldman, G M; Klootwijk, J; van Heerikhuizen, H; Planta, R J

    1981-01-01

    We have determined the nucleotide sequence of part of a cloned yeast ribosomal RNA operon extending from the 5.8S RNA gene downstream into the 5' -terminal region of the 26S RNA gene. We mapped the pertinent processing sites, viz. the 5' end of 26S rRNA and the 3'ends of 5.8S rRNA and its immediate precursor, 7S RNA. At the 3' end of 7S RNA we find the sequence UCGUUU which is very similar to the type I consensus sequence UCAUUA/U present at the 3' ends of 17S, 5.8S and 26S rRNA as well as 18S precursor rRNA in yeast. At the 5' end of the 26S RNA gene we find a sequence of thirteen nucleotides which is homologous to the type II sequence present at the 5' termini of both the 17S and the 5.8S RNA gene. These findings further support the suggestion put forward earlier (G.M. Veldman et al. (1980) Nucl. Acids Res. 8, 2907-2920) that both consensus sequences are involved in the recognition of precursor rRNA by the processing nuclease(s). We discuss a model for the processing of yeast rRNA in which a processing enzyme sequentially recognizes several combinations of a type I and a type II consensus sequence. We also describe the existence of a significant base complementarity between sequences in the 5' -terminal region of 26S rRNA and the 3' -terminal region of 5.8S rRNA. We suggest that base pairing between these sequences contributes to the binding between 5.8S and 26S rRNA. Images PMID:7312619

  2. Deep Sequence Analysis of AgoshRNA Processing Reveals 3' A Addition and Trimming.

    PubMed

    Harwig, Alex; Herrera-Carrillo, Elena; Jongejan, Aldo; van Kampen, Antonius Hubertus; Berkhout, Ben

    2015-07-14

    The RNA interference (RNAi) pathway, in which microprocessor and Dicer collaborate to process microRNAs (miRNA), was recently expanded by the description of alternative processing routes. In one of these noncanonical pathways, Dicer action is replaced by the Argonaute2 (Ago2) slicer function. It was recently shown that the stem-length of precursor-miRNA or short hairpin RNA (shRNA) molecules is a major determinant for Dicer versus Ago2 processing. Here we present the results of a deep sequence study on the processing of shRNAs with different stem length and a top G·U wobble base pair (bp). This analysis revealed some unexpected properties of these so-called AgoshRNA molecules that are processed by Ago2 instead of Dicer. First, we confirmed the gradual shift from Dicer to Ago2 processing upon shortening of the hairpin length. Second, hairpins with a stem larger than 19 base pair are inefficiently cleaved by Ago2 and we noticed a shift in the cleavage site. Third, the introduction of a top G·U bp in a regular shRNA can promote Ago2-cleavage, which coincides with a loss of Ago2-loading of the Dicer-cleaved 3' strand. Fourth, the Ago2-processed AgoshRNAs acquire a short 3' tail of 1-3 A-nucleotides (nt) and we present evidence that this product is subsequently trimmed by the poly(A)-specific ribonuclease (PARN).

  3. Deep Sequence Analysis of AgoshRNA Processing Reveals 3' A Addition and Trimming

    PubMed Central

    Harwig, Alex; Herrera-Carrillo, Elena; Jongejan, Aldo; van Kampen, Antonius Hubertus; Berkhout, Ben

    2015-01-01

    The RNA interference (RNAi) pathway, in which microprocessor and Dicer collaborate to process microRNAs (miRNA), was recently expanded by the description of alternative processing routes. In one of these noncanonical pathways, Dicer action is replaced by the Argonaute2 (Ago2) slicer function. It was recently shown that the stem-length of precursor-miRNA or short hairpin RNA (shRNA) molecules is a major determinant for Dicer versus Ago2 processing. Here we present the results of a deep sequence study on the processing of shRNAs with different stem length and a top G·U wobble base pair (bp). This analysis revealed some unexpected properties of these so-called AgoshRNA molecules that are processed by Ago2 instead of Dicer. First, we confirmed the gradual shift from Dicer to Ago2 processing upon shortening of the hairpin length. Second, hairpins with a stem larger than 19 base pair are inefficiently cleaved by Ago2 and we noticed a shift in the cleavage site. Third, the introduction of a top G·U bp in a regular shRNA can promote Ago2-cleavage, which coincides with a loss of Ago2-loading of the Dicer-cleaved 3' strand. Fourth, the Ago2-processed AgoshRNAs acquire a short 3' tail of 1–3 A-nucleotides (nt) and we present evidence that this product is subsequently trimmed by the poly(A)-specific ribonuclease (PARN). PMID:26172504

  4. iRNA-2methyl: Identify RNA 2'-O-methylation Sites by Incorporating Sequence-Coupled Effects into General PseKNC and Ensemble Classifier.

    PubMed

    Qiu, Wang-Ren; Jiang, Shi-Yu; Sun, Bi-Qian; Xiao, Xuan; Cheng, Xiang; Chou, Kuo-Chen

    2017-01-01

    Being a kind of post-transcriptional modification (PTCM) in RNA, the 2'-Omethylation modification occurs in the processes of life development and disease formation as well. Accordingly, from the angles of both basic research and drug development, we are facing a challenging problem: given an uncharacterized RNA sequence formed by many nucleotides of A (adenine), C (cytosine), G (guanine), and U (uracil), which one can be of 2-O'-methylation modification, and which one cannot? Unfortunately, so far no computational method whatsoever has been developed to address such a problem. To fill this empty area, we propose a predictor called iRNA-2methyl. It is formed by incorporating a series of sequence-coupled factors into the general PseKNC (pseudo nucleotide composition), followed by fusing 12 basic random forest classifier into four ensemble predictors, with each aimed to identify the cases of A, C, G, and U along the RNA sequence concerned, respectively. Rigorous jackknife cross-validations have indicated that the success rates are very high (>93%). For the convenience of most experimental scientists, a user-friendly web-server for iRNA-2methyl has been established at http://www.jci-bioinfo.cn/iRNA-2methyl, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. The proposed predictor iRNA-2methyl will become a very useful bioinformatics tool for medicinal chemistry, helping to design effective drugs against the diseases related to the 2'-Omethylation modification. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  5. The nucleotide sequence of 5S ribosomal RNA from Micrococcus lysodeikticus.

    PubMed Central

    Hori, H; Osawa, S; Murao, K; Ishikura, H

    1980-01-01

    The nucleotide sequence of ribosomal 5S RNA from Micrococcus lysodeikticus is pGUUACGGCGGCUAUAGCGUGGGGGAAACGCCCGGCCGUAUAUCGAACCCGGAAGCUAAGCCCCAUAGCGCCGAUGGUUACUGUAACCGGGAGGUUGUGGGAGAGUAGGUCGCCGCCGUGAOH. When compared to other 5S RNAs, the sequence homology is greatest with Thermus aquaticus, and these two 5S RNAs reveal several features intermediate between those of typical gram-positive bacteria and gram-negative bacteria. PMID:6780979

  6. A multiplicity of factors contributes to selective RNA polymerase III occupancy of a subset of RNA polymerase III genes in mouse liver

    PubMed Central

    Canella, Donatella; Bernasconi, David; Gilardi, Federica; LeMartelot, Gwendal; Migliavacca, Eugenia; Praz, Viviane; Cousin, Pascal; Delorenzi, Mauro; Hernandez, Nouria; Hernandez, Nouria; Delorenzi, Mauro; Deplancke, Bart; Desvergne, Béatrice; Guex, Nicolas; Herr, Winship; Naef, Felix; Rougemont, Jacques; Schibler, Ueli; Deplancke, Bart; Guex, Nicolas; Herr, Winship; Guex, Nicolas; Andersin, Teemu; Cousin, Pascal; Gilardi, Federica; Gos, Pascal; Le Martelot, Gwendal; Lammers, Fabienne; Canella, Donatella; Gilardi, Federica; Raghav, Sunil; Fabbretti, Roberto; Fortier, Arnaud; Long, Li; Vlegel, Volker; Xenarios, Ioannis; Migliavacca, Eugenia; Praz, Viviane; Guex, Nicolas; Naef, Felix; Rougemont, Jacques; David, Fabrice; Jarosz, Yohan; Kuznetsov, Dmitry; Liechti, Robin; Martin, Olivier; Ross, Frederick; Sinclair, Lucas; Cajan, Julia; Krier, Irina; Leleu, Marion; Migliavacca, Eugenia; Molina, Nacho; Naldi, Aurélien; Rey, Guillaume; Symul, Laura; Guex, Nicolas; Naef, Felix; Rougemont, Jacques; Bernasconi, David; Delorenzi, Mauro; Andersin, Teemu; Canella, Donatella; Gilardi, Federica; Le Martelot, Gwendal; Lammers, Fabienne; Raghav, Sunil

    2012-01-01

    The genomic loci occupied by RNA polymerase (RNAP) III have been characterized in human culture cells by genome-wide chromatin immunoprecipitations, followed by deep sequencing (ChIP-seq). These studies have shown that only ∼40% of the annotated 622 human tRNA genes and pseudogenes are occupied by RNAP-III, and that these genes are often in open chromatin regions rich in active RNAP-II transcription units. We have used ChIP-seq to characterize RNAP-III-occupied loci in a differentiated tissue, the mouse liver. Our studies define the mouse liver RNAP-III-occupied loci including a conserved mammalian interspersed repeat (MIR) as a potential regulator of an RNAP-III subunit-encoding gene. They reveal that synteny relationships can be established between a number of human and mouse RNAP-III genes, and that the expression levels of these genes are significantly linked. They establish that variations within the A and B promoter boxes, as well as the strength of the terminator sequence, can strongly affect RNAP-III occupancy of tRNA genes. They reveal correlations with various genomic features that explain the observed variation of 81% of tRNA scores. In mouse liver, loci represented in the NCBI37/mm9 genome assembly that are clearly occupied by RNAP-III comprise 50 Rn5s (5S RNA) genes, 14 known non-tRNA RNAP-III genes, nine Rn4.5s (4.5S RNA) genes, and 29 SINEs. Moreover, out of the 433 annotated tRNA genes, half are occupied by RNAP-III. Transfer RNA gene expression levels reflect both an underlying genomic organization conserved in dividing human culture cells and resting mouse liver cells, and the particular promoter and terminator strengths of individual genes. PMID:22287103

  7. Influence of gag and RRE Sequences on HIV-1 RNA Packaging Signal Structure and Function.

    PubMed

    Kharytonchyk, Siarhei; Brown, Joshua D; Stilger, Krista; Yasin, Saif; Iyer, Aishwarya S; Collins, John; Summers, Michael F; Telesnitsky, Alice

    2018-07-06

    The packaging signal (Ψ) and Rev-responsive element (RRE) enable unspliced HIV-1 RNAs' export from the nucleus and packaging into virions. For some retroviruses, engrafting Ψ onto a heterologous RNA is sufficient to direct encapsidation. In contrast, HIV-1 RNA packaging requires 5' leader Ψ elements plus poorly defined additional features. We previously defined minimal 5' leader sequences competitive with intact Ψ for HIV-1 packaging, and here examined the potential roles of additional downstream elements. The findings confirmed that together, HIV-1 5' leader Ψ sequences plus a nuclear export element are sufficient to specify packaging. However, RNAs trafficked using a heterologous export element did not compete well with RNAs using HIV-1's RRE. Furthermore, some RNA additions to well-packaged minimal vectors rendered them packaging-defective. These defects were rescued by extending gag sequences in their native context. To understand these packaging defects' causes, in vitro dimerization properties of RNAs containing minimal packaging elements were compared to RNAs with sequence extensions that were or were not compatible with packaging. In vitro dimerization was found to correlate with packaging phenotypes, suggesting that HIV-1 evolved to prevent 5' leader residues' base pairing with downstream residues and misfolding of the packaging signal. Our findings explain why gag sequences have been implicated in packaging and show that RRE's packaging contributions appear more specific than nuclear export alone. Paired with recent work showing that sequences upstream of Ψ can dictate RNA folds, the current work explains how genetic context of minimal packaging elements contributes to HIV-1 RNA fate determination. Copyright © 2018 Elsevier Ltd. All rights reserved.

  8. Small RNA sequencing and functional characterization reveals microRNA-143 tumor suppressor activity in liposarcoma

    PubMed Central

    Ugras, Stacy; Brill, Elliott; Jacobsen, Anders; Hafner, Markus; Socci, Nicholas D.; DeCarolis, Penelope L.; Khanin, Raya; O'Connor, Rachael; Mihailovic, Aleksandra; Taylor, Barry S.; Sheridan, Robert; Gimble, Jeffrey M.; Viale, Agnes; Crago, Aimee; Antonescu, Cristina R.; Sander, Chris; Tuschl, Thomas; Singer, Samuel

    2011-01-01

    Liposarcoma remains the most common mesenchymal cancer, with a mortality rate of 60% among patients with this disease. To address the present lack of therapeutic options, we embarked upon a study of microRNA (miRNA) expression alterations associated with liposarcomagenesis with the goal of exploiting differentially expressed miRNAs and the gene products they regulate as potential therapeutic targets. MicroRNA expression was profiled in samples of normal adipose tissue, well-differentiated liposarcoma, and dedifferentiated liposarcoma by both deep sequencing of small RNA libraries and hybridization-based Agilent microarrays. The expression profiles discriminated liposarcoma from normal adipose tissue and well-differentiated from dedifferentiated disease. We defined over 40 miRNAs that were dysregulated in dedifferentiated liposarcomas in both the sequencing and the microarray analysis. The upregulated miRNAs included two cancer-associated species (miR-21, miR-26a), and the downregulated miRNAs included two species that were highly abundant in adipose tissue (miR-143, miR-145). Restoring miR-143 expression in dedifferentiated liposarcoma cells inhibited proliferation, induced apoptosis, and decreased expression of BCL2, TOP2A, PRC1, and PLK1. The downregulation of PRC1 and its docking partner PLK1 suggests that miR-143 inhibits cytokinesis in these cells. In support of this idea, treatment with a PLK1 inhibitor potently induced G2/M growth arrest and apoptosis in liposarcoma cells. Taken together, our findings suggest that miR-143 re-expression vectors or selective agents directed at miR-143 or its targets may have therapeutic value in dedifferentiated liposarcoma. PMID:21693658

  9. Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench

    PubMed Central

    Beckers, Matthew; Mohorianu, Irina; Stocks, Matthew; Applegate, Christopher; Dalmay, Tamas; Moulton, Vincent

    2017-01-01

    Recently, high-throughput sequencing (HTS) has revealed compelling details about the small RNA (sRNA) population in eukaryotes. These 20 to 25 nt noncoding RNAs can influence gene expression by acting as guides for the sequence-specific regulatory mechanism known as RNA silencing. The increase in sequencing depth and number of samples per project enables a better understanding of the role sRNAs play by facilitating the study of expression patterns. However, the intricacy of the biological hypotheses coupled with a lack of appropriate tools often leads to inadequate mining of the available data and thus, an incomplete description of the biological mechanisms involved. To enable a comprehensive study of differential expression in sRNA data sets, we present a new interactive pipeline that guides researchers through the various stages of data preprocessing and analysis. This includes various tools, some of which we specifically developed for sRNA analysis, for quality checking and normalization of sRNA samples as well as tools for the detection of differentially expressed sRNAs and identification of the resulting expression patterns. The pipeline is available within the UEA sRNA Workbench, a user-friendly software package for the processing of sRNA data sets. We demonstrate the use of the pipeline on a H. sapiens data set; additional examples on a B. terrestris data set and on an A. thaliana data set are described in the Supplemental Information. A comparison with existing approaches is also included, which exemplifies some of the issues that need to be addressed for sRNA analysis and how the new pipeline may be used to do this. PMID:28289155

  10. The nucleotide sequence of a major glycine transfer RNA from the posterior silk gland of Bombyx mori L.

    PubMed Central

    Zúñiga, M C; Steitz, J A

    1977-01-01

    The nucleotide sequence of tRNA1Gly isolated from the posterior silk gland of Bombyx mori has been determined. This transfer RNA is present in high amounts in the posterior silk gland during the fifth larval instar. It has a GCC anticodon, capable of decoding a major glycine codon in the fibroin messenger RNA, GGU. Structural features of Bombyx tRNA1Gly and its homology to other eukaryotic glycine tRNAs are discussed. Images PMID:414206

  11. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

    PubMed

    Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

    2015-07-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. High protists diversity in the plankton of sulfurous lakes and lagoons examined by 18s rRNA gene sequence analyses.

    PubMed

    Triadó-Margarit, Xavier; Casamayor, Emilio O

    2015-12-01

    Diversity of small protists was studied in sulfidic and anoxic (euxinic) stratified karstic lakes and coastal lagoons by 18S rRNA gene analyses. We hypothesized a major sulfide effect, reducing protist diversity and richness with only a few specialized populations adapted to deal with low-redox conditions and high-sulfide concentrations. However, genetic fingerprinting suggested similar ecological diversity in anoxic and sulfurous than in upper oxygen rich water compartments with specific populations inhabiting euxinic waters. Many of them agreed with genera previously identified by microscopic observations, but also new and unexpected groups were detected. Most of the sequences matched a rich assemblage of Ciliophora (i.e., Coleps, Prorodon, Plagiopyla, Strombidium, Metopus, Vorticella and Caenomorpha, among others) and algae (mainly Cryptomonadales). Unidentified Cercozoa, Fungi, Stramenopiles and Discoba were recurrently found. The lack of GenBank counterparts was higher in deep hypolimnetic waters and appeared differentially allocated in the different taxa, being higher within Discoba and lower in Cryptophyceae. A larger number of populations than expected were specifically detected in the deep sulfurous waters, with unknown ecological interactions and metabolic capabilities. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  13. Sequence-Specific Affinity Chromatography of Bacterial Small Regulatory RNA-Binding Proteins from Bacterial Cells.

    PubMed

    Gans, Jonathan; Osborne, Jonathan; Cheng, Juliet; Djapgne, Louise; Oglesby-Sherrouse, Amanda G

    2018-01-01

    Bacterial small RNA molecules (sRNAs) are increasingly recognized as central regulators of bacterial stress responses and pathogenesis. In many cases, RNA-binding proteins are critical for the stability and function of sRNAs. Previous studies have adopted strategies to genetically tag an sRNA of interest, allowing isolation of RNA-protein complexes from cells. Here we present a sequence-specific affinity purification protocol that requires no prior genetic manipulation of bacterial cells, allowing isolation of RNA-binding proteins bound to native RNA molecules.

  14. Analysis and Prediction of Exon Skipping Events from RNA-Seq with Sequence Information Using Rotation Forest.

    PubMed

    Du, Xiuquan; Hu, Changlin; Yao, Yu; Sun, Shiwei; Zhang, Yanping

    2017-12-12

    In bioinformatics, exon skipping (ES) event prediction is an essential part of alternative splicing (AS) event analysis. Although many methods have been developed to predict ES events, a solution has yet to be found. In this study, given the limitations of machine learning algorithms with RNA-Seq data or genome sequences, a new feature, called RS (RNA-seq and sequence) features, was constructed. These features include RNA-Seq features derived from the RNA-Seq data and sequence features derived from genome sequences. We propose a novel Rotation Forest classifier to predict ES events with the RS features (RotaF-RSES). To validate the efficacy of RotaF-RSES, a dataset from two human tissues was used, and RotaF-RSES achieved an accuracy of 98.4%, a specificity of 99.2%, a sensitivity of 94.1%, and an area under the curve (AUC) of 98.6%. When compared to the other available methods, the results indicate that RotaF-RSES is efficient and can predict ES events with RS features.

  15. Effect of substrate RNA sequence on the cleavage reaction by a short ribozyme.

    PubMed Central

    Ohmichi, T; Okumoto, Y; Sugimoto, N

    1998-01-01

    Leadzyme is a ribozyme that requires Pb2+. The catalytic sequence, CUGGGAGUCC, binds to an RNA substrate, GGACC downward arrowGAGCCAG, cleaving the RNA substrate at one site. We have investigated the effect of the substrate sequence on the cleavage activity of leadzyme using mutant substrates in order to structurally understand the RNA catalysis. The results showed that leadzyme acted as a catalyst for single site cleavage of a C5 deletion mutant substrate, GGAC downward arrowGAGCCAG, as well as the wild-type substrate. However, a mutant substrate GGACCGACCAG, which had G8 deleted from the wild-type substrate, was not cleaved. Kinetic studies by surface plasmon resonance indicated that the difference between active and inactive structures reflected the slow association and dissociation rate constants of complex formation induced by Pb2+rather than differences in complex stability. CD spectra showed that the active form of the substrate-leadzyme complex was rearranged by Pb2+binding. The G8 of the wild-type substrate, which was absent in the inactive complex, is not near the cleavage site. Thus, these results show that the active substrate-leadzyme complex has a Pb2+binding site at the junction between the unpaired region (asymmetric internal loop) and the stem region, which is distal to the cleavage site. Pb2+may play a role in rearranging the bases in the asymmetric internal loop to the correct position for catalysis. PMID:9837996

  16. mRNA deep sequencing reveals 75 new genes and a complex transcriptional landscape in Mimivirus.

    PubMed

    Legendre, Matthieu; Audic, Stéphane; Poirot, Olivier; Hingamp, Pascal; Seltzer, Virginie; Byrne, Deborah; Lartigue, Audrey; Lescot, Magali; Bernadac, Alain; Poulain, Julie; Abergel, Chantal; Claverie, Jean-Michel

    2010-05-01

    Mimivirus, a virus infecting Acanthamoeba, is the prototype of the Mimiviridae, the latest addition to the nucleocytoplasmic large DNA viruses. The Mimivirus genome encodes close to 1000 proteins, many of them never before encountered in a virus, such as four amino-acyl tRNA synthetases. To explore the physiology of this exceptional virus and identify the genes involved in the building of its characteristic intracytoplasmic "virion factory," we coupled electron microscopy observations with the massively parallel pyrosequencing of the polyadenylated RNA fractions of Acanthamoeba castellanii cells at various time post-infection. We generated 633,346 reads, of which 322,904 correspond to Mimivirus transcripts. This first application of deep mRNA sequencing (454 Life Sciences [Roche] FLX) to a large DNA virus allowed the precise delineation of the 5' and 3' extremities of Mimivirus mRNAs and revealed 75 new transcripts including several noncoding RNAs. Mimivirus genes are expressed across a wide dynamic range, in a finely regulated manner broadly described by three main temporal classes: early, intermediate, and late. This RNA-seq study confirmed the AAAATTGA sequence as an early promoter element, as well as the presence of palindromes at most of the polyadenylation sites. It also revealed a new promoter element correlating with late gene expression, which is also prominent in Sputnik, the recently described Mimivirus "virophage." These results-validated genome-wide by the hybridization of total RNA extracted from infected Acanthamoeba cells on a tiling array (Agilent)--will constitute the foundation on which to build subsequent functional studies of the Mimivirus/Acanthamoeba system.

  17. RNA sequencing reveals significant miRNAs in Atypical endometrial hyperplasia.

    PubMed

    Tang, Shiqian; Dai, Yinmei

    2018-06-01

    In this paper, we aimed to investigate the miRNAs that played a regulatory role in the development of atypical endometrial hyperplasia (AEH). RNA sequencing was performed for endometrial tissues from 3 AEH patients and 3 endometrial normal hyperplasia patients. RNA sequencing data were processed and differentially expressed (DE) miRNAs were identified between AEH and controls. The target genes for DE miRNAs were identified and mapped to the protein-protein interaction (PPI) network. The miRNA related functions were predicted and miRNA-disease gene network was constructed. Total 18 DE miRNAs were overlapped in three sample groups, among which hsa-miR-577, hsa-miR-182-5p and hsa-miR-183-5p were top three miRNAs that targeting largest number of genes. Function analysis showed that the 18 overlapped miRNAs mainly related with cancer and signaling transduction related pathways. PPI network showed that total 12 genes were among top 20 genes based on three network topological features including BCL2, UMPS, MAPK13, PRKCB, CREB1, IGF1, SP1, SMAD3, IGF1R, NOTCH2, WNT5A, TK2. Top 10 miRNAs in miRNA-disease gene network were identified such as hsa-miR-577 (degree = 17), hsa-miR-182-5p (degree = 16) and hsa-miR-3609 (degree = 13). hsa-miR-577 and hsa-miR-182-5p may play regulatory role in AEH through AMPK signal pathway and Wnt signaling pathway. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Detecting cooperative sequences in the binding of RNA Polymerase-II

    NASA Astrophysics Data System (ADS)

    Glass, Kimberly; Rozenberg, Julian; Girvan, Michelle; Losert, Wolfgang; Ott, Ed; Vinson, Charles

    2008-03-01

    Regulation of the expression level of genes is a key biological process controlled largely by the 1000 base pair (bp) sequence preceding each gene (the promoter region). Within that region transcription factor binding sites (TFBS), 5-10 bp long sequences, act individually or cooperate together in the recruitment of, and therefore subsequent gene transcription by, RNA Polymerase-II (RNAP). We have measured the binding of RNAP to promoters on a genome-wide basis using Chromatin Immunoprecipitation (ChIP-on-Chip) microarray assays. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters with high RNAP binding values. We are able to demonstrate that virtually all sequences enriched in such promoters contain a CpG dinucleotide, indicating that TFBS that contain the CpG dinucleotide are involved in RNAP binding to promoters. Further analysis shows that the presence of pairs of CpG containing sequences cooperate to enhance the binding of RNAP to the promoter.

  19. Hsp70 Is a Novel Posttranscriptional Regulator of Gene Expression That Binds and Stabilizes Selected mRNAs Containing AU-Rich Elements

    PubMed Central

    Kishor, Aparna; Tandukar, Bishal; Ly, Yann V.; Toth, Eric A.; Suarez, Yvelisse; Brewer, Gary

    2013-01-01

    The AU-rich elements (AREs) encoded within many mRNA 3′ untranslated regions (3′UTRs) are targets for factors that control transcript longevity and translational efficiency. Hsp70, best known as a protein chaperone with well-defined peptide-refolding properties, is known to interact with ARE-like RNA substrates in vitro. Here, we show that cofactor-free preparations of Hsp70 form direct, high-affinity complexes with ARE substrates based on specific recognition of U-rich sequences by both the ATP- and peptide-binding domains. Suppressing Hsp70 in HeLa cells destabilized an ARE reporter mRNA, indicating a novel ARE-directed mRNA-stabilizing role for this protein. Hsp70 also bound and stabilized endogenous ARE-containing mRNAs encoding vascular endothelial growth factor (VEGF) and Cox-2, which involved a mechanism that was unaffected by an inhibitor of its protein chaperone function. Hsp70 recognition and stabilization of VEGF mRNA was mediated by an ARE-like sequence in the proximal 3′UTR. Finally, stabilization of VEGF mRNA coincided with the accumulation of Hsp70 protein in HL60 promyelocytic leukemia cells recovering from acute thermal stress. We propose that the binding and stabilization of selected ARE-containing mRNAs may contribute to the cytoprotective effects of Hsp70 following cellular stress but may also provide a novel mechanism linking constitutively elevated Hsp70 expression to the development of aggressive neoplastic phenotypes. PMID:23109422

  20. Linking maternal and somatic 5S rRNA types with different sequence-specific non-LTR retrotransposons

    PubMed Central

    Pagano, Johanna F.B.; Ensink, Wim A.; van Olst, Marina; van Leeuwen, Selina; Nehrdich, Ulrike; Zhu, Kongju; Spaink, Herman P.; Girard, Geneviève; Rauwerda, Han; Jonker, Martijs J.; Dekker, Rob J.

    2017-01-01

    5S rRNA is a ribosomal core component, transcribed from many gene copies organized in genomic repeats. Some eukaryotic species have two 5S rRNA types defined by their predominant expression in oogenesis or adult tissue. Our next-generation sequencing study on zebrafish egg, embryo, and adult tissue identified maternal-type 5S rRNA that is exclusively accumulated during oogenesis, replaced throughout the embryogenesis by a somatic-type, and thus virtually absent in adult somatic tissue. The maternal-type 5S rDNA contains several thousands of gene copies on chromosome 4 in tandem repeats with small intergenic regions, whereas the somatic-type is present in only 12 gene copies on chromosome 18 with large intergenic regions. The nine-nucleotide variation between the two 5S rRNA types likely affects TFIII binding and riboprotein L5 binding, probably leading to storage of maternal-type rRNA. Remarkably, these sequence differences are located exactly at the sequence-specific target site for genome integration by the 5S rRNA-specific Mutsu retrotransposon family. Thus, we could define maternal- and somatic-type MutsuDr subfamilies. Furthermore, we identified four additional maternal-type and two new somatic-type MutsuDr subfamilies, each with their own target sequence. This target-site specificity, frequently intact maternal-type retrotransposon elements, plus specific presence of Mutsu retrotransposon RNA and piRNA in egg and adult tissue, suggest an involvement of retrotransposons in achieving the differential copy number of the two types of 5S rDNA loci. PMID:28003516

  1. Linking maternal and somatic 5S rRNA types with different sequence-specific non-LTR retrotransposons.

    PubMed

    Locati, Mauro D; Pagano, Johanna F B; Ensink, Wim A; van Olst, Marina; van Leeuwen, Selina; Nehrdich, Ulrike; Zhu, Kongju; Spaink, Herman P; Girard, Geneviève; Rauwerda, Han; Jonker, Martijs J; Dekker, Rob J; Breit, Timo M

    2017-04-01

    5S rRNA is a ribosomal core component, transcribed from many gene copies organized in genomic repeats. Some eukaryotic species have two 5S rRNA types defined by their predominant expression in oogenesis or adult tissue. Our next-generation sequencing study on zebrafish egg, embryo, and adult tissue identified maternal-type 5S rRNA that is exclusively accumulated during oogenesis, replaced throughout the embryogenesis by a somatic-type, and thus virtually absent in adult somatic tissue. The maternal-type 5S rDNA contains several thousands of gene copies on chromosome 4 in tandem repeats with small intergenic regions, whereas the somatic-type is present in only 12 gene copies on chromosome 18 with large intergenic regions. The nine-nucleotide variation between the two 5S rRNA types likely affects TFIII binding and riboprotein L5 binding, probably leading to storage of maternal-type rRNA. Remarkably, these sequence differences are located exactly at the sequence-specific target site for genome integration by the 5S rRNA-specific Mutsu retrotransposon family. Thus, we could define maternal- and somatic-type MutsuDr subfamilies. Furthermore, we identified four additional maternal-type and two new somatic-type MutsuDr subfamilies, each with their own target sequence. This target-site specificity, frequently intact maternal-type retrotransposon elements, plus specific presence of Mutsu retrotransposon RNA and piRNA in egg and adult tissue, suggest an involvement of retrotransposons in achieving the differential copy number of the two types of 5S rDNA loci. © 2017 Locati et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  2. Identification of Bacillus Probiotics Isolated from Soil Rhizosphere Using 16S rRNA, recA, rpoB Gene Sequencing and RAPD-PCR.

    PubMed

    Mohkam, Milad; Nezafat, Navid; Berenjian, Aydin; Mobasher, Mohammad Ali; Ghasemi, Younes

    2016-03-01

    Some Bacillus species, especially Bacillus subtilis and Bacillus pumilus groups, have highly similar 16S rRNA gene sequences, which are hard to identify based on 16S rDNA sequence analysis. To conquer this drawback, rpoB, recA sequence analysis along with randomly amplified polymorphic (RAPD) fingerprinting was examined as an alternative method for differentiating Bacillus species. The 16S rRNA, rpoB and recA genes were amplified via a polymerase chain reaction using their specific primers. The resulted PCR amplicons were sequenced, and phylogenetic analysis was employed by MEGA 6 software. Identification based on 16S rRNA gene sequencing was underpinned by rpoB and recA gene sequencing as well as RAPD-PCR technique. Subsequently, concatenation and phylogenetic analysis showed that extent of diversity and similarity were better obtained by rpoB and recA primers, which are also reinforced by RAPD-PCR methods. However, in one case, these approaches failed to identify one isolate, which in combination with the phenotypical method offsets this issue. Overall, RAPD fingerprinting, rpoB and recA along with concatenated genes sequence analysis discriminated closely related Bacillus species, which highlights the significance of the multigenic method in more precisely distinguishing Bacillus strains. This research emphasizes the benefit of RAPD fingerprinting, rpoB and recA sequence analysis superior to 16S rRNA gene sequence analysis for suitable and effective identification of Bacillus species as recommended for probiotic products.

  3. Plant tRNA ligases are multifunctional enzymes that have diverged in sequence and substrate specificity from RNA ligases of other phylogenetic origins

    PubMed Central

    Englert, Markus; Beier, Hildburg

    2005-01-01

    Pre-tRNA splicing is an essential process in all eukaryotes. It requires the concerted action of an endonuclease to remove the intron and a ligase for joining the resulting tRNA halves as studied best in the yeast Saccharomyces cerevisiae. Here, we report the first characterization of an RNA ligase protein and its gene from a higher eukaryotic organism that is an essential component of the pre-tRNA splicing process. Purification of tRNA ligase from wheat germ by successive column chromatographic steps has identified a protein of 125 kDa by its potentiality to covalently bind AMP, and by its ability to catalyse the ligation of tRNA halves and the circularization of linear introns. Peptide sequences obtained from the purified protein led to the elucidation of the corresponding proteins and their genes in Arabidopsis and Oryza databases. The plant tRNA ligases exhibit no overall sequence homologies to any known RNA ligases, however, they harbour a number of conserved motifs that indicate the presence of three intrinsic enzyme activities: an adenylyltransferase/ligase domain in the N-terminal region, a polynucleotide kinase in the centre and a cyclic phosphodiesterase domain at the C-terminal end. In vitro expression of the recombinant Arabidopsis tRNA ligase and functional analyses revealed all expected individual activities. Plant RNA ligases are active on a variety of substrates in vitro and are capable of inter- and intramolecular RNA joining. Hence, we conclude that their role in vivo might comprise yet unknown essential functions besides their involvement in pre-tRNA splicing. PMID:15653639

  4. RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes

    PubMed Central

    Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle

    2016-01-01

    ABSTRACT Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation. PMID:26580233

  5. RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes.

    PubMed

    Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle

    2016-01-01

    Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation.

  6. RNA regulators responding to ribosomal protein S15 are frequent in sequence space

    PubMed Central

    Slinger, Betty L.; Meyer, Michelle M.

    2016-01-01

    There are several natural examples of distinct RNA structures that interact with the same ligand to regulate the expression of homologous genes in different organisms. One essential question regarding this phenomenon is whether such RNA regulators are the result of convergent or divergent evolution. Are the RNAs derived from some common ancestor and diverged to the point where we cannot identify the similarity, or have multiple solutions to the same biological problem arisen independently? A key variable in assessing these alternatives is how frequently such regulators arise within sequence space. Ribosomal protein S15 is autogenously regulated via an RNA regulator in many bacterial species; four apparently distinct regulators have been functionally validated in different bacterial phyla. Here, we explore how frequently such regulators arise within a partially randomized sequence population. We find many RNAs that interact specifically with ribosomal protein S15 from Geobacillus kaustophilus with biologically relevant dissociation constants. Furthermore, of the six sequences we characterize, four show regulatory activity in an Escherichia coli reporter assay. Subsequent footprinting and mutagenesis analysis indicates that protein binding proximal to regulatory features such as the Shine–Dalgarno sequence is sufficient to enable regulation, suggesting that regulation in response to S15 is relatively easily acquired. PMID:27580716

  7. Identification and analysis of pig chimeric mRNAs using RNA sequencing data

    PubMed Central

    2012-01-01

    Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs. PMID:22925561

  8. Evaluation of sequence alignments and oligonucleotide probes with respect to three-dimensional structure of ribosomal RNA using ARB software package

    PubMed Central

    Kumar, Yadhu; Westram, Ralf; Kipfer, Peter; Meier, Harald; Ludwig, Wolfgang

    2006-01-01

    Background Availability of high-resolution RNA crystal structures for the 30S and 50S ribosomal subunits and the subsequent validation of comparative secondary structure models have prompted the biologists to use three-dimensional structure of ribosomal RNA (rRNA) for evaluating sequence alignments of rRNA genes. Furthermore, the secondary and tertiary structural features of rRNA are highly useful and successfully employed in designing rRNA targeted oligonucleotide probes intended for in situ hybridization experiments. RNA3D, a program to combine sequence alignment information with three-dimensional structure of rRNA was developed. Integration into ARB software package, which is used extensively by the scientific community for phylogenetic analysis and molecular probe designing, has substantially extended the functionality of ARB software suite with 3D environment. Results Three-dimensional structure of rRNA is visualized in OpenGL 3D environment with the abilities to change the display and overlay information onto the molecule, dynamically. Phylogenetic information derived from the multiple sequence alignments can be overlaid onto the molecule structure in a real time. Superimposition of both statistical and non-statistical sequence associated information onto the rRNA 3D structure can be done using customizable color scheme, which is also applied to a textual sequence alignment for reference. Oligonucleotide probes designed by ARB probe design tools can be mapped onto the 3D structure along with the probe accessibility models for evaluation with respect to secondary and tertiary structural conformations of rRNA. Conclusion Visualization of three-dimensional structure of rRNA in an intuitive display provides the biologists with the greater possibilities to carry out structure based phylogenetic analysis. Coupled with secondary structure models of rRNA, RNA3D program aids in validating the sequence alignments of rRNA genes and evaluating probe target sites

  9. Screening for plant viruses by next generation sequencing using a modified double strand RNA extraction protocol with an internal amplification control.

    PubMed

    Kesanakurti, Prasad; Belton, Mark; Saeed, Hanaa; Rast, Heidi; Boyes, Ian; Rott, Michael

    2016-10-01

    The majority of plant viruses contain RNA genomes. Detection of viral RNA genomes in infected plant material by next generation sequencing (NGS) is possible through the extraction and sequencing of total RNA, total RNA devoid of ribosomal RNA, small RNA interference (RNAi) molecules, or double stranded RNA (dsRNA). Plants do not typically produce high molecular weight dsRNA, therefore the presence of dsRNA makes it an attractive target for plant virus diagnostics. The sensitivity of NGS as a diagnostic method demands an effective dsRNA protocol that is both representative of the sample and minimizes sample cross contamination. We have developed a modified dsRNA extraction protocol that is more efficient compared to traditional protocols, requiring reduced amounts of starting material, that is less prone to sample cross contamination. This was accomplished by using bead based homogenization of plant material in closed, disposable 50ml tubes. To assess the quality of extraction, we also developed an internal control by designing a real-time (quantitative) PCR (qPCR) assay that targets endornaviruses present in Phaseolus vulgaris cultivar Black Turtle Soup (BTS). Crown Copyright © 2016. Published by Elsevier B.V. All rights reserved.

  10. The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence.

    PubMed

    Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J

    2015-09-18

    La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    NASA Technical Reports Server (NTRS)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  12. Fingerprints of Modified RNA Bases from Deep Sequencing Profiles.

    PubMed

    Kietrys, Anna M; Velema, Willem A; Kool, Eric T

    2017-11-29

    Posttranscriptional modifications of RNA bases are not only found in many noncoding RNAs but have also recently been identified in coding (messenger) RNAs as well. They require complex and laborious methods to locate, and many still lack methods for localized detection. Here we test the ability of next-generation sequencing (NGS) to detect and distinguish between ten modified bases in synthetic RNAs. We compare ultradeep sequencing patterns of modified bases, including miscoding, insertions and deletions (indels), and truncations, to unmodified bases in the same contexts. The data show widely varied responses to modification, ranging from no response, to high levels of mutations, insertions, deletions, and truncations. The patterns are distinct for several of the modifications, and suggest the future use of ultradeep sequencing as a fingerprinting strategy for locating and identifying modifications in cellular RNAs.

  13. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    PubMed

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by

  14. The nucleotide sequence of RNA1 of Lettuce big-vein virus, genus Varicosavirus, reveals its relation to nonsegmented negative-strand RNA viruses.

    PubMed

    Sasaya, Takahide; Ishikawa, Koichi; Koganezawa, Hiroki

    2002-06-05

    The complete nucleotide sequence of RNA1 from Lettuce big-vein virus (LBVV), the type member of the genus Varicosavirus, was determined. LBVV RNA1 consists of 6797 nucleotides and contains one large ORF that encodes a large (L) protein of 2040 amino acids with a predicted M(r) of 232,092. Northern blot hybridization analysis indicated that the LBVV RNA1 is a negative-sense RNA. Database searches showed that the amino acid sequence of L protein is homologous to those of L polymerases of nonsegmented negative-strand RNA viruses. A cluster dendrogram derived from alignments of the LBVV L protein and the L polymerases indicated that the L protein is most closely related to the L polymerases of plant rhabdoviruses. Transcription termination/polyadenylation signal-like poly(U) tracts that resemble those in rhabdovirus and paramyxovirus RNAs were present upstream and downstream of the coding region. Although LBVV is related to rhabdoviruses, a key distinguishing feature is that the genome of LBVV is segmented. The results reemphasize the need to reconsider the taxonomic position of varicosaviruses.

  15. Extraordinary GU-rich single-strand RNA identified from SARS coronavirus contributes an excessive innate immune response.

    PubMed

    Li, Yan; Chen, Ming; Cao, Hongwei; Zhu, Yuanfeng; Zheng, Jiang; Zhou, Hong

    2013-02-01

    A dangerous cytokine storm occurs in the SARS involving in immune disorder, but many aspects of the pathogenetic mechanism remain obscure since its outbreak. To deeply reveal the interaction of host and SARS-CoV, based on the basic structural feature of pathogen-associated molecular pattern, we created a new bioinformatics method for searching potential pathogenic molecules and identified a set of SARS-CoV specific GU-rich ssRNA fragments with a high-density distribution in the genome. In vitro experiments, the result showed the representative SARS-CoV ssRNAs had powerful immunostimulatory activities to induce considerable level of pro-inflammatory cytokine TNF-a, IL-6 and IL-12 release via the TLR7 and TLR8, almost 2-fold higher than the strong stimulatory ssRNA40 that was found previously from other virus. Moreover, SARS-CoV ssRNA was able to cause acute lung injury in mice with a high mortality rate in vivo experiment. It suggests that SARS-CoV specific GU-rich ssRNA plays a very important role in the cytokine storm associated with a dysregulation of the innate immunity. This study not only presents new evidence about the immunopathologic damage caused by overactive inflammation during the SARS-CoV infection, but also provides a useful clue for a new therapeutic strategy. Copyright © 2012 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  16. Automated Identification of Medically Important Bacteria by 16S rRNA Gene Sequencing Using a Novel Comprehensive Database, 16SpathDB▿

    PubMed Central

    Woo, Patrick C. Y.; Teng, Jade L. L.; Yeung, Juilian M. Y.; Tse, Herman; Lau, Susanna K. P.; Yuen, Kwok-Yung

    2011-01-01

    Despite the increasing use of 16S rRNA gene sequencing, interpretation of 16S rRNA gene sequence results is one of the most difficult problems faced by clinical microbiologists and technicians. To overcome the problems we encountered in the existing databases during 16S rRNA gene sequence interpretation, we built a comprehensive database, 16SpathDB (http://147.8.74.24/16SpathDB) based on the 16S rRNA gene sequences of all medically important bacteria listed in the Manual of Clinical Microbiology and evaluated its use for automated identification of these bacteria. Among 91 nonduplicated bacterial isolates collected in our clinical microbiology laboratory, 71 (78%) were reported by 16SpathDB as a single bacterial species having >98.0% nucleotide identity with the query sequence, 19 (20.9%) were reported as more than one bacterial species having >98.0% nucleotide identity with the query sequence, and 1 (1.1%) was reported as no match. For the 71 bacterial isolates reported as a single bacterial species, all results were identical to their true identities as determined by a polyphasic approach. For the 19 bacterial isolates reported as more than one bacterial species, all results contained their true identities as determined by a polyphasic approach and all of them had their true identities as the “best match in 16SpathDB.” For the isolate (Gordonibacter pamelaeae) reported as no match, the bacterium has never been reported to be associated with human disease and was not included in the Manual of Clinical Microbiology. 16SpathDB is an automated, user-friendly, efficient, accurate, and regularly updated database for 16S rRNA gene sequence interpretation in clinical microbiology laboratories. PMID:21389154

  17. Pseudomonas sp. strain CA5 (a selenite-reducing bacterium) 16S rRNA gene complete sequence. National Institute of Health, National Center for Biotechnology Information, GenBank sequence. Accession FJ422810.1.

    USDA-ARS?s Scientific Manuscript database

    This study used 1321 base pair 16S rRNA gene sequence methods to confirm the phylogenetic position of a soil isolate as a bacterium belonging to the genus Pesudomonas sp. Morphological, biochemical characteristics, and fatty acid profiles are consistent with the 16S rRNA gene sequence identification...

  18. The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence

    DOE PAGES

    Lahr, Roni M.; Mack, Seshat M.; Heroux, Annie; ...

    2015-07-22

    La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. Amore » putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. Ultimately, these studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis.« less

  19. Three Cases of Anaerobiospirillum succiniciproducens Bacteremia Confirmed by 16S rRNA Gene Sequencing

    PubMed Central

    Tee, Wee; Korman, Tony M.; Waters, Mary Jo; Macphee, Andrew; Jenney, Adam; Joyce, Linda; Dyall-Smith, Michael L.

    1998-01-01

    We describe three cases of Anaerobiospirillum succiniciproducens bacteremia from Australia. We believe one of these cases represents the first report of A. succiniciproducens bacteremia in a human immunodeficiency virus (HIV)-infected individual. The other two patients had an underlying disorder (one patient had bleeding esophageal varices complicating alcohol liver disease and one patient had non-Hodgkin’s lymphoma). A motile, gram-negative, spiral anaerobe was isolated by culturing blood from all patients. Electron microscopy showed a curved bacterium with bipolar tufts of flagella resembling Anaerobiospirillum spp. Sequencing of the 16S rRNA genes of the isolates revealed no close relatives (organisms likely to be in the same genus) in the sequence databases, nor were any sequence data available for A. succiniciproducens. This report presents for the first time the 16S rRNA gene sequence of the type strain of A. succiniciproducens, strain ATCC 29305. Two of the three clinical isolates have sequences identical to that of the type strain, while the sequence of the other strain differs from that of the type strain at 4 nucleotides. PMID:9574678

  20. Next-generation small RNA sequencing for microRNAs profiling in the honey bee Apis mellifera.

    PubMed

    Chen, X; Yu, X; Cai, Y; Zheng, H; Yu, D; Liu, G; Zhou, Q; Hu, S; Hu, F

    2010-12-01

    MicroRNAs (miRNAs) are key regulators in various physiological and pathological processes via post-transcriptional regulation of gene expression. The honey bee (Apis mellifera) is a key model for highly social species, and its complex social behaviour can be interpreted theoretically as changes in gene regulation, in which miRNAs are thought to be involved. We used the SOLiD sequencing system to identify the repertoire of miRNAs in the honey bee by sequencing a mixed small RNA library from different developmental stages. We obtained a total of 36,796,459 raw sequences; of which 5,491,100 short sequences were fragments of mRNA and other noncoding RNAs (ncRNA), and 1,759,346 reads mapped to the known miRNAs. We predicted 267 novel honey bee miRNAs representing 380,182 short reads, including eight miRNAs of other insects in 14,107,583 genome-mapped sequences. We verified 50 of them using stem-loop reverse-transcription PCR (RT-PCR), in which 35 yielded PCR products. Cross-species analyses showed 81 novel miRNAs with homologues in other insects, suggesting that they were authentic miRNAs and have similar functions. The results of this study provide a basis for studies of the miRNA-modulating networks in development and some intriguing phenomena such as caste differentiation in A. mellifera. © 2010 The Authors. Insect Molecular Biology © 2010 The Royal Entomological Society.

  1. SHAPE Selection (SHAPES) enrich for RNA structure signal in SHAPE sequencing-based probing data

    PubMed Central

    Poulsen, Line Dahl; Kielpinski, Lukasz Jan; Salama, Sofie R.; Krogh, Anders; Vinther, Jeppe

    2015-01-01

    Selective 2′ Hydroxyl Acylation analyzed by Primer Extension (SHAPE) is an accurate method for probing of RNA secondary structure. In existing SHAPE methods, the SHAPE probing signal is normalized to a no-reagent control to correct for the background caused by premature termination of the reverse transcriptase. Here, we introduce a SHAPE Selection (SHAPES) reagent, N-propanone isatoic anhydride (NPIA), which retains the ability of SHAPE reagents to accurately probe RNA structure, but also allows covalent coupling between the SHAPES reagent and a biotin molecule. We demonstrate that SHAPES-based selection of cDNA–RNA hybrids on streptavidin beads effectively removes the large majority of background signal present in SHAPE probing data and that sequencing-based SHAPES data contain the same amount of RNA structure data as regular sequencing-based SHAPE data obtained through normalization to a no-reagent control. Moreover, the selection efficiently enriches for probed RNAs, suggesting that the SHAPES strategy will be useful for applications with high-background and low-probing signal such as in vivo RNA structure probing. PMID:25805860

  2. DrImpute: imputing dropout events in single cell RNA sequencing data.

    PubMed

    Gong, Wuming; Kwak, Il-Youp; Pota, Pruthvi; Koyano-Nakagawa, Naoko; Garry, Daniel J

    2018-06-08

    The single cell RNA sequencing (scRNA-seq) technique begin a new era by allowing the observation of gene expression at the single cell level. However, there is also a large amount of technical and biological noise. Because of the low number of RNA transcriptomes and the stochastic nature of the gene expression pattern, there is a high chance of missing nonzero entries as zero, which are called dropout events. We develop DrImpute to impute dropout events in scRNA-seq data. We show that DrImpute has significantly better performance on the separation of the dropout zeros from true zeros than existing imputation algorithms. We also demonstrate that DrImpute can significantly improve the performance of existing tools for clustering, visualization and lineage reconstruction of nine published scRNA-seq datasets. DrImpute can serve as a very useful addition to the currently existing statistical tools for single cell RNA-seq analysis. DrImpute is implemented in R and is available at https://github.com/gongx030/DrImpute .

  3. mRNA deep sequencing reveals 75 new genes and a complex transcriptional landscape in Mimivirus

    PubMed Central

    Legendre, Matthieu; Audic, Stéphane; Poirot, Olivier; Hingamp, Pascal; Seltzer, Virginie; Byrne, Deborah; Lartigue, Audrey; Lescot, Magali; Bernadac, Alain; Poulain, Julie; Abergel, Chantal; Claverie, Jean-Michel

    2010-01-01

    Mimivirus, a virus infecting Acanthamoeba, is the prototype of the Mimiviridae, the latest addition to the nucleocytoplasmic large DNA viruses. The Mimivirus genome encodes close to 1000 proteins, many of them never before encountered in a virus, such as four amino-acyl tRNA synthetases. To explore the physiology of this exceptional virus and identify the genes involved in the building of its characteristic intracytoplasmic “virion factory,” we coupled electron microscopy observations with the massively parallel pyrosequencing of the polyadenylated RNA fractions of Acanthamoeba castellanii cells at various time post-infection. We generated 633,346 reads, of which 322,904 correspond to Mimivirus transcripts. This first application of deep mRNA sequencing (454 Life Sciences [Roche] FLX) to a large DNA virus allowed the precise delineation of the 5′ and 3′ extremities of Mimivirus mRNAs and revealed 75 new transcripts including several noncoding RNAs. Mimivirus genes are expressed across a wide dynamic range, in a finely regulated manner broadly described by three main temporal classes: early, intermediate, and late. This RNA-seq study confirmed the AAAATTGA sequence as an early promoter element, as well as the presence of palindromes at most of the polyadenylation sites. It also revealed a new promoter element correlating with late gene expression, which is also prominent in Sputnik, the recently described Mimivirus “virophage.” These results—validated genome-wide by the hybridization of total RNA extracted from infected Acanthamoeba cells on a tiling array (Agilent)—will constitute the foundation on which to build subsequent functional studies of the Mimivirus/Acanthamoeba system. PMID:20360389

  4. High-Throughput Sequencing of RNA Silencing-Associated Small RNAs in Olive (Olea europaea L.)

    PubMed Central

    Donaire, Livia; Pedrola, Laia; de la Rosa, Raúl; Llave, César

    2011-01-01

    Small RNAs (sRNAs) of 20 to 25 nucleotides (nt) in length maintain genome integrity and control gene expression in a multitude of developmental and physiological processes. Despite RNA silencing has been primarily studied in model plants, the advent of high-throughput sequencing technologies has enabled profiling of the sRNA component of more than 40 plant species. Here, we used deep sequencing and molecular methods to report the first inventory of sRNAs in olive (Olea europaea L.). sRNA libraries prepared from juvenile and adult shoots revealed that the 24-nt class dominates the sRNA transcriptome and atypically accumulates to levels never seen in other plant species, suggesting an active role of heterochromatin silencing in the maintenance and integrity of its large genome. A total of 18 known miRNA families were identified in the libraries. Also, 5 other sRNAs derived from potential hairpin-like precursors remain as plausible miRNA candidates. RNA blots confirmed miRNA expression and suggested tissue- and/or developmental-specific expression patterns. Target mRNAs of conserved miRNAs were computationally predicted among the olive cDNA collection and experimentally validated through endonucleolytic cleavage assays. Finally, we use expression data to uncover genetic components of the miR156, miR172 and miR390/TAS3-derived trans-acting small interfering RNA (tasiRNA) regulatory nodes, suggesting that these interactive networks controlling developmental transitions are fully operational in olive. PMID:22140484

  5. Uncovering Small RNA-Mediated Responses to Cold Stress in a Wheat Thermosensitive Genic Male-Sterile Line by Deep Sequencing1[W][OA

    PubMed Central

    Tang, Zhonghui; Zhang, Liping; Xu, Chenguang; Yuan, Shaohua; Zhang, Fengting; Zheng, Yonglian; Zhao, Changping

    2012-01-01

    The male sterility of thermosensitive genic male sterile (TGMS) lines of wheat (Triticum aestivum) is strictly controlled by temperature. The early phase of anther development is especially susceptible to cold stress. MicroRNAs (miRNAs) play an important role in plant development and in responses to environmental stress. In this study, deep sequencing of small RNA (smRNA) libraries obtained from spike tissues of the TGMS line under cold and control conditions identified a total of 78 unique miRNA sequences from 30 families and trans-acting small interfering RNAs (tasiRNAs) derived from two TAS3 genes. To identify smRNA targets in the wheat TGMS line, we applied the degradome sequencing method, which globally and directly identifies the remnants of smRNA-directed target cleavage. We identified 26 targets of 16 miRNA families and three targets of tasiRNAs. Comparing smRNA sequencing data sets and TaqMan quantitative polymerase chain reaction results, we identified six miRNAs and one tasiRNA (tasiRNA-ARF [for Auxin-Responsive Factor]) as cold stress-responsive smRNAs in spike tissues of the TGMS line. We also determined the expression profiles of target genes that encode transcription factors in response to cold stress. Interestingly, the expression of cold stress-responsive smRNAs integrated in the auxin-signaling pathway and their target genes was largely noncorrelated. We investigated the tissue-specific expression of smRNAs using a tissue microarray approach. Our data indicated that miR167 and tasiRNA-ARF play roles in regulating the auxin-signaling pathway and possibly in the developmental response to cold stress. These data provide evidence that smRNA regulatory pathways are linked with male sterility in the TGMS line during cold stress. PMID:22508932

  6. Induction of cysteine-rich motor neuron 1 mRNA expression in vascular endothelial cells.

    PubMed

    Nakashima, Yukiko; Takahashi, Satoru

    2014-08-22

    Cysteine-rich motor neuron 1 (CRIM1) is expressed in vascular endothelial cells and plays a crucial role in angiogenesis. In this study, we investigated the expression of CRIM1 mRNA in human umbilical vein endothelial cells (HUVECs). CRIM1 mRNA levels were not altered in vascular endothelial growth factor (VEGF)-stimulated monolayer HUVECs or in cells in collagen gels without VEGF. In contrast, the expression of CRIM1 mRNA was elevated in VEGF-stimulated cells in collagen gels. The increase in CRIM1 mRNA expression was observed even at 2h when HUVECs did not form tubular structures in collagen gels. Extracellular signal-regulated kinase (Erk) 1/2, Akt and focal adhesion kinase (FAK) were activated by VEGF in HUVECs. The VEGF-induced expression of CRIM1 mRNA was significantly abrogated by PD98059 or PF562271, but was not affected by LY294002. These results demonstrate that CRIM1 is an early response gene in the presence of both angiogenic stimulation (VEGF) and environmental (extracellular matrix) factors, and Erk and FAK might be involved in the upregulation of CRIM1 mRNA expression in vascular endothelial cells. Copyright © 2014 Elsevier Inc. All rights reserved.

  7. Ebola virus RNA editing depends on the primary editing site sequence and an upstream secondary structure.

    PubMed

    Mehedi, Masfique; Hoenen, Thomas; Robertson, Shelly; Ricklefs, Stacy; Dolan, Michael A; Taylor, Travis; Falzarano, Darryl; Ebihara, Hideki; Porcella, Stephen F; Feldmann, Heinz

    2013-01-01

    Ebolavirus (EBOV), the causative agent of a severe hemorrhagic fever and a biosafety level 4 pathogen, increases its genome coding capacity by producing multiple transcripts encoding for structural and nonstructural glycoproteins from a single gene. This is achieved through RNA editing, during which non-template adenosine residues are incorporated into the EBOV mRNAs at an editing site encoding for 7 adenosine residues. However, the mechanism of EBOV RNA editing is currently not understood. In this study, we report for the first time that minigenomes containing the glycoprotein gene editing site can undergo RNA editing, thereby eliminating the requirement for a biosafety level 4 laboratory to study EBOV RNA editing. Using a newly developed dual-reporter minigenome, we have characterized the mechanism of EBOV RNA editing, and have identified cis-acting sequences that are required for editing, located between 9 nt upstream and 9 nt downstream of the editing site. Moreover, we show that a secondary structure in the upstream cis-acting sequence plays an important role in RNA editing. EBOV RNA editing is glycoprotein gene-specific, as a stretch encoding for 7 adenosine residues located in the viral polymerase gene did not serve as an editing site, most likely due to an absence of the necessary cis-acting sequences. Finally, the EBOV protein VP30 was identified as a trans-acting factor for RNA editing, constituting a novel function for this protein. Overall, our results provide novel insights into the RNA editing mechanism of EBOV, further understanding of which might result in novel intervention strategies against this viral pathogen.

  8. High-throughput sequencing reveals circular substrates for an archaeal RNA ligase

    PubMed Central

    Becker, Hubert F.; Héliou, Alice; Djaout, Kamel; Lestini, Roxane; Regnier, Mireille; Myllykallio, Hannu

    2017-01-01

    ABSTRACT It is only recently that the abundant presence of circular RNAs (circRNAs) in all kingdoms of Life, including the hyperthermophilic archaeon Pyrococcus abyssi, has emerged. This led us to investigate the physiologic significance of a previously observed weak intramolecular ligation activity of Pab1020 RNA ligase. Here we demonstrate that this enzyme, despite sharing significant sequence similarity with DNA ligases, is indeed an RNA-specific polynucleotide ligase efficiently acting on physiologically significant substrates. Using a combination of RNA immunoprecipitation assays and RNA-seq, our genome-wide studies revealed 133 individual circRNA loci in P. abyssi. The large majority of these loci interacted with Pab1020 in cells and circularization of selected C/D Box and 5S rRNA transcripts was confirmed biochemically. Altogether these studies revealed that Pab1020 is required for RNA circularization. Our results further suggest the functional speciation of an ancestral NTase domain and/or DNA ligase toward RNA ligase activity and prompt for further characterization of the widespread functions of circular RNAs in prokaryotes. Detailed insight into the cellular substrates of Pab1020 may facilitate the development of new biotechnological applications e.g. in ligation of preadenylated adaptors to RNA molecules. PMID:28277897

  9. Conserved RNA binding activity of a Yin-Yang 1 homologue in the ova of the purple sea urchin Strongylocentrotus purpuratus.

    PubMed

    Belak, Zachery R; Ovsenek, Nicholas; Eskiw, Christopher H

    2018-05-23

    Yin-Yang 1 (YY1) is a highly conserved transcription factor possessing RNA-binding activity. A putative YY1 homologue was previously identified in the developmental model organism Strongylocentrotus purpuratus (the purple sea urchin) by genomic sequencing. We identified a high degree of sequence similarity with YY1 homologues of vertebrate origin which shared 100% protein sequence identity over the DNA- and RNA-binding zinc-finger region with high similarity in the N-terminal transcriptional activation domain. SpYY1 demonstrated identical DNA- and RNA-binding characteristics between Xenopus laevis and S. purpuratus indicating that it maintains similar functional and biochemical properties across widely divergent deuterostome species. SpYY1 binds to the consensus YY1 DNA element, and also to U-rich RNA sequences. Although we detected SpYY1 RNA-binding activity in ova lysates and observed cytoplasmic localization, SpYY1 was not associated with maternal mRNA in ova. SpYY1 expressed in Xenopus oocytes was excluded from the nucleus and associated with maternally expressed cytoplasmic mRNA molecules. These data demonstrate the existence of an YY1 homologue in S. purpuratus with similar structural and biochemical features to those of the well-studied vertebrate YY1; however, the data reveal major differences in the biological role of YY1 in the regulation of maternally expressed mRNA in the two species.

  10. Evaluating the Detection of Hydrocarbon-Degrading Bacteria in 16S rRNA Gene Sequencing Surveys

    PubMed Central

    Berry, David; Gutierrez, Tony

    2017-01-01

    Hydrocarbonoclastic bacteria (HCB) play a key role in the biodegradation of oil hydrocarbons in marine and other environments. A small number of taxa have been identified as obligate HCB, notably the Gammaproteobacterial genera Alcanivorax, Cycloclasticus, Marinobacter, Neptumonas, Oleiphilus, Oleispira, and Thalassolituus, as well as the Alphaproteobacterial genus Thalassospira. Detection of HCB in amplicon-based sequencing surveys relies on high coverage by PCR primers and accurate taxonomic classification. In this study, we performed a phylogenetic analysis to identify 16S rRNA gene sequence regions that represent the breadth of sequence diversity within these taxa. Using validated sequences, we evaluated 449 universal 16S rRNA gene-targeted bacterial PCR primer pairs for their coverage of these taxa. The results of this analysis provide a practical framework for selection of suitable primer sets for optimal detection of HCB in sequencing surveys. PMID:28567035

  11. Evaluating the Detection of Hydrocarbon-Degrading Bacteria in 16S rRNA Gene Sequencing Surveys.

    PubMed

    Berry, David; Gutierrez, Tony

    2017-01-01

    Hydrocarbonoclastic bacteria (HCB) play a key role in the biodegradation of oil hydrocarbons in marine and other environments. A small number of taxa have been identified as obligate HCB, notably the Gammaproteobacterial genera Alcanivorax, Cycloclasticus, Marinobacter, Neptumonas, Oleiphilus, Oleispira , and Thalassolituus , as well as the Alphaproteobacterial genus Thalassospira . Detection of HCB in amplicon-based sequencing surveys relies on high coverage by PCR primers and accurate taxonomic classification. In this study, we performed a phylogenetic analysis to identify 16S rRNA gene sequence regions that represent the breadth of sequence diversity within these taxa. Using validated sequences, we evaluated 449 universal 16S rRNA gene-targeted bacterial PCR primer pairs for their coverage of these taxa. The results of this analysis provide a practical framework for selection of suitable primer sets for optimal detection of HCB in sequencing surveys.

  12. Genome wide assessment of mRNA in astrocyte protrusions by direct RNA sequencing reveals mRNA localization for the intermediate filament protein nestin.

    PubMed

    Thomsen, Rune; Pallesen, Jonatan; Daugaard, Tina F; Børglum, Anders D; Nielsen, Anders L

    2013-11-01

    Subcellular RNA localization plays an important role in development, cell differentiation, and cell migration. For a comprehensive description of the population of protrusion localized mRNAs in astrocytes we separated protrusions from cell bodies in a Boyden chamber and performed high-throughput direct RNA sequencing. The mRNAs with localization in astrocyte protrusions encode proteins belonging to a variety of functional groups indicating involvement of RNA localization for a palette of cellular functions. The mRNA encoding the intermediate filament protein Nestin was among the identified mRNAs. By RT-qPCR and RNA FISH analysis we confirmed Nestin mRNA localization in cell protrusions and also protrusion localization of Nestin protein. Nestin mRNA localization was dependent of Fragile X mental retardation syndrome proteins Fmrp and Fxr1, and the Nestin 3'-UTR was sufficient to mediate protrusion mRNA localization. The mRNAs for two other intermediate filament proteins in astrocytes, Gfap and Vimentin, have moderate and no protrusion localization, respectively, showing that individual intermediate filament components have different localization mechanisms. The correlated localization of Nestin mRNA with Nestin protein in cell protrusions indicates the presence of a regulatory mechanism at the mRNA localization level for the Nestin intermediate filament protein with potential importance for astrocyte functions during brain development and maintenance. Copyright © 2013 Wiley Periodicals, Inc.

  13. The Landscape of Circular RNA Expression Profiles in Papillary Thyroid Carcinoma Based on RNA Sequencing.

    PubMed

    Lan, Xiabin; Xu, Jiajie; Chen, Chao; Zheng, Chuanming; Wang, Jiafeng; Cao, Jun; Zhu, Xuhang; Ge, Minghua

    2018-05-25

    Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. However, the molecular mechanisms responsible for its tumorigenesis and progression remain largely unknown. Circular RNA (circRNA) is a novel type of noncoding RNA that can serve as an ideal biomarker due to its stability. Recent evidence suggests that circRNAs play important roles in tumorigenesis. This study aims to investigate circRNA expression profiles and their potential biological functions in PTC. High-throughput RNA sequencing was used to assess circRNA expression profiles in PTC, and quantitative real-time polymerase chain reaction (qRT-PCR) was used to validate dysregulated circRNAs. Receiver operating characteristic (ROC) curves were generated to evaluate the diagnostic value of circRNAs for PTC. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were employed to determine the biological functions of differentially expressed circRNAs. Bioinformatic analyses were applied to predict interactions between circRNAs and microRNAs (miRNAs), and a circRNA-miRNA-mRNA network was constructed using Cytoscape software. We identified a number of differentially expressed circRNAs in PTC tissues compared with paired normal thyroid tissues, with chr5: 160757890-160763776-, chr12: 40696591-40697936+, chr7: 22330794-22357656-, and chr21: 16386665-16415895- being upregulated, and chr7: 91924203-91957214+, chr2: 179514891-179516047-, chr9: 16435553-16437522-, and chr22: 36006931-36007153- being downregulated. These findings were confirmed by qRT-PCR, and ROC curves indicated that they can serve as potential biomarkers for PTC. GO and KEGG pathway analyses showed that some of these circRNAs are related to cancers. Additionally, bioinformatic analyses revealed a potential competing-endogenous-RNA-regulating network among circRNAs, miRNAs, and mRNAs. Our study results depict the landscape of circRNA expression profiles in PTC and also provide potential

  14. Striking similarities in amino acid sequence among nonstructural proteins encoded by RNA viruses that have dissimilar genomic organization.

    PubMed Central

    Haseloff, J; Goelet, P; Zimmern, D; Ahlquist, P; Dasgupta, R; Kaesberg, P

    1984-01-01

    The plant viruses alfalfa mosaic virus (AMV) and brome mosaic virus (BMV) each divide their genetic information among three RNAs while tobacco mosaic virus (TMV) contains a single genomic RNA. Amino acid sequence comparisons suggest that the single proteins encoded by AMV RNA 1 and BMV RNA 1 and by AMV RNA 2 and BMV RNA 2 are related to the NH2-terminal two-thirds and the COOH-terminal one-third, respectively, of the largest protein encoded by TMV. Separating these two domains in the TMV RNA sequence is an amber termination codon, whose partial suppression allows translation of the downstream domain. Many of the residues that the TMV read-through domain and the segmented plant viruses have in common are also conserved in a read-through domain found in the nonstructural polyprotein of the animal alphaviruses Sindbis and Middelburg. We suggest that, despite substantial differences in gene organization and expression, all of these viruses use related proteins for common functions in RNA replication. Reassortment of functional modules of coding and regulatory sequence from preexisting viral or cellular sources, perhaps via RNA recombination, may be an important mechanism in RNA virus evolution. PMID:6611550

  15. Technologically important extremophile 16S rRNA sequence Shannon entropy and fractal property comparison with long term dormant microbes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Gadura, N.; Dehipawala, S.; Cheung, E.; Tuffour, M.; Schneider, P.; Tremberger, G., Jr.; Lieberman, D.; Cheung, T.

    2011-10-01

    Technologically important extremophiles including oil eating microbes, uranium and rocket fuel perchlorate reduction microbes, electron producing microbes and electrode electrons feeding microbes were compared in terms of their 16S rRNA sequences, a standard targeted sequence in comparative phylogeny studies. Microbes that were reported to have survived a prolonged dormant duration were also studied. Examples included the recently discovered microbe that survives after 34,000 years in a salty environment while feeding off organic compounds from other trapped dead microbes. Shannon entropy of the 16S rRNA nucleotide composition and fractal dimension of the nucleotide sequence in terms of its atomic number fluctuation analyses suggest a selected range for these extremophiles as compared to other microbes; consistent with the experience of relatively mild evolutionary pressure. However, most of the microbes that have been reported to survive in prolonged dormant duration carry sequences with fractal dimension between 1.995 and 2.005 (N = 10 out of 13). Similar results are observed for halophiles, red-shifted chlorophyll and radiation resistant microbes. The results suggest that prolonged dormant duration, in analogous to high salty or radiation environment, would select high fractal 16S rRNA sequences. Path analysis in structural equation modeling supports a causal relation between entropy and fractal dimension for the studied 16S rRNA sequences (N = 7). Candidate choices for high fractal 16S rRNA microbes could offer protection for prolonged spaceflights. BioBrick gene network manipulation could include extremophile 16S rRNA sequences in synthetic biology and shed more light on exobiology and future colonization in shielded spaceflights. Whether the high fractal 16S rRNA sequences contain an asteroidlike extra-terrestrial source could be speculative but interesting.

  16. Sequence-engineered mRNA Without Chemical Nucleoside Modifications Enables an Effective Protein Therapy in Large Animals

    PubMed Central

    Thess, Andreas; Grund, Stefanie; Mui, Barbara L; Hope, Michael J; Baumhof, Patrick; Fotin-Mleczek, Mariola; Schlake, Thomas

    2015-01-01

    Being a transient carrier of genetic information, mRNA could be a versatile, flexible, and safe means for protein therapies. While recent findings highlight the enormous therapeutic potential of mRNA, evidence that mRNA-based protein therapies are feasible beyond small animals such as mice is still lacking. Previous studies imply that mRNA therapeutics require chemical nucleoside modifications to obtain sufficient protein expression and avoid activation of the innate immune system. Here we show that chemically unmodified mRNA can achieve those goals as well by applying sequence-engineered molecules. Using erythropoietin (EPO) driven production of red blood cells as the biological model, engineered Epo mRNA elicited meaningful physiological responses from mice to nonhuman primates. Even in pigs of about 20 kg in weight, a single adequate dose of engineered mRNA encapsulated in lipid nanoparticles (LNPs) induced high systemic Epo levels and strong physiological effects. Our results demonstrate that sequence-engineered mRNA has the potential to revolutionize human protein therapies. PMID:26050989

  17. A glutamine-rich carrier efficiently delivers anti-CD47 siRNA driven by "glutamine trap" to inhibit lung cancer cell growth.

    PubMed

    Wu, JiaMin; Li, Zhi; Yang, Zeping; Guo, Ling; Zhang, Ye; Deng, Huihui; Wang, Cuifeng; Feng, Min

    2018-06-25

    It is not efficient enough using the current approaches for tumor-selective drug delivery based on the EPR effect and ligand-receptor interactions, and they have largely failed to translate into the clinic. So it is urgent to explore an enhanced strategy for effective delivery of anticancer agents. Clinically, many cancers require large amounts of glutamine for their continued growth and survival, resulting in circulating glutamine extraction by the tumor being much greater than that for any organs, behaving as a "glutamine trap". In the present study, we sought to elucidate whether the glutamine trap effect could be exploited to deliver therapeutic agents to selectively kill cancer cells. Here, a macromolecular glutamine analog, glutamine-functionalized branched polyethylenimine (GPI), was constructed as the carrier to deliver anti-CD47 siRNA for the blockage of CD47 "don't eat me" signals on cancer cells. The GPI/siRNA glutamine-rich polyplexes exhibited remarkably high levels of cellular uptake by glutamine-dependent lung cancer cells, wild-type A549 cells (A549WT) and its cisplatin-resistant cells (A549DDP), specifically under glutamine-depleted conditions. It was noted that the glutamine transporter ASCT2 was highly expressed both on A549WT and A549DDP, but almost no expression in normal human lung fibroblasts cells. Inhibition of ASCT2 significantly prevented the internalization of GPI polyplexes. These findings raised the intriguing possibility that the glutamine-rich GPI polyplexes utilize the ASCT2 pathway to selectively facilitate their cellular uptake by cancer cells. GPI further delivered anti-CD47 siRNA efficiently both in vitro and in vivo to down-regulate the intratumoral mRNA and protein expression levels of CD47. CD47 functions as a "don't eat me" signal and binds to the immunoreceptor SIRPα inducing evasion of phagocytic clearance. GPI/anti-CD47 siRNA polyplexes achieved significant antitumor activities both on A549WT and A549DDP tumor

  18. Long-read sequencing of nascent RNA reveals coupling among RNA processing events.

    PubMed

    Herzel, Lydia; Straube, Korinna; Neugebauer, Karla M

    2018-06-14

    Pre-mRNA splicing is accomplished by the spliceosome, a megadalton complex that assembles de novo on each intron. Because spliceosome assembly and catalysis occur cotranscriptionally, we hypothesized that introns are removed in the order of their transcription in genomes dominated by constitutive splicing. Remarkably little is known about splicing order and the regulatory potential of nascent transcript remodeling by splicing, due to the limitations of existing methods that focus on analysis of mature splicing products (mRNAs) rather than substrates and intermediates. Here, we overcome this obstacle through long-read RNA sequencing of nascent, multi-intron transcripts in the fission yeast Schizosaccharomyces pombe Most multi-intron transcripts were fully spliced, consistent with rapid cotranscriptional splicing. However, an unexpectedly high proportion of transcripts were either fully spliced or fully unspliced, suggesting that splicing of any given intron is dependent on the splicing status of other introns in the transcript. Supporting this, mild inhibition of splicing by a temperature-sensitive mutation in prp2 , the homolog of vertebrate U2AF65, increased the frequency of fully unspliced transcripts. Importantly, fully unspliced transcripts displayed transcriptional read-through at the polyA site and were degraded cotranscriptionally by the nuclear exosome. Finally, we show that cellular mRNA levels were reduced in genes with a high number of unspliced nascent transcripts during caffeine treatment, showing regulatory significance of cotranscriptional splicing. Therefore, overall splicing of individual nascent transcripts, 3' end formation, and mRNA half-life depend on the splicing status of neighboring introns, suggesting crosstalk among spliceosomes and the polyA cleavage machinery during transcription elongation. © 2018 Herzel et al.; Published by Cold Spring Harbor Laboratory Press.

  19. Import of desired nucleic acid sequences using addressing motif of mitochondrial ribosomal 5S-rRNA for fluorescent in vivo hybridization of mitochondrial DNA and RNA.

    PubMed

    Zelenka, Jaroslav; Alán, Lukáš; Jabůrek, Martin; Ježek, Petr

    2014-04-01

    Based on the matrix-addressing sequence of mitochondrial ribosomal 5S-rRNA (termed MAM), which is naturally imported into mitochondria, we have constructed an import system for in vivo targeting of mitochondrial DNA (mtDNA) or mt-mRNA, in order to provide fluorescence hybridization of the desired sequences. Thus DNA oligonucleotides were constructed, containing the 5'-flanked T7 RNA polymerase promoter. After in vitro transcription and fluorescent labeling with Alexa Fluor(®) 488 or 647 dye, we obtained the fluorescent "L-ND5 probe" containing MAM and exemplar cargo, i.e., annealing sequence to a short portion of ND5 mRNA and to the light-strand mtDNA complementary to the heavy strand nd5 mt gene (5'-end 21 base pair sequence). For mitochondrial in vivo fluorescent hybridization, HepG2 cells were treated with dequalinium micelles, containing the fluorescent probes, bringing the probes proximally to the mitochondrial outer membrane and to the natural import system. A verification of import into the mitochondrial matrix of cultured HepG2 cells was provided by confocal microscopy colocalizations. Transfections using lipofectamine or probes without 5S-rRNA addressing MAM sequence or with MAM only were ineffective. Alternatively, the same DNA oligonucleotides with 5'-CACC overhang (substituting T7 promoter) were transcribed from the tetracycline-inducible pENTRH1/TO vector in human embryonic kidney T-REx®-293 cells, while mitochondrial matrix localization after import of the resulting unlabeled RNA was detected by PCR. The MAM-containing probe was then enriched by three-order of magnitude over the natural ND5 mRNA in the mitochondrial matrix. In conclusion, we present a proof-of-principle for mitochondrial in vivo hybridization and mitochondrial nucleic acid import.

  20. Deciphering mRNA Sequence Determinants of Protein Production Rate

    NASA Astrophysics Data System (ADS)

    Szavits-Nossan, Juraj; Ciandrini, Luca; Romano, M. Carmen

    2018-03-01

    One of the greatest challenges in biophysical models of translation is to identify coding sequence features that affect the rate of translation and therefore the overall protein production in the cell. We propose an analytic method to solve a translation model based on the inhomogeneous totally asymmetric simple exclusion process, which allows us to unveil simple design principles of nucleotide sequences determining protein production rates. Our solution shows an excellent agreement when compared to numerical genome-wide simulations of S. cerevisiae transcript sequences and predicts that the first 10 codons, which is the ribosome footprint length on the mRNA, together with the value of the initiation rate, are the main determinants of protein production rate under physiological conditions. Finally, we interpret the obtained analytic results based on the evolutionary role of the codons' choice for regulating translation rates and ribosome densities.

  1. Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing

    PubMed Central

    2012-01-01

    Background RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. Results Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. Conclusions This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates. PMID:22985019

  2. Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing.

    PubMed

    Robles, José A; Qureshi, Sumaira E; Stephen, Stuart J; Wilson, Susan R; Burden, Conrad J; Taylor, Jennifer M

    2012-09-17

    RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates.

  3. A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution.

    PubMed

    Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme

    2013-07-01

    The design of RNA sequences folding into predefined secondary structures is a milestone for many synthetic biology and gene therapy studies. Most of the current software uses similar local search strategies (i.e. a random seed is progressively adapted to acquire the desired folding properties) and more importantly do not allow the user to control explicitly the nucleotide distribution such as the GC-content in their sequences. However, the latter is an important criterion for large-scale applications as it could presumably be used to design sequences with better transcription rates and/or structural plasticity. In this article, we introduce IncaRNAtion, a novel algorithm to design RNA sequences folding into target secondary structures with a predefined nucleotide distribution. IncaRNAtion uses a global sampling approach and weighted sampling techniques. We show that our approach is fast (i.e. running time comparable or better than local search methods), seedless (we remove the bias of the seed in local search heuristics) and successfully generates high-quality sequences (i.e. thermodynamically stable) for any GC-content. To complete this study, we develop a hybrid method combining our global sampling approach with local search strategies. Remarkably, our glocal methodology overcomes both local and global approaches for sampling sequences with a specific GC-content and target structure. IncaRNAtion is available at csb.cs.mcgill.ca/incarnation/. Supplementary data are available at Bioinformatics online.

  4. Selective amplification and sequencing of cyclic phosphate-containing RNAs by the cP-RNA-seq method.

    PubMed

    Honda, Shozo; Morichika, Keisuke; Kirino, Yohei

    2016-03-01

    RNA digestions catalyzed by many ribonucleases generate RNA fragments that contain a 2',3'-cyclic phosphate (cP) at their 3' termini. However, standard RNA-seq methods are unable to accurately capture cP-containing RNAs because the cP inhibits the adapter ligation reaction. We recently developed a method named cP-RNA-seq that is able to selectively amplify and sequence cP-containing RNAs. Here we describe the cP-RNA-seq protocol in which the 3' termini of all RNAs, except those containing a cP, are cleaved through a periodate treatment after phosphatase treatment; hence, subsequent adapter ligation and cDNA amplification steps are exclusively applied to cP-containing RNAs. cP-RNA-seq takes ∼6 d, excluding the time required for sequencing and bioinformatics analyses, which are not covered in detail in this protocol. Biochemical validation of the existence of cP in the identified RNAs takes ∼3 d. Even though the cP-RNA-seq method was developed to identify angiogenin-generating 5'-tRNA halves as a proof of principle, the method should be applicable to global identification of cP-containing RNA repertoires in various transcriptomes.

  5. Computational biology of RNA interactions.

    PubMed

    Dieterich, Christoph; Stadler, Peter F

    2013-01-01

    The biodiversity of the RNA world has been underestimated for decades. RNA molecules are key building blocks, sensors, and regulators of modern cells. The biological function of RNA molecules cannot be separated from their ability to bind to and interact with a wide space of chemical species, including small molecules, nucleic acids, and proteins. Computational chemists, physicists, and biologists have developed a rich tool set for modeling and predicting RNA interactions. These interactions are to some extent determined by the binding conformation of the RNA molecule. RNA binding conformations are approximated with often acceptable accuracy by sequence and secondary structure motifs. Secondary structure ensembles of a given RNA molecule can be efficiently computed in many relevant situations by employing a standard energy model for base pair interactions and dynamic programming techniques. The case of bi-molecular RNA-RNA interactions can be seen as an extension of this approach. However, unbiased transcriptome-wide scans for local RNA-RNA interactions are computationally challenging yet become efficient if the binding motif/mode is known and other external information can be used to confine the search space. Computational methods are less developed for proteins and small molecules, which bind to RNA with very high specificity. Binding descriptors of proteins are usually determined by in vitro high-throughput assays (e.g., microarrays or sequencing). Intriguingly, recent experimental advances, which are mostly based on light-induced cross-linking of binding partners, render in vivo binding patterns accessible yet require new computational methods for careful data interpretation. The grand challenge is to model the in vivo situation where a complex interplay of RNA binders competes for the same target RNA molecule. Evidently, bioinformaticians are just catching up with the impressive pace of these developments. Copyright © 2012 John Wiley & Sons, Ltd.

  6. Design of the hairpin ribozyme for targeting specific RNA sequences.

    PubMed

    Hampel, A; DeYoung, M B; Galasinski, S; Siwkowski, A

    1997-01-01

    The following steps should be taken when designing the hairpin ribozyme to cleave a specific target sequence: 1. Select a target sequence containing BN*GUC where B is C, G, or U. 2. Select the target sequence in areas least likely to have extensive interfering structure. 3. Design the conventional hairpin ribozyme as shown in Fig. 1, such that it can form a 4 bp helix 2 and helix 1 lengths up to 10 bp. 4. Synthesize this ribozyme from single-stranded DNA templates with a double-stranded T7 promoter. 5. Prepare a series of short substrates capable of forming a range of helix 1 lengths of 5-10 bp. 6. Identify these by direct RNA sequencing. 7. Assay the extent of cleavage of each substrate to identify the optimal length of helix 1. 8. Prepare the hairpin tetraloop ribozyme to determine if catalytic efficiency can be improved.

  7. Structural analysis of the human U3 ribonucleoprotein particle reveal a conserved sequence available for base pairing with pre-rRNA.

    PubMed Central

    Parker, K A; Steitz, J A

    1987-01-01

    The human U3 ribonucleoprotein (RNP) has been analyzed to determine its protein constituents, sites of protein-RNA interaction, and RNA secondary structure. By using anti-U3 RNP antibodies and extracts prepared from HeLa cells labeled in vivo, the RNP was found to contain four nonphosphorylated proteins of 36, 30, 13, and 12.5 kilodaltons and two phosphorylated proteins of 74 and 59 kilodaltons. U3 nucleotides 72-90, 106-121, 154-166, and 190-217 must contain sites that interact with proteins since these regions are immunoprecipitated after treatment of the RNP with RNase A or T1. The secondary structure was probed with specific nucleases and by chemical modification with single-strand-specific reagents that block subsequent reverse transcription. Regions that are single stranded (and therefore potentially able to interact with a substrate RNA) include an evolutionarily conserved sequence at nucleotides 104-112 and nonconserved sequences at nucleotides 65-74, 80-84, and 88-93. Nucleotides 159-168 do not appear to be highly accessible, thus making it unlikely that this U3 sequence base pairs with sequences near the 5.8S rRNA-internal transcribed spacer II junction, as previously proposed. Alternative functions of the U3 RNP are discussed, including the possibility that U3 may participate in a processing event near the 3' end of 28S rRNA. Images PMID:2959855

  8. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity.

    PubMed

    Mulligan, M E; Hawley, D K; Entriken, R; McClure, W R

    1984-01-11

    We describe a simple algorithm for computing a homology score for Escherichia coli promoters based on DNA sequence alone. The homology score was related to 31 values, measured in vitro, of RNA polymerase selectivity, which we define as the product KBk2, the apparent second order rate constant for open complex formation. We found that promoter strength could be predicted to within a factor of +/-4.1 in KBk2 over a range of 10(4) in the same parameter. The quantitative evaluation was linked to an automated (Apple II) procedure for searching and evaluating possible promoters in DNA sequence files.

  9. Conserved small mRNA with an unique, extended Shine-Dalgarno sequence

    PubMed Central

    Hahn, Julia; Migur, Anzhela; von Boeselager, Raphael Freiherr; Kubatova, Nina; Kubareva, Elena; Schwalbe, Harald

    2017-01-01

    ABSTRACT Up to now, very small protein-coding genes have remained unrecognized in sequenced genomes. We identified an mRNA of 165 nucleotides (nt), which is conserved in Bradyrhizobiaceae and encodes a polypeptide with 14 amino acid residues (aa). The small mRNA harboring a unique Shine-Dalgarno sequence (SD) with a length of 17 nt was localized predominantly in the ribosome-containing P100 fraction of Bradyrhizobium japonicum USDA 110. Strong interaction between the mRNA and 30S ribosomal subunits was demonstrated by their co-sedimentation in sucrose density gradient. Using translational fusions with egfp, we detected weak translation and found that it is impeded by both the extended SD and the GTG start codon (instead of ATG). Biophysical characterization (CD- and NMR-spectroscopy) showed that synthesized polypeptide remained unstructured in physiological puffer. Replacement of the start codon by a stop codon increased the stability of the transcript, strongly suggesting additional posttranscriptional regulation at the ribosome. Therefore, the small gene was named rreB (ribosome-regulated expression in Bradyrhizobiaceae). Assuming that the unique ribosome binding site (RBS) is a hallmark of rreB homologs or similarly regulated genes, we looked for similar putative RBS in bacterial genomes and detected regions with at least 16 nt complementarity to the 3′-end of 16S rRNA upstream of sORFs in Caulobacterales, Rhizobiales, Rhodobacterales and Rhodospirillales. In the Rhodobacter/Roseobacter lineage of α-proteobacteria the corresponding gene (rreR) is conserved and encodes an 18 aa protein. This shows how specific RBS features can be used to identify new genes with presumably similar control of expression at the RNA level. PMID:27834614

  10. Cleavage of rRNA ensures translational cessation in sperm at fertilization

    PubMed Central

    Johnson, G.D.; Sendler, E.; Lalancette, C.; Hauser, R.; Diamond, M.P.; Krawetz, S.A.

    2011-01-01

    Intact ribosomal RNAs (rRNAs) comprise the majority of somatic transcripts, yet appear conspicuously absent in spermatozoa, perhaps reflecting cytoplasmic expulsion during spermatogenesis. To discern their fate, total RNA retained in mature spermatozoa from three fertile donors was characterized by Next Generation Sequencing. In all samples, >75% of total sequence reads aligned to rRNAs. The distribution of reads along the length of these transcripts exhibited a high degree of non-uniformity that was reiterated between donors. The coverage of sequencing reads was inversely correlated with guanine-cytosine (GC)-richness such that sequences greater than ∼70% GC were virtually absent in all sperm RNA samples. To confirm the loss of sequence, the relative abundance of specific regions of the 28S transcripts in sperm was established by 7-Deaza-2′-deoxy-guanosine-5′-triphosphate RT–PCR. The inability to amplify specific regions of the 28S sequence from sperm despite the abundant representation of this transcript in the sequencing libraries demonstrates that approximately three-quarters of RNA retained in the mature male gamete are products of rRNA fragmentation. Hence, cleavage (not expulsion of the RNA component of the translational machinery) is responsible for preventing spurious translation following spermiogenesis. These results highlight the potential importance of those transcripts, including many mRNAs, which evade fragmentation and remain intact when sperm are delivered at fertilization. Sequencing data are deposited in GEO as: GSE29160. PMID:21831882

  11. Utility of next-generation RNA-sequencing in identifying chimeric transcription involving human endogenous retroviruses.

    PubMed

    Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou

    2016-01-01

    Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights

  12. Nucleotide sequence of the ribosomal RNA gene of Physarum polycephalum: intron 2 and its flanking regions of the 26S rRNA gene.

    PubMed Central

    Nomiyama, H; Kuhara, S; Kukita, T; Otsuka, T; Sakaki, Y

    1981-01-01

    The 26S ribosomal RNA gene of Physarum polycephalum is interrupted by two introns, and we have previously determined the sequence of one of them (intron 1) (Nomiyama et al. Proc.Natl.Acad.Sci.USA 78, 1376-1380, 1981). In this study we sequenced the second intron (intron 2) of about 0.5 kb length and its flanking regions, and found that one nucleotide at each junction is identical in intron 1 and intron 2, though the junction regions share no other sequence homology. Comparison of the flanking exon sequences to E. coli 23S rRNA sequences shows that conserved sequences are interspersed with tracts having little homology. In particular, the region encompassing the intron 2 interruption site is highly conserved. The E. coli ribosomal protein L1 binding region is also conserved. Images PMID:6171776

  13. Sequence-specific inhibition of microRNA-130a gene by CRISPR/Cas9 system in breast cancer cell line

    NASA Astrophysics Data System (ADS)

    Ainina Abdollah, Nur; Das Kumitaa, Theva; Yusof Narazah, Mohd; Razak, Siti Razila Abdul

    2017-05-01

    MicroRNAs (miRNAs) are short stranded noncoding RNA that play important roles in apoptosis, cell survival, development and cell proliferation. However, gene expression control via small regulatory RNA, particularly miRNA in breast cancer is still less explored. Therefore, this project aims to develop an approach to target microRNA-130a using the Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)/Cas9 system in MCF7, breast cancer cell line. The 20 bp sequences target at stem loop, 3ʹ and 5ʹ end of miR130a were cloned into pSpCas9(BB)-2A-GFP (PX458) plasmid, and the positive clones were confirmed by sequencing. A total of 5 μg of PX458-miR130a was transfected to MCF7 using Lipofectamine® 3000 according to manufacturer’s protocol. The transfected cells were maintained in the incubator at 37 °C under humidified 5% CO2. After 48 hours, cells were harvested and total RNA was extracted using miRNeasy Mini Kit (Qiagen). cDNAs were synthesised specific to miR-130a using TaqMan MicroRNA Reverse Transcription Kit (Applied Biosystems). Then, qRT-PCR was carried out using TaqMan Universal Master Mix (Applied Biosystems) to quantify the knockdown level of mature miRNAs in the cells. Result showed that miR-130a-5p was significantly downregulated in MCF7 cell line. However, no significant changes were observed for sequences targeting miR-130a-3p and stem loop. Thus, this study showed that the expression of miR-130a-5p was successfully down-regulated using CRISPR silencing system. This technique may be useful to manipulate the level of miRNA in various cell types to answer clinical questions at the molecular level.

  14. Interconnections Between RNA-Processing Pathways Revealed by a Sequencing-Based Genetic Screen for Pre-mRNA Splicing Mutants in Fission Yeast.

    PubMed

    Larson, Amy; Fair, Benjamin Jung; Pleiss, Jeffrey A

    2016-06-01

    Pre-mRNA splicing is an essential component of eukaryotic gene expression and is highly conserved from unicellular yeasts to humans. Here, we present the development and implementation of a sequencing-based reverse genetic screen designed to identify nonessential genes that impact pre-mRNA splicing in the fission yeast Schizosaccharomyces pombe, an organism that shares many of the complex features of splicing in higher eukaryotes. Using a custom-designed barcoding scheme, we simultaneously queried ∼3000 mutant strains for their impact on the splicing efficiency of two endogenous pre-mRNAs. A total of 61 nonessential genes were identified whose deletions resulted in defects in pre-mRNA splicing; enriched among these were factors encoding known or predicted components of the spliceosome. Included among the candidates identified here are genes with well-characterized roles in other RNA-processing pathways, including heterochromatic silencing and 3' end processing. Splicing-sensitive microarrays confirm broad splicing defects for many of these factors, revealing novel functional connections between these pathways. Copyright © 2016 Larson et al.

  15. Interconnections Between RNA-Processing Pathways Revealed by a Sequencing-Based Genetic Screen for Pre-mRNA Splicing Mutants in Fission Yeast

    PubMed Central

    Larson, Amy; Fair, Benjamin Jung; Pleiss, Jeffrey A.

    2016-01-01

    Pre-mRNA splicing is an essential component of eukaryotic gene expression and is highly conserved from unicellular yeasts to humans. Here, we present the development and implementation of a sequencing-based reverse genetic screen designed to identify nonessential genes that impact pre-mRNA splicing in the fission yeast Schizosaccharomyces pombe, an organism that shares many of the complex features of splicing in higher eukaryotes. Using a custom-designed barcoding scheme, we simultaneously queried ∼3000 mutant strains for their impact on the splicing efficiency of two endogenous pre-mRNAs. A total of 61 nonessential genes were identified whose deletions resulted in defects in pre-mRNA splicing; enriched among these were factors encoding known or predicted components of the spliceosome. Included among the candidates identified here are genes with well-characterized roles in other RNA-processing pathways, including heterochromatic silencing and 3ʹ end processing. Splicing-sensitive microarrays confirm broad splicing defects for many of these factors, revealing novel functional connections between these pathways. PMID:27172183

  16. Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing

    PubMed Central

    Tourlousse, Dieter M.; Yoshiike, Satowa; Ohashi, Akiko; Matsukura, Satoko; Noda, Naohiro

    2017-01-01

    Abstract High-throughput sequencing of 16S rRNA gene amplicons (16S-seq) has become a widely deployed method for profiling complex microbial communities but technical pitfalls related to data reliability and quantification remain to be fully addressed. In this work, we have developed and implemented a set of synthetic 16S rRNA genes to serve as universal spike-in standards for 16S-seq experiments. The spike-ins represent full-length 16S rRNA genes containing artificial variable regions with negligible identity to known nucleotide sequences, permitting unambiguous identification of spike-in sequences in 16S-seq read data from any microbiome sample. Using defined mock communities and environmental microbiota, we characterized the performance of the spike-in standards and demonstrated their utility for evaluating data quality on a per-sample basis. Further, we showed that staggered spike-in mixtures added at the point of DNA extraction enable concurrent estimation of absolute microbial abundances suitable for comparative analysis. Results also underscored that template-specific Illumina sequencing artifacts may lead to biases in the perceived abundance of certain taxa. Taken together, the spike-in standards represent a novel bioanalytical tool that can substantially improve 16S-seq-based microbiome studies by enabling comprehensive quality control along with absolute quantification. PMID:27980100

  17. Complete Genome Sequence of Diaphorina citri-associated C virus, a Novel Putative RNA Virus of the Asian Citrus Psyllid, Diaphorina citri.

    PubMed

    Nouri, Shahideh; Salem, Nidà; Falk, Bryce W

    2016-07-21

    We present here the complete nucleotide sequence and genome organization of a novel putative RNA virus identified in field populations of the Asian citrus psyllid, Diaphorina citri, through sequencing of the transcriptome followed by reverse transcription-PCR (RT-PCR). We tentatively named this virus Diaphorina citri-associated C virus (DcACV). DcACV is an unclassified positive-sense RNA virus. Copyright © 2016 Nouri et al.

  18. A novel polydopamine-based chemiluminescence resonance energy transfer method for microRNA detection coupling duplex-specific nuclease-aided target recycling strategy.

    PubMed

    Wang, Qian; Yin, Bin-Cheng; Ye, Bang-Ce

    2016-06-15

    MicroRNAs (miRNAs), functioning as oncogenes or tumor suppressors, play significant regulatory roles in regulating gene expression and become as biomarkers for disease diagnostics and therapeutics. In this work, we have coupled a polydopamine (PDA) nanosphere-assisted chemiluminescence resonance energy transfer (CRET) platform and a duplex-specific nuclease (DSN)-assisted signal amplification strategy to develop a novel method for specific miRNA detection. With the assistance of hemin, luminol, and H2O2, the horseradish peroxidase (HRP)-mimicking G-rich sequence in the sensing probe produces chemiluminescence, which is quickly quenched by the CRET effect between PDA as energy acceptor and excited luminol as energy donor. The target miRNA triggers DSN to partially degrade the sensing probe in the DNA-miRNA heteroduplex to repeatedly release G-quadruplex formed by G-rich sequence from PDA for the production of chemiluminescence. The method allows quantitative detection of target miRNA in the range of 80 pM-50 nM with a detection limit of 49.6 pM. The method also shows excellent specificity to discriminate single-base differences, and can accurately quantify miRNA in biological samples, with good agreement with the result from a commercial miRNA detection kit. The procedure requires no organic dyes or labels, and is a simple and cost-effective method for miRNA detection for early clinical diagnosis. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Functional Analysis of RNA Interference-Related Soybean Pod Borer (Lepidoptera) Genes Based on Transcriptome Sequences.

    PubMed

    Meng, Fanli; Yang, Mingyu; Li, Yang; Li, Tianyu; Liu, Xinxin; Wang, Guoyue; Wang, Zhanchun; Jin, Xianhao; Li, Wenbin

    2018-01-01

    RNA interference (RNAi) is useful for controlling pests of agriculturally important crops. The soybean pod borer (SPB) is the most important soybean pest in Northeastern Asia. In an earlier study, we confirmed that the SPB could be controlled via transgenic plant-mediated RNAi. Here, the SPB transcriptome was sequenced to identify RNAi-related genes, and also to establish an RNAi-of-RNAi assay system for evaluating genes involved in the SPB systemic RNAi response. The core RNAi genes, as well as genes potentially involved in double-stranded RNA (dsRNA) uptake were identified based on SPB transcriptome sequences. A phylogenetic analysis and the characterization of these core components as well as dsRNA uptake related genes revealed that they contain conserved domains essential for the RNAi pathway. The results of the RNAi-of-RNAi assay involving Laccas e 2 (a critical cuticle pigmentation gene) as a marker showed that genes encoding the sid-like ( Sil1 ), scavenger receptor class C ( Src ), and scavenger receptor class B ( Srb3 and Srb4 ) proteins of the endocytic pathway were required for SPB cellular uptake of dsRNA. The SPB response was inferred to contain three functional small RNA pathways (i.e., miRNA, siRNA, and piRNA pathways). Additionally, the SPB systemic RNA response may rely on systemic RNA interference deficient transmembrane channel-mediated and receptor-mediated endocytic pathways. The results presented herein may be useful for developing RNAi-mediated methods to control SPB infestations in soybean.

  20. Functional Analysis of RNA Interference-Related Soybean Pod Borer (Lepidoptera) Genes Based on Transcriptome Sequences

    PubMed Central

    Meng, Fanli; Yang, Mingyu; Li, Yang; Li, Tianyu; Liu, Xinxin; Wang, Guoyue; Wang, Zhanchun; Jin, Xianhao; Li, Wenbin

    2018-01-01

    RNA interference (RNAi) is useful for controlling pests of agriculturally important crops. The soybean pod borer (SPB) is the most important soybean pest in Northeastern Asia. In an earlier study, we confirmed that the SPB could be controlled via transgenic plant-mediated RNAi. Here, the SPB transcriptome was sequenced to identify RNAi-related genes, and also to establish an RNAi-of-RNAi assay system for evaluating genes involved in the SPB systemic RNAi response. The core RNAi genes, as well as genes potentially involved in double-stranded RNA (dsRNA) uptake were identified based on SPB transcriptome sequences. A phylogenetic analysis and the characterization of these core components as well as dsRNA uptake related genes revealed that they contain conserved domains essential for the RNAi pathway. The results of the RNAi-of-RNAi assay involving Laccase 2 (a critical cuticle pigmentation gene) as a marker showed that genes encoding the sid-like (Sil1), scavenger receptor class C (Src), and scavenger receptor class B (Srb3 and Srb4) proteins of the endocytic pathway were required for SPB cellular uptake of dsRNA. The SPB response was inferred to contain three functional small RNA pathways (i.e., miRNA, siRNA, and piRNA pathways). Additionally, the SPB systemic RNA response may rely on systemic RNA interference deficient transmembrane channel-mediated and receptor-mediated endocytic pathways. The results presented herein may be useful for developing RNAi-mediated methods to control SPB infestations in soybean. PMID:29773992

  1. Sequence-based heuristics for faster annotation of non-coding RNA families.

    PubMed

    Weinberg, Zasha; Ruzzo, Walter L

    2006-01-01

    Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be. In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that--unlike family-specific solutions--can scale to hundreds of ncRNA families. The source code is available under GNU Public License at the supplementary web site.

  2. Toward an Understanding of Changes in Diversity Associated with Fecal Microbiome Transplantation Based on 16S rRNA Gene Deep Sequencing

    PubMed Central

    Shahinas, Dea; Silverman, Michael; Sittler, Taylor; Chiu, Charles; Kim, Peter; Allen-Vercoe, Emma; Weese, Scott; Wong, Andrew; Low, Donald E.; Pillai, Dylan R.

    2012-01-01

    ABSTRACT Fecal microbiome transplantation by low-volume enema is an effective, safe, and inexpensive alternative to antibiotic therapy for patients with chronic relapsing Clostridium difficile infection (CDI). We explored the microbial diversity of pre- and posttransplant stool specimens from CDI patients (n = 6) using deep sequencing of the 16S rRNA gene. While interindividual variability in microbiota change occurs with fecal transplantation and vancomycin exposure, in this pilot study we note that clinical cure of CDI is associated with an increase in diversity and richness. Genus- and species-level analysis may reveal a cocktail of microorganisms or products thereof that will ultimately be used as a probiotic to treat CDI. PMID:23093385

  3. Molecular-Sized DNA or RNA Sequencing Machine | NCI Technology Transfer Center | TTC

    Cancer.gov

    The National Cancer Institute's Gene Regulation and Chromosome Biology Laboratory is seeking statements of capability or interest from parties interested in collaborative research to co-develop a molecular-sized DNA or RNA sequencing machine.

  4. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity

    NASA Technical Reports Server (NTRS)

    Fox, G. E.; Wisotzkey, J. D.; Jurtshuk, P. Jr

    1992-01-01

    16S rRNA (genes coding for rRNA) sequence comparisons were conducted with the following three psychrophilic strains: Bacillus globisporus W25T (T = type strain) and Bacillus psychrophilus W16AT, and W5. These strains exhibited more than 99.5% sequence identity and within experimental uncertainty could be regarded as identical. Their close taxonomic relationship was further documented by phenotypic similarities. In contrast, previously published DNA-DNA hybridization results have convincingly established that these strains do not belong to the same species if current standards are used. These results emphasize the important point that effective identity of 16S rRNA sequences is not necessarily a sufficient criterion to guarantee species identity. Thus, although 16S rRNA sequences can be used routinely to distinguish and establish relationships between genera and well-resolved species, very recently diverged species may not be recognizable.

  5. The nucleotide sequence of the putative transcription initiation site of a cloned ribosomal RNA gene of the mouse.

    PubMed Central

    Urano, Y; Kominami, R; Mishima, Y; Muramatsu, M

    1980-01-01

    Approximately one kilobase pairs surrounding and upstream the transcription initiation site of a cloned ribosomal DNA (rDNA) of the mouse were sequenced. The putative transcription initiation site was determined by two independent methods: one nuclease S1 protection and the other reverse transcriptase elongation mapping using isolated 45S ribosomal RNA precursor (45S RNA) and appropriate restriction fragments of rDNA. Both methods gave an identical result; 45S RNA had a structure starting from ACTCTTAG---. Characteristically, mouse rDNA had many T clusters (greater than or equal to 5) upstream the initiation site, the longest being 21 consecutive T's. A pentadecanucleotide, TGCCTCCCGAGTGCA, appeared twice within 260 nucleotides upstream the putative initiation site. No such characteristic sequences were found downstream this site. Little similarity was found in the upstream of the transcription initiation site between the mouse, Xenopus laevis and Saccharomyces cerevisiae rDNA. Images PMID:6162156

  6. Transcriptogenomics identification and characterization of RNA editing sites in human primary monocytes using high-depth next generation sequencing data.

    PubMed

    Leong, Wai-Mun; Ripen, Adiratna Mat; Mirsafian, Hoda; Mohamad, Saharuddin Bin; Merican, Amir Feisal

    2018-06-07

    High-depth next generation sequencing data provide valuable insights into the number and distribution of RNA editing events. Here, we report the RNA editing events at cellular level of human primary monocyte using high-depth whole genomic and transcriptomic sequencing data. We identified over a ten thousand putative RNA editing sites and 69% of the sites were A-to-I editing sites. The sites enriched in repetitive sequences and intronic regions. High-depth sequencing datasets revealed that 90% of the canonical sites were edited at lower frequencies (<0.7). Single and multiple human monocytes and brain tissues samples were analyzed through genome sequence independent approach. The later approach was observed to identify more editing sites. Monocytes was observed to contain more C-to-U editing sites compared to brain tissues. Our results establish comparable pipeline that can address current limitations as well as demonstrate the potential for highly sensitive detection of RNA editing events in single cell type. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. A Burst of miRNA Innovation in the Early Evolution of Butterflies and Moths

    PubMed Central

    Quah, Shan; Hui, Jerome H.L.; Holland, Peter W.H.

    2015-01-01

    MicroRNAs (miRNAs) are involved in posttranscriptional regulation of gene expression. Because several miRNAs are known to affect the stability or translation of developmental regulatory genes, the origin of novel miRNAs may have contributed to the evolution of developmental processes and morphology. Lepidoptera (butterflies and moths) is a species-rich clade with a well-established phylogeny and abundant genomic resources, thereby representing an ideal system in which to study miRNA evolution. We sequenced small RNA libraries from developmental stages of two divergent lepidopterans, Cameraria ohridella (Horse chestnut Leafminer) and Pararge aegeria (Speckled Wood butterfly), discovering 90 and 81 conserved miRNAs, respectively, and many species-specific miRNA sequences. Mapping miRNAs onto the lepidopteran phylogeny reveals rapid miRNA turnover and an episode of miRNA fixation early in lepidopteran evolution, implying that miRNA acquisition accompanied the early radiation of the Lepidoptera. One lepidopteran-specific miRNA gene, miR-2768, is located within an intron of the homeobox gene invected, involved in insect segmental and wing patterning. We identified cubitus interruptus (ci) as a likely direct target of miR-2768, and validated this suppression using a luciferase assay system. We propose a model by which miR-2768 modulates expression of ci in the segmentation pathway and in patterning of lepidopteran wing primordia. PMID:25576364

  8. Comprehensive evaluation of AmpliSeq transcriptome, a novel targeted whole transcriptome RNA sequencing methodology for global gene expression analysis.

    PubMed

    Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich

    2015-12-16

    Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.

  9. ModeRNA: a tool for comparative modeling of RNA 3D structure

    PubMed Central

    Rother, Magdalena; Rother, Kristian; Puton, Tomasz; Bujnicki, Janusz M.

    2011-01-01

    RNA is a large group of functionally important biomacromolecules. In striking analogy to proteins, the function of RNA depends on its structure and dynamics, which in turn is encoded in the linear sequence. However, while there are numerous methods for computational prediction of protein three-dimensional (3D) structure from sequence, with comparative modeling being the most reliable approach, there are very few such methods for RNA. Here, we present ModeRNA, a software tool for comparative modeling of RNA 3D structures. As an input, ModeRNA requires a 3D structure of a template RNA molecule, and a sequence alignment between the target to be modeled and the template. It must be emphasized that a good alignment is required for successful modeling, and for large and complex RNA molecules the development of a good alignment usually requires manual adjustments of the input data based on previous expertise of the respective RNA family. ModeRNA can model post-transcriptional modifications, a functionally important feature analogous to post-translational modifications in proteins. ModeRNA can also model DNA structures or use them as templates. It is equipped with many functions for merging fragments of different nucleic acid structures into a single model and analyzing their geometry. Windows and UNIX implementations of ModeRNA with comprehensive documentation and a tutorial are freely available. PMID:21300639

  10. Exploration of sequence space as the basis of viral RNA genome segmentation.

    PubMed

    Moreno, Elena; Ojosnegros, Samuel; García-Arriaza, Juan; Escarmís, Cristina; Domingo, Esteban; Perales, Celia

    2014-05-06

    The mechanisms of viral RNA genome segmentation are unknown. On extensive passage of foot-and-mouth disease virus in baby hamster kidney-21 cells, the virus accumulated multiple point mutations and underwent a transition akin to genome segmentation. The standard single RNA genome molecule was replaced by genomes harboring internal in-frame deletions affecting the L- or capsid-coding region. These genomes were infectious and killed cells by complementation. Here we show that the point mutations in the nonstructural protein-coding region (P2, P3) that accumulated in the standard genome before segmentation increased the relative fitness of the segmented version relative to the standard genome. Fitness increase was documented by intracellular expression of virus-coded proteins and infectious progeny production by RNAs with the internal deletions placed in the sequence context of the parental and evolved genome. The complementation activity involved several viral proteins, one of them being the leader proteinase L. Thus, a history of genetic drift with accumulation of point mutations was needed to allow a major variation in the structure of a viral genome. Thus, exploration of sequence space by a viral genome (in this case an unsegmented RNA) can reach a point of the space in which a totally different genome structure (in this case, a segmented RNA) is favored over the form that performed the exploration.

  11. Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

    PubMed Central

    Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

    2014-01-01

    The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia

  12. Optimal packaging of FIV genomic RNA depends upon a conserved long-range interaction and a palindromic sequence within gag.

    PubMed

    Rizvi, Tahir A; Kenyon, Julia C; Ali, Jahabar; Aktar, Suriya J; Phillip, Pretty S; Ghazawi, Akela; Mustafa, Farah; Lever, Andrew M L

    2010-10-15

    The feline immunodeficiency virus (FIV) is a lentivirus that is related to human immunodeficiency virus (HIV), causing a similar pathology in cats. It is a potential small animal model for AIDS and the FIV-based vectors are also being pursued for human gene therapy. Previous studies have mapped the FIV packaging signal (ψ) to two or more discontinuous regions within the 5' 511 nt of the genomic RNA and structural analyses have determined its secondary structure. The 5' and 3' sequences within ψ region interact through extensive long-range interactions (LRIs), including a conserved heptanucleotide interaction between R/U5 and gag. Other secondary structural elements identified include a conserved 150 nt stem-loop (SL2) and a small palindromic stem-loop within gag open reading frame that might act as a viral dimerization initiation site. We have performed extensive mutational analysis of these sequences and structures and ascertained their importance in FIV packaging using a trans-complementation assay. Disrupting the conserved heptanucleotide LRI to prevent base pairing between R/U5 and gag reduced packaging by 2.8-5.5 fold. Restoration of pairing using an alternative, non-wild type (wt) LRI sequence restored RNA packaging and propagation to wt levels, suggesting that it is the structure of the LRI, rather than its sequence, that is important for FIV packaging. Disrupting the palindrome within gag reduced packaging by 1.5-3-fold, but substitution with a different palindromic sequence did not restore packaging completely, suggesting that the sequence of this region as well as its palindromic nature is important. Mutation of individual regions of SL2 did not have a pronounced effect on FIV packaging, suggesting that either it is the structure of SL2 as a whole that is necessary for optimal packaging, or that there is redundancy within this structure. The mutational analysis presented here has further validated the previously predicted RNA secondary structure of FIV

  13. Phylogenetic analysis of Fusobacterium prausnitzii based upon the 16S rRNA gene sequence and PCR confirmation.

    PubMed

    Wang, R F; Cao, W W; Cerniglia, C E

    1996-01-01

    In order to develop a PCR method to detect Fusobacterium prausnitzii in human feces and to clarify the phylogenetic position of this species, its 16S rRNA gene sequence was determined. The sequence described in this paper is different from the 16S rRNA gene sequence is specific for F. prausnitzii, and the results of this assay confirmed that F. prausnitzii is the most common species in human feces. However, a PCR assay based on the original GenBank sequence was negative when it was performed with two strains of F. prausnitzii obtained from the American Type Culture Collection. A phylogenetic tree based on the new 16S rRNA gene sequence was constructed. On this tree F. prausnitzii was not a member of the Fusobacterium group but was closer to some Eubacterium spp. and located between Clostridium "clusters III and IV" (M.D. Collins, P.A. Lawson, A. Willems, J.J. Cordoba, J. Fernandez-Garayzabal, P. Garcia, J. Cai, H. Hippe, and J.A.E. Farrow, Int. J. Syst. Bacteriol. 44:812-826, 1994).

  14. Construction of armored RNA containing long-size chimeric RNA by increasing the number and affinity of the pac site in exogenous rna and sequence coding coat protein of the MS2 bacteriophage.

    PubMed

    Wei, Baojun; Wei, Yuxiang; Zhang, Kuo; Yang, Changmei; Wang, Jing; Xu, Ruihuan; Zhan, Sien; Lin, Guigao; Wang, Wei; Liu, Min; Wang, Lunan; Zhang, Rui; Li, Jinming

    2008-01-01

    To construct a one-plasmid expression system of the armored RNA containing long chimeric RNA by increasing the number and affinity of the pac site. The plasmid pET-MS2-pac was constructed with one C-variant pac site, and then the plasmid pM-CR-2C containing 1,891-bp chimeric sequences and two C-variant pac sites was produced. Meanwhile, three plasmids (pM-CR-C, pM-CR-2W and pM-CR-W) were obtained as parallel controls with a different number and affinity of the pac site. Finally, the armored RNA was expressed and purified. The armored RNA with 1,891 bases target RNA was expressed successfully by the one-plasmid expression system with two C-variant pac sites, while for one pac site, no matter whether the affinity was changed or not, only the 1,200 bases target RNA was packaged. It was also found that the C-variant pac site could increase the expression efficiency of the armored RNA. The armored RNA with 1,891-bp exogenous RNA in our study showed the characterization of ribonuclease resistance and stability at different time points and temperature conditions. The armored RNA with 1,891 bases exogenous RNA was constructed and the expression system can be used as a platform for preparation of the armored RNA containing long RNA sequences. Copyright 2008 S. Karger AG, Basel.

  15. Detection of canonical A-to-G editing events at 3' UTRs and microRNA target sites in human lungs using next-generation sequencing.

    PubMed

    Soundararajan, Ramani; Stearns, Timothy M; Griswold, Anthony L; Mehta, Arpit; Czachor, Alexander; Fukumoto, Jutaro; Lockey, Richard F; King, Benjamin L; Kolliputi, Narasaiah

    2015-11-03

    RNA editing is a post-transcriptional modification of RNA. The majority of these changes result from adenosine deaminase acting on RNA (ADARs) catalyzing the conversion of adenosine residues to inosine in double-stranded RNAs (dsRNAs). Massively parallel sequencing has enabled the identification of RNA editing sites in human transcriptomes. In this study, we sequenced DNA and RNA from human lungs and identified RNA editing sites with high confidence via a computational pipeline utilizing stringent analysis thresholds. We identified a total of 3,447 editing sites that overlapped in three human lung samples, and with 50% of these sites having canonical A-to-G base changes. Approximately 27% of the edited sites overlapped with Alu repeats, and showed A-to-G clustering (>3 clusters in 100 bp). The majority of edited sites mapped to either 3' untranslated regions (UTRs) or introns close to splice sites; whereas, only few sites were in exons resulting in non-synonymous amino acid changes. Interestingly, we identified 652 A-to-G editing events in the 3' UTR of 205 target genes that mapped to 932 potential miRNA target binding sites. Several of these miRNA edited sites were validated in silico. Additionally, we validated several A-to-G edited sites by Sanger sequencing. Altogether, our study suggests a role for RNA editing in miRNA-mediated gene regulation and splicing in human lungs. In this study, we have generated a RNA editome of human lung tissue that can be compared with other RNA editomes across different lung tissues to delineate a role for RNA editing in normal and diseased states.

  16. Selective amplification and sequencing of cyclic phosphate-containing RNAs by the cP-RNA-seq method

    PubMed Central

    Honda, Shozo; Morichika, Keisuke; Kirino, Yohei

    2016-01-01

    RNA digestions catalyzed by many ribonucleases generate RNA fragments containing a 2′,3′-cyclic phosphate (cP) at their 3′-termini. However, standard RNA-seq methods are unable to accurately capture cP-containing RNAs because the cP inhibits the adapter ligation reaction. We recently developed a method named “cP-RNA-seq” that is able to selectively amplify and sequence cP-containing RNAs. Here we describe the cP-RNA-seq protocol in which the 3′-termini of all RNAs, except those containing a cP, are cleaved through a periodate treatment after phosphatase treatment, hence subsequent adapter ligation and cDNA amplification steps are exclusively applied to cP-containing RNAs. cP-RNA-seq takes ~6 d, excluding the time required for sequencing and bioinformatics analyses, such downstream assays are not covered in detail in this protocol. Biochemical validation of the existence of cP in the identified RNAs takes ~3 d. Even though the cP-RNA-seq method was developed to identify angiogenin-generating 5′-tRNA halves as a proof of principle, the method should be applicable to global identification of cP-containing RNA repertoires in various transcriptomes. PMID:26866791

  17. Delivery of siRNA using ternary complexes containing branched cationic peptides: the role of peptide sequence, branching and targeting.

    PubMed

    Kudsiova, Laila; Welser, Katharina; Campbell, Frederick; Mohammadi, Atefeh; Dawson, Natalie; Cui, Lili; Hailes, Helen C; Lawrence, M Jayne; Tabor, Alethea B

    2016-03-01

    Ternary nanocomplexes, composed of bifunctional cationic peptides, lipids and siRNA, as delivery vehicles for siRNA have been investigated. The study is the first to determine the optimal sequence and architecture of the bifunctional cationic peptide used for siRNA packaging and delivery using lipopolyplexes. Specifically three series of cationic peptides of differing sequence, degrees of branching and cell-targeting sequences were co-formulated with siRNA and vesicles prepared from a 1 : 1 molar ratio of the cationic lipid DOTMA and the helper lipid, DOPE. The level of siRNA knockdown achieved in the human alveolar cell line, A549-luc cells, in both reduced serum and in serum supplemented media was evaluated, and the results correlated to the nanocomplex structure (established using a range of physico-chemical tools, namely small angle neutron scattering, transmission electron microscopy, dynamic light scattering and zeta potential measurement); the conformational properties of each component (circular dichroism); the degree of protection of the siRNA in the lipopolyplex (using gel shift assays) and to the cellular uptake, localisation and toxicity of the nanocomplexes (confocal microscopy). Although the size, charge, structure and stability of the various lipopolyplexes were broadly similar, it was clear that lipopolyplexes formulated from branched peptides containing His-Lys sequences perform best as siRNA delivery agents in serum, with protection of the siRNA in serum balanced against efficient release of the siRNA into the cytoplasm of the cell.

  18. Single-cell isolation by a modular single-cell pipette for RNA-sequencing.

    PubMed

    Zhang, Kai; Gao, Min; Chong, Zechen; Li, Ying; Han, Xin; Chen, Rui; Qin, Lidong

    2016-11-29

    Single-cell transcriptome sequencing highly requires a convenient and reliable method to rapidly isolate a live cell into a specific container such as a PCR tube. Here, we report a modular single-cell pipette (mSCP) consisting of three modular components, a SCP-Tip, an air-displacement pipette (ADP), and ADP-Tips, that can be easily assembled, disassembled, and reassembled. By assembling the SCP-Tip containing a hydrodynamic trap, the mSCP can isolate single cells from 5-10 cells per μL of cell suspension. The mSCP is compatible with microscopic identification of captured single cells to finally achieve 100% single-cell isolation efficiency. The isolated live single cells are in submicroliter volumes and well suitable for single-cell PCR analysis and RNA-sequencing. The mSCP possesses merits of convenience, rapidness, and high efficiency, making it a powerful tool to isolate single cells for transcriptome analysis.

  19. Conserved structures formed by heterogeneous RNA sequences drive silencing of an inflammation responsive post-transcriptional operon

    PubMed Central

    Basu, Abhijit; Jain, Niyati; Tolbert, Blanton S.; Komar, Anton A.

    2017-01-01

    Abstract RNA–protein interactions with physiological outcomes usually rely on conserved sequences within the RNA element. By contrast, activity of the diverse gamma-interferon-activated inhibitor of translation (GAIT)-elements relies on the conserved RNA folding motifs rather than the conserved sequence motifs. These elements drive the translational silencing of a group of chemokine (CC/CXC) and chemokine receptor (CCR) mRNAs, thereby helping to resolve physiological inflammation. Despite sequence dissimilarity, these RNA elements adopt common secondary structures (as revealed by 2D-1H NMR spectroscopy), providing a basis for their interaction with the RNA-binding GAIT complex. However, many of these elements (e.g. those derived from CCL22, CXCL13, CCR4 and ceruloplasmin (Cp) mRNAs) have substantially different affinities for GAIT complex binding. Toeprinting analysis shows that different positions within the overall conserved GAIT element structure contribute to differential affinities of the GAIT protein complex towards the elements. Thus, heterogeneity of GAIT elements may provide hierarchical fine-tuning of the resolution of inflammation. PMID:29069516

  20. The Role of 16S rRNA Gene Sequencing in Identification of Microorganisms Misidentified by Conventional Methods

    PubMed Central

    Petti, C. A.; Polage, C. R.; Schreckenberger, P.

    2005-01-01

    Traditional methods for microbial identification require the recognition of differences in morphology, growth, enzymatic activity, and metabolism to define genera and species. Full and partial 16S rRNA gene sequencing methods have emerged as useful tools for identifying phenotypically aberrant microorganisms. We report on three bacterial blood isolates from three different College of American Pathologists-certified laboratories that were referred to ARUP Laboratories for definitive identification. Because phenotypic identification suggested unusual organisms not typically associated with the submitted clinical diagnosis, consultation with the Medical Director was sought and further testing was performed including partial 16S rRNA gene sequencing. All three patients had endocarditis, and conventional methods identified isolates from patients A, B, and C as a Facklamia sp., Eubacterium tenue, and a Bifidobacterium sp. 16S rRNA gene sequencing identified the isolates as Enterococcus faecalis, Cardiobacterium valvarum, and Streptococcus mutans, respectively. We conclude that the initial identifications of these three isolates were erroneous, may have misled clinicians, and potentially impacted patient care. 16S rRNA gene sequencing is a more objective identification tool, unaffected by phenotypic variation or technologist bias, and has the potential to reduce laboratory errors. PMID:16333109

  1. RNA deep sequencing as a tool for selection of cell lines for systematic subcellular localization of all human proteins.

    PubMed

    Danielsson, Frida; Wiking, Mikaela; Mahdessian, Diana; Skogs, Marie; Ait Blal, Hammou; Hjelmare, Martin; Stadler, Charlotte; Uhlén, Mathias; Lundberg, Emma

    2013-01-04

    One of the major challenges of a chromosome-centric proteome project is to explore in a systematic manner the potential proteins identified from the chromosomal genome sequence, but not yet characterized on a protein level. Here, we describe the use of RNA deep sequencing to screen human cell lines for RNA profiles and to use this information to select cell lines suitable for characterization of the corresponding gene product. In this manner, the subcellular localization of proteins can be analyzed systematically using antibody-based confocal microscopy. We demonstrate the usefulness of selecting cell lines with high expression levels of RNA transcripts to increase the likelihood of high quality immunofluorescence staining and subsequent successful subcellular localization of the corresponding protein. The results show a path to combine transcriptomics with affinity proteomics to characterize the proteins in a gene- or chromosome-centric manner.

  2. De novo transcriptome sequencing reveals a considerable bias in the incidence of simple sequence repeats towards the downstream of 'Pre-miRNAs' of black pepper.

    PubMed

    Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

    2013-01-01

    Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of '43 pre-miRNA candidates bearing different types of SSR motifs'. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted 'pre-miRNA candidates bearing SSRs'. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted 'pre-miRNA candidates'. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of 'tandem repeats' in miRNAs.

  3. Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing.

    PubMed

    Tourlousse, Dieter M; Yoshiike, Satowa; Ohashi, Akiko; Matsukura, Satoko; Noda, Naohiro; Sekiguchi, Yuji

    2017-02-28

    High-throughput sequencing of 16S rRNA gene amplicons (16S-seq) has become a widely deployed method for profiling complex microbial communities but technical pitfalls related to data reliability and quantification remain to be fully addressed. In this work, we have developed and implemented a set of synthetic 16S rRNA genes to serve as universal spike-in standards for 16S-seq experiments. The spike-ins represent full-length 16S rRNA genes containing artificial variable regions with negligible identity to known nucleotide sequences, permitting unambiguous identification of spike-in sequences in 16S-seq read data from any microbiome sample. Using defined mock communities and environmental microbiota, we characterized the performance of the spike-in standards and demonstrated their utility for evaluating data quality on a per-sample basis. Further, we showed that staggered spike-in mixtures added at the point of DNA extraction enable concurrent estimation of absolute microbial abundances suitable for comparative analysis. Results also underscored that template-specific Illumina sequencing artifacts may lead to biases in the perceived abundance of certain taxa. Taken together, the spike-in standards represent a novel bioanalytical tool that can substantially improve 16S-seq-based microbiome studies by enabling comprehensive quality control along with absolute quantification. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Conserved noncoding sequences (CNSs) in higher plants.

    PubMed

    Freeling, Michael; Subramaniam, Shabarinath

    2009-04-01

    Plant conserved noncoding sequences (CNSs)--a specific category of phylogenetic footprint--have been shown experimentally to function. No plant CNS is conserved to the extent that ultraconserved noncoding sequences are conserved in vertebrates. Plant CNSs are enriched in known transcription factor or other cis-acting binding sites, and are usually clustered around genes. Genes that encode transcription factors and/or those that respond to stimuli are particularly CNS-rich. Only rarely could this function involve small RNA binding. Some transcribed CNSs encode short translation products as a form of negative control. Approximately 4% of Arabidopsis gene content is estimated to be both CNS-rich and occupies a relatively long stretch of chromosome: Bigfoot genes (long phylogenetic footprints). We discuss a 'DNA-templated protein assembly' idea that might help explain Bigfoot gene CNSs.

  5. LncRNA Expression Profile of Human Thoracic Aortic Dissection by High-Throughput Sequencing.

    PubMed

    Sun, Jie; Chen, Guojun; Jing, Yuanwen; He, Xiang; Dong, Jianting; Zheng, Junmeng; Zou, Meisheng; Li, Hairui; Wang, Shifei; Sun, Yili; Liao, Wangjun; Liao, Yulin; Feng, Li; Bin, Jianping

    2018-01-01

    In this study, the long non-coding RNA (lncRNA) expression profile in human thoracic aortic dissection (TAD), a highly lethal cardiovascular disease, was investigated. Human TAD (n=3) and normal aortic tissues (NA) (n=3) were examined by high-throughput sequencing. Bioinformatics analyses were performed to predict the roles of aberrantly expressed lncRNAs. Quantitative real-time polymerase chain reaction (qRT-PCR) was applied to validate the results. A total of 269 lncRNAs (159 up-regulated and 110 down-regulated) and 2, 255 mRNAs (1 294 up-regulated and 961 down-regulated) were aberrantly expressed in human TAD (fold-change> 1.5, P< 0.05). QRT-PCR results of five dysregulated genes were consistent with HTS data. A lncRNA-mRNA coexpression analysis showed positive correlations between the up-regulated lncRNA (ENSG00000269936) and its adjacent up-regulated mRNA (MAP2K6, R=0.940, P< 0.01), and between the down-regulated lncRNA_1421 and its down-regulated mRNAs (FBLN5, R=0.950, P< 0.01; ACTA2, R=0.96, P< 0.01; TIMP3, R=0.96, P< 0.05). The lncRNA-miRNA-mRNA network indicated that the up-regulated lncRNA XIST and p21 had similar sequences targeted by has-miR-17-5p. The results of luciferase assay and fluorescence immuno-cytochemistry were consistent with that. And qRT-PCR results showed that lncRNA XIST and p21 were expressed at a higher level and has-miR-17-5p was expressed at a lower level in TAD than in NA. The predicted binding motifs of three up-regulated lncRNAs (ENSG00000248508, ENSG00000226530, and EG00000259719) were correlated with up-regulated RUNX1 (R=0.982, P< 0.001; R=0.967, P< 0.01; R=0.960, P< 0.01, respectively). Our study revealed a set of dysregulated lncRNAs and predicted their multiple potential functions in human TAD. These findings suggest that lncRNAs are novel potential therapeutic targets for human TAD. © 2018 The Author(s). Published by S. Karger AG, Basel.

  6. Phytoplasma phylogenetics based on analysis of secA and 23S rRNA gene sequences for improved resolution of candidate species of 'Candidatus Phytoplasma'.

    PubMed

    Hodgetts, Jennifer; Boonham, Neil; Mumford, Rick; Harrison, Nigel; Dickinson, Matthew

    2008-08-01

    Phytoplasma phylogenetics has focused primarily on sequences of the non-coding 16S rRNA gene and the 16S-23S rRNA intergenic spacer region (16-23S ISR), and primers that enable amplification of these regions from all phytoplasmas by PCR are well established. In this study, primers based on the secA gene have been developed into a semi-nested PCR assay that results in a sequence of the expected size (about 480 bp) from all 34 phytoplasmas examined, including strains representative of 12 16Sr groups. Phylogenetic analysis of secA gene sequences showed similar clustering of phytoplasmas when compared with clusters resolved by similar sequence analyses of a 16-23S ISR-23S rRNA gene contig or of the 16S rRNA gene alone. The main differences between trees were in the branch lengths, which were elongated in the 16-23S ISR-23S rRNA gene tree when compared with the 16S rRNA gene tree and elongated still further in the secA gene tree, despite this being a shorter sequence. The improved resolution in the secA gene-derived phylogenetic tree resulted in the 16SrII group splitting into two distinct clusters, while phytoplasmas associated with coconut lethal yellowing-type diseases split into three distinct groups, thereby supporting past proposals that they represent different candidate species within 'Candidatus Phytoplasma'. The ability to differentiate 16Sr groups and subgroups by virtual RFLP analysis of secA gene sequences suggests that this gene may provide an informative alternative molecular marker for pathogen identification and diagnosis of phytoplasma diseases.

  7. Mapping specificity landscapes of RNA-protein interactions by high throughput sequencing.

    PubMed

    Jankowsky, Eckhard; Harris, Michael E

    2017-04-15

    To function in a biological setting, RNA binding proteins (RBPs) have to discriminate between alternative binding sites in RNAs. This discrimination can occur in the ground state of an RNA-protein binding reaction, in its transition state, or in both. The extent by which RBPs discriminate at these reaction states defines RBP specificity landscapes. Here, we describe the HiTS-Kin and HiTS-EQ techniques, which combine kinetic and equilibrium binding experiments with high throughput sequencing to quantitatively assess substrate discrimination for large numbers of substrate variants at ground and transition states of RNA-protein binding reactions. We discuss experimental design, practical considerations and data analysis and outline how a combination of HiTS-Kin and HiTS-EQ allows the mapping of RBP specificity landscapes. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Exome sequencing coupled with mRNA analysis identifies NDUFAF6 as a Leigh gene.

    PubMed

    Bianciardi, Laura; Imperatore, Valentina; Fernandez-Vizarra, Erika; Lopomo, Angela; Falabella, Micol; Furini, Simone; Galluzzi, Paolo; Grosso, Salvatore; Zeviani, Massimo; Renieri, Alessandra; Mari, Francesca; Frullanti, Elisa

    2016-11-01

    We report here the case of a young male who started to show verbal fluency disturbance, clumsiness and gait anomalies at the age of 3.5years and presented bilateral striatal necrosis. Clinically, the diagnosis was compatible with Leigh syndrome but the underlying molecular defect remained elusive even after exome analysis using autosomal/X-linked recessive or de novo models. Dosage of respiratory chain activity on fibroblasts, but not in muscle, underlined a deficit in complex I. Re-analysis of heterozygous probably pathogenic variants, inherited from one healthy parent, identified the p.Ala178Pro in NDUFAF6, a complex I assembly factor. RNA analysis showed an almost mono-allelic expression of the mutated allele in blood and fibroblasts and puromycin treatment on cultured fibroblasts did not lead to the rescue of the maternal allele expression, not supporting the involvement of nonsense-mediated RNA decay mechanism. Complementation assay underlined a recovery of complex I activity after transduction of the wild-type gene. Since the second mutation was not detected and promoter methylation analysis resulted normal, we hypothesized a non-exonic event in the maternal allele affecting a regulatory element that, in conjunction with the paternal mutation, leads to the autosomal recessive disorder and the different allele expression in various tissues. This paper confirms NDUFAF6 as a genuine morbid gene and proposes the coupling of exome sequencing with mRNA analysis as a method useful for enhancing the exome sequencing detection rate when the simple application of classical inheritance models fails. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing

    PubMed Central

    Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li

    2010-01-01

    Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome. PMID:20392818

  10. Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing.

    PubMed

    Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li

    2010-08-01

    Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome.

  11. Phosphorylation of poly(rC) binding protein 1 (PCBP1) contributes to stabilization of mu opioid receptor (MOR) mRNA via interaction with AU-rich element RNA-binding protein 1 (AUF1) and poly A binding protein (PABP)

    PubMed Central

    Hwang, Cheol Kyu; Wagley, Yadav; Law, Ping-Yee; Wei, Li-Na; Loh, Horace H.

    2016-01-01

    Gene regulation at the post-transcriptional level is frequently based on cis- and trans-acting factors on target mRNAs. We found a C-rich element (CRE) in mu-opioid receptor (MOR) 3′-untranslated region (UTR) to which poly (rC) binding protein 1 (PCBP1) binds, resulting in MOR mRNA stabilization. RNA immunoprecipitation and RNA EMSA revealed the formation of PCBP1-RNA complexes at the element. Knockdown of PCBP1 decreased MOR mRNA half-life and protein expression. Stimulation by forskolin increased cytoplasmic localization of PCBP1 and PCBP1/MOR 3′-UTR interactions via increased serine phosphorylation that was blocked by protein kinase A (PKA) or (phosphatidyl inositol-3) PI3-kinase inhibitors. The forskolin treatment also enhanced serine- and tyrosine-phosphorylation of AU-rich element binding protein (AUF1), concurrent with its increased binding to the CRE, and led to an increased interaction of poly A binding protein (PABP) with the CRE and poly(A) sites. AUF1 phosphorylation also led to an increased interaction with PCBP1. These findings suggest that a single co-regulator, PCBP1, plays a crucial role in stabilizing MOR mRNA, and is induced by PKA signaling by conforming to AUF1 and PABP. PMID:27836661

  12. Methylation guide RNA evolution in archaea: structure, function and genomic organization of 110 C/D box sRNA families across six Pyrobaculum species.

    PubMed

    Lui, Lauren M; Uzilov, Andrew V; Bernick, David L; Corredor, Andrea; Lowe, Todd M; Dennis, Patrick P

    2018-05-16

    Archaeal homologs of eukaryotic C/D box small nucleolar RNAs (C/D box sRNAs) guide precise 2'-O-methyl modification of ribosomal and transfer RNAs. Although C/D box sRNA genes constitute one of the largest RNA gene families in archaeal thermophiles, most genomes have incomplete sRNA gene annotation because reliable, fully automated detection methods are not available. We expanded and curated a comprehensive gene set across six species of the crenarchaeal genus Pyrobaculum, particularly rich in C/D box sRNA genes. Using high-throughput small RNA sequencing, specialized computational searches and comparative genomics, we analyzed 526 Pyrobaculum C/D box sRNAs, organizing them into 110 families based on synteny and conservation of guide sequences which determine methylation targets. We examined gene duplications and rearrangements, including one family that has expanded in a pattern similar to retrotransposed repetitive elements in eukaryotes. New training data and inclusion of kink-turn secondary structural features enabled creation of an improved search model. Our analyses provide the most comprehensive, dynamic view of C/D box sRNA evolutionary history within a genus, in terms of modification function, feature plasticity, and gene mobility.

  13. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    USDA-ARS?s Scientific Manuscript database

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  14. Human CST Facilitates Genome-wide RAD51 Recruitment to GC-Rich Repetitive Sequences in Response to Replication Stress.

    PubMed

    Chastain, Megan; Zhou, Qing; Shiva, Olga; Fadri-Moskwik, Maria; Whitmore, Leanne; Jia, Pingping; Dai, Xueyu; Huang, Chenhui; Ye, Ping; Chai, Weihang

    2016-08-02

    The telomeric CTC1/STN1/TEN1 (CST) complex has been implicated in promoting replication recovery under replication stress at genomic regions, yet its precise role is unclear. Here, we report that STN1 is enriched at GC-rich repetitive sequences genome-wide in response to hydroxyurea (HU)-induced replication stress. STN1 deficiency exacerbates the fragility of these sequences under replication stress, resulting in chromosome fragmentation. We find that upon fork stalling, CST proteins form distinct nuclear foci that colocalize with RAD51. Furthermore, replication stress induces physical association of CST with RAD51 in an ATR-dependent manner. Strikingly, CST deficiency diminishes HU-induced RAD51 foci formation and reduces RAD51 recruitment to telomeres and non-telomeric GC-rich fragile sequences. Collectively, our findings establish that CST promotes RAD51 recruitment to GC-rich repetitive sequences in response to replication stress to facilitate replication restart, thereby providing insights into the mechanism underlying genome stability maintenance. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.

  15. Sequence Variation in the Small-Subunit rRNA Gene of Plasmodium malariae and Prevalence of Isolates with the Variant Sequence in Sichuan, China

    PubMed Central

    Liu, Qing; Zhu, Shenghua; Mizuno, Sahoko; Kimura, Masatsugu; Liu, Peina; Isomura, Shin; Wang, Xingzhen; Kawamoto, Fumihiko

    1998-01-01

    By two PCR-based diagnostic methods, Plasmodium malariae infections have been rediscovered at two foci in the Sichuan province of China, a region where no cases of P. malariae have been officially reported for the last 2 decades. In addition, a variant form of P. malariae which has a deletion of 19 bp and seven substitutions of base pairs in the target sequence of the small-subunit (SSU) rRNA gene was detected with high frequency. Alignment analysis of Plasmodium sp. SSU rRNA gene sequences revealed that the 5′ region of the variant sequence is identical to that of P. vivax or P. knowlesi and its 3′ region is identical to that of P. malariae. The same sequence variations were also found in P. malariae isolates collected along the Thai-Myanmar border, suggesting a wide distribution of this variant form from southern China to Southeast Asia. PMID:9774600

  16. Molecular characterizations of somatic hybrids developed between Pleurotus florida and Lentinus squarrosulus through inter-simple sequence repeat markers and sequencing of ribosomal RNA-ITS gene.

    PubMed

    Mallick, Pijush; Chattaraj, Shruti; Sikdar, Samir Ranjan

    2017-10-01

    The 12 pfls somatic hybrids and 2 parents of Pleurotus florida and Lentinus s quarrosulus were characterized by ISSR and sequencing of rRNA-ITS genes. Five ISSR primers were used and amplified a total of 54 reproducible fragments with 98.14% polymorphism among all the pfls hybrid populations and parental strains. UPGMA-based cluster exhibited a dendrogram with three major groups between the parents and pfls hybrids. Parent P . florida and L . squarrosulus showed different degrees of genetic distance with all the hybrid lines and they showed closeness to hybrid pfls 1m and pfls 1h , respectively. ITS1(F) and ITS4(R) amplified the rRNA-ITS gene with 611-867 bp sequence length. The nucleotide polymorphisms were found in the ITS1, ITS2 and 5.8S rRNA region with different number of bases. Based on rRNA-ITS sequence, UPGMA cluster exhibited three distinct groups between L. squarrosulus and pfls 1p , pfls 1m and pfls 1s , and pfls 1e and P. florida .

  17. Optimization of High-Throughput Sequencing Kinetics for determining enzymatic rate constants of thousands of RNA substrates

    PubMed Central

    Niland, Courtney N.; Jankowsky, Eckhard; Harris, Michael E.

    2016-01-01

    Quantification of the specificity of RNA binding proteins and RNA processing enzymes is essential to understanding their fundamental roles in biological processes. High Throughput Sequencing Kinetics (HTS-Kin) uses high throughput sequencing and internal competition kinetics to simultaneously monitor the processing rate constants of thousands of substrates by RNA processing enzymes. This technique has provided unprecedented insight into the substrate specificity of the tRNA processing endonuclease ribonuclease P. Here, we investigate the accuracy and robustness of measurements associated with each step of the HTS-Kin procedure. We examine the effect of substrate concentration on the observed rate constant, determine the optimal kinetic parameters, and provide guidelines for reducing error in amplification of the substrate population. Importantly, we find that high-throughput sequencing, and experimental reproducibility contribute their own sources of error, and these are the main sources of imprecision in the quantified results when otherwise optimized guidelines are followed. PMID:27296633

  18. Determining RNA quality for NextGen sequencing: some exceptions to the gold standard rule of 23S to 16S rRNA ratio

    USDA-ARS?s Scientific Manuscript database

    Using next-generation-sequencing technology to assess entire transcriptomes requires high quality starting RNA. Currently, RNA quality is routinely judged using automated microfluidic gel electrophoresis platforms and associated algorithms. Here we report that such automated methods generate false-n...

  19. A powerful and flexible approach to the analysis of RNA sequence count data

    PubMed Central

    Zhou, Yi-Hui; Xia, Kai; Wright, Fred A.

    2011-01-01

    Motivation: A number of penalization and shrinkage approaches have been proposed for the analysis of microarray gene expression data. Similar techniques are now routinely applied to RNA sequence transcriptional count data, although the value of such shrinkage has not been conclusively established. If penalization is desired, the explicit modeling of mean–variance relationships provides a flexible testing regimen that ‘borrows’ information across genes, while easily incorporating design effects and additional covariates. Results: We describe BBSeq, which incorporates two approaches: (i) a simple beta-binomial generalized linear model, which has not been extensively tested for RNA-Seq data and (ii) an extension of an expression mean–variance modeling approach to RNA-Seq data, involving modeling of the overdispersion as a function of the mean. Our approaches are flexible, allowing for general handling of discrete experimental factors and continuous covariates. We report comparisons with other alternate methods to handle RNA-Seq data. Although penalized methods have advantages for very small sample sizes, the beta-binomial generalized linear model, combined with simple outlier detection and testing approaches, appears to have favorable characteristics in power and flexibility. Availability: An R package containing examples and sample datasets is available at http://www.bios.unc.edu/research/genomic_software/BBSeq Contact: yzhou@bios.unc.edu; fwright@bios.unc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21810900

  20. A powerful and flexible approach to the analysis of RNA sequence count data.

    PubMed

    Zhou, Yi-Hui; Xia, Kai; Wright, Fred A

    2011-10-01

    A number of penalization and shrinkage approaches have been proposed for the analysis of microarray gene expression data. Similar techniques are now routinely applied to RNA sequence transcriptional count data, although the value of such shrinkage has not been conclusively established. If penalization is desired, the explicit modeling of mean-variance relationships provides a flexible testing regimen that 'borrows' information across genes, while easily incorporating design effects and additional covariates. We describe BBSeq, which incorporates two approaches: (i) a simple beta-binomial generalized linear model, which has not been extensively tested for RNA-Seq data and (ii) an extension of an expression mean-variance modeling approach to RNA-Seq data, involving modeling of the overdispersion as a function of the mean. Our approaches are flexible, allowing for general handling of discrete experimental factors and continuous covariates. We report comparisons with other alternate methods to handle RNA-Seq data. Although penalized methods have advantages for very small sample sizes, the beta-binomial generalized linear model, combined with simple outlier detection and testing approaches, appears to have favorable characteristics in power and flexibility. An R package containing examples and sample datasets is available at http://www.bios.unc.edu/research/genomic_software/BBSeq yzhou@bios.unc.edu; fwright@bios.unc.edu Supplementary data are available at Bioinformatics online.

  1. RNA-Sequencing of Primary Retinoblastoma Tumors Provides New Insights and Challenges Into Tumor Development.

    PubMed

    Elchuri, Sailaja V; Rajasekaran, Swetha; Miles, Wayne O

    2018-01-01

    Retinoblastoma is rare tumor of the retina caused by the homozygous loss of the Retinoblastoma 1 tumor suppressor gene (RB1). Loss of the RB1 protein, pRB, results in de-regulated activity of the E2F transcription factors, chromatin changes and developmental defects leading to tumor development. Extensive microarray profiles of these tumors have enabled the identification of genes sensitive to pRB disruption, however, this technology has a number of limitations in the RNA profiles that they generate. The advent of RNA-sequencing has enabled the global profiling of all of the RNA within the cell including both coding and non-coding features and the detection of aberrant RNA processing events. In this perspective, we focus on discussing how RNA-sequencing of rare Retinoblastoma tumors will build on existing data and open up new area's to improve our understanding of the biology of these tumors. In particular, we discuss how the RB-research field may be to use this data to determine how RB1 loss results in the expression of; non-coding RNAs, causes aberrant RNA processing events and how a deeper analysis of metabolic RNA changes can be utilized to model tumor specific shifts in metabolism. Each section discusses new opportunities and challenges associated with these types of analyses and aims to provide an honest assessment of how understanding these different processes may contribute to the treatment of Retinoblastoma.

  2. QUANTIFYING ALTERNATIVE SPLICING FROM PAIRED-END RNA-SEQUENCING DATA.

    PubMed

    Rossell, David; Stephan-Otto Attolini, Camille; Kroiss, Manuel; Stöcker, Almond

    2014-03-01

    RNA-sequencing has revolutionized biomedical research and, in particular, our ability to study gene alternative splicing. The problem has important implications for human health, as alternative splicing may be involved in malfunctions at the cellular level and multiple diseases. However, the high-dimensional nature of the data and the existence of experimental biases pose serious data analysis challenges. We find that the standard data summaries used to study alternative splicing are severely limited, as they ignore a substantial amount of valuable information. Current data analysis methods are based on such summaries and are hence sub-optimal. Further, they have limited flexibility in accounting for technical biases. We propose novel data summaries and a Bayesian modeling framework that overcome these limitations and determine biases in a non-parametric, highly flexible manner. These summaries adapt naturally to the rapid improvements in sequencing technology. We provide efficient point estimates and uncertainty assessments. The approach allows to study alternative splicing patterns for individual samples and can also be the basis for downstream analyses. We found a several fold improvement in estimation mean square error compared popular approaches in simulations, and substantially higher consistency between replicates in experimental data. Our findings indicate the need for adjusting the routine summarization and analysis of alternative splicing RNA-seq studies. We provide a software implementation in the R package casper.

  3. RNA-Mediated Gene Duplication and Retroposons: Retrogenes, LINEs, SINEs, and Sequence Specificity

    PubMed Central

    2013-01-01

    A substantial number of “retrogenes” that are derived from the mRNA of various intron-containing genes have been reported. A class of mammalian retroposons, long interspersed element-1 (LINE1, L1), has been shown to be involved in the reverse transcription of retrogenes (or processed pseudogenes) and non-autonomous short interspersed elements (SINEs). The 3′-end sequences of various SINEs originated from a corresponding LINE. As the 3′-untranslated regions of several LINEs are essential for retroposition, these LINEs presumably require “stringent” recognition of the 3′-end sequence of the RNA template. However, the 3′-ends of mammalian L1s do not exhibit any similarity to SINEs, except for the presence of 3′-poly(A) repeats. Since the 3′-poly(A) repeats of L1 and Alu SINE are critical for their retroposition, L1 probably recognizes the poly(A) repeats, thereby mobilizing not only Alu SINE but also cytosolic mRNA. Many flowering plants only harbor L1-clade LINEs and a significant number of SINEs with poly(A) repeats, but no homology to the LINEs. Moreover, processed pseudogenes have also been found in flowering plants. I propose that the ancestral L1-clade LINE in the common ancestor of green plants may have recognized a specific RNA template, with stringent recognition then becoming relaxed during the course of plant evolution. PMID:23984183

  4. Thermodynamic and spectroscopic investigations of TMPyP4 association with guanine- and cytosine-rich DNA and RNA repeats of C9orf72.

    PubMed

    Alniss, Hasan; Zamiri, Bita; Khalaj, Melisa; Pearson, Christopher E; Macgregor, Robert B

    2018-01-22

    An expansion of the hexanucleotide repeat (GGGGCC)n·(GGCCCC)n in the C9orf72 promoter has been shown to be the cause of Amyotrophic lateral sclerosis and frontotemporal dementia (ALS-FTD). The C9orf72 repeat can form four-stranded structures; the cationic porphyrin (TMPyP4) binds and distorts these structures. Isothermal titration calorimetry (ITC), and circular dichroism (CD) were used to study the binding of TMPyP4 to the C-rich and G-rich DNA and RNA oligos containing the hexanucleotide repeat at pH 7.5 and 0.1 M K + . The CD spectra of G-rich DNA and RNA TMPyP4 complexes showed features of antiparallel and parallel G-quadruplexes, respectively. The shoulder at 260 nm in the CD spectrum becomes more intense upon formation of complexes between TMPyP4 and the C-rich DNA. The peak at 290 nm becomes more intense in the c-rich RNA molecules, suggesting induction of an i-motif structure. The ITC data showed that TMPyP4 binds at two independent sites for all DNA and RNA molecules. For DNA, the data are consistent with TMPyP4 stacking on the terminal tetrads and intercalation. For RNA, the thermodynamics of the two binding modes are consistent with groove binding and intercalation. In both cases, intercalation is the weaker binding mode. These findings are considered with respect to the structural differences of the folded DNA and RNA molecules and the energetics of the processes that drive site-specific recognition by TMPyP4; these data will be helpful in efforts to optimize the specificity and affinity of the binding of porphyrin-like molecules. Copyright © 2018 Elsevier Inc. All rights reserved.

  5. Nucleotide sequence and regulatory studies of VGF, a nervous system-specific mRNA that is rapidly and relatively selectively induced by nerve growth factor.

    PubMed

    Salton, S R

    1991-09-01

    A nervous system-specific mRNA that is rapidly induced in PC12 cells to a greater extent by nerve growth factor (NGF) than by epidermal growth factor treatment has been cloned. The polypeptide deduced from the nucleic acid sequence of the NGF33.1 cDNA clone contains regions of amino acid sequence identity with that predicted by the cDNA clone VGF, and further analysis suggests that both NGF33.1 and VGF cDNA clones very likely correspond to the same mRNA (VGF). In this report both the nucleic acid sequence that corresponds to VGF mRNA and the polypeptide predicted by the NGF33.1 cDNA clone are presented. Genomic Southern analysis and database comparison did not detect additional sequences with high homology to the VGF gene. Induction of VGF mRNA by depolarization and phorbol 12-myristate 13-acetate treatment was greater than by serum stimulation or protein kinase A pathway activation. These studies suggest that VGF mRNA is induced to the greatest extent by NGF treatment and that VGF is one of the most rapidly regulated neuronal mRNAs identified in PC12 cells.

  6. Complete mitochondrial genome sequence of a phytophagous ladybird beetle, Henosepilachna pusillanima (Mulsant) (Coleoptera: Coccinellidae).

    PubMed

    Behere, G T; Firake, D M; Tay, W T; Azad Thakur, N S; Ngachan, S V

    2016-01-01

    Ladybird beetles are generally considered as agriculturally beneficial insects, but the ladybird beetles in the coleopteran subfamily Epilachninae are phytophagous and major plant feeding pest species which causes severe economic losses to cucurbitaceous and solanaceous crops. Henosepilachna pusillanima (Mulsant) is one of the important pest species of ladybird beetle. In this report, we sequenced and characterized the complete mitochondrial genome of H. pusillanima. For sequencing of the complete mitochondrial genome, we used the Ion Torrent sequencing platform. The complete circular mitochondrial genome of the H. pusillanima was determined to be 16,216 bp long. There were totally 13 protein coding genes, 22 transfer RNA, 2 ribosomal RNA and a control (A + T-rich) region estimated to be 1690 bp. The gene arrangement and orientations of assembled mitogenome were identical to the reported predatory ladybird beetle Coccinella septempunctata L. This is the first completely sequenced coleopteran mitochondrial genome from the beetle subfamily Epilachninae from India. Data generated in this study will benefit future comparative genomics studies for understanding the evolutionary relationships between predatory and phytophagous coccinellid beetles.

  7. Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hudson, William H.; Pickard, Mark R.; de Vera, Ian Mitchelle S.

    2014-12-23

    The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic noncoding RNAs (lincRNAs). Although lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA growth arrest-specific 5 (Gas5), which regulates steroid-mediated transcriptional regulation, growth arrest and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5more » lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions.« less

  8. Role of the Pepino mosaic virus 3'-untranslated region elements in negative-strand RNA synthesis in vitro.

    PubMed

    Osman, Toba A M; Olsthoorn, René C L; Livieratos, Ioannis C

    2014-09-22

    Pepino mosaic virus (PepMV) is a mechanically-transmitted positive-strand RNA potexvirus, with a 6410 nt long single-stranded (ss) RNA genome flanked by a 5'-methylguanosine cap and a 3' poly-A tail. Computer-assisted folding of the 64 nt long PepMV 3'-untranslated region (UTR) resulted in the prediction of three stem-loop structures (hp1, hp2, and hp3 in the 3'-5' direction). The importance of these structures and/or sequences for promotion of negative-strand RNA synthesis and binding to the RNA dependent RNA polymerase (RdRp) was tested in vitro using a specific RdRp assay. Hp1, which is highly variable among different PepMV isolates, appeared dispensable for negative-strand synthesis. Hp2, which is characterized by a large U-rich loop, tolerated base-pair changes in its stem as long as they maintained the stem integrity but was very sensitive to changes in the U-rich loop. Hp3, which harbours the conserved potexvirus ACUUAA hexamer motif, was essential for template activity. Template-RNA polymerase binding competition experiments showed that the ACUUAA sequence represents a high-affinity RdRp binding element. Copyright © 2014 Elsevier B.V. All rights reserved.

  9. De novo Transcriptome Sequencing Reveals a Considerable Bias in the Incidence of Simple Sequence Repeats towards the Downstream of ‘Pre-miRNAs’ of Black Pepper

    PubMed Central

    Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

    2013-01-01

    Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of ‘43 pre-miRNA candidates bearing different types of SSR motifs’. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted ‘pre-miRNA candidates bearing SSRs’. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted ‘pre-miRNA candidates’. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of ‘tandem repeats’ in miRNAs. PMID:23469176

  10. Activated charcoal-mediated RNA extraction method for Azadirachta indica and plants highly rich in polyphenolics, polysaccharides and other complex secondary compounds

    PubMed Central

    2013-01-01

    Background High quality RNA is a primary requisite for numerous molecular biological applications but is difficult to isolate from several plants rich in polysaccharides, polyphenolics and other secondary metabolites. These compounds either bind with nucleic acids or often co-precipitate at the final step and many times cannot be removed by conventional methods and kits. Addition of vinyl-pyrollidone polymers in extraction buffer efficiently removes polyphenolics to some extent, but, it failed in case of Azadirachta indica and several other medicinal and aromatic plants. Findings Here we report the use of adsorption property of activated charcoal (0.03%–0.1%) in RNA isolation procedures to remove complex secondary metabolites and polyphenolics to yield good quality RNA from Azadirachta indica. We tested and validated our modified RNA isolation method across 21 different plants including Andrographis paniculata, Aloe vera, Rosa damascena, Pelargonium graveolens, Phyllanthus amarus etc. from 13 other different families, many of which are considered as tough system for isolating RNA. The A260/280 ratio of the extracted RNA ranged between 1.8-2.0 and distinct 28S and 18S ribosomal RNA bands were observed in denaturing agarose gel electrophoresis. Analysis using Agilent 2100 Bioanalyzer revealed intact total RNA yield with very good RNA Integrity Number. Conclusions The RNA isolated by our modified method was found to be of high quality and amenable for sensitive downstream molecular applications like subtractive library construction and RT-PCR. This modified RNA isolation procedure would aid and accelerate the biotechnological studies in complex medicinal and aromatic plants which are extremely rich in secondary metabolic compounds. PMID:23537338

  11. Single-Stranded Condensation Stochastically Blocks G-Quadruplex Assembly in Human Telomeric RNA.

    PubMed

    Gutiérrez, Irene; Garavís, Miguel; de Lorenzo, Sara; Villasante, Alfredo; González, Carlos; Arias-Gonzalez, J Ricardo

    2018-05-17

    TERRA is an RNA molecule transcribed from human subtelomeric regions toward chromosome ends potentially involved in regulation of heterochromatin stability, semiconservative replication, and telomerase inhibition, among others. TERRA contains tandem repeats of the sequence GGGUUA, with a strong tendency to fold into a four-stranded arrangement known as a parallel G-quadruplex. Here, we demonstrate by using single-molecule force spectroscopy that this potential is limited by the inherent capacity of RNA to self-associate randomly and further condense into entropically more favorable structures. We stretched RNA constructions with more than four and less than eight hexanucleotide repeats, thus unable to form several G-quadruplexes in tandem, flanked by non-G-rich overhangs of random sequence by optical tweezers on a one by one basis. We found that condensed RNA stochastically blocks G-quadruplex folding pathways with a near 20% probability, a behavior that is not found in DNA analogous molecules.

  12. Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

    PubMed Central

    Hall, L; Laird, J E; Craig, R K

    1984-01-01

    Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375

  13. Small RNA Library Preparation Method for Next-Generation Sequencing Using Chemical Modifications to Prevent Adapter Dimer Formation.

    PubMed

    Shore, Sabrina; Henderson, Jordana M; Lebedev, Alexandre; Salcedo, Michelle P; Zon, Gerald; McCaffrey, Anton P; Paul, Natasha; Hogrefe, Richard I

    2016-01-01

    For most sample types, the automation of RNA and DNA sample preparation workflows enables high throughput next-generation sequencing (NGS) library preparation. Greater adoption of small RNA (sRNA) sequencing has been hindered by high sample input requirements and inherent ligation side products formed during library preparation. These side products, known as adapter dimer, are very similar in size to the tagged library. Most sRNA library preparation strategies thus employ a gel purification step to isolate tagged library from adapter dimer contaminants. At very low sample inputs, adapter dimer side products dominate the reaction and limit the sensitivity of this technique. Here we address the need for improved specificity of sRNA library preparation workflows with a novel library preparation approach that uses modified adapters to suppress adapter dimer formation. This workflow allows for lower sample inputs and elimination of the gel purification step, which in turn allows for an automatable sRNA library preparation protocol.

  14. Enzymatic synthesis of random sequences of RNA and RNA analogues by DNA polymerase theta mutants for the generation of aptamer libraries.

    PubMed

    Randrianjatovo-Gbalou, Irina; Rosario, Sandrine; Sismeiro, Odile; Varet, Hugo; Legendre, Rachel; Coppée, Jean-Yves; Huteau, Valérie; Pochet, Sylvie; Delarue, Marc

    2018-05-21

    Nucleic acid aptamers, especially RNA, exhibit valuable advantages compared to protein therapeutics in terms of size, affinity and specificity. However, the synthesis of libraries of large random RNAs is still difficult and expensive. The engineering of polymerases able to directly generate these libraries has the potential to replace the chemical synthesis approach. Here, we start with a DNA polymerase that already displays a significant template-free nucleotidyltransferase activity, human DNA polymerase theta, and we mutate it based on the knowledge of its three-dimensional structure as well as previous mutational studies on members of the same polA family. One mutant exhibited a high tolerance towards ribonucleotides (NTPs) and displayed an efficient ribonucleotidyltransferase activity that resulted in the assembly of long RNA polymers. HPLC analysis and RNA sequencing of the products were used to quantify the incorporation of the four NTPs as a function of initial NTP concentrations and established the randomness of each generated nucleic acid sequence. The same mutant revealed a propensity to accept other modified nucleotides and to extend them in long fragments. Hence, this mutant can deliver random natural and modified RNA polymers libraries ready to use for SELEX, with custom lengths and balanced or unbalanced ratios.

  15. RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.

    PubMed

    Walia, Rasna R; Xue, Li C; Wilkins, Katherine; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2014-01-01

    Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence

  16. RNase-Resistant Virus-Like Particles Containing Long Chimeric RNA Sequences Produced by Two-Plasmid Coexpression System▿

    PubMed Central

    Wei, Yuxiang; Yang, Changmei; Wei, Baojun; Huang, Jie; Wang, Lunan; Meng, Shuang; Zhang, Rui; Li, Jinming

    2008-01-01

    RNase-resistant, noninfectious virus-like particles containing exogenous RNA sequences (armored RNA) are good candidates as RNA controls and standards in RNA virus detection. However, the length of RNA packaged in the virus-like particles with high efficiency is usually less than 500 bases. In this study, we describe a method for producing armored L-RNA. Armored L-RNA is a complex of MS2 bacteriophage coat protein and RNA produced in Escherichia coli by the induction of a two-plasmid coexpression system in which the coat protein and maturase are expressed from one plasmid and the target RNA sequence with modified MS2 stem-loop (pac site) is transcribed from another plasmid. A 3V armored L-RNA of 2,248 bases containing six gene fragments—hepatitis C virus, severe acute respiratory syndrome coronavirus (SARS-CoV1, SARS-CoV2, and SARS-CoV3), avian influenza virus matrix gene (M300), and H5N1 avian influenza virus (HA300)—was successfully expressed by the two-plasmid coexpression system and was demonstrated to have all of the characteristics of armored RNA. We evaluated the 3V armored L-RNA as a calibrator for multiple virus assays. We used the WHO International Standard for HCV RNA (NIBSC 96/790) to calibrate the chimeric armored L-RNA, which was diluted by 10-fold serial dilutions to obtain samples containing 106 to 102 copies. In conclusion, the approach we used for armored L-RNA preparation is practical and could reduce the labor and cost of quality control in multiplex RNA virus assays. Furthermore, we can assign the chimeric armored RNA with an international unit for quantitative detection. PMID:18305135

  17. Comparison of 16S rRNA sequencing with biochemical testing for species-level identification of clinical isolates of Neisseria spp.

    PubMed

    Mechergui, Arij; Achour, Wafa; Ben Hassen, Assia

    2014-08-01

    We aimed to compare accuracy of genus and species level identification of Neisseria spp. using biochemical testing and 16S rRNA sequence analysis. These methods were evaluated using 85 Neisseria spp. clinical isolates initially identified to the genus level by conventional biochemical tests and API NH system (Bio-Mérieux(®)). In 34 % (29/85), more than one possibility was given by 16S rRNA sequence analysis. In 6 % (5/85), one of the possibilities offered by 16S rRNA gene sequencing, agreed with the result given by biochemical testing. In 4 % (3/85), the same species was given by both methods. 16S rRNA gene sequencing results did not correlate well with biochemical tests.

  18. 16S rRNA partial gene sequencing for the differentiation and molecular subtyping of Listeria species.

    PubMed

    Hellberg, Rosalee S; Martin, Keely G; Keys, Ashley L; Haney, Christopher J; Shen, Yuelian; Smiley, R Derike

    2013-12-01

    Use of 16S rRNA partial gene sequencing within the regulatory workflow could greatly reduce the time and labor needed for confirmation and subtyping of Listeria monocytogenes. The goal of this study was to build a 16S rRNA partial gene reference library for Listeria spp. and investigate the potential for 16S rRNA molecular subtyping. A total of 86 isolates of Listeria representing L. innocua, L. seeligeri, L. welshimeri, and L. monocytogenes were obtained for use in building the custom library. Seven non-Listeria species and three additional strains of Listeria were obtained for use in exclusivity and food spiking tests. Isolates were sequenced for the partial 16S rRNA gene using the MicroSeq ID 500 Bacterial Identification Kit (Applied Biosystems). High-quality sequences were obtained for 84 of the custom library isolates and 23 unique 16S sequence types were discovered for use in molecular subtyping. All of the exclusivity strains were negative for Listeria and the three Listeria strains used in food spiking were consistently recovered and correctly identified at the species level. The spiking results also allowed for differentiation beyond the species level, as 87% of replicates for one strain and 100% of replicates for the other two strains consistently matched the same 16S type. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer.

    PubMed

    Wojcik, Sylwia E; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z; Rai, Kanti R; Kipps, Thomas J; Keating, Michael J; Croce, Carlo M; Calin, George A

    2010-02-01

    Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas.

  20. Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer

    PubMed Central

    Wojcik, Sylwia E.; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S.; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z.; Rai, Kanti R.; Kipps, Thomas J.; Keating, Michael J.

    2010-01-01

    Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas. PMID:19926640