Science.gov

Sample records for rna instability sequences

  1. The role of topoisomerase I in suppressing genome instability associated with a highly transcribed guanine-rich sequence is not restricted to preventing RNA:DNA hybrid accumulation.

    PubMed

    Yadav, Puja; Owiti, Norah; Kim, Nayun

    2016-01-29

    Highly transcribed guanine-run containing sequences, in Saccharomyces cerevisiae, become unstable when topoisomerase I (Top1) is disrupted. Topological changes, such as the formation of extended RNA:DNA hybrids or R-loops or non-canonical DNA structures including G-quadruplexes has been proposed as the major underlying cause of the transcription-linked genome instability. Here, we report that R-loop accumulation at a guanine-rich sequence, which is capable of assembling into the four-stranded G4 DNA structure, is dependent on the level and the orientation of transcription. In the absence of Top1 or RNase Hs, R-loops accumulated to substantially higher extent when guanine-runs were located on the non-transcribed strand. This coincides with the orientation where higher genome instability was observed. However, we further report that there are significant differences between the disruption of RNase Hs and Top1 in regards to the orientation-specific elevation in genome instability at the guanine-rich sequence. Additionally, genome instability in Top1-deficient yeasts is not completely suppressed by removal of negative supercoils and further aggravated by expression of mutant Top1. Together, our data provide a strong support for a function of Top1 in suppressing genome instability at the guanine-run containing sequence that goes beyond preventing the transcription-associated RNA:DNA hybrid formation.

  2. Nuclear RNA Isolation and Sequencing.

    PubMed

    Dhaliwal, Navroop K; Mitchell, Jennifer A

    2016-01-01

    Most transcriptome studies involve sequencing and quantification of steady-state mRNA by isolating and sequencing poly (A) RNA. Although this type of sequencing data is informative to determine steady-state mRNA levels it does not provide information on transcriptional output and thus may not always reflect changes in transcriptional regulation of gene expression. Furthermore, sequencing poly (A) RNA may miss transcribed regions of the genome not usually modified by polyadenylation which includes many long noncoding RNAs. Here, we describe nuclear-RNA sequencing (nucRNA-seq) which investigates the transcriptional landscape through sequencing and quantification of nuclear RNAs which are both unspliced and spliced transcripts for protein-coding genes and nuclear-retained long noncoding RNAs.

  3. AMPLIFICATION OF RIBOSOMAL RNA SEQUENCES

    EPA Science Inventory

    This book chapter offers an overview of the use of ribosomal RNA sequences. A history of the technology traces the evolution of techniques to measure bacterial phylogenetic relationships and recent advances in obtaining rRNA sequence information. The manual also describes procedu...

  4. CTLA-8, cloned from an activated T cell, bearing AU-rich messenger RNA instability sequences, and homologous to a herpesvirus saimiri gene.

    PubMed

    Rouvier, E; Luciani, M F; Mattéi, M G; Denizot, F; Golstein, P

    1993-06-15

    To detect novel molecules involved in immune functions, a subtracted cDNA library between closely related murine lymphoid cells was prepared using improved technology. Differential screening of this library yielded several clones with a very restricted tissue specificity, including one that we named CTLA-8. CTLA-8 transcripts could be detected only in T cell hybridoma clones related to the one used to prepare the library. Southern blots showed that the CTLA-8 gene was single copy in mice, rats, and humans. By radioactive in situ hybridization, the CTLA-8 gene was mapped at a single site on mouse chromosome 1A and human chromosome 2q31, in a known interspecific syntenic region. The CTLA-8 cDNA sequence indicated the presence, in the 3'-untranslated region of the mRNA, of AU-rich repeats previously found in the mRNA of various cytokines, growth factors, and oncogenes. The CTLA-8 cDNA contained an open reading frame encoding a putative protein of 150 amino acids. This protein was 57% homologous to the putative protein encoded by the ORF13 gene of herpesvirus Saimiri, a T lymphotropic virus. These findings are discussed in the context of other genes of this herpesvirus homologous to known immunologically active molecules. More generally, CTLA-8 may belong to the growing set of virus-captured functionally important cellular genes related to the immune system or to cell death and cell survival.

  5. RNA polymerase backtracking in gene regulation and genome instability.

    PubMed

    Nudler, Evgeny

    2012-06-22

    RNA polymerase is a ratchet machine that oscillates between productive and backtracked states at numerous DNA positions. Since its first description 15 years ago, backtracking--the reversible sliding of RNA polymerase along DNA and RNA--has been implicated in many critical processes in bacteria and eukaryotes, including the control of transcription elongation, pausing, termination, fidelity, and genome instability.

  6. Deciphering the RNA landscape by RNAome sequencing

    PubMed Central

    Derks, Kasper WJ; Misovic, Branislav; van den Hout, Mirjam CGN; Kockx, Christel EM; Payan Gomez, Cesar; Brouwer, Rutger WW; Vrieling, Harry; Hoeijmakers, Jan HJ; van IJcken, Wilfred FJ; Pothof, Joris

    2015-01-01

    Current RNA expression profiling methods rely on enrichment steps for specific RNA classes, thereby not detecting all RNA species in an unperturbed manner. We report strand-specific RNAome sequencing that determines expression of small and large RNAs from rRNA-depleted total RNA in a single sequence run. Since current analysis pipelines cannot reliably analyze small and large RNAs simultaneously, we developed TRAP, Total Rna Analysis Pipeline, a robust interface that is also compatible with existing RNA sequencing protocols. RNAome sequencing quantitatively preserved all RNA classes, allowing cross-class comparisons that facilitates the identification of relationships between different RNA classes. We demonstrate the strength of RNAome sequencing in mouse embryonic stem cells treated with cisplatin. MicroRNA and mRNA expression in RNAome sequencing significantly correlated between replicates and was in concordance with both existing RNA sequencing methods and gene expression arrays generated from the same samples. Moreover, RNAome sequencing also detected additional RNA classes such as enhancer RNAs, anti-sense RNAs, novel RNA species and numerous differentially expressed RNAs undetectable by other methods. At the level of complete RNA classes, RNAome sequencing also identified a specific global repression of the microRNA and microRNA isoform classes after cisplatin treatment whereas all other classes such as mRNAs were unchanged. These characteristics of RNAome sequencing will significantly improve expression analysis as well as studies on RNA biology not covered by existing methods. PMID:25826412

  7. RNA sequence analysis using covariance models.

    PubMed Central

    Eddy, S R; Durbin, R

    1994-01-01

    We describe a general approach to several RNA sequence analysis problems using probabilistic models that flexibly describe the secondary structure and primary sequence consensus of an RNA sequence family. We call these models 'covariance models'. A covariance model of tRNA sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related sequences in sequence databases. A model can be built automatically from an existing sequence alignment. We also describe an algorithm for learning a model and hence a consensus secondary structure from initially unaligned example sequences and no prior structural information. Models trained on unaligned tRNA examples correctly predict tRNA secondary structure and produce high-quality multiple alignments. The approach may be applied to any family of small RNA sequences. Images PMID:8029015

  8. Mechanisms of genome instability induced by RNA-processing defects.

    PubMed

    Chan, Yujia A; Hieter, Philip; Stirling, Peter C

    2014-06-01

    The role of normal transcription and RNA processing in maintaining genome integrity is becoming increasingly appreciated in organisms ranging from bacteria to humans. Several mutations in RNA biogenesis factors have been implicated in human cancers, but the mechanisms and potential connections to tumor genome instability are not clear. Here, we discuss how RNA-processing defects could destabilize genomes through mutagenic R-loop structures and by altering expression of genes required for genome stability. A compelling body of evidence now suggests that researchers should be directly testing these mechanisms in models of human cancer.

  9. Functional mapping of the translation-dependent instability element of yeast MATalpha1 mRNA.

    PubMed Central

    Hennigan, A N; Jacobson, A

    1996-01-01

    The determinants of mRNA stability include specific cis-acting destabilizing sequences located within mRNA coding and noncoding regions. We have developed an approach for mapping coding-region instability sequences in unstable yeast mRNAs that exploits the link between mRNA translation and turnover and the dependence of nonsense-mediated mRNA decay on the activity of the UPF1 gene product. This approach, which involves the systematic insertion of in-frame translational termination codons into the coding sequence of a gene of interest in a upf1delta strain, differs significantly from conventional methods for mapping cis-acting elements in that it causes minimal perturbations to overall mRNA structure. Using the previously characterized MATalpha1 mRNA as a model, we have accurately localized its 65-nucleotide instability element (IE) within the protein coding region. Termination of translation 5' to this element stabilized the MATalpha1 mRNA two- to threefold relative to wild-type transcripts. Translation through the element was sufficient to restore an unstable decay phenotype, while internal termination resulted in different extents of mRNA stabilization dependent on the precise location of ribosome stalling. Detailed mutagenesis of the element's rare-codon/AU-rich sequence boundary revealed that the destabilizing activity of the MATalpha1 IE is observed when the terminal codon of the element's rare-codon interval is translated. This region of stability transition corresponds precisely to a MATalpha1 IE sequence previously shown to be complementary to 18S rRNA. Deletion of three nucleotides 3' to this sequence shifted the stability boundary one codon 5' to its wild-type location. Conversely, constructs containing an additional three nucleotides at this same location shifted the transition downstream by an equivalent sequence distance. Our results suggest a model in which the triggering of MATalpha1 mRNA destabilization results from establishment of an interaction

  10. Experimental investigation of an RNA sequence space

    NASA Technical Reports Server (NTRS)

    Lee, Youn-Hyung; Dsouza, Lisa; Fox, George E.

    1993-01-01

    Modern rRNAs are the historic consequence of an ongoing evolutionary exploration of a sequence space. These extant sequences belong to a special subset of the sequence space that is comprised only of those primary sequences that can validly perform the biological function(s) required of the particular RNA. If it were possible to readily identify all such valid sequences, stochastic predictions could be made about the relative likelihood of various evolutionary pathways available to an RNA. Herein an experimental system which can assess whether a particular sequence is likely to have validity as a eubacterial 5S rRNA is described. A total of ten naturally occurring, and hence known to be valid, sequences and two point mutants of unknown validity were used to test the usefulness of the approach. Nine of the ten valid sequences tested positive whereas both mutants tested as clearly defective. The tenth valid sequence gave results that would be interpreted as reflecting a borderline status were the answer not known. These results demonstrate that it is possible to experimentally determine which sequences in local regions of the sequence space are potentially valid 5S rRNAs.

  11. Sequence Fingerprints of MicroRNA Conservation

    PubMed Central

    Shi, Bing; Gao, Wei; Wang, Juan

    2012-01-01

    It is known that the conservation of protein-coding genes is associated with their sequences both various species, such as animals and plants. However, the association between microRNA (miRNA) conservation and their sequences in various species remains unexplored. Here we report the association of miRNA conservation with its sequence features, such as base content and cleavage sites, suggesting that miRNA sequences contain the fingerprints for miRNA conservation. More interestingly, different species show different and even opposite patterns between miRNA conservation and sequence features. For example, mammalian miRNAs show a positive/negative correlation between conservation and AU/GC content, whereas plant miRNAs show a negative/positive correlation between conservation and AU/GC content. Further analysis puts forward the hypothesis that the introns of protein-coding genes may be a main driving force for the origin and evolution of mammalian miRNAs. At the 5′ end, conserved miRNAs have a preference for base U, while less-conserved miRNAs have a preference for a non-U base in mammals. This difference does not exist in insects and plants, in which both conserved miRNAs and less-conserved miRNAs have a preference for base U at the 5′ end. We further revealed that the non-U preference at the 5′ end of less-conserved mammalian miRNAs is associated with miRNA function diversity, which may have evolved from the pressure of a highly sophisticated environmental stimulus the mammals encountered during evolution. These results indicated that miRNA sequences contain the fingerprints for conservation, and these fingerprints vary according to species. More importantly, the results suggest that although species share common mechanisms by which miRNAs originate and evolve, mammals may develop a novel mechanism for miRNA origin and evolution. In addition, the fingerprint found in this study can be predictor of miRNA conservation, and the findings are helpful in achieving a

  12. Probabilistic error correction for RNA sequencing

    PubMed Central

    Le, Hai-Son; Schulz, Marcel H.; McCauley, Brenna M.; Hinman, Veronica F.; Bar-Joseph, Ziv

    2013-01-01

    Sequencing of RNAs (RNA-Seq) has revolutionized the field of transcriptomics, but the reads obtained often contain errors. Read error correction can have a large impact on our ability to accurately assemble transcripts. This is especially true for de novo transcriptome analysis, where a reference genome is not available. Current read error correction methods, developed for DNA sequence data, cannot handle the overlapping effects of non-uniform abundance, polymorphisms and alternative splicing. Here we present SEquencing Error CorrEction in Rna-seq data (SEECER), a hidden Markov Model (HMM)–based method, which is the first to successfully address these problems. SEECER efficiently learns hundreds of thousands of HMMs and uses these to correct sequencing errors. Using human RNA-Seq data, we show that SEECER greatly improves on previous methods in terms of quality of read alignment to the genome and assembly accuracy. To illustrate the usefulness of SEECER for de novo transcriptome studies, we generated new RNA-Seq data to study the development of the sea cucumber Parastichopus parvimensis. Our corrected assembled transcripts shed new light on two important stages in sea cucumber development. Comparison of the assembled transcripts to known transcripts in other species has also revealed novel transcripts that are unique to sea cucumber, some of which we have experimentally validated. Supporting website: http://sb.cs.cmu.edu/seecer/. PMID:23558750

  13. Probabilistic error correction for RNA sequencing.

    PubMed

    Le, Hai-Son; Schulz, Marcel H; McCauley, Brenna M; Hinman, Veronica F; Bar-Joseph, Ziv

    2013-05-01

    Sequencing of RNAs (RNA-Seq) has revolutionized the field of transcriptomics, but the reads obtained often contain errors. Read error correction can have a large impact on our ability to accurately assemble transcripts. This is especially true for de novo transcriptome analysis, where a reference genome is not available. Current read error correction methods, developed for DNA sequence data, cannot handle the overlapping effects of non-uniform abundance, polymorphisms and alternative splicing. Here we present SEquencing Error CorrEction in Rna-seq data (SEECER), a hidden Markov Model (HMM)-based method, which is the first to successfully address these problems. SEECER efficiently learns hundreds of thousands of HMMs and uses these to correct sequencing errors. Using human RNA-Seq data, we show that SEECER greatly improves on previous methods in terms of quality of read alignment to the genome and assembly accuracy. To illustrate the usefulness of SEECER for de novo transcriptome studies, we generated new RNA-Seq data to study the development of the sea cucumber Parastichopus parvimensis. Our corrected assembled transcripts shed new light on two important stages in sea cucumber development. Comparison of the assembled transcripts to known transcripts in other species has also revealed novel transcripts that are unique to sea cucumber, some of which we have experimentally validated. Supporting website: http://sb.cs.cmu.edu/seecer/.

  14. Detection theory in identification of RNA-DNA sequence differences using RNA-sequencing.

    PubMed

    Toung, Jonathan M; Lahens, Nicholas; Hogenesch, John B; Grant, Gregory

    2014-01-01

    Advances in sequencing technology have allowed for detailed analyses of the transcriptome at single-nucleotide resolution, facilitating the study of RNA editing or sequence differences between RNA and DNA genome-wide. In humans, two types of post-transcriptional RNA editing processes are known to occur: A-to-I deamination by ADAR and C-to-U deamination by APOBEC1. In addition to these sequence differences, researchers have reported the existence of all 12 types of RNA-DNA sequence differences (RDDs); however, the validity of these claims is debated, as many studies claim that technical artifacts account for the majority of these non-canonical sequence differences. In this study, we used a detection theory approach to evaluate the performance of RNA-Sequencing (RNA-Seq) and associated aligners in accurately identifying RNA-DNA sequence differences. By generating simulated RNA-Seq datasets containing RDDs, we assessed the effect of alignment artifacts and sequencing error on the sensitivity and false discovery rate of RDD detection. Overall, we found that even in the presence of sequencing errors, false negative and false discovery rates of RDD detection can be contained below 10% with relatively lenient thresholds. We also assessed the ability of various filters to target false positive RDDs and found them to be effective in discriminating between true and false positives. Lastly, we used the optimal thresholds we identified from our simulated analyses to identify RDDs in a human lymphoblastoid cell line. We found approximately 6,000 RDDs, the majority of which are A-to-G edits and likely to be mediated by ADAR. Moreover, we found the majority of non A-to-G RDDs to be associated with poorer alignments and conclude from these results that the evidence for widespread non-canonical RDDs in humans is weak. Overall, we found RNA-Seq to be a powerful technique for surveying RDDs genome-wide when coupled with the appropriate thresholds and filters.

  15. Dis3- and exosome subunit-responsive 3 Prime mRNA instability elements

    SciTech Connect

    Kiss, Daniel L.; Hou, Dezhi; Gross, Robert H.; Andrulis, Erik D.

    2012-07-06

    Highlights: Black-Right-Pointing-Pointer Successful use of a novel RNA-specific bioinformatic tool, RNA SCOPE. Black-Right-Pointing-Pointer Identified novel 3 Prime UTR cis-acting element that destabilizes a reporter mRNA. Black-Right-Pointing-Pointer Show exosome subunits are required for cis-acting element-mediated mRNA instability. Black-Right-Pointing-Pointer Define precise sequence requirements of novel cis-acting element. Black-Right-Pointing-Pointer Show that microarray-defined exosome subunit-regulated mRNAs have novel element. -- Abstract: Eukaryotic RNA turnover is regulated in part by the exosome, a nuclear and cytoplasmic complex of ribonucleases (RNases) and RNA-binding proteins. The major RNase of the complex is thought to be Dis3, a multi-functional 3 Prime -5 Prime exoribonuclease and endoribonuclease. Although it is known that Dis3 and core exosome subunits are recruited to transcriptionally active genes and to messenger RNA (mRNA) substrates, this recruitment is thought to occur indirectly. We sought to discover cis-acting elements that recruit Dis3 or other exosome subunits. Using a bioinformatic tool called RNA SCOPE to screen the 3 Prime untranslated regions of up-regulated transcripts from our published Dis3 depletion-derived transcriptomic data set, we identified several motifs as candidate instability elements. Secondary screening using a luciferase reporter system revealed that one cassette-harboring four elements-destabilized the reporter transcript. RNAi-based depletion of Dis3, Rrp6, Rrp4, Rrp40, or Rrp46 diminished the efficacy of cassette-mediated destabilization. Truncation analysis of the cassette showed that two exosome subunit-sensitive elements (ESSEs) destabilized the reporter. Point-directed mutagenesis of ESSE abrogated the destabilization effect. An examination of the transcriptomic data from exosome subunit depletion-based microarrays revealed that mRNAs with ESSEs are found in every up-regulated mRNA data set but are

  16. Predicting pseudoknotted structures across two RNA sequences

    PubMed Central

    Sperschneider, Jana; Datta, Amitava; Wise, Michael J.

    2012-01-01

    Motivation: Laboratory RNA structure determination is demanding and costly and thus, computational structure prediction is an important task. Single sequence methods for RNA secondary structure prediction are limited by the accuracy of the underlying folding model, if a structure is supported by a family of evolutionarily related sequences, one can be more confident that the prediction is accurate. RNA pseudoknots are functional elements, which have highly conserved structures. However, few comparative structure prediction methods can handle pseudoknots due to the computational complexity. Results: A comparative pseudoknot prediction method called DotKnot-PW is introduced based on structural comparison of secondary structure elements and H-type pseudoknot candidates. DotKnot-PW outperforms other methods from the literature on a hand-curated test set of RNA structures with experimental support. Availability: DotKnot-PW and the RNA structure test set are available at the web site http://dotknot.csse.uwa.edu.au/pw. Contact: janaspe@csse.uwa.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23044552

  17. Ribosomal RNA sequence suggest microsporidia are extremely ancient eukaryotes

    NASA Technical Reports Server (NTRS)

    Vossbrinck, C. R.; Maddox, J. V.; Friedman, S.; Debrunner-Vossbrinck, B. A.; Woese, C. R.

    1987-01-01

    A comparative sequence analysis of the 18S small subunit ribosomal RNA (rRNA) of the microsporidium Vairimorpha necatrix is presented. The results show that this rRNA sequence is more unlike those of other eukaryotes than any known eukaryote rRNA sequence. It is concluded that the lineage leading to microsporidia branched very early from that leading to other eukaryotes.

  18. Advanced Applications of RNA Sequencing and Challenges

    PubMed Central

    Han, Yixing; Gao, Shouguo; Muegge, Kathrin; Zhang, Wei; Zhou, Bing

    2015-01-01

    Next-generation sequencing technologies have revolutionarily advanced sequence-based research with the advantages of high-throughput, high-sensitivity, and high-speed. RNA-seq is now being used widely for uncovering multiple facets of transcriptome to facilitate the biological applications. However, the large-scale data analyses associated with RNA-seq harbors challenges. In this study, we present a detailed overview of the applications of this technology and the challenges that need to be addressed, including data preprocessing, differential gene expression analysis, alternative splicing analysis, variants detection and allele-specific expression, pathway analysis, co-expression network analysis, and applications combining various experimental procedures beyond the achievements that have been made. Specifically, we discuss essential principles of computational methods that are required to meet the key challenges of the RNA-seq data analyses, development of various bioinformatics tools, challenges associated with the RNA-seq applications, and examples that represent the advances made so far in the characterization of the transcriptome. PMID:26609224

  19. Applicability of Next Generation Sequencing Technology in Microsatellite Instability Testing

    PubMed Central

    Gan, Chun; Love, Clare; Beshay, Victoria; Macrae, Finlay; Fox, Stephen; Waring, Paul; Taylor, Graham

    2015-01-01

    Microsatellite instability (MSI) is a useful marker for risk assessment, prediction of chemotherapy responsiveness and prognosis in patients with colorectal cancer. Here, we describe a next generation sequencing approach for MSI testing using the MiSeq platform. Different from other MSI capturing strategies that are based on targeted gene capture, we utilize “deep resequencing”, where we focus the sequencing on only the microsatellite regions of interest. We sequenced a series of 44 colorectal tumours with normal controls for five MSI loci (BAT25, BAT26, BAT34c4, D18S55, D5S346) and a second series of six colorectal tumours (no control) with two mononucleotide loci (BAT25, BAT26). In the first series, we were able to determine 17 MSI-High, 1 MSI-Low and 26 microsatellite stable (MSS) tumours. In the second series, there were three MSI-High and three MSS tumours. Although there was some variation within individual markers, this NGS method produced the same overall MSI status for each tumour, as obtained with the traditional multiplex PCR-based method. PMID:25685876

  20. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-03-19

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data.

  1. RNA-sequencing from single nuclei

    PubMed Central

    Grindberg, Rashel V.; Yee-Greenbaum, Joyclyn L.; McConnell, Michael J.; Novotny, Mark; O’Shaughnessy, Andy L.; Lambert, Georgina M.; Araúzo-Bravo, Marcos J.; Lee, Jun; Fishman, Max; Robbins, Gillian E.; Lin, Xiaoying; Venepally, Pratap; Badger, Jonathan H.; Galbraith, David W.; Gage, Fred H.; Lasken, Roger S.

    2013-01-01

    It has recently been established that synthesis of double-stranded cDNA can be done from a single cell for use in DNA sequencing. Global gene expression can be quantified from the number of reads mapping to each gene, and mutations and mRNA splicing variants determined from the sequence reads. Here we demonstrate that this method of transcriptomic analysis can be done using the extremely low levels of mRNA in a single nucleus, isolated from a mouse neural progenitor cell line and from dissected hippocampal tissue. This method is characterized by excellent coverage and technical reproducibility. On average, more than 16,000 of the 24,057 mouse protein-coding genes were detected from single nuclei, and the amount of gene-expression variation was similar when measured between single nuclei and single cells. Several major advantages of the method exist: first, nuclei, compared with whole cells, have the advantage of being easily isolated from complex tissues and organs, such as those in the CNS. Second, the method can be widely applied to eukaryotic species, including those of different kingdoms. The method also provides insight into regulatory mechanisms specific to the nucleus. Finally, the method enables dissection of regulatory events at the single-cell level; pooling of 10 nuclei or 10 cells obscures some of the variability measured in transcript levels, implying that single nuclei and cells will be extremely useful in revealing the physiological state and interconnectedness of gene regulation in a manner that avoids the masking inherent to conventional transcriptomics using bulk cells or tissues. PMID:24248345

  2. RNA-sequencing from single nuclei.

    PubMed

    Grindberg, Rashel V; Yee-Greenbaum, Joyclyn L; McConnell, Michael J; Novotny, Mark; O'Shaughnessy, Andy L; Lambert, Georgina M; Araúzo-Bravo, Marcos J; Lee, Jun; Fishman, Max; Robbins, Gillian E; Lin, Xiaoying; Venepally, Pratap; Badger, Jonathan H; Galbraith, David W; Gage, Fred H; Lasken, Roger S

    2013-12-03

    It has recently been established that synthesis of double-stranded cDNA can be done from a single cell for use in DNA sequencing. Global gene expression can be quantified from the number of reads mapping to each gene, and mutations and mRNA splicing variants determined from the sequence reads. Here we demonstrate that this method of transcriptomic analysis can be done using the extremely low levels of mRNA in a single nucleus, isolated from a mouse neural progenitor cell line and from dissected hippocampal tissue. This method is characterized by excellent coverage and technical reproducibility. On average, more than 16,000 of the 24,057 mouse protein-coding genes were detected from single nuclei, and the amount of gene-expression variation was similar when measured between single nuclei and single cells. Several major advantages of the method exist: first, nuclei, compared with whole cells, have the advantage of being easily isolated from complex tissues and organs, such as those in the CNS. Second, the method can be widely applied to eukaryotic species, including those of different kingdoms. The method also provides insight into regulatory mechanisms specific to the nucleus. Finally, the method enables dissection of regulatory events at the single-cell level; pooling of 10 nuclei or 10 cells obscures some of the variability measured in transcript levels, implying that single nuclei and cells will be extremely useful in revealing the physiological state and interconnectedness of gene regulation in a manner that avoids the masking inherent to conventional transcriptomics using bulk cells or tissues.

  3. Nucleotide sequence of Neurospora crassa cytoplasmic initiator tRNA.

    PubMed Central

    Gillum, A M; Hecker, L I; Silberklang, M; Schwartzbach, S D; RajBhandary, U L; Barnett, W E

    1977-01-01

    Initiator methionine tRNA from the cytoplasm of Neurospora crassa has been purified and sequenced. The sequence is: pAGCUGCAUm1GGCGCAGCGGAAGCGCM22GCY*GGGCUCAUt6AACCCGGAGm7GU (or D) - CACUCGAUCGm1AAACGAG*UUGCAGCUACCAOH. Similar to initiator tRNAs from the cytoplasm of other eukaryotes, this tRNA also contains the sequence -AUCG- instead of the usual -TphiCG (or A)- found in loop IV of other tRNAs. The sequence of the N. crassa cytoplasmic initiator tRNA is quite different from that of the corresponding mitochondrial initiator tRNA. Comparison of the sequence of N. crassa cytoplasmic initiator tRNA to those of yeast, wheat germ and vertebrate cytoplasmic initiator tRNA indicates that the sequences of the two fungal tRNAs are no more similar to each other than they are to those of other initiator tRNAs. Images PMID:146192

  4. Concentrations of individual RNA sequences in polyadenylated nuclear and cytoplasmic RNA populations of Drosophila cells.

    PubMed Central

    Biessmann, H

    1980-01-01

    Steady state concentrations of individual RNA sequences in poly(A) nuclear and cytoplasmic RNA populations of Drosophila Kc cells were determined using cloned cDNA fragments. These cDNAs represent poly(A) RNA sequences of different abundance in the cytoplasm of Kc cells, but their steady state concentrations in poly(A) hnRNA was always lower. Of ten different sequences analysed, eight showed some four-fold lower concentration in hnRNA mRNA, two were underrepresented in hnRNA relative to the others. The obvious clustering of mRNA/hnRNA ratios is discussed in relation to sequence complexity and turnover rates of these RNA populations. Images PMID:6162158

  5. Depletion of Ribosomal RNA Sequences from Single-Cell RNA-Sequencing Library.

    PubMed

    Fang, Nan; Akinci-Tolun, Rumeysa

    2016-07-01

    Recent advances in single-cell RNA sequencing technologies have revealed high heterogeneity of gene expression profiles in individual cells. However, most current single-cell RNA-seq methods use oligo-dT priming in the reverse transcription steps and detect only polyA-positive for more accuracy, since there are also polyA-positive non-coding RNAs transcripts, not other important RNA species, such as polyA-negative noncoding RNA. Reverse transcription using random oligos enables detection of not only the noncoding RNA species without polyA tails, but also ribosomal RNA (rRNA). rRNA comprises more than 90% of the total RNA and should be depleted from the RNA-seq library to ensure efficient usage of the sequencing capacity. Commonly used hybridization-based rRNA depletion methods can preserve noncoding RNA in the standard RNA-seq library. However, such rRNA depletion methods require high input amounts of total RNA and do not work at the single-cell level or with limited input DNA. This unit describes a novel procedure for RNA-seq library construction from single cells or a minimal amount of RNA. A thermostable duplex-specific nuclease is used in this method to effectively remove ribosomal RNA sequences following whole-transcriptome amplification and sequencing library construction. © 2016 by John Wiley & Sons, Inc.

  6. Approaching marine bioprospecting in hexacorals by RNA deep sequencing.

    PubMed

    Johansen, Steinar D; Emblem, Ase; Karlsen, Bård Ove; Okkenhaug, Siri; Hansen, Hilde; Moum, Truls; Coucheron, Dag H; Seternes, Ole Morten

    2010-07-31

    RNA deep sequencing represents a new complementary approach in marine bioprospecting. Next-generation sequencing platforms have recently been developed for de novo whole transcriptome analysis, small RNA discovery and gene expression profiling. Deep sequencing transcriptomics (sequencing the complete set of cellular transcripts at a specific stage or condition) leads to sequential identification of all expressed genes in a sample. When combined to high-throughput bioinformatics and protein synthesis, RNA deep sequencing represents a new powerful approach in gene product discovery and bioprospecting. Here we summarize recent progress in the analyses of hexacoral transcriptomes with the focus on cold-water sea anemones and related organisms.

  7. Empirical insights into the stochasticity of small RNA sequencing

    NASA Astrophysics Data System (ADS)

    Qin, Li-Xuan; Tuschl, Thomas; Singer, Samuel

    2016-04-01

    The choice of stochasticity distribution for modeling the noise distribution is a fundamental assumption for the analysis of sequencing data and consequently is critical for the accurate assessment of biological heterogeneity and differential expression. The stochasticity of RNA sequencing has been assumed to follow Poisson distributions. We collected microRNA sequencing data and observed that its stochasticity is better approximated by gamma distributions, likely because of the stochastic nature of exponential PCR amplification. We validated our findings with two independent datasets, one for microRNA sequencing and another for RNA sequencing. Motivated by the gamma distributed stochasticity, we provided a simple method for the analysis of RNA sequencing data and showed its superiority to three existing methods for differential expression analysis using three data examples of technical replicate data and biological replicate data.

  8. BS-RNA: An efficient mapping and annotation tool for RNA bisulfite sequencing data.

    PubMed

    Liang, Fang; Hao, Lili; Wang, Jinyue; Shi, Shuo; Xiao, Jingfa; Li, Rujiao

    2016-12-01

    Cytosine methylation is one of the most important RNA epigenetic modifications. With the development of experimental technology, scientists attach more importance to RNA cytosine methylation and find bisulfite sequencing is an effective experimental method for RNA cytosine methylation study. However, there are only a few tools can directly deal with RNA bisulfite sequencing data efficiently. Herein, we developed a specialized tool BS-RNA, which can analyze cytosine methylation of RNA based on bisulfite sequencing data and support both paired-end and single-end sequencing reads from directional bisulfite libraries. For paired-end reads, simply removing the biased positions from the 5' end may result in "dovetailing" reads, where one or both reads seem to extend past the start of the mate read. BS-RNA could map "dovetailing" reads successfully. The annotation result of BS-RNA is exported in BED (.bed) format, including locations, sequence context types (CG/CHG/CHH, H=A,T, or C), reference sequencing depths, cytosine sequencing depths, and methylation levels of covered cytosine sites on both Watson and Crick strands. BS-RNA is an efficient, specialized and highly automated mapping and annotation tool for RNA bisulfite sequencing data. It performs better than the existing program in terms of accuracy and efficiency. BS-RNA is developed by Perl language and the source code of this tool is freely available from the website: http://bs-rna.big.ac.cn.

  9. miRBase: the microRNA sequence database.

    PubMed

    Griffiths-Jones, Sam

    2006-01-01

    The miRBase Sequence database is the primary repository for published microRNA (miRNA) sequence and annotation data. miRBase provides a user-friendly web interface for miRNA data, allowing the user to search using key words or sequences, trace links to the primary literature referencing the miRNA discoveries, analyze genomic coordinates and context, and mine relationships between miRNA sequences. miRBase also provides a confidential gene-naming service, assigning official miRNA names to novel genes before their publication. The methods outlined in this chapter describe these functions. miRBase is freely available to all at http://microrna.sanger.ac.uk/.

  10. RNAcentral: A comprehensive database of non-coding RNA sequences

    DOE PAGES

    Williams, Kelly Porter; Lau, Britney Yan

    2016-10-28

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less

  11. RNAcentral: a comprehensive database of non-coding RNA sequences

    PubMed Central

    2017-01-01

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/. PMID:27794554

  12. RNAcentral: A comprehensive database of non-coding RNA sequences

    SciTech Connect

    Williams, Kelly Porter; Lau, Britney Yan

    2016-10-28

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality.

  13. DSAP: deep-sequencing small RNA analysis pipeline.

    PubMed

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  14. RNase P-Mediated Sequence-Specific Cleavage of RNA by Engineered External Guide Sequences

    PubMed Central

    Derksen, Merel; Mertens, Vicky; Pruijn, Ger J.M.

    2015-01-01

    The RNA cleavage activity of RNase P can be employed to decrease the levels of specific RNAs and to study their function or even to eradicate pathogens. Two different technologies have been developed to use RNase P as a tool for RNA knockdown. In one of these, an external guide sequence, which mimics a tRNA precursor, a well-known natural RNase P substrate, is used to target an RNA molecule for cleavage by endogenous RNase P. Alternatively, a guide sequence can be attached to M1 RNA, the (catalytic) RNase P RNA subunit of Escherichia coli. The guide sequence is specific for an RNA target, which is subsequently cleaved by the bacterial M1 RNA moiety. These approaches are applicable in both bacteria and eukaryotes. In this review, we will discuss the two technologies in which RNase P is used to reduce RNA expression levels. PMID:26569326

  15. RNase P-Mediated Sequence-Specific Cleavage of RNA by Engineered External Guide Sequences.

    PubMed

    Derksen, Merel; Mertens, Vicky; Pruijn, Ger J M

    2015-11-09

    The RNA cleavage activity of RNase P can be employed to decrease the levels of specific RNAs and to study their function or even to eradicate pathogens. Two different technologies have been developed to use RNase P as a tool for RNA knockdown. In one of these, an external guide sequence, which mimics a tRNA precursor, a well-known natural RNase P substrate, is used to target an RNA molecule for cleavage by endogenous RNase P. Alternatively, a guide sequence can be attached to M1 RNA, the (catalytic) RNase P RNA subunit of Escherichia coli. The guide sequence is specific for an RNA target, which is subsequently cleaved by the bacterial M1 RNA moiety. These approaches are applicable in both bacteria and eukaryotes. In this review, we will discuss the two technologies in which RNase P is used to reduce RNA expression levels.

  16. Unbiased Deep Sequencing of RNA Viruses from Clinical Samples

    PubMed Central

    Matranga, Christian B.; Gladden-Young, Adrianne; Qu, James; Winnicki, Sarah; Nosamiefan, Dolo; Levin, Joshua Z.; Sabeti, Pardis C.

    2016-01-01

    Here we outline a next-generation RNA sequencing protocol that enables de novo assemblies and intra-host variant calls of viral genomes collected from clinical and biological sources. The method is unbiased and universal; it uses random primers for cDNA synthesis and requires no prior knowledge of the viral sequence content. Before library construction, selective RNase H-based digestion is used to deplete unwanted RNA — including poly(rA) carrier and ribosomal RNA — from the viral RNA sample. Selective depletion improves both the data quality and the number of unique reads in viral RNA sequencing libraries. Moreover, a transposase-based 'tagmentation' step is used in the protocol as it reduces overall library construction time. The protocol has enabled rapid deep sequencing of over 600 Lassa and Ebola virus samples-including collections from both blood and tissue isolates-and is broadly applicable to other microbial genomics studies. PMID:27403729

  17. Noncoding RNA gene detection using comparative sequence analysis

    PubMed Central

    Rivas, Elena; Eddy, Sean R

    2001-01-01

    Background Noncoding RNA genes produce transcripts that exert their function without ever producing proteins. Noncoding RNA gene sequences do not have strong statistical signals, unlike protein coding genes. A reliable general purpose computational genefinder for noncoding RNA genes has been elusive. Results We describe a comparative sequence analysis algorithm for detecting novel structural RNA genes. The key idea is to test the pattern of substitutions observed in a pairwise alignment of two homologous sequences. A conserved coding region tends to show a pattern of synonymous substitutions, whereas a conserved structural RNA tends to show a pattern of compensatory mutations consistent with some base-paired secondary structure. We formalize this intuition using three probabilistic "pair-grammars": a pair stochastic context free grammar modeling alignments constrained by structural RNA evolution, a pair hidden Markov model modeling alignments constrained by coding sequence evolution, and a pair hidden Markov model modeling a null hypothesis of position-independent evolution. Given an input pairwise sequence alignment (e.g. from a BLASTN comparison of two related genomes) we classify the alignment into the coding, RNA, or null class according to the posterior probability of each class. Conclusions We have implemented this approach as a program, QRNA, which we consider to be a prototype structural noncoding RNA genefinder. Tests suggest that this approach detects noncoding RNA genes with a fair degree of reliability. PMID:11801179

  18. Simulations Using Random-Generated DNA and RNA Sequences

    ERIC Educational Resources Information Center

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  19. Replication initiation and genome instability: a crossroads for DNA and RNA synthesis.

    PubMed

    Barlow, Jacqueline H; Nussenzweig, André

    2014-12-01

    Nuclear DNA replication requires the concerted action of hundreds of proteins to efficiently unwind and duplicate the entire genome while also retaining epigenetic regulatory information. Initiation of DNA replication is tightly regulated, rapidly firing thousands of origins once the conditions to promote rapid and faithful replication are in place, and defects in replication initiation lead to proliferation defects, genome instability, and a range of developmental abnormalities. Interestingly, DNA replication in metazoans initiates in actively transcribed DNA, meaning that replication initiation occurs in DNA that is co-occupied with tens of thousands of poised and active RNA polymerase complexes. Active transcription can induce genome instability, particularly during DNA replication, as RNA polymerases can induce torsional stress, formation of secondary structures, and act as a physical barrier to other enzymes involved in DNA metabolism. Here we discuss the challenges facing mammalian DNA replication, their impact on genome instability, and the development of cancer.

  20. Quantifying RNA allelic ratios by microfluidic multiplex PCR and sequencing.

    PubMed

    Zhang, Rui; Li, Xin; Ramaswami, Gokul; Smith, Kevin S; Turecki, Gustavo; Montgomery, Stephen B; Li, Jin Billy

    2014-01-01

    We developed a targeted RNA sequencing method that couples microfluidics-based multiplex PCR and deep sequencing (mmPCR-seq) to uniformly and simultaneously amplify up to 960 loci in 48 samples independently of their gene expression levels and to accurately and cost-effectively measure allelic ratios even for low-quantity or low-quality RNA samples. We applied mmPCR-seq to RNA editing and allele-specific expression studies. mmPCR-seq complements RNA-seq for studying allelic variations in the transcriptome.

  1. RNAcentral: an international database of ncRNA sequences

    SciTech Connect

    Williams, Kelly Porter

    2014-10-28

    The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.

  2. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  3. Compilation of 5S rRNA and 5S rRNA gene sequences

    PubMed Central

    Specht, Thomas; Wolters, Jörn; Erdmann, Volker A.

    1990-01-01

    The BERLIN RNA DATABANK as of Dezember 31, 1989, contains a total of 667 sequences of 5S rRNAs or their genes, which is an increase of 114 new sequence entries over the last compilation (1). It covers sequences from 44 archaebacteria, 267 eubacteria, 20 plastids, 6 mitochondria, 319 eukaryotes and 11 eukaryotic pseudogenes. The hardcopy shows only the list (Table 1) of those organisms whose sequences have been determined. The BERLIN RNA DATABANK uses the format of the EMBL Nucleotide Sequence Data Library complemented by a Sequence Alignment (SA) field including secondary structure information. PMID:1692116

  4. Translating RNA sequencing into clinical diagnostics: opportunities and challenges.

    PubMed

    Byron, Sara A; Van Keuren-Jensen, Kendall R; Engelthaler, David M; Carpten, John D; Craig, David W

    2016-05-01

    With the emergence of RNA sequencing (RNA-seq) technologies, RNA-based biomolecules hold expanded promise for their diagnostic, prognostic and therapeutic applicability in various diseases, including cancers and infectious diseases. Detection of gene fusions and differential expression of known disease-causing transcripts by RNA-seq represent some of the most immediate opportunities. However, it is the diversity of RNA species detected through RNA-seq that holds new promise for the multi-faceted clinical applicability of RNA-based measures, including the potential of extracellular RNAs as non-invasive diagnostic indicators of disease. Ongoing efforts towards the establishment of benchmark standards, assay optimization for clinical conditions and demonstration of assay reproducibility are required to expand the clinical utility of RNA-seq.

  5. FLDS: A Comprehensive dsRNA Sequencing Method for Intracellular RNA Virus Surveillance

    PubMed Central

    Urayama, Syun-ichi; Takaki, Yoshihiro; Nunoura, Takuro

    2016-01-01

    Knowledge of the distribution and diversity of RNA viruses is still limited in spite of their possible environmental and epidemiological impacts because RNA virus-specific metagenomic methods have not yet been developed. We herein constructed an effective metagenomic method for RNA viruses by targeting long double-stranded (ds)RNA in cellular organisms, which is a hallmark of infection, or the replication of dsRNA and single-stranded (ss)RNA viruses, except for retroviruses. This novel dsRNA targeting metagenomic method is characterized by an extremely high recovery rate of viral RNA sequences, the retrieval of terminal sequences, and uniform read coverage, which has not previously been reported in other metagenomic methods targeting RNA viruses. This method revealed a previously unidentified viral RNA diversity of more than 20 complete RNA viral genomes including dsRNA and ssRNA viruses associated with an environmental diatom colony. Our approach will be a powerful tool for cataloging RNA viruses associated with organisms of interest. PMID:26877136

  6. The chemical structure of DNA sequence signals for RNA transcription

    NASA Technical Reports Server (NTRS)

    George, D. G.; Dayhoff, M. O.

    1982-01-01

    The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

  7. TARDIS, a targeted RNA directional sequencing method for rare RNA discovery.

    PubMed

    Portal, Maximiliano M; Pavet, Valeria; Erb, Cathie; Gronemeyer, Hinrich

    2015-12-01

    High-throughput transcriptional analysis has unveiled a myriad of novel RNAs. However, technical constraints in RNA sequencing library preparation and platform performance hamper the identification of rare transcripts contained within the RNA repertoire. Herein we present targeted-RNA directional sequencing (TARDIS), a hybridization-based method that allows subsets of RNAs contained within the transcriptome to be interrogated independently of transcript length, function, the presence or absence of poly-A tracts, or the mechanism of biogenesis. TARDIS is a modular protocol that is subdivided into four main phases, including the generation of random DNA traps covering the region of interest, purification of input RNA material, DNA trap-based RNA capture, and finally RNA-sequencing library construction. Importantly, coupling RNA capture to strand-specific RNA sequencing enables robust identification and reconstruction of novel transcripts, the definition of sense and antisense RNA pairs and, by the concomitant analysis of long and natural small RNA pools, it allows the user to infer potential precursor-product relations. TARDIS takes ∼10 d to implement.

  8. RNA sequencing analysis of the developing chicken retina

    PubMed Central

    Langouet-Astrie, Christophe J.; Meinsen, Annamarie L.; Grunwald, Emily R.; Turner, Stephen D.; Enke, Raymond A.

    2016-01-01

    RNA sequencing transcriptome analysis using massively parallel next generation sequencing technology provides the capability to understand global changes in gene expression throughout a range of tissue samples. Development of the vertebrate retina requires complex temporal orchestration of transcriptional activation and repression. The chicken embryo (Gallus gallus) is a classic model system for studying developmental biology and retinogenesis. Existing retinal transcriptome projects have been critical to the vision research community for studying aspects of murine and human retinogenesis, however, there are currently no publicly available data sets describing the developing chicken retinal transcriptome. Here we used Illumina RNA sequencing (RNA-seq) analysis to characterize the mRNA transcriptome of the developing chicken retina in an effort to identify genes critical for retinal development in this important model organism. These data will be valuable to the vision research community for characterizing global changes in gene expression between ocular tissues and critical developmental time points during retinogenesis in the chicken retina. PMID:27996968

  9. Tuning RNA Flexibility with Helix Length and Junction Sequence

    PubMed Central

    Sutton, Julie L.; Pollack, Lois

    2015-01-01

    The increasing awareness of RNA’s central role in biology calls for a new understanding of how RNAs, like proteins, recognize biological partners. Because RNA is inherently flexible, it assumes a variety of conformations. This conformational flexibility can be a critical aspect of how RNA attracts and binds molecular partners. Structurally, RNA consists of rigid basepaired duplexes, separated by flexible non-basepaired regions. Here, using an RNA system consisting of two short helices, connected by a single-stranded (non-basepaired) junction, we explore the role of helix length and junction sequence in determining the range of conformations available to a model RNA. Single-molecule Förster resonance energy transfer reports on the RNA conformation as a function of either mono- or divalent ion concentration. Electrostatic repulsion between helices dominates at low salt concentration, whereas junction sequence effects determine the conformations at high salt concentration. Near physiological salt concentrations, RNA conformation is sensitive to both helix length and junction sequence, suggesting a means for sensitively tuning RNA conformations. PMID:26682821

  10. Library preparation for highly accurate population sequencing of RNA viruses

    PubMed Central

    Acevedo, Ashley; Andino, Raul

    2015-01-01

    Circular resequencing (CirSeq) is a novel technique for efficient and highly accurate next-generation sequencing (NGS) of RNA virus populations. The foundation of this approach is the circularization of fragmented viral RNAs, which are then redundantly encoded into tandem repeats by ‘rolling-circle’ reverse transcription. When sequenced, the redundant copies within each read are aligned to derive a consensus sequence of their initial RNA template. This process yields sequencing data with error rates far below the variant frequencies observed for RNA viruses, facilitating ultra-rare variant detection and accurate measurement of low-frequency variants. Although library preparation takes ~5 d, the high-quality data generated by CirSeq simplifies downstream data analysis, making this approach substantially more tractable for experimentalists. PMID:24967624

  11. Comparison of ribosomal RNA removal methods for transcriptome sequencing workflows in teleost fish

    Technology Transfer Automated Retrieval System (TEKTRAN)

    RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. rRNAs are commonly removed from total RNA via either mRNA selection or rRNA depletion. These methods have...

  12. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications

    PubMed Central

    Herzog, Michel; Maroteaux, Luc

    1986-01-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage. PMID:16578795

  13. Dinoflagellate 17S rRNA sequence inferred from the gene sequence: Evolutionary implications.

    PubMed

    Herzog, M; Maroteaux, L

    1986-11-01

    We present the complete sequence of the nuclear-encoded small-ribosomal-subunit RNA inferred from the cloned gene sequence of the dinoflagellate Prorocentrum micans. The dinoflagellate 17S rRNA sequence of 1798 nucleotides is contained in a family of 200 tandemly repeated genes per haploid genome. A tentative model of the secondary structure of P. micans 17S rRNA is presented. This sequence is compared with the small-ribosomal-subunit rRNA of Xenopus laevis (Animalia), Saccharomyces cerevisiae (Fungi), Zea mays (Planta), Dictyostelium discoideum (Protoctista), and Halobacterium volcanii (Monera). Although the secondary structure of the dinoflagellate 17S rRNA presents most of the eukaryotic characteristics, it contains sufficient archaeobacterial-like structural features to reinforce the view that dinoflagellates branch off very early from the eukaryotic lineage.

  14. Discovering New Biology through Sequencing of RNA1

    PubMed Central

    Weber, Andreas P.M.

    2015-01-01

    Sequencing of RNA (RNA-Seq) was invented approximately 1 decade ago and has since revolutionized biological research. This update provides a brief historic perspective on the development of RNA-Seq and then focuses on the application of RNA-Seq in qualitative and quantitative analyses of transcriptomes. Particular emphasis is given to aspects of data analysis. Since the wet-lab and data analysis aspects of RNA-Seq are still rapidly evolving and novel applications are continuously reported, a printed review will be rapidly outdated and can only serve to provide some examples and general guidelines for planning and conducting RNA-Seq studies. Hence, selected references to frequently update online resources are given. PMID:26353759

  15. Nucleotide sequences important for translation initiation of enterovirus RNA.

    PubMed Central

    Iizuka, N; Yonekawa, H; Nomoto, A

    1991-01-01

    An infectious cDNA clone was constructed from the genome of coxsackievirus B1 strain. A number of RNA transcripts that have mutations in the 5' noncoding region were synthesized in vitro from the modified cDNA clones and examined for their abilities to act as mRNAs in a cell-free translation system prepared from HeLa S3 cells. RNAs that lack nucleotide sequences at positions 568 to 726 and 565 to 726 were found to be less efficient and inactive mRNAs, respectively. To understand the biological significance of this region of RNA, small deletions and point mutations were introduced in the nucleotide sequence between positions 538 and 601. Except for a nucleotide substitution at 592 (U----C) within the 7-base conserved sequence, mutations introduced in the sequence downstream of position 568 did not affect much, if any, of the ability of RNA to act as mRNA. Except for a point mutation at 558 (C----U), mutations upstream of position 567 appeared to inactivate the mRNA. In the upstream region, a sequence consisting of 21 nucleotides at positions 546 to 566 is perfectly conserved in the 5' noncoding regions of enterovirus and rhinovirus genomes. These results suggest that the 7-base conserved sequence functions to maintain the efficiency of translation initiation and that the nucleotide sequence upstream of position 567, including the 21-base conserved sequence, plays essential roles in translation initiation. A deletion mutant whose genome lacks the nucleotide sequence at positions 568 to 726 showed a small-plaque phenotype and less virulence against suckling mice than the wild-type virus. Thus, reduction of the efficiency of translation initiation may result in the construction of enteroviruses with the lower-virulence phenotype. Images PMID:1651409

  16. Nucleotide sequence of papaya mosaic virus RNA.

    PubMed

    Sit, T L; Abouhaidar, M G; Holy, S

    1989-09-01

    The RNA genome of papaya mosaic virus is 6656 nucleotides long [excluding the poly(A) tail] with six open reading frames (ORFs) more than 200 nucleotides long. The four nearest the 5' end each overlap with adjacent ORFs and could code for proteins with Mr 176307, 26248, 11949 and 7224 (ORFs 1 to 4). The fifth ORF produces the capsid protein of Mr 23043 and the sixth ORF, located completely within ORF1, could code for a protein with Mr 14113. The translation products of ORFs 1 to 3 show strong similarity with those of other potexviruses but the ORF 4 protein has only limited similarity with the other potexvirus ORF 4 proteins of 7K to 11K.

  17. Antisense Transcript and RNA Processing Alterations Suppress Instability of Polyadenylated mRNA in Chlamydomonas Chloroplasts

    PubMed Central

    Nishimura, Yoshiki; Kikis, Elise A.; Zimmer, Sara L.; Komine, Yutaka; Stern, David B.

    2004-01-01

    In chloroplasts, the control of mRNA stability is of critical importance for proper regulation of gene expression. The Chlamydomonas reinhardtii strain Δ26pAtE is engineered such that the atpB mRNA terminates with an mRNA destabilizing polyadenylate tract, resulting in this strain being unable to conduct photosynthesis. A collection of photosynthetic revertants was obtained from Δ26pAtE, and gel blot hybridizations revealed RNA processing alterations in the majority of these suppressor of polyadenylation (spa) strains, resulting in a failure to expose the atpB mRNA 3′ poly(A) tail. Two exceptions were spa19 and spa23, which maintained unusual heteroplasmic chloroplast genomes. One genome type, termed PS+, conferred photosynthetic competence by contributing to the stability of atpB mRNA; the other, termed PS−, was required for viability but could not produce stable atpB transcripts. Based on strand-specific RT-PCR, S1 nuclease protection, and RNA gel blots, evidence was obtained that the PS+ genome stabilizes atpB mRNA by generating an atpB antisense transcript, which attenuates the degradation of the polyadenylated form. The accumulation of double-stranded RNA was confirmed by insensitivity of atpB mRNA from PS+ genome-containing cells to S1 nuclease digestion. To obtain additional evidence for antisense RNA function in chloroplasts, we used strain Δ26, in which atpB mRNA is unstable because of the lack of a 3′ stem-loop structure. In this context, when a 121-nucleotide segment of atpB antisense RNA was expressed from an ectopic site, an elevated accumulation of atpB mRNA resulted. Finally, when spa19 was placed in a genetic background in which expression of the chloroplast exoribonuclease polynucleotide phosphorylase was diminished, the PS+ genome and the antisense transcript were no longer required for photosynthesis. Taken together, our results suggest that antisense RNA in chloroplasts can protect otherwise unstable transcripts from 3′→5

  18. RNAcentral: A vision for an international database of RNA sequences

    PubMed Central

    Bateman, Alex; Agrawal, Shipra; Birney, Ewan; Bruford, Elspeth A.; Bujnicki, Janusz M.; Cochrane, Guy; Cole, James R.; Dinger, Marcel E.; Enright, Anton J.; Gardner, Paul P.; Gautheret, Daniel; Griffiths-Jones, Sam; Harrow, Jen; Herrero, Javier; Holmes, Ian H.; Huang, Hsien-Da; Kelly, Krystyna A.; Kersey, Paul; Kozomara, Ana; Lowe, Todd M.; Marz, Manja; Moxon, Simon; Pruitt, Kim D.; Samuelsson, Tore; Stadler, Peter F.; Vilella, Albert J.; Vogel, Jan-Hinnerk; Williams, Kelly P.; Wright, Mathew W.; Zwieb, Christian

    2011-01-01

    During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor. PMID:21940779

  19. Evaluation of commercially available RNA amplification kits for RNA sequencing using very low input amounts of total RNA.

    PubMed

    Shanker, Savita; Paulson, Ariel; Edenberg, Howard J; Peak, Allison; Perera, Anoja; Alekseyev, Yuriy O; Beckloff, Nicholas; Bivens, Nathan J; Donnelly, Robert; Gillaspy, Allison F; Grove, Deborah; Gu, Weikuan; Jafari, Nadereh; Kerley-Hamilton, Joanna S; Lyons, Robert H; Tepper, Clifford; Nicolet, Charles M

    2015-04-01

    This article includes supplemental data. Please visit http://www.fasebj.org to obtain this information.Multiple recent publications on RNA sequencing (RNA-seq) have demonstrated the power of next-generation sequencing technologies in whole-transcriptome analysis. Vendor-specific protocols used for RNA library construction often require at least 100 ng total RNA. However, under certain conditions, much less RNA is available for library construction. In these cases, effective transcriptome profiling requires amplification of subnanogram amounts of RNA. Several commercial RNA amplification kits are available for amplification prior to library construction for next-generation sequencing, but these kits have not been comprehensively field evaluated for accuracy and performance of RNA-seq for picogram amounts of RNA. To address this, 4 types of amplification kits were tested with 3 different concentrations, from 5 ng to 50 pg, of a commercially available RNA. Kits were tested at multiple sites to assess reproducibility and ease of use. The human total reference RNA used was spiked with a control pool of RNA molecules in order to further evaluate quantitative recovery of input material. Additional control data sets were generated from libraries constructed following polyA selection or ribosomal depletion using established kits and protocols. cDNA was collected from the different sites, and libraries were synthesized at a single site using established protocols. Sequencing runs were carried out on the Illumina platform. Numerous metrics were compared among the kits and dilutions used. Overall, no single kit appeared to meet all the challenges of small input material. However, it is encouraging that excellent data can be recovered with even the 50 pg input total RNA.

  20. Integrated bioinformatics analysis of chromatin regulator EZH2 in regulating mRNA and lncRNA expression by ChIP sequencing and RNA sequencing

    PubMed Central

    Li, Yuan; Luo, Mei; Shi, Xuejiao; Lu, Zhiliang; Sun, Shouguo; Huang, Jianbing; Chen, Zhaoli; He, Jie

    2016-01-01

    Enhancer of zeste homolog 2 (EZH2), a dynamic chromatin regulator in cancer, represents a potential therapeutic target showing early signs of promise in clinical trials. EZH2 ChIP sequencing data in 19 cell lines and RNA sequencing data in ten cancer types were downloaded from GEO and TCGA, respectively. Integrated ChIP sequencing analysis and co-expressing analysis were conducted and both mRNA and long noncoding RNA (lncRNA) targets were detected. We detected a median of 4,672 mRNA targets and 4,024 lncRNA targets regulated by EZH2 in 19 cell lines. 20 mRNA targets and 27 lncRNA targets were found in all 19 cell lines. These mRNA targets were enriched in pathways in cancer, Hippo, Wnt, MAPK and PI3K-Akt pathways. Co-expression analysis confirmed numerous targets, mRNA genes (RRAS, TGFBR2, NUF2 and PRC1) and lncRNA genes (lncRNA LINC00261, DIO3OS, RP11-307C12.11 and RP11-98D18.9) were potential targets and were significantly correlated with EZH2. We predicted genome-wide potential targets and the role of EZH2 in regulating as a transcriptional suppressor or activator which could pave the way for mechanism studies and the targeted therapy of EZH2 in cancer. PMID:27835578

  1. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1987-10-07

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  2. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1990-10-09

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  3. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.

    1990-01-01

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.

  4. Single-cell sequencing of the small-RNA transcriptome.

    PubMed

    Faridani, Omid R; Abdullayev, Ilgar; Hagemann-Jensen, Michael; Schell, John P; Lanner, Fredrik; Sandberg, Rickard

    2016-12-01

    Little is known about the heterogeneity of small-RNA expression as small-RNA profiling has so far required large numbers of cells. Here we present a single-cell method for small-RNA sequencing and apply it to naive and primed human embryonic stem cells and cancer cells. Analysis of microRNAs and fragments of tRNAs and small nucleolar RNAs (snoRNAs) reveals the potential of microRNAs as markers for different cell types and states.

  5. Probing dimensionality beyond the linear sequence of mRNA.

    PubMed

    Del Campo, Cristian; Ignatova, Zoya

    2016-05-01

    mRNA is a nexus entity between DNA and translating ribosomes. Recent developments in deep sequencing technologies coupled with structural probing have revealed new insights beyond the classic role of mRNA and place it more centrally as a direct effector of a variety of processes, including translation, cellular localization, and mRNA degradation. Here, we highlight emerging approaches to probe mRNA secondary structure on a global transcriptome-wide level and compare their potential and resolution. Combined approaches deliver a richer and more complex picture. While our understanding on the effect of secondary structure for various cellular processes is quite advanced, the next challenge is to unravel more complex mRNA architectures and tertiary interactions.

  6. Comparative Analysis of Single-Cell RNA Sequencing Methods.

    PubMed

    Ziegenhain, Christoph; Vieth, Beate; Parekh, Swati; Reinius, Björn; Guillaumet-Adkins, Amy; Smets, Martha; Leonhardt, Heinrich; Heyn, Holger; Hellmann, Ines; Enard, Wolfgang

    2017-02-16

    Single-cell RNA sequencing (scRNA-seq) offers new possibilities to address biological and medical questions. However, systematic comparisons of the performance of diverse scRNA-seq protocols are lacking. We generated data from 583 mouse embryonic stem cells to evaluate six prominent scRNA-seq methods: CEL-seq2, Drop-seq, MARS-seq, SCRB-seq, Smart-seq, and Smart-seq2. While Smart-seq2 detected the most genes per cell and across cells, CEL-seq2, Drop-seq, MARS-seq, and SCRB-seq quantified mRNA levels with less amplification noise due to the use of unique molecular identifiers (UMIs). Power simulations at different sequencing depths showed that Drop-seq is more cost-efficient for transcriptome quantification of large numbers of cells, while MARS-seq, SCRB-seq, and Smart-seq2 are more efficient when analyzing fewer cells. Our quantitative comparison offers the basis for an informed choice among six prominent scRNA-seq methods, and it provides a framework for benchmarking further improvements of scRNA-seq protocols.

  7. Replication and packaging of Turnip yellow mosaic virus RNA containing Flock house virus RNA1 sequence.

    PubMed

    Kim, Hui-Bae; Kim, Do-Yeong; Cho, Tae-Ju

    2014-06-01

    Turnip yellow mosaic virus (TYMV) is a spherical plant virus that has a single 6.3 kb positive strand RNA as a genome. In this study, RNA1 sequence of Flock house virus (FHV) was inserted into the TYMV genome to test whether TYMV can accommodate and express another viral entity. In the resulting construct, designated TY-FHV, the FHV RNA1 sequence was expressed as a TYMV subgenomic RNA. Northern analysis of the Nicotiana benthamiana leaves agroinfiltrated with the TY-FHV showed that both genomic and subgenomic FHV RNAs were abundantly produced. This indicates that the FHV RNA1 sequence was correctly expressed and translated to produce a functional FHV replicase. Although these FHV RNAs were not encapsidated, the FHV RNA having a TYMV CP sequence at the 3'-end was efficiently encapsidated. When an eGFP gene was inserted into the B2 ORF of the FHV sequence, a fusion protein of B2-eGFP was produced as expected.

  8. MicroRNA Expression Profile in Penile Cancer Revealed by Next-Generation Small RNA Sequencing

    PubMed Central

    Zhang, Yuanwei; Xu, Bo; Zhou, Jun; Fan, Song; Hao, Zongyao; Shi, Haoqiang; Zhang, Xiansheng; Kong, Rui; Xu, Lingfan; Gao, Jingjing; Zou, Duohong; Liang, Chaozhao

    2015-01-01

    Penile cancer (PeCa) is a relatively rare tumor entity but possesses higher morbidity and mortality rates especially in developing countries. To date, the concrete pathogenic signaling pathways and core machineries involved in tumorigenesis and progression of PeCa remain to be elucidated. Several studies suggested miRNAs, which modulate gene expression at posttranscriptional level, were frequently mis-regulated and aberrantly expressed in human cancers. However, the miRNA profile in human PeCa has not been reported before. In this present study, the miRNA profile was obtained from 10 fresh penile cancerous tissues and matched adjacent non-cancerous tissues via next-generation sequencing. As a result, a total of 751 and 806 annotated miRNAs were identified in normal and cancerous penile tissues, respectively. Among which, 56 miRNAs with significantly different expression levels between paired tissues were identified. Subsequently, several annotated miRNAs were selected randomly and validated using quantitative real-time PCR. Compared with the previous publications regarding to the altered miRNAs expression in various cancers and especially genitourinary (prostate, bladder, kidney, testis) cancers, the most majority of deregulated miRNAs showed the similar expression pattern in penile cancer. Moreover, the bioinformatics analyses suggested that the putative target genes of differentially expressed miRNAs between cancerous and matched normal penile tissues were tightly associated with cell junction, proliferation, growth as well as genomic instability and so on, by modulating Wnt, MAPK, p53, PI3K-Akt, Notch and TGF-β signaling pathways, which were all well-established to participate in cancer initiation and progression. Our work presents a global view of the differentially expressed miRNAs and potentially regulatory networks of their target genes for clarifying the pathogenic transformation of normal penis to PeCa, which research resource also provides new insights

  9. Studying RNA homology and conservation with Infernal: from single sequences to RNA families

    PubMed Central

    Barquist, Lars; Burge, Sarah W.; Gardner, Paul P.

    2016-01-01

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remains difficult. This protocol introduces methods developed by the Rfam database for identifying “families” of homologous ncRNAs starting from single “seed” sequences using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs, then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. PMID:27322404

  10. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

    PubMed

    Barquist, Lars; Burge, Sarah W; Gardner, Paul P

    2016-06-20

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc.

  11. High Throughput Sequencing of Extracellular RNA from Human Plasma

    PubMed Central

    Danielson, Kirsty M.; Rubio, Renee; Abderazzaq, Fieda; Das, Saumya; Wang, Yaoyu E.

    2017-01-01

    The presence and relative stability of extracellular RNAs (exRNAs) in biofluids has led to an emerging recognition of their promise as ‘liquid biopsies’ for diseases. Most prior studies on discovery of exRNAs as disease-specific biomarkers have focused on microRNAs (miRNAs) using technologies such as qRT-PCR and microarrays. The recent application of next-generation sequencing to discovery of exRNA biomarkers has revealed the presence of potential novel miRNAs as well as other RNA species such as tRNAs, snoRNAs, piRNAs and lncRNAs in biofluids. At the same time, the use of RNA sequencing for biofluids poses unique challenges, including low amounts of input RNAs, the presence of exRNAs in different compartments with varying degrees of vulnerability to isolation techniques, and the high abundance of specific RNA species (thereby limiting the sensitivity of detection of less abundant species). Moreover, discovery in human diseases often relies on archival biospecimens of varying age and limiting amounts of samples. In this study, we have tested RNA isolation methods to optimize profiling exRNAs by RNA sequencing in individuals without any known diseases. Our findings are consistent with other recent studies that detect microRNAs and ribosomal RNAs as the major exRNA species in plasma. Similar to other recent studies, we found that the landscape of biofluid microRNA transcriptome is dominated by several abundant microRNAs that appear to comprise conserved extracellular miRNAs. There is reasonable correlation of sets of conserved miRNAs across biological replicates, and even across other data sets obtained at different investigative sites. Conversely, the detection of less abundant miRNAs is far more dependent on the exact methodology of RNA isolation and profiling. This study highlights the challenges in detecting and quantifying less abundant plasma miRNAs in health and disease using RNA sequencing platforms. PMID:28060806

  12. miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data.

    PubMed

    An, Jiyuan; Lai, John; Lehman, Melanie L; Nelson, Colleen C

    2013-01-01

    miRDeep and its varieties are widely used to quantify known and novel micro RNA (miRNA) from small RNA sequencing (RNAseq). This article describes miRDeep*, our integrated miRNA identification tool, which is modeled off miRDeep, but the precision of detecting novel miRNAs is improved by introducing new strategies to identify precursor miRNAs. miRDeep* has a user-friendly graphic interface and accepts raw data in FastQ and Sequence Alignment Map (SAM) or the binary equivalent (BAM) format. Known and novel miRNA expression levels, as measured by the number of reads, are displayed in an interface, which shows each RNAseq read relative to the pre-miRNA hairpin. The secondary pre-miRNA structure and read locations for each predicted miRNA are shown and kept in a separate figure file. Moreover, the target genes of known and novel miRNAs are predicted using the TargetScan algorithm, and the targets are ranked according to the confidence score. miRDeep* is an integrated standalone application where sequence alignment, pre-miRNA secondary structure calculation and graphical display are purely Java coded. This application tool can be executed using a normal personal computer with 1.5 GB of memory. Further, we show that miRDeep* outperformed existing miRNA prediction tools using our LNCaP and other small RNAseq datasets. miRDeep* is freely available online at http://www.australianprostatecentre.org/research/software/mirdeep-star.

  13. Prediction and prioritization of neoantigens: integration of RNA sequencing data with whole-exome sequencing.

    PubMed

    Karasaki, Takahiro; Nagayama, Kazuhiro; Kuwano, Hideki; Nitadori, Jun-Ichi; Sato, Masaaki; Anraku, Masaki; Hosoi, Akihiro; Matsushita, Hirokazu; Takazawa, Masaki; Ohara, Osamu; Nakajima, Jun; Kakimi, Kazuhiro

    2017-02-01

    The importance of neoantigens for cancer immunity is now well-acknowledged. However, there are diverse strategies for predicting and prioritizing candidate neoantigens, and thus reported neoantigen loads vary a great deal. To clarify this issue, we compared the numbers of neoantigen candidates predicted by four currently utilized strategies. Whole-exome sequencing and RNA sequencing (RNA-Seq) of four non-small-cell lung cancer patients was carried out. We identified 361 somatic missense mutations from which 224 candidate neoantigens were predicted using MHC class I binding affinity prediction software (strategy I). Of these, 207 exceeded the set threshold of gene expression (fragments per kilobase of transcript per million fragments mapped ≥1), resulting in 124 candidate neoantigens (strategy II). To verify mutant mRNA expression, sequencing of amplicons from tumor cDNA including each mutation was undertaken; 204 of the 207 mutations were successfully sequenced, yielding 121 mutant mRNA sequences, resulting in 75 candidate neoantigens (strategy III). Sequence information was extracted from RNA-Seq to confirm the presence of mutated mRNA. Variant allele frequencies ≥0.04 in RNA-Seq were found for 117 of the 207 mutations and regarded as expressed in the tumor, and finally, 72 candidate neoantigens were predicted (strategy IV). Without additional amplicon sequencing of cDNA, strategy IV was comparable to strategy III. We therefore propose strategy IV as a practical and appropriate strategy to predict candidate neoantigens fully utilizing currently available information. It is of note that different neoantigen loads were deduced from the same tumors depending on the strategies applied.

  14. Structurally complex and highly active RNA ligases derived from random RNA sequences

    NASA Technical Reports Server (NTRS)

    Ekland, E. H.; Szostak, J. W.; Bartel, D. P.

    1995-01-01

    Seven families of RNA ligases, previously isolated from random RNA sequences, fall into three classes on the basis of secondary structure and regiospecificity of ligation. Two of the three classes of ribozymes have been engineered to act as true enzymes, catalyzing the multiple-turnover transformation of substrates into products. The most complex of these ribozymes has a minimal catalytic domain of 93 nucleotides. An optimized version of this ribozyme has a kcat exceeding one per second, a value far greater than that of most natural RNA catalysts and approaching that of comparable protein enzymes. The fact that such a large and complex ligase emerged from a very limited sampling of sequence space implies the existence of a large number of distinct RNA structures of equivalent complexity and activity.

  15. Simultaneous rapid sequencing of multiple RNA virus genomes.

    PubMed

    Neill, John D; Bayles, Darrell O; Ridpath, Julia F

    2014-06-01

    Comparing sequences of archived viruses collected over many years to the present allows the study of viral evolution and contributes to the design of new vaccines. However, the difficulty, time and expense of generating full-length sequences individually from each archived sample have hampered these studies. Next generation sequencing technologies have been utilized for analysis of clinical and environmental samples to identify viral pathogens that may be present. This has led to the discovery of many new, uncharacterized viruses from a number of viral families. Use of these sequencing technologies would be advantageous in examining viral evolution. In this study, a sequencing procedure was used to sequence simultaneously and rapidly multiple archived samples using a single standard protocol. This procedure utilized primers composed of 20 bases of known sequence with 8 random bases at the 3'-end that also served as an identifying barcode that allowed the differentiation each viral library following pooling and sequencing. This conferred sequence independence by random priming both first and second strand cDNA synthesis. Viral stocks were treated with a nuclease cocktail to reduce the presence of host nucleic acids. Viral RNA was extracted, followed by single tube random-primed double-stranded cDNA synthesis. The resultant cDNAs were amplified by primer-specific PCR, pooled, size fractionated and sequenced on the Ion Torrent PGM platform. The individual virus genomes were readily assembled by both de novo and template-assisted assembly methods. This procedure consistently resulted in near full length, if not full-length, genomic sequences and was used to sequence multiple bovine pestivirus and coronavirus isolates simultaneously.

  16. Using Small RNA Deep Sequencing Data to Detect Human Viruses

    PubMed Central

    Wang, Fang; Sun, Yu; Ruan, Jishou; Chen, Rui; Chen, Xin; Chen, Chengjie; Kreuze, Jan F.; Fei, ZhangJun; Zhu, Xiao

    2016-01-01

    Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans. PMID:27066498

  17. Using Small RNA Deep Sequencing Data to Detect Human Viruses.

    PubMed

    Wang, Fang; Sun, Yu; Ruan, Jishou; Chen, Rui; Chen, Xin; Chen, Chengjie; Kreuze, Jan F; Fei, ZhangJun; Zhu, Xiao; Gao, Shan

    2016-01-01

    Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans.

  18. Power analysis of single-cell RNA-sequencing experiments.

    PubMed

    Svensson, Valentine; Natarajan, Kedar Nath; Ly, Lam-Ha; Miragaia, Ricardo J; Labalette, Charlotte; Macaulay, Iain C; Cvejic, Ana; Teichmann, Sarah A

    2017-04-01

    Single-cell RNA sequencing (scRNA-seq) has become an established and powerful method to investigate transcriptomic cell-to-cell variation, thereby revealing new cell types and providing insights into developmental processes and transcriptional stochasticity. A key question is how the variety of available protocols compare in terms of their ability to detect and accurately quantify gene expression. Here, we assessed the protocol sensitivity and accuracy of many published data sets, on the basis of spike-in standards and uniform data processing. For our workflow, we developed a flexible tool for counting the number of unique molecular identifiers (https://github.com/vals/umis/). We compared 15 protocols computationally and 4 protocols experimentally for batch-matched cell populations, in addition to investigating the effects of spike-in molecular degradation. Our analysis provides an integrated framework for comparing scRNA-seq protocols.

  19. Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments

    PubMed Central

    2011-01-01

    Prior to the advent of new, deep sequencing methods, small RNA (sRNA) discovery was dependent on Sanger sequencing, which was time-consuming and limited knowledge to only the most abundant sRNA. The innovation of large-scale, next-generation sequencing has exponentially increased knowledge of the biology, diversity and abundance of sRNA populations. In this review, we discuss issues involved in the design of sRNA sequencing experiments, including choosing a sequencing platform, inherent biases that affect sRNA measurements and replication. We outline the steps involved in preprocessing sRNA sequencing data and review both the principles behind and the current options for normalization. Finally, we discuss differential expression analysis in the absence and presence of biological replicates. While our focus is on sRNA sequencing experiments, many of the principles discussed are applicable to the sequencing of other RNA populations. PMID:21356093

  20. RNA sequencing of archived neonatal dried blood spots.

    PubMed

    Bybjerg-Grauholm, Jonas; Hagen, Christian Munch; Khoo, Sok Kean; Johannesen, Maria Louise; Hansen, Christine Søholm; Bækvad-Hansen, Marie; Christiansen, Michael; Hougaard, David Michael; Hollegaard, Mads V

    2017-03-01

    Neonatal dried blood spots (DBS) are routinely collected on standard Guthrie cards for all-comprising national newborn screening programs for inborn errors of metabolism, hypothyroidism and other diseases. In Denmark, the Guthrie cards are stored at - 20 °C in the Danish Neonatal Screening Biobank and each sample is linked to elaborate social and medical registries. This provides a unique biospecimen repository to enable large population research at a perinatal level. Here, we demonstrate the feasibility to obtain gene expression data from DBS using next-generation RNA sequencing (RNA-seq). RNA-seq was performed on five males and five females. Sequencing results have an average of > 30 million reads per sample. 26,799 annotated features can be identified with 64% features detectable without fragments per kilobase of transcript per million mapped reads (FPKM) cutoff; number of detectable features dropped to 18% when FPKM ≥ 1. Sex can be discriminated using blood-based sex-specific gene set identified by the Genotype-Tissue Expression consortium. Here, we demonstrate the feasibility to acquire biologically-relevant gene expression from DBS using RNA-seq which provide a new avenue to investigate perinatal diseases in a high throughput manner.

  1. The distribution of RNA motifs in natural sequences.

    PubMed

    Bourdeau, V; Ferbeyre, G; Pageau, M; Paquin, B; Cedergren, R

    1999-11-15

    Functional analysis of genome sequences has largely ignored RNA genes and their structures. We introduce here the notion of 'ribonomics' to describe the search for the distribution of and eventually the determination of the physiological roles of these RNA structures found in the sequence databases. The utility of this approach is illustrated here by the identification in the GenBank database of RNA motifs having known binding or chemical activity. The frequency of these motifs indicates that most have originated from evolutionary drift and are selectively neutral. On the other hand, their distribution among species and their location within genes suggest that the destiny of these motifs may be more elaborate. For example, the hammerhead motif has a skewed organismal presence, is phylogenetically stable and recent work on a schistosome version confirms its in vivo biological activity. The under-representation of the valine-binding motif and the Rev-binding element in GenBank hints at a detrimental effect on cell growth or viability. Data on the presence and the location of these motifs may provide critical guidance in the design of experiments directed towards the understanding and the manipulation of RNA complexes and activities in vivo.

  2. Statistical mechanics of secondary structures formed by random RNA sequences.

    PubMed

    Bundschuh, R; Hwa, T

    2002-03-01

    The formation of secondary structures by a random RNA sequence is studied as a model system for the sequence-structure problem omnipresent in biopolymers. Several toy energy models are introduced to allow detailed analytical and numerical studies. First, a two-replica calculation is performed. By mapping the two-replica problem to the denaturation of a single homogeneous RNA molecule in six-dimensional embedding space, we show that sequence disorder is perturbatively irrelevant, i.e., an RNA molecule with weak sequence disorder is in a molten phase where many secondary structures with comparable total energy coexist. A numerical study of various models at high temperature reproduces behaviors characteristic of the molten phase. On the other hand, a scaling argument based on the external statistics of rare regions can be constructed to show that the low-temperature phase is unstable to sequence disorder. We performed a detailed numerical study of the low-temperature phase using the droplet theory as a guide, and characterized the statistics of large-scale, low-energy excitations of the secondary structures from the ground state structure. We find the excitation energy to grow very slowly (i.e., logarithmically) with the length scale of the excitation, suggesting the existence of a marginal glass phase. The transition between the low-temperature glass phase and the high-temperature molten phase is also characterized numerically. It is revealed by a change in the coefficient of the logarithmic excitation energy, from being disorder dominated to being entropy dominated.

  3. Recombination regulator PRDM9 influences the instability of its own coding sequence in humans.

    PubMed

    Jeffreys, Alec J; Cotton, Victoria E; Neumann, Rita; Lam, Kwan-Wood Gabriel

    2013-01-08

    PRDM9 plays a key role in specifying meiotic recombination hotspot locations in humans and mice via recognition of hotspot sequence motifs by a variable tandem-repeat zinc finger domain in the protein. We now explore germ-line instability of this domain in humans. We show that repeat turnover is driven by mitotic and meiotic mutation pathways, the latter frequently resulting in substantial remodeling of zinc fingers. Turnover dynamics predict frequent allele switches in populations with correspondingly fast changes of the recombination landscape, fully consistent with the known rapid evolution of hotspot locations. We found variation in meiotic instability between men that correlated with PRDM9 status. One particular "destabilizer" variant caused hyperinstability not only of itself but also of otherwise-stable alleles in heterozygotes. PRDM9 protein thus appears to regulate the instability of its own coding sequence. However, destabilizer variants are strongly self-limiting in populations and probably have little impact on the evolution of the recombination landscape.

  4. tRNA-Related Sequences Trigger Systemic mRNA Transport in Plants[OPEN

    PubMed Central

    Zhang, Wenna; Kollwig, Gregor; Apelt, Federico; Walther, Dirk

    2016-01-01

    In plants, protein-coding mRNAs can move via the phloem vasculature to distant tissues, where they may act as non-cell-autonomous signals. Emerging work has identified many phloem-mobile mRNAs, but little is known regarding RNA motifs triggering mobility, the extent of mRNA transport, and the potential of transported mRNAs to be translated into functional proteins after transport. To address these aspects, we produced reporter transcripts harboring tRNA-like structures (TLSs) that were found to be enriched in the phloem stream and in mRNAs moving over chimeric graft junctions. Phenotypic and enzymatic assays on grafted plants indicated that mRNAs harboring a distinctive TLS can move from transgenic roots into wild-type leaves and from transgenic leaves into wild-type flowers or roots; these mRNAs can also be translated into proteins after transport. In addition, we provide evidence that dicistronic mRNA:tRNA transcripts are frequently produced in Arabidopsis thaliana and are enriched in the population of graft-mobile mRNAs. Our results suggest that tRNA-derived sequences with predicted stem-bulge-stem-loop structures are sufficient to mediate mRNA transport and seem to be necessary for the mobility of a large number of endogenous transcripts that can move through graft junctions. PMID:27268430

  5. Sequence and expression of ferredoxin mRNA in barley

    SciTech Connect

    Zielinski, R.; Funder, P.M.; Ling, V. )

    1990-05-01

    We have isolated and structurally characterized a full-length cDNA clone encoding ferredoxin from a {lambda}gt10 cDNA library prepared from barley leaf mRNA. The ferredoxin clone (pBFD-1) was fused head-to-head with a partial-length cDNA clone encoding calmodulin, and was fortuitously isolated by screening the library with a calmodulin-specific oligonucleotide probe. The mRNA sequence from which pBFD-1 was derived is expressed exclusively in the leaf tissues of 7-d old barley seedlings. Barley pre-ferredoxin has a predicted size of 15.3 kDal, of which 4.6 kDal are accounted for by the transit peptide. The polypeptide encoded by pBFD-1 is identical to wheat ferredoxin, and shares slightly more amino acid sequence similarity with spinach ferredoxin I than with ferredoxin II. Ferredoxin mRNA levels are rapidly increased 10-fold by white light in etiolated barley leaves.

  6. How to analyze gene expression using RNA-sequencing data.

    PubMed

    Ramsköld, Daniel; Kavak, Ersen; Sandberg, Rickard

    2012-01-01

    RNA-Seq is arising as a powerful method for transcriptome analyses that will eventually make microarrays obsolete for gene expression analyses. Improvements in high-throughput sequencing and efficient sample barcoding are now enabling tens of samples to be run in a cost-effective manner, competing with microarrays in price, excelling in performance. Still, most studies use microarrays, partly due to the ease of data analyses using programs and modules that quickly turn raw microarray data into spreadsheets of gene expression values and significant differentially expressed genes. Instead RNA-Seq data analyses are still in its infancy and the researchers are facing new challenges and have to combine different tools to carry out an analysis. In this chapter, we provide a tutorial on RNA-Seq data analysis to enable researchers to quantify gene expression, identify splice junctions, and find novel transcripts using publicly available software. We focus on the analyses performed in organisms where a reference genome is available and discuss issues with current methodology that have to be solved before RNA-Seq data can utilize its full potential.

  7. Optimizing RNA structures by sequence extensions using RNAcop

    PubMed Central

    Hecker, Nikolai; Christensen-Dalsgaard, Mikkel; Seemann, Stefan E.; Havgaard, Jakob H.; Stadler, Peter F.; Hofacker, Ivo L.; Nielsen, Henrik; Gorodkin, Jan

    2015-01-01

    A key aspect of RNA secondary structure prediction is the identification of novel functional elements. This is a challenging task because these elements typically are embedded in longer transcripts where the borders between the element and flanking regions have to be defined. The flanking sequences impact the folding of the functional elements both at the level of computational analyses and when the element is extracted as a transcript for experimental analysis. Here, we analyze how different flanking region lengths impact folding into a constrained structure by computing probabilities of folding for different sizes of flanking regions. Our method, RNAcop (RNA context optimization by probability), is tested on known and de novo predicted structures. In vitro experiments support the computational analysis and suggest that for a number of structures, choosing proper lengths of flanking regions is critical. RNAcop is available as web server and stand-alone software via http://rth.dk/resources/rnacop. PMID:26283181

  8. Chaining sequence/structure seeds for computing RNA similarity.

    PubMed

    Bourgeade, Laetitia; Chauve, Cédric; Allali, Julien

    2015-03-01

    We describe a new method to compare a query RNA with a static set of target RNAs. Our method is based on (i) a static indexing of the sequence/structure seeds of the target RNAs; (ii) searching the target RNAs by detecting seeds of the query present in the target, chaining these seeds in promising candidate homologs; and then (iii) completing the alignment using an anchor-based exact alignment algorithm. We apply our method on the benchmark Bralibase2.1 and compare its accuracy and efficiency with the exact method LocARNA and its recent seeds-based speed-up ExpLoc-P. Our pipeline RNA-unchained greatly improves computation time of LocARNA and is comparable to the one of ExpLoc-P, while improving the overall accuracy of the final alignments.

  9. Adenylylation of small RNA sequencing adapters using the TS2126 RNA ligase I.

    PubMed

    Lama, Lodoe; Ryan, Kevin

    2016-01-01

    Many high-throughput small RNA next-generation sequencing protocols use 5' preadenylylated DNA oligonucleotide adapters during cDNA library preparation. Preadenylylation of the DNA adapter's 5' end frees from ATP-dependence the ligation of the adapter to RNA collections, thereby avoiding ATP-dependent side reactions. However, preadenylylation of the DNA adapters can be costly and difficult. The currently available method for chemical adenylylation of DNA adapters is inefficient and uses techniques not typically practiced in laboratories profiling cellular RNA expression. An alternative enzymatic method using a commercial RNA ligase was recently introduced, but this enzyme works best as a stoichiometric adenylylating reagent rather than a catalyst and can therefore prove costly when several variant adapters are needed or during scale-up or high-throughput adenylylation procedures. Here, we describe a simple, scalable, and highly efficient method for the 5' adenylylation of DNA oligonucleotides using the thermostable RNA ligase 1 from bacteriophage TS2126. Adapters with 3' blocking groups are adenylylated at >95% yield at catalytic enzyme-to-adapter ratios and need not be gel purified before ligation to RNA acceptors. Experimental conditions are also reported that enable DNA adapters with free 3' ends to be 5' adenylylated at >90% efficiency.

  10. Mapping of DNA instability at the fragile X to a trinucleotide repeat sequence p(CCG)n

    SciTech Connect

    Kremer, E.J.; Pritchard, M.; Lynch, M.; Yu, S.; Holman, K.; Baker, E.; Sutherland, G.R.; Richards, R.I. ); Warren, S.T. ); Schlessinger, D. )

    1991-06-21

    The sequence of a Pst I restriction fragment was determined that demonstrates instability in fragile X syndrome pedigrees. The region of instability was localized to a trinucleotide repeat p(CCG)n. The sequences flanking this repeat were identical in normal and affected individuals. The breakpoints in two somatic cell hybrids constructed to break at the fragile sites also mapped to this repeat sequence. The repeat exhibits instability both when cloned in a nonhomologous host and after amplification by the polymerase chain reaction. These results suggest variation in the trinucleotide repeat copy number as the molecular basis for the instability and possibly the fragile site. This would account for the observed properties of this region in vivo and in vitro.

  11. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  12. Assessing long-distance RNA sequence connectivity via RNA-templated DNA–DNA ligation

    PubMed Central

    Roy, Christian K; Olson, Sara; Graveley, Brenton R; Zamore, Phillip D; Moore, Melissa J

    2015-01-01

    Many RNAs, including pre-mRNAs and long non-coding RNAs, can be thousands of nucleotides long and undergo complex post-transcriptional processing. Multiple sites of alternative splicing within a single gene exponentially increase the number of possible spliced isoforms, with most human genes currently estimated to express at least ten. To understand the mechanisms underlying these complex isoform expression patterns, methods are needed that faithfully maintain long-range exon connectivity information in individual RNA molecules. In this study, we describe SeqZip, a methodology that uses RNA-templated DNA–DNA ligation to retain and compress connectivity between distant sequences within single RNA molecules. Using this assay, we test proposed coordination between distant sites of alternative exon utilization in mouse Fn1, and we characterize the extraordinary exon diversity of Drosophila melanogaster Dscam1. DOI: http://dx.doi.org/10.7554/eLife.03700.001 PMID:25866926

  13. Gene regulation: ancient microRNA target sequences in plants.

    PubMed

    Floyd, Sandra K; Bowman, John L

    2004-04-01

    MicroRNAs are an abundant class of small RNAs that are thought to regulate the expression of protein-coding genes in plants and animals. Here we show that the target sequence of two microRNAs, known to regulate genes in the class-III homeodomain-leucine zipper (HD-Zip) gene family of the flowering plant Arabidopsis, is conserved in homologous sequences from all lineages of land plants, including bryophytes, lycopods, ferns and seed plants. We also find that the messenger RNAs from these genes are cleaved within the same microRNA-binding site in representatives of each land-plant group, as they are in Arabidopsis. Our results indicate not only that microRNAs mediate gene regulation in non-flowering as well as flowering plants, but also that the regulation of this class of plant genes dates back more than 400 million years.

  14. Approaches to sequence analysis of 125I-labeled RNA.

    PubMed Central

    Dickson, E; Pape, L K; Robertson, H D

    1979-01-01

    A method is described for the initial steps of sequence analysis of RNase T1-and pancreatic RN-ase-resistant oligonucleotides of RNA containing cytidylate residues labeled in vitro with 125I. In many cases an oligonucleotide sequence can be deduced from a consideration of (i) its relative position in the two-dimensional fingerprint (with DEAE thin layer homochromatographic second dimension), (ii) its electrophoretic mobility on DEAE paper at pH 1.9, and (iii) identification of its products of further enzymatic digestion by comparison with a set of marker oligonucleotides. Additional methods including analysis of oligonucleotides following chemical blocking of uridylate residues with CMCT and analysis of products of incomplete enzymatic digestion are also discussed. Images PMID:106369

  15. Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

    PubMed Central

    Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

    2014-01-01

    The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia

  16. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences

    PubMed Central

    Laslett, Dean; Canback, Bjorn

    2004-01-01

    A computer program, ARAGORN, identifies tRNA and tmRNA genes. The program employs heuristic algorithms to predict tRNA secondary structure, based on homology with recognized tRNA consensus sequences and ability to form a base-paired cloverleaf. tmRNA genes are identified using a modified version of the BRUCE program. ARAGORN achieves a detection sensitivity of 99% from a set of 1290 eubacterial, eukaryotic and archaeal tRNA genes and detects all complete tmRNA sequences in the tmRNA database, improving on the performance of the BRUCE program. Recently discovered tmRNA genes in the chloroplasts of two species from the ‘green’ algae lineage are detected. The output of the program reports the proposed tRNA secondary structure and, for tmRNA genes, the secondary structure of the tRNA domain, the tmRNA gene sequence, the tag peptide and a list of organisms with matching tmRNA peptide tags. PMID:14704338

  17. Nascent RNA sequencing reveals distinct features in plant transcription

    PubMed Central

    Hetzel, Jonathan; Duttke, Sascha H.; Benner, Christopher; Chory, Joanne

    2016-01-01

    Transcriptional regulation of gene expression is a major mechanism used by plants to confer phenotypic plasticity, and yet compared with other eukaryotes or bacteria, little is known about the design principles. We generated an extensive catalog of nascent and steady-state transcripts in Arabidopsis thaliana seedlings using global nuclear run-on sequencing (GRO-seq), 5′GRO-seq, and RNA-seq and reanalyzed published maize data to capture characteristics of plant transcription. De novo annotation of nascent transcripts accurately mapped start sites and unstable transcripts. Examining the promoters of coding and noncoding transcripts identified comparable chromatin signatures, a conserved “TGT” core promoter motif and unreported transcription factor-binding sites. Mapping of engaged RNA polymerases showed a lack of enhancer RNAs, promoter-proximal pausing, and divergent transcription in Arabidopsis seedlings and maize, which are commonly present in yeast and humans. In contrast, Arabidopsis and maize genes accumulate RNA polymerases in proximity of the polyadenylation site, a trend that coincided with longer genes and CpG hypomethylation. Lack of promoter-proximal pausing and a higher correlation of nascent and steady-state transcripts indicate Arabidopsis may regulate transcription predominantly at the level of initiation. Our findings provide insight into plant transcription and eukaryotic gene expression as a whole. PMID:27729530

  18. Analysis of sequencing data for probing RNA secondary structures and protein-RNA binding in studying posttranscriptional regulations.

    PubMed

    Hu, Xihao; Wu, Yang; Lu, Zhi John; Yip, Kevin Y

    2016-11-01

    High-throughput sequencing has been used to study posttranscriptional regulations, where the identification of protein-RNA binding is a major and fast-developing sub-area, which is in turn benefited by the sequencing methods for whole-transcriptome probing of RNA secondary structures. In the study of RNA secondary structures using high-throughput sequencing, bases are modified or cleaved according to their structural features, which alter the resulting composition of sequencing reads. In the study of protein-RNA binding, methods have been proposed to immuno-precipitate (IP) protein-bound RNA transcripts in vitro or in vivo By sequencing these transcripts, the protein-RNA interactions and the binding locations can be identified. For both types of data, read counts are affected by a combination of confounding factors, including expression levels of transcripts, sequence biases, mapping errors and the probing or IP efficiency of the experimental protocols. Careful processing of the sequencing data and proper extraction of important features are fundamentally important to a successful analysis. Here we review and compare different experimental methods for probing RNA secondary structures and binding sites of RNA-binding proteins (RBPs), and the computational methods proposed for analyzing the corresponding sequencing data. We suggest how these two types of data should be integrated to study the structural properties of RBP binding sites as a systematic way to better understand posttranscriptional regulations.

  19. Use of Unamplified RNA/cDNA–Hybrid Nanopore Sequencing for Rapid Detection and Characterization of RNA Viruses

    PubMed Central

    Kilianski, Andy; Roth, Pierce A.; Liem, Alvin T.; Hill, Jessica M.; Willis, Kristen L.; Rossmaier, Rebecca D.; Marinich, Andrew V.; Maughan, Michele N.; Karavis, Mark A.; Kuhn, Jens H.; Honko, Anna N.

    2016-01-01

    Nanopore sequencing, a novel genomics technology, has potential applications for routine biosurveillance, clinical diagnosis, and outbreak investigation of virus infections. Using rapid sequencing of unamplified RNA/cDNA hybrids, we identified Venezuelan equine encephalitis virus and Ebola virus in 3 hours from sample receipt to data acquisition, demonstrating a fieldable technique for RNA virus characterization. PMID:27191483

  20. Somatic instability of the DNA sequences encoding the polymorphic polyglutamine tract of the AIB1 gene

    PubMed Central

    Dai, P; Wong, L

    2003-01-01

    Background: AIB1 contains a polymorphic polyglutamine tract (poly Q) that is encoded by a trinucleotide CAG repeat. Previously there have been conflicting results regarding the effect of the poly Q tract length on breast cancer. Since poly Q is not encoded by a perfect CAG repeat, the heterozygous polymorphic alleles need to be resolved, to understand the exact DNA sequences encoding poly Q. Methods: Poly Q encoding sequences of AIB1 from 107 DNA samples, including breast cancer cell lines, sporadic primary breast tumours, and blood samples from BRCA1/BRCA2 mutation carriers and the general population, were resolved by PCR/cloning followed by sequencing of each individual clone. Results: 25 distinct poly Q encoding sequence patterns were found. More than two distinct sequence patterns were found in a significantly higher proportion of tumours and cell lines than that of the general population, suggesting somatic instability. A significantly higher proportion of cancer cell lines or primary breast tumours than that of the general population contained rare sequence patterns. The proportion of sporadic breast tumours having at least one allele ⩽27 repeats is significantly higher than that in the blood of BRCA1/BRCA2 mutation carrier breast cancer patients or the general population. Conclusion: The poly Q encoding DNA sequences are somatically unstable in tumour tissues and cell lines. A missense mutation and a very short glutamine repeat in primary tumours suggests that AIB1 activity may be modulated through poly Q, which in turn plays a role in the cotransactivation of gene expressions in breast cancers. PMID:14684685

  1. In vitro DNA dependent synthesis of globin RNA sequences from erythroleukemic cell chromatin.

    PubMed

    Reff, M E; Davidson, R L

    1979-01-01

    Murine erythroleukemic cells in culture accumulate cytoplasmic globin mRNA during differentiation induced by dimethyl sulfoxide (DMSO)1. Chromatin was prepared from DMSO induced erythroleukemic cells that were transcribing globin RNA in order to determine whether in vitro synthesis of globin RNA sequences was possible from chromatin. RNA was synthesized in vitro using 5-mercuriuridine triphosphate and exogenous Escheria coli RNA polymerase. Newly synthesized mercurated RNA was purified from endogenous chromatin associated RNA by affinity chromatography on a sepharose sulfhydryl column, and the globin RNA sequence content of the mercurated RNA was assayed by hybridization to cDNA globin. The synthesis of globin RNA sequences was shown to occur and to be sensitive to actinomycin and rifampicin and insensitive to alpha-amanitin. In contrast, synthesis of globin RNA sequence synthesis was not detected in significant amounts from chromatin prepared from uninduced erythroleukemic cells, nor from uninduced cell chromatin to which globin RNA was added prior to transcription. Isolated RNA:cDNA globin hybrids were shown to contain mercurated RNA by affinity chromatography. These results indicated that synthesis of globin RNA sequences from chromatin can be performed by E. coli RNA polymerase.

  2. Deciphering Poxvirus Gene Expression by RNA Sequencing and Ribosome Profiling

    PubMed Central

    Cao, Shuai; Martens, Craig A.; Porcella, Stephen F.; Xie, Zhi; Ma, Ming; Shen, Ben

    2015-01-01

    ABSTRACT The more than 200 closely spaced annotated open reading frames, extensive transcriptional read-through, and numerous unpredicted RNA start sites have made the analysis of vaccinia virus gene expression challenging. Genome-wide ribosome profiling provided an unprecedented assessment of poxvirus gene expression. By 4 h after infection, approximately 80% of the ribosome-associated mRNA was viral. Ribosome-associated mRNAs were detected for most annotated early genes at 2 h and for most intermediate and late genes at 4 and 8 h. Cluster analysis identified a subset of early mRNAs that continued to be translated at the later times. At 2 h, there was excellent correlation between the abundance of individual mRNAs and the numbers of associated ribosomes, indicating that expression was primarily transcriptionally regulated. However, extensive transcriptional read-through invalidated similar correlations at later times. The mRNAs with the highest density of ribosomes had host response, DNA replication, and transcription roles at early times and were virion components at late times. Translation inhibitors were used to map initiation sites at single-nucleotide resolution at the start of most annotated open reading frames although in some cases a downstream methionine was used instead. Additional putative translational initiation sites with AUG or alternative codons occurred mostly within open reading frames, and fewer occurred in untranslated leader sequences, antisense strands, and intergenic regions. However, most open reading frames associated with these additional translation initiation sites were short, raising questions regarding their biological roles. The data were used to construct a high-resolution genome-wide map of the vaccinia virus translatome. IMPORTANCE This report contains the first genome-wide, high-resolution analysis of poxvirus gene expression at both transcriptional and translational levels. The study was made possible by recent methodological

  3. RNA expression profile of calcified bicuspid, tricuspid, and normal human aortic valves by RNA sequencing.

    PubMed

    Guauque-Olarte, Sandra; Droit, Arnaud; Tremblay-Marchand, Joël; Gaudreault, Nathalie; Kalavrouziotis, Dimitri; Dagenais, Francois; Seidman, Jonathan G; Body, Simon C; Pibarot, Philippe; Mathieu, Patrick; Bossé, Yohan

    2016-10-01

    The molecular mechanisms leading to premature development of aortic valve stenosis (AS) in individuals with a bicuspid aortic valve are unknown. The objective of this study was to identify genes differentially expressed between calcified bicuspid aortic valves (BAVc) and tricuspid valves with (TAVc) and without (TAVn) AS using RNA sequencing (RNA-Seq). We collected 10 human BAVc and nine TAVc from men who underwent primary aortic valve replacement. Eight TAVn were obtained from men who underwent heart transplantation. mRNA levels were measured by RNA-Seq and compared between valve groups. Two genes were upregulated, and none were downregulated in BAVc compared with TAVc, suggesting a similar gene expression response to AS in individuals with bicuspid and tricuspid valves. There were 462 genes upregulated and 282 downregulated in BAVc compared with TAVn. In TAVc compared with TAVn, 329 genes were up- and 170 were downregulated. A total of 273 upregulated and 147 downregulated genes were concordantly altered between BAVc vs. TAVn and TAVc vs. TAVn, which represent 56 and 84% of significant genes in the first and second comparisons, respectively. This indicates that extra genes and pathways were altered in BAVc. Shared pathways between calcified (BAVc and TAVc) and normal (TAVn) aortic valves were also more extensively altered in BAVc. The top pathway enriched for genes differentially expressed in calcified compared with normal valves was fibrosis, which support the remodeling process as a therapeutic target. These findings are relevant to understand the molecular basis of AS in patients with bicuspid and tricuspid valves.

  4. A Mammalian microRNA Expression Atlas Based on Small RNA Library Sequencing

    PubMed Central

    Landgraf, Pablo; Rusu, Mirabela; Sheridan, Robert; Sewer, Alain; Iovino, Nicola; Aravin, Alexei; Pfeffer, Sébastien; Rice, Amanda; Kamphorst, Alice O.; Landthaler, Markus; Lin, Carolina; Socci, Nicholas D.; Hermida, Leandro; Fulci, Valerio; Chiaretti, Sabina; Foà, Robin; Schliwka, Julia; Fuchs, Uta; Novosel, Astrid; Müller, Roman-Ulrich; Schermer, Bernhard; Bissels, Ute; Inman, Jason; Phan, Quang; Chien, Minchen; Weir, David B.; Choksi, Ruchi; De Vita, Gabriella; Frezzetti, Daniela; Trompeter, Hans-Ingo; Hornung, Veit; Teng, Grace; Hartmann, Gunther; Palkovits, Miklos; Di Lauro, Roberto; Wernet, Peter; Macino, Giuseppe; Rogler, Charles E.; Nagle, James W.; Ju, Jingyue; Papavasiliou, F. Nina; Benzing, Thomas; Lichter, Peter; Tam, Wayne; Brownstein, Michael J.; Bosio, Andreas; Borkhardt, Arndt; Russo, James J.; Sander, Chris; Zavolan, Mihaela; Tuschl, Thomas

    2007-01-01

    Summary MicroRNAs (miRNAs) are small non-coding regulatory RNAs that reduce stability and/or translation of fully or partially sequence-complementary target mRNAs. In order to identify miRNAs and to assess their expression patterns, we sequenced over 250 small RNA libraries from 26 different organ systems and cell types of human and rodents, enriched in neuronal as well as normal and malignant hematopoietic cells and tissues. We present expression profiles derived from clone count data and provide novel computational tools for their analysis. Unexpectedly, a relatively small set of miRNAs, many of which are ubiquitously expressed, account for most of the difference in miRNA profiles between cell lineages and tissues. This broad survey also provides detailed and accurate information about mature sequences, precursors, genome locations, maturation processes, inferred transcriptional units and conservation patterns. We also propose a subclassification scheme for miRNAs for assisting future experimental and computational functional analyses. PMID:17604727

  5. Genome-wide analyses of Epstein-Barr virus reveal conserved RNA structures and a novel stable intronic sequence RNA

    PubMed Central

    2013-01-01

    Background Epstein-Barr virus (EBV) is a human herpesvirus implicated in cancer and autoimmune disorders. Little is known concerning the roles of RNA structure in this important human pathogen. This study provides the first comprehensive genome-wide survey of RNA and RNA structure in EBV. Results Novel EBV RNAs and RNA structures were identified by computational modeling and RNA-Seq analyses of EBV. Scans of the genomic sequences of four EBV strains (EBV-1, EBV-2, GD1, and GD2) and of the closely related Macacine herpesvirus 4 using the RNAz program discovered 265 regions with high probability of forming conserved RNA structures. Secondary structure models are proposed for these regions based on a combination of free energy minimization and comparative sequence analysis. The analysis of RNA-Seq data uncovered the first observation of a stable intronic sequence RNA (sisRNA) in EBV. The abundance of this sisRNA rivals that of the well-known and highly expressed EBV-encoded non-coding RNAs (EBERs). Conclusion This work identifies regions of the EBV genome likely to generate functional RNAs and RNA structures, provides structural models for these regions, and discusses potential functions suggested by the modeled structures. Enhanced understanding of the EBV transcriptome will guide future experimental analyses of the discovered RNAs and RNA structures. PMID:23937650

  6. Optimization of shRNA inhibitors by variation of the terminal loop sequence.

    PubMed

    Schopman, Nick C T; Liu, Ying Poi; Konstantinova, Pavlina; ter Brake, Olivier; Berkhout, Ben

    2010-05-01

    Gene silencing by RNA interference (RNAi) can be achieved by intracellular expression of a short hairpin RNA (shRNA) that is processed into the effective small interfering RNA (siRNA) inhibitor by the RNAi machinery. Previous studies indicate that shRNA molecules do not always reflect the activity of corresponding synthetic siRNAs that attack the same target sequence. One obvious difference between these two effector molecules is the hairpin loop of the shRNA. Most studies use the original shRNA design of the pSuper system, but no extensive study regarding optimization of the shRNA loop sequence has been performed. We tested the impact of different hairpin loop sequences, varying in size and structure, on the activity of a set of shRNAs targeting HIV-1. We were able to transform weak inhibitors into intermediate or even strong shRNA inhibitors by replacing the loop sequence. We demonstrate that the efficacy of these optimized shRNA inhibitors is improved significantly in different cell types due to increased siRNA production. These results indicate that the loop sequence is an essential part of the shRNA design. The optimized shRNA loop sequence is generally applicable for RNAi knockdown studies, and will allow us to develop a more potent gene therapy against HIV-1.

  7. Comparative RNA sequencing reveals substantial genetic variation in endangered primates.

    PubMed

    Perry, George H; Melsted, Páll; Marioni, John C; Wang, Ying; Bainer, Russell; Pickrell, Joseph K; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D; Stephens, Matthew; Pritchard, Jonathan K; Gilad, Yoav

    2012-04-01

    Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success.

  8. Modulations of RNA sequences by cytokinin in pumpkin cotyledons

    SciTech Connect

    Chang, C.; Ertl, J.; Chen, C.

    1987-04-01

    Polyadenylated mRNAs from excised pumpkin cotyledons treated with or without 10/sup -4/ M benzyladenine (BA) for various time periods in suspension culture were assayed by in vitro translation in the presence of (/sup 35/S) methionine. The radioactive polypeptides were analyzed by one- and two-dimensional polyacrylamide gel electrophoresis. Specific sequences of mRNAs were enhanced, reduced, induced, or suppressed by the hormone within 60 min of the application of BA to the cotyledons. Four independent cDNA clones of cytokinin-modulated mRNAs have been selected and characterized. RNA blot hybridization using the four cDNA probes also indicates that the levels of specific mRNAs are modulated upward or downward by the hormone.

  9. A minimal ribosomal RNA: sequence and secondary structure of the 9S kinetoplast ribosomal RNA from Leishmania tarentolae.

    PubMed Central

    de la Cruz, V F; Lake, J A; Simpson, A M; Simpson, L

    1985-01-01

    The portion of the Leishmania tarentolae kinetoplast maxicircle DNA encoding the 9S RNA gene was sequenced, and the 5' and 3' ends of the transcript were determined. A secondary structure for the 9S RNA was determined based on the Escherichia coli 16S model. The 610-nucleotide 9S RNA exhibits a minimal secondary structure in which all four domains of the E. coli 16S structure are preserved. Within domains, however, some stems and loops have been greatly reduced or eliminated entirely. It is presumed that these reduced domains represent the minimal essential small ribosomal RNA secondary structures necessary for a functional ribosome. Alignment of the L. tarentolae 9S rRNA sequence with the published Trypanosoma brucei 9S rRNA sequence shows a nucleotide similarity of 84% and a transversion/transition ratio of 1.66. Images PMID:3856267

  10. Frequency distribution of pre-messenger RNA sequences in polyadenylated and non-polyadenylated nuclear RNA from Friend cells.

    PubMed Central

    Balmain, A; Minty, A J; Birnie, G D

    1980-01-01

    Hybridisation of cDNA probes for abundant and rare polysomal polyadenylated RNAs with polyadenylated and non-polyadenylated nuclear RNA from Friend cells indicated that the abundant polysomal polyadenylated RNA sequences were present at a higher concentration in the nucleus than rare polysomal sequences, but at a reduced range of concentrations. The ratio of the concentrations of abundant and rare sequences was about 3 in non-polyadenylated nuclear RNA, 9 in polyadenylated nuclear RNA and 13 in polysomal polyadenylated RNA. This suggests that polyadenylation may play a role in the quantitative selection of sequences for transport to the cytoplasm. Polyadenylation cannot be the only signal for transport, since a highly complex population of nucleus-confined polyadenylated molecules exists, each of which is present on average at less than one copy per cell. PMID:7433127

  11. Control of the rescue and replication of Semliki Forest virus recombinants by the insertion of miRNA target sequences.

    PubMed

    Ratnik, Kaspar; Viru, Liane; Merits, Andres

    2013-01-01

    Due to their broad cell- and tissue-tropism, alphavirus-based replication-competent vectors are of particular interest for anti-cancer therapy. These properties may, however, be potentially hazardous unless the virus infection is controlled. While the RNA genome of alphaviruses precludes the standard control techniques, host miRNAs can be used to down-regulate viral replication. In this study, target sites from ubiquitous miRNAs and those of miRNAs under-represented in cervical cancer cells were inserted into replication-competent DNA/RNA layered vectors of Semliki Forest virus. It was found that in order to achieve the most efficient suppression of recombinant virus rescue, the introduced target sequences must be fully complementary to those of the corresponding miRNAs. Target sites of ubiquitous miRNAs, introduced into the 3' untranslated region of the viral vector, profoundly reduced the rescue of recombinant viruses. Insertion of the same miRNA targets into coding region of the viral vector was approximately 300-fold less effective. Viruses carrying these miRNAs were genetically unstable and rapidly lost the target sequences. This process was delayed, but not completely prevented, by miRNA inhibitors. Target sites of miRNA under-represented in cervical cancer cells had much smaller but still significant effects on recombinant virus rescue in cervical cancer-derived HeLa cells. Over-expression of miR-214, one of these miRNAs, reduced replication of the targeted virus. Though the majority of rescued viruses maintained the introduced miRNA target sequences, genomes with deletions of these sequences were also detected. Thus, the low-level repression of rescue and replication of targeted virus in HeLa cells was still sufficient to cause genetic instability.

  12. Sequence of the 16S ribosomal RNA from Halobacterium volcanii, an archaebacterium

    NASA Technical Reports Server (NTRS)

    Gupta, R.; Lanter, J. M.; Woese, C. R.

    1983-01-01

    The sequence of the 16S ribosomal RNA (rRNA) from the archaebacterium Halobacterium volcanii has been determined by DNA sequencing methods. The archaebacterial rRNA is similar to its eubacterial counterpart in secondary structure. Although it is closer in sequence to the eubacterial 16S rRNA than to the eukaryotic 16S-like rRNA, the H. volcanii sequence also shows certain points of specific similarity to its eukaryotic counterpart. Since the H. volcanii sequence is closer to both the eubacterial and the eukaryotic sequences than these two are to one another, it follows that the archaebacterial sequence resembles their common ancestral sequence more closely than does either of the other two versions.

  13. High-throughput illumina strand-specific RNA sequencing library preparation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Conventional Illumina RNA-Seq does not have the resolution to decode the complex eukaryote transcriptome due to the lack of RNA polarity information. Strand-specific RNA sequencing (ssRNA-Seq) can overcome these limitations and as such is better suited for genome annotation, de novo transcriptome as...

  14. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    PubMed Central

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  15. Presence of tadpole and adult globin RNA sequences in oocytes of Xenopus laevis

    PubMed Central

    Perlman, S. M.; Ford, P. J.; Rosbash, M. M.

    1977-01-01

    Complementary DNA transcribed from adult Xenopus laevis globin mRNA was used to assay ovary RNA from Xenopus for the presence of globin sequences by RNA·cDNA hybridization. These sequences are present at approximately the same concentration as the majority of poly(A)-containing ovary sequences. The sequences are also found at approximately 200,000 copies per cell in poly(A)-containing RNA extracted from mature oocytes. To rule out contamination of the oocytes with somatic cells, two additional experiments were performed. First, RNA isolated from ovulated unfertilized eggs, which are devoid of somatic cells, was also shown to contain the globin sequences. Second, globin mRNA was isolated from Xenopus tadpoles. Adult globin mRNA is free of the tadpole sequence and no homology was detected between adult and tadpoles globin RNA. The ovary was shown to contain tadpole globin RNA at nearly the same concentration as the adult sequences. Thus, the results cannot be explained by contamination with erythroid cells which should contain only the adult sequence. The swimming tadpole, which possesses an active circulatory system, was also assayed for the tadpole and adult globin sequences. Whereas the adult sequences are present at approximately the same concentration as in the mature oocyte, the concentration of the tadpole sequences increases at least 300-fold in the first 3 days following fertilization. PMID:269434

  16. Single-Cell RNA Sequencing of the Bronchial Epithelium in Smokers With Lung Cancer

    DTIC Science & Technology

    2015-07-01

    AWARD NUMBER: W81XWH-14-1-0234 TITLE: Single-Cell RNA Sequencing of the Bronchial Epithelium in Smokers With Lung Cancer PRINCIPAL INVESTIGATOR...TITLE AND SUBTITLE Single-Cell RNA Sequencing of the Bronchial Epithelium in Smokers With Lung Cancer 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH...single cell RNA sequencing on airway epithelial cells obtained from smokers with and without lung cancer to identify cell-type dependent gene expression

  17. Mutations in the yeast RNA14 and RNA15 genes result in an abnormal mRNA decay rate; sequence analysis reveals an RNA-binding domain in the RNA15 protein.

    PubMed Central

    Minvielle-Sebastia, L; Winsor, B; Bonneaud, N; Lacroute, F

    1991-01-01

    In Saccharomyces cerevisiae, temperature-sensitive mutations in the genes RNA14 and RNA15 correlate with a reduction of mRNA stability and poly(A) tail length. Although mRNA transcription is not abolished in these mutants, the transcripts are rapidly deadenylated as in a strain carrying an RNA polymerase B(II) temperature-sensitive mutation. This suggests that the primary defect could be in the control of the poly(A) status of the mRNAs and that the fast decay rate may be due to the loss of this control. By complementation of their temperature-sensitive phenotype, we have cloned the wild-type genes. They are essential for cell viability and are unique in the haploid genome. The RNA14 gene, located on chromosome H, is transcribed as three mRNAs, one major and two minor, which are 2.2, 1.5, and 1.1 kb in length. The RNA15 gene gives rise to a single 1.2-kb transcript and maps to chromosome XVI. Sequence analysis indicates that RNA14 encodes a 636-amino-acid protein with a calculated molecular weight of 75,295. No homology was found between RNA14 and RNA15 or between RNA14 and other proteins contained in data banks. The RNA15 DNA sequence predicts a protein of 296 amino acids with a molecular weight of 32,770. Sequence comparison reveals an N-terminal putative RNA-binding domain in the RNA15-encoded protein, followed by a glutamine and asparagine stretch similar to the opa sequences. Both RNA14 and RNA15 wild-type genes, when cloned on a multicopy plasmid, are able to suppress the temperature-sensitive phenotype of strains bearing either the rna14 or the rna15 mutation, suggesting that the encoded proteins could interact with each other. Images PMID:1674817

  18. Depletion of tRNA-halves enables effective small RNA sequencing of low-input murine serum samples

    PubMed Central

    Van Goethem, Alan; Yigit, Nurten; Everaert, Celine; Moreno-Smith, Myrthala; Mus, Liselot M.; Barbieri, Eveline; Speleman, Frank; Mestdagh, Pieter; Shohet, Jason; Van Maerken, Tom; Vandesompele, Jo

    2016-01-01

    The ongoing ascent of sequencing technologies has enabled researchers to gain unprecedented insights into the RNA content of biological samples. MiRNAs, a class of small non-coding RNAs, play a pivotal role in regulating gene expression. The discovery that miRNAs are stably present in circulation has spiked interest in their potential use as minimally-invasive biomarkers. However, sequencing of blood-derived samples (serum, plasma) is challenging due to the often low RNA concentration, poor RNA quality and the presence of highly abundant RNAs that dominate sequencing libraries. In murine serum for example, the high abundance of tRNA-derived small RNAs called 5′ tRNA halves hampers the detection of other small RNAs, like miRNAs. We therefore evaluated two complementary approaches for targeted depletion of 5′ tRNA halves in murine serum samples. Using a protocol based on biotinylated DNA probes and streptavidin coated magnetic beads we were able to selectively deplete 95% of the targeted 5′ tRNA half molecules. This allowed an unbiased enrichment of the miRNA fraction resulting in a 6-fold increase of mapped miRNA reads and 60% more unique miRNAs detected. Moreover, when comparing miRNA levels in tumor-carrying versus tumor-free mice, we observed a three-fold increase in differentially expressed miRNAs. PMID:27901112

  19. Rational experiment design for sequencing-based RNA structure mapping.

    PubMed

    Aviran, Sharon; Pachter, Lior

    2014-12-01

    Structure mapping is a classic experimental approach for determining nucleic acid structure that has gained renewed interest in recent years following advances in chemistry, genomics, and informatics. The approach encompasses numerous techniques that use different means to introduce nucleotide-level modifications in a structure-dependent manner. Modifications are assayed via cDNA fragment analysis, using electrophoresis or next-generation sequencing (NGS). The recent advent of NGS has dramatically increased the throughput, multiplexing capacity, and scope of RNA structure mapping assays, thereby opening new possibilities for genome-scale, de novo, and in vivo studies. From an informatics standpoint, NGS is more informative than prior technologies by virtue of delivering direct molecular measurements in the form of digital sequence counts. Motivated by these new capabilities, we introduce a novel model-based in silico approach for quantitative design of large-scale multiplexed NGS structure mapping assays, which takes advantage of the direct and digital nature of NGS readouts. We use it to characterize the relationship between controllable experimental parameters and the precision of mapping measurements. Our results highlight the complexity of these dependencies and shed light on relevant tradeoffs and pitfalls, which can be difficult to discern by intuition alone. We demonstrate our approach by quantitatively assessing the robustness of SHAPE-Seq measurements, obtained by multiplexing SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) chemistry in conjunction with NGS. We then utilize it to elucidate design considerations in advanced genome-wide approaches for probing the transcriptome, which recently obtained in vivo information using dimethyl sulfate (DMS) chemistry.

  20. Escherichia coli 16S rRNA 3'-end formation requires a distal transfer RNA sequence at a proper distance.

    PubMed Central

    Srivastava, A K; Schlessinger, D

    1989-01-01

    The 16S rRNA species in bacterial precursor rRNAs is followed by two evolutionarily conserved features: (i) a double-stranded stem formed by complementary sequences adjacent to the 5' and 3' ends of the 16S rRNA; and (ii) a 3'-transfer RNA sequence. To assess the possible role of these features, plasmid constructs with precursor-specific features deleted were tested for their capacity to form mature rRNA. Stem-forming sequences were dispensable for both 5' and 3' terminus formation; whereas an intact spacer tRNA positioned greater than 24 nucleotides downstream of the 16S RNA sequence was required for correct 3'-end maturation. These results suggest that spacer tRNA at an appropriate location helps form a conformation obligate for pre-rRNA processing, perhaps by binding to a nascent binding site in preribosomes. Thus, spacer tRNAs may be an obligate participant in ribosome formation. Images PMID:2684637

  1. Equally parsimonious pathways through an RNA sequence space are not equally likely

    NASA Technical Reports Server (NTRS)

    Lee, Y. H.; DSouza, L. M.; Fox, G. E.

    1997-01-01

    An experimental system for determining the potential ability of sequences resembling 5S ribosomal RNA (rRNA) to perform as functional 5S rRNAs in vivo in the Escherichia coli cellular environment was devised previously. Presumably, the only 5S rRNA sequences that would have been fixed by ancestral populations are ones that were functionally valid, and hence the actual historical paths taken through RNA sequence space during 5S rRNA evolution would have most likely utilized valid sequences. Herein, we examine the potential validity of all sequence intermediates along alternative equally parsimonious trajectories through RNA sequence space which connect two pairs of sequences that had previously been shown to behave as valid 5S rRNAs in E. coli. The first trajectory requires a total of four changes. The 14 sequence intermediates provide 24 apparently equally parsimonious paths by which the transition could occur. The second trajectory involves three changes, six intermediate sequences, and six potentially equally parsimonious paths. In total, only eight of the 20 sequence intermediates were found to be clearly invalid. As a consequence of the position of these invalid intermediates in the sequence space, seven of the 30 possible paths consisted of exclusively valid sequences. In several cases, the apparent validity/invalidity of the intermediate sequences could not be anticipated on the basis of current knowledge of the 5S rRNA structure. This suggests that the interdependencies in RNA sequence space may be more complex than currently appreciated. If ancestral sequences predicted by parsimony are to be regarded as actual historical sequences, then the present results would suggest that they should also satisfy a validity requirement and that, in at least limited cases, this conjecture can be tested experimentally.

  2. Identification and characterization of microRNA sequences from bovine mammary epithelial cells.

    PubMed

    Bu, D P; Nan, X M; Wang, F; Loor, J J; Wang, J Q

    2015-03-01

    The bovine mammary gland is composed of various cell types including bovine mammary epithelial cells (BMEC). The use of BMEC to uncover the microRNA (miRNA) profile would allow us to obtain a more specific profile of miRNA sequences that could be associated with lactation and avoid interference from other cell types. The objective of this study was to characterize the miRNA sequences expressed in isolated BMEC. The miRNA were identified by Solexa sequencing technology (Illumina Inc., San Diego, CA). Furthermore, novel miRNA were uncovered by stem-loop reverse transcription-PCR and sequencing of PCR products. To detect tissue specificity, expression of novel miRNA sequences was measured by stem-loop RT-PCR and sequencing of PCR products in mammary gland, liver, adipose, ileum, spleen and kidney tissue from 3 lactating Holstein cows (50±10 d postpartum). After bioinformatics analysis, 12,323,451 reads were obtained by Solexa sequencing, of which 11,979,706 were clean reads, matching the bovine genome. Among clean reads, 9,428,122 belonged to miRNA sequences. Further analysis revealed that the miRNA bta-mir-184 had the most abundant expression, and 388 loci possessed the typical stem-loop structures matching known miRNA hairpins. In total, 38 loci with novel hairpins were identified as novel miRNA and were numbered from bta-U1 to bta-U38. One novel miRNA (bta-U21) was specific to mammary gland. Seven novel miRNA, including bta-U21, had tissue-restricted distribution. Uncovering the specific roles of these novel miRNA during lactation appears warranted.

  3. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    PubMed

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA.

  4. Evaluating Methods for Isolating Total RNA and Predicting the Success of Sequencing Phylogenetically Diverse Plant Transcriptomes

    PubMed Central

    Bruskiewich, Richard; Burris, Jason N.; Carrigan, Charlotte T.; Chase, Mark W.; Clarke, Neil D.; Covshoff, Sarah; dePamphilis, Claude W.; Edger, Patrick P.; Goh, Falicia; Graham, Sean; Greiner, Stephan; Hibberd, Julian M.; Jordon-Thaden, Ingrid; Kutchan, Toni M.; Leebens-Mack, James; Melkonian, Michael; Miles, Nicholas; Myburg, Henrietta; Patterson, Jordan; Pires, J. Chris; Ralph, Paula; Rolf, Megan; Sage, Rowan F.; Soltis, Douglas; Soltis, Pamela; Stevenson, Dennis; Stewart, C. Neal; Surek, Barbara; Thomsen, Christina J. M.; Villarreal, Juan Carlos; Wu, Xiaolei; Zhang, Yong; Deyholos, Michael K.; Wong, Gane Ka-Shu

    2012-01-01

    Next-generation sequencing plays a central role in the characterization and quantification of transcriptomes. Although numerous metrics are purported to quantify the quality of RNA, there have been no large-scale empirical evaluations of the major determinants of sequencing success. We used a combination of existing and newly developed methods to isolate total RNA from 1115 samples from 695 plant species in 324 families, which represents >900 million years of phylogenetic diversity from green algae through flowering plants, including many plants of economic importance. We then sequenced 629 of these samples on Illumina GAIIx and HiSeq platforms and performed a large comparative analysis to identify predictors of RNA quality and the diversity of putative genes (scaffolds) expressed within samples. Tissue types (e.g., leaf vs. flower) varied in RNA quality, sequencing depth and the number of scaffolds. Tissue age also influenced RNA quality but not the number of scaffolds ≥1000 bp. Overall, 36% of the variation in the number of scaffolds was explained by metrics of RNA integrity (RIN score), RNA purity (OD 260/230), sequencing platform (GAIIx vs HiSeq) and the amount of total RNA used for sequencing. However, our results show that the most commonly used measures of RNA quality (e.g., RIN) are weak predictors of the number of scaffolds because Illumina sequencing is robust to variation in RNA quality. These results provide novel insight into the methods that are most important in isolating high quality RNA for sequencing and assembling plant transcriptomes. The methods and recommendations provided here could increase the efficiency and decrease the cost of RNA sequencing for individual labs and genome centers. PMID:23185583

  5. Distinct tmRNA sequence elements facilitate RNase R engagement on rescued ribosomes for selective nonstop mRNA decay.

    PubMed

    Venkataraman, Krithika; Zafar, Hina; Karzai, A Wali

    2014-01-01

    trans-Translation, orchestrated by SmpB and tmRNA, is the principal eubacterial pathway for resolving stalled translation complexes. RNase R, the leading nonstop mRNA surveillance factor, is recruited to stalled ribosomes in a trans-translation dependent process. To elucidate the contributions of SmpB and tmRNA to RNase R recruitment, we evaluated Escherichia coli-Francisella tularensis chimeric variants of tmRNA and SmpB. This evaluation showed that while the hybrid tmRNA supported nascent polypeptide tagging and ribosome rescue, it suffered defects in facilitating RNase R recruitment to stalled ribosomes. To gain further insights, we used established tmRNA and SmpB variants that impact distinct stages of the trans-translation process. Analysis of select tmRNA variants revealed that the sequence composition and positioning of the ultimate and penultimate codons of the tmRNA ORF play a crucial role in recruiting RNase R to rescued ribosomes. Evaluation of defined SmpB C-terminal tail variants highlighted the importance of establishing the tmRNA reading frame, and provided valuable clues into the timing of RNase R recruitment to rescued ribosomes. Taken together, these studies demonstrate that productive RNase R-ribosomes engagement requires active trans-translation, and suggest that RNase R captures the emerging nonstop mRNA at an early stage after establishment of the tmRNA ORF as the surrogate mRNA template.

  6. Characterising the Canine Oral Microbiome by Direct Sequencing of Reverse-Transcribed rRNA Molecules

    PubMed Central

    McDonald, James E.; Larsen, Niels; Pennington, Andrea; Connolly, John; Wallis, Corrin; Rooks, David J.; Hall, Neil; McCarthy, Alan J.; Allison, Heather E.

    2016-01-01

    PCR amplification and sequencing of phylogenetic markers, primarily Small Sub-Unit ribosomal RNA (SSU rRNA) genes, has been the paradigm for defining the taxonomic composition of microbiomes. However, ‘universal’ SSU rRNA gene PCR primer sets are likely to miss much of the diversity therein. We sequenced a library comprising purified and reverse-transcribed SSU rRNA (RT-SSU rRNA) molecules from the canine oral microbiome and compared it to a general bacterial 16S rRNA gene PCR amplicon library generated from the same biological sample. In addition, we have developed BIONmeta, a novel, open-source, computer package for the processing and taxonomic classification of the randomly fragmented RT-SSU rRNA reads produced. Direct RT-SSU rRNA sequencing revealed that 16S rRNA molecules belonging to the bacterial phyla Actinobacteria, Bacteroidetes, Firmicutes, Proteobacteria and Spirochaetes, were most abundant in the canine oral microbiome (92.5% of total bacterial SSU rRNA). The direct rRNA sequencing approach detected greater taxonomic diversity (1 additional phylum, 2 classes, 1 order, 10 families and 61 genera) when compared with general bacterial 16S rRNA amplicons from the same sample, simultaneously provided SSU rRNA gene inventories of Bacteria, Archaea and Eukarya, and detected significant numbers of sequences not recognised by ‘universal’ primer sets. Proteobacteria and Spirochaetes were found to be under-represented by PCR-based analysis of the microbiome, and this was due to primer mismatches and taxon-specific variations in amplification efficiency, validated by qPCR analysis of 16S rRNA amplicons from a mock community. This demonstrated the veracity of direct RT-SSU rRNA sequencing for molecular microbial ecology. PMID:27276347

  7. In silico detection of tRNA sequence features characteristic to aminoacyl-tRNA synthetase class membership

    PubMed Central

    Jakó, Éena; Ittzés, Péter; Szenes, Áron; Kun, Ádám; Szathmáry, Eörs; Pál, Gábor

    2007-01-01

    Aminoacyl tRNA synthetases (aaRS) are grouped into Class I and II based on primary and tertiary structure and enzyme properties suggesting two independent phylogenetic lineages. Analogously, tRNA molecules can also form two respective classes, based on the class membership of their corresponding aaRS. Although some aaRS–tRNA interactions are not extremely specific and require editing mechanisms to avoid misaminoacylation, most aaRS–tRNA interactions are rather stereospecific. Thus, class-specific aaRS features could be mirrored by class-specific tRNA features. However, previous investigations failed to detect conserved class-specific nucleotides. Here we introduce a discrete mathematical approach that evaluates not only class-specific ‘strictly present’, but also ‘strictly absent’ nucleotides. The disjoint subsets of these elements compose a unique partition, named extended consensus partition (ECP). By analyzing the ECP for both Class I and II tDNA sets from 50 (13 archaeal, 30 bacterial and 7 eukaryotic) species, we could demonstrate that class-specific tRNA sequence features do exist, although not in terms of strictly conserved nucleotides as it had previously been anticipated. This finding demonstrates that important information was hidden in tRNA sequences inaccessible for traditional statistical methods. The ECP analysis might contribute to the understanding of tRNA evolution and could enrich the sequence analysis tool repertoire. PMID:17704131

  8. Rapid Amplification of cDNA Ends for RNA Transcript Sequencing in Staphylococcus.

    PubMed

    Miller, Eric

    2016-01-01

    Rapid amplification of cDNA ends (RACE) is a technique that was developed to swiftly and efficiently amplify full-length RNA molecules in which the terminal ends have not been characterized. Current usage of this procedure has been more focused on sequencing and characterizing RNA 5' and 3' untranslated regions. Herein is described an adapted RACE protocol to amplify bacterial RNA transcripts.

  9. Transcription profile of boar spermatozoa as revealed by RNA-sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughput RNA sequencing (RNA-Seq) overcomes the limitations of the current hybridization-based techniques to detect the actual pool of RNA transcripts in spermatozoa. The application of this technology in livestock can speed the discovery of potential predictors of male fertility. As a first ...

  10. Evaluating Quality of Aged Archival Formalin-Fixed Paraffin-Embedded Samples for RNA-Sequencing

    EPA Science Inventory

    Archival formalin-fixed paraffin-embedded (FFPE) samples offer a vast, untapped source of genomic data for biomarker discovery. However, the quality of FFPE samples is often highly variable, and conventional methods to assess RNA quality for RNA-sequencing (RNA-seq) are not infor...

  11. Analysis of long noncoding RNA and mRNA using RNA sequencing during the differentiation of intramuscular preadipocytes in chicken

    PubMed Central

    Zhang, Tao; Zhang, Xiangqian; Han, Kunpeng; Zhang, Genxi; Wang, Jinyu; Xie, Kaizhou; Xue, Qian; Fan, Xiaomei

    2017-01-01

    Long noncoding RNAs (lncRNAs) regulate metabolic tissue development and function, including adipogenesis. However, little is known about the function and profile of lncRNAs in intramuscular preadipocyte differentiation in chicken. Here, we identified lncRNAs in chicken intramuscular preadipocytes at different differentiation stages using RNA sequencing. A total of 1,311,382,604 clean reads and 25,435 lncRNAs were obtained from 12 samples. In total, 7,433 differentially expressed genes (4,698 lncRNAs and 2,735 mRNAs) were identified by pairwise comparison. These 7,433 differentially expressed genes were grouped into 11 clusters based on their expression patterns by K-means clustering. Using Weighted Gene Coexpression Network Analysis, we identified four stage-specific modules positively related to I0, I2, I4, and I6 stages and two stage-specific modules negatively related to I0 and I2 stages, respectively. Many well-known and novel pathways associated with intramuscular preadipocyte differentiation were identified. We also identified hub genes in each stage-specific module and visualized them in Cytoscape. Our analysis revealed many highly-connected genes, including XLOC_058593, BMP3, MYOD1, and LAMP3. This study provides a valuable resource for chicken lncRNA study and improves our understanding of the biology of preadipocyte differentiation in chicken. PMID:28199418

  12. Analysis of long noncoding RNA and mRNA using RNA sequencing during the differentiation of intramuscular preadipocytes in chicken.

    PubMed

    Zhang, Tao; Zhang, Xiangqian; Han, Kunpeng; Zhang, Genxi; Wang, Jinyu; Xie, Kaizhou; Xue, Qian; Fan, Xiaomei

    2017-01-01

    Long noncoding RNAs (lncRNAs) regulate metabolic tissue development and function, including adipogenesis. However, little is known about the function and profile of lncRNAs in intramuscular preadipocyte differentiation in chicken. Here, we identified lncRNAs in chicken intramuscular preadipocytes at different differentiation stages using RNA sequencing. A total of 1,311,382,604 clean reads and 25,435 lncRNAs were obtained from 12 samples. In total, 7,433 differentially expressed genes (4,698 lncRNAs and 2,735 mRNAs) were identified by pairwise comparison. These 7,433 differentially expressed genes were grouped into 11 clusters based on their expression patterns by K-means clustering. Using Weighted Gene Coexpression Network Analysis, we identified four stage-specific modules positively related to I0, I2, I4, and I6 stages and two stage-specific modules negatively related to I0 and I2 stages, respectively. Many well-known and novel pathways associated with intramuscular preadipocyte differentiation were identified. We also identified hub genes in each stage-specific module and visualized them in Cytoscape. Our analysis revealed many highly-connected genes, including XLOC_058593, BMP3, MYOD1, and LAMP3. This study provides a valuable resource for chicken lncRNA study and improves our understanding of the biology of preadipocyte differentiation in chicken.

  13. Predicting RNA-binding residues from evolutionary information and sequence conservation

    PubMed Central

    2010-01-01

    Abstract Background RNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA. RBPs are designed to efficiently recognize specific RNA sequences after it is derived from the DNA sequence. To satisfy diverse functional requirements, RNA binding proteins are composed of multiple blocks of RNA-binding domains (RBDs) presented in various structural arrangements to provide versatile functions. The ability to computationally predict RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments. Results The proposed prediction framework named “ProteRNA” combines a SVM-based classifier with conserved residue discovery by WildSpan to identify the residues that interact with RNA in a RNA-binding protein. Although these conserved residues can be either functionally conserved residues or structurally conserved residues, they provide clues on the important residues in a protein sequence. In the independent testing dataset, ProteRNA has been able to deliver overall accuracy of 89.78%, MCC of 0.2628, F-score of 0.3075, and F0.5-score of 0.3546. Conclusions This article presents the design of a sequence-based predictor aiming to identify the RNA-binding residues in a RNA-binding protein by combining machine learning and pattern mining approaches. RNA-binding proteins have diverse functions while interacting with different categories of RNAs because these proteins are composed of multiple copies of RNA-binding domains presented in various structural arrangements to expand the functional repertoire of RNA-binding proteins. Furthermore, predicting RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments. PMID:21143803

  14. Sequences controlling histone H4 mRNA abundance.

    PubMed Central

    Capasso, O; Bleecker, G C; Heintz, N

    1987-01-01

    The post-transcriptional regulation of histone mRNA abundance is manifest both by accumulation of histone mRNA during the S phase, and by the rapid degradation of mature histone mRNA following the inhibition of DNA synthesis. We have constructed a comprehensive series of substitution mutants within a human H4 histone gene, introduced them into the mouse L cell genome, and analyzed their effects on the post-transcriptional control of the H4 mRNA. Our results demonstrate that most of the H4 mRNA is dispensable for proper regulation of histone mRNA abundance. However, recognition of the 3' terminus of the mature H4 mRNA is critically important for regulating its cytoplasmic half-life. Thus, this region of the mRNA functions both in the nucleus as a signal for proper processing of the mRNA terminus, and in the cytoplasm as an essential element in the control of mRNA stability. Images Fig. 2. Fig. 3. Fig. 4. Fig. 5. PMID:3608993

  15. RNA sequencing of the exercise transcriptome in equine athletes.

    PubMed

    Capomaccio, Stefano; Vitulo, Nicola; Verini-Supplizi, Andrea; Barcaccia, Gianni; Albiero, Alessandro; D'Angelo, Michela; Campagna, Davide; Valle, Giorgio; Felicetti, Michela; Silvestrelli, Maurizio; Cappelli, Katia

    2013-01-01

    The horse is an optimal model organism for studying the genomic response to exercise-induced stress, due to its natural aptitude for athletic performance and the relative homogeneity of its genetic and environmental backgrounds. Here, we applied RNA-sequencing analysis through the use of SOLiD technology in an experimental framework centered on exercise-induced stress during endurance races in equine athletes. We monitored the transcriptional landscape by comparing gene expression levels between animals at rest and after competition. Overall, we observed a shift from coding to non-coding regions, suggesting that the stress response involves the differential expression of not annotated regions. Notably, we observed significant post-race increases of reads that correspond to repeats, especially the intergenic and intronic L1 and L2 transposable elements. We also observed increased expression of the antisense strands compared to the sense strands in intronic and regulatory regions (1 kb up- and downstream) of the genes, suggesting that antisense transcription could be one of the main mechanisms for transposon regulation in the horse under stress conditions. We identified a large number of transcripts corresponding to intergenic and intronic regions putatively associated with new transcriptional elements. Gene expression and pathway analysis allowed us to identify several biological processes and molecular functions that may be involved with exercise-induced stress. Ontology clustering reflected mechanisms that are already known to be stress activated (e.g., chemokine-type cytokines, Toll-like receptors, and kinases), as well as "nucleic acid binding" and "signal transduction activity" functions. There was also a general and transient decrease in the global rates of protein synthesis, which would be expected after strenuous global stress. In sum, our network analysis points toward the involvement of specific gene clusters in equine exercise-induced stress, including

  16. RNA2DNAlign: nucleotide resolution allele asymmetries through quantitative assessment of RNA and DNA paired sequencing data.

    PubMed

    Movassagh, Mercedeh; Alomran, Nawaf; Mudvari, Prakriti; Dede, Merve; Dede, Cem; Kowsari, Kamran; Restrepo, Paula; Cauley, Edmund; Bahl, Sonali; Li, Muzi; Waterhouse, Wesley; Tsaneva-Atanasova, Krasimira; Edwards, Nathan; Horvath, Anelia

    2016-12-15

    We introduce RNA2DNAlign, a computational framework for quantitative assessment of allele counts across paired RNA and DNA sequencing datasets. RNA2DNAlign is based on quantitation of the relative abundance of variant and reference read counts, followed by binomial tests for genotype and allelic status at SNV positions between compatible sequences. RNA2DNAlign detects positions with differential allele distribution, suggesting asymmetries due to regulatory/structural events. Based on the type of asymmetry, RNA2DNAlign outlines positions likely to be implicated in RNA editing, allele-specific expression or loss, somatic mutagenesis or loss-of-heterozygosity (the first three also in a tumor-specific setting). We applied RNA2DNAlign on 360 matching normal and tumor exomes and transcriptomes from 90 breast cancer patients from TCGA. Under high-confidence settings, RNA2DNAlign identified 2038 distinct SNV sites associated with one of the aforementioned asymetries, the majority of which have not been linked to functionality before. The performance assessment shows very high specificity and sensitivity, due to the corroboration of signals across multiple matching datasets. RNA2DNAlign is freely available from http://github.com/HorvathLab/NGS as a self-contained binary package for 64-bit Linux systems.

  17. Globin mRNA reduction for whole-blood transcriptome sequencing

    PubMed Central

    Krjutškov, Kaarel; Koel, Mariann; Roost, Anne Mari; Katayama, Shintaro; Einarsdottir, Elisabet; Jouhilahti, Eeva-Mari; Söderhäll, Cilla; Jaakma, Ülle; Plaas, Mario; Vesterlund, Liselotte; Lohi, Hannes; Salumets, Andres; Kere, Juha

    2016-01-01

    The transcriptome analysis of whole-blood RNA by sequencing holds promise for the identification and tracking of biomarkers; however, the high globin mRNA (gmRNA) content of erythrocytes hampers whole-blood and buffy coat analyses. We introduce a novel gmRNA locking assay (GlobinLock, GL) as a robust and simple gmRNA reduction tool to preserve RNA quality, save time and cost. GL consists of a pair of gmRNA-specific oligonucleotides in RNA initial denaturation buffer that is effective immediately after RNA denaturation and adds only ten minutes of incubation to the whole cDNA synthesis procedure when compared to non-blood RNA analysis. We show that GL is fully effective not only for human samples but also for mouse and rat, and so far incompletely studied cow, dog and zebrafish. PMID:27515369

  18. Sequence-specific cleavage of dsRNA by Mini-III RNase.

    PubMed

    Głów, Dawid; Pianka, Dariusz; Sulej, Agata A; Kozłowski, Łukasz P; Czarnecka, Justyna; Chojnowski, Grzegorz; Skowronek, Krzysztof J; Bujnicki, Janusz M

    2015-03-11

    Ribonucleases (RNases) play a critical role in RNA processing and degradation by hydrolyzing phosphodiester bonds (exo- or endonucleolytically). Many RNases that cut RNA internally exhibit substrate specificity, but their target sites are usually limited to one or a few specific nucleotides in single-stranded RNA and often in a context of a particular three-dimensional structure of the substrate. Thus far, no RNase counterparts of restriction enzymes have been identified which could cleave double-stranded RNA (dsRNA) in a sequence-specific manner. Here, we present evidence for a sequence-dependent cleavage of long dsRNA by RNase Mini-III from Bacillus subtilis (BsMiniIII). Analysis of the sites cleaved by this enzyme in limited digest of bacteriophage Φ6 dsRNA led to the identification of a consensus target sequence. We defined nucleotide residues within the preferred cleavage site that affected the efficiency of the cleavage and were essential for the discrimination of cleavable versus non-cleavable dsRNA sequences. We have also determined that the loop α5b-α6, a distinctive structural element in Mini-III RNases, is crucial for the specific cleavage, but not for dsRNA binding. Our results suggest that BsMiniIII may serve as a prototype of a sequence-specific dsRNase that could possibly be used for targeted cleavage of dsRNA.

  19. Discriminative Prediction of A-To-I RNA Editing Events from DNA Sequence

    PubMed Central

    Sun, Jiangming; Singh, Pratibha; Bagge, Annika; Valtat, Bérengère; Vikman, Petter; Spégel, Peter; Mulder, Hindrik

    2016-01-01

    RNA editing is a post-transcriptional alteration of RNA sequences that, via insertions, deletions or base substitutions, can affect protein structure as well as RNA and protein expression. Recently, it has been suggested that RNA editing may be more frequent than previously thought. A great impediment, however, to a deeper understanding of this process is the paramount sequencing effort that needs to be undertaken to identify RNA editing events. Here, we describe an in silico approach, based on machine learning, that ameliorates this problem. Using 41 nucleotide long DNA sequences, we show that novel A-to-I RNA editing events can be predicted from known A-to-I RNA editing events intra- and interspecies. The validity of the proposed method was verified in an independent experimental dataset. Using our approach, 203 202 putative A-to-I RNA editing events were predicted in the whole human genome. Out of these, 9% were previously reported. The remaining sites require further validation, e.g., by targeted deep sequencing. In conclusion, the approach described here is a useful tool to identify potential A-to-I RNA editing events without the requirement of extensive RNA sequencing. PMID:27764195

  20. Selecting effective siRNA target sequences by using Bayes' theorem.

    PubMed

    Takasaki, Shigeru

    2009-10-01

    Short interfering RNA (siRNA) has been widely used for studying gene functions in mammalian cells but varies markedly in its gene silencing efficacy. Although many design rules/guidelines for effective siRNAs based on various criteria have been reported recently, there are few consistencies among them. This makes it difficult to select effective siRNA sequences in mammalian genes. Another shortcoming of most previously reported methods is that they cannot estimate the probability that a candidate sequence will silence the target gene. The analytical prediction method proposed in the present study uses Bayes' theorem to select effective siRNA target sequences from many possible candidate sequences. It is quite different from the previous score-based siRNA design techniques and can predict the probability that a candidate siRNA sequence will be effective. The results of evaluating it by applying it to recently reported effective and ineffective siRNA sequences for various genes indicate that it would be useful for many other genes. It should therefore be useful for selecting siRNA sequences effective for mammalian genes.

  1. A structural and primary sequence comparison of the viral RNA-dependent RNA polymerases

    PubMed Central

    Bruenn, Jeremy A.

    2003-01-01

    A systematic bioinformatic approach to identifying the evolutionarily conserved regions of proteins has verified the universality of a newly described conserved motif in RNA-dependent RNA polymerases (motif F). In combination with structural comparisons, this approach has defined two regions that may be involved in unwinding double-stranded RNA (dsRNA) for transcription. One of these is the N-terminal portion of motif F and the second is a large insertion in motif F present in the RNA-dependent RNA polymerases of some dsRNA viruses. PMID:12654997

  2. RNA Secondary Structures Having a Compatible Sequence of Certain Nucleotide Ratios.

    PubMed

    Barrett, Christopher L; Li, Thomas J X; Reidys, Christian M

    2016-11-01

    Given a random RNA secondary structure, S, we study RNA sequences having fixed ratios of nucleotides that are compatible with S. We perform this analysis for RNA secondary structures subject to various base-pairing rules and minimum arc- and stack-length restrictions. Our main result reads as follows: in the simplex of nucleotide ratios, there exists a convex region, in which, in the limit of long sequences, a random structure asymptotically almost surely (a.a.s.) has compatible sequence with these ratios and outside of which a.a.s. a random structure has no such compatible sequence. We localize this region for RNA secondary structures subject to various base-pairing rules and minimum arc- and stack-length restrictions. In particular, for GC-sequences (GC denoting the nucleotides guanine and cytosine, respectively) having a ratio of G nucleotides smaller than 1/3, a random RNA secondary structure without any minimum arc- and stack-length restrictions has a.a.s. no such compatible sequence. For sequences having a ratio of G nucleotides larger than 1/3, a random RNA secondary structure has a.a.s. such compatible sequences. We discuss our results in the context of various families of RNA structures.

  3. Use of 16S Ribosomal RNA Sequences to Infer Relationships among Archaebacteria.

    DTIC Science & Technology

    1987-04-16

    FIELD GROUP SUB-GROUP Archaebacteria; Eubacteria ; Eukaryotes; 16S Ribosomal RNA; 08 I Phylogeny; rRNA; RNA Sequencing; Molecular Clock; Urkingdoms; r...16S rRNA data were used to infer the relat onships among the archaebacteria, and of the archaebacteria to the eubacteria and eukaryotes. ur programs for...been published (1, 2, 16, 18). The analyses render untenable the suggestions of Lake and colleagues (Lake et al., 1985) that the eubacteria derive from

  4. Autocatalytic cyclization of an excised intervening sequence RNA is a cleavage-ligation reaction.

    PubMed

    Zaug, A J; Grabowski, P J; Cech, T R

    The intervening sequence (IVS) of the Tetrahymena ribosomal RNA precursor is excised as a linear RNA molecule which subsequently cyclizes itself in a protein-independent reaction. Cyclization involves cleavage of the linear IVS RNA 15 nucleotides from its 5' end and formation of a phosphodiester bond between the new 5' phosphate and the original 3'-hydroxyl terminus of the IVS. This recombination mechanism is analogous to that by which splicing of the precursor RNA is achieved. The circular molecules appear to have no direct function in RNA splicing, and we propose the cyclization serves to prevent unwanted RNA from driving the splicing reactions backwards.

  5. Melting temperature highlights functionally important RNA structure and sequence elements in yeast mRNA coding regions.

    PubMed

    Qi, Fei; Frishman, Dmitrij

    2017-03-07

    Secondary structure elements in the coding regions of mRNAs play an important role in gene expression and regulation, but distinguishing functional from non-functional structures remains challenging. Here we investigate the dependence of sequence-structure relationships in the coding regions on temperature based on the recent PARTE data by Wan et al. Our main finding is that the regions with high and low thermostability (high Tm and low Tm regions) are under evolutionary pressure to preserve RNA secondary structure and primary sequence, respectively. Sequences of low Tm regions display a higher degree of evolutionary conservation compared to high Tm regions. Low Tm regions are under strong synonymous constraint, while high Tm regions are not. These findings imply that high Tm regions contain thermo-stable functionally important RNA structures, which impose relaxed evolutionary constraint on sequence as long as the base-pairing patterns remain intact. By contrast, low thermostability regions contain single-stranded functionally important conserved RNA sequence elements accessible for binding by other molecules. We also find that theoretically predicted structures of paralogous mRNA pairs become more similar with growing temperature, while experimentally measured structures tend to diverge, which implies that the melting pathways of RNA structures cannot be fully captured by current computational approaches.

  6. High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases.

    PubMed

    Qin, Yidan; Yao, Jun; Wu, Douglas C; Nottingham, Ryan M; Mohr, Sabine; Hunicke-Smith, Scott; Lambowitz, Alan M

    2016-01-01

    Next-generation RNA-sequencing (RNA-seq) has revolutionized transcriptome profiling, gene expression analysis, and RNA-based diagnostics. Here, we developed a new RNA-seq method that exploits thermostable group II intron reverse transcriptases (TGIRTs) and used it to profile human plasma RNAs. TGIRTs have higher thermostability, processivity, and fidelity than conventional reverse transcriptases, plus a novel template-switching activity that can efficiently attach RNA-seq adapters to target RNA sequences without RNA ligation. The new TGIRT-seq method enabled construction of RNA-seq libraries from <1 ng of plasma RNA in <5 h. TGIRT-seq of RNA in 1-mL plasma samples from a healthy individual revealed RNA fragments mapping to a diverse population of protein-coding gene and long ncRNAs, which are enriched in intron and antisense sequences, as well as nearly all known classes of small ncRNAs, some of which have never before been seen in plasma. Surprisingly, many of the small ncRNA species were present as full-length transcripts, suggesting that they are protected from plasma RNases in ribonucleoprotein (RNP) complexes and/or exosomes. This TGIRT-seq method is readily adaptable for profiling of whole-cell, exosomal, and miRNAs, and for related procedures, such as HITS-CLIP and ribosome profiling.

  7. Identification of Dirofilaria immitis miRNA using illumina deep sequencing

    PubMed Central

    2013-01-01

    The heartworm Dirofilaria immitis is the causative agent of cardiopulmonary dirofilariosis in dogs and cats, which also infects a wide range of wild mammals and humans. The complex life cycle of D. immitis with several developmental stages in its invertebrate mosquito vectors and its vertebrate hosts indicates the importance of miRNA in growth and development, and their ability to regulate infection of mammalian hosts. This study identified the miRNA profiles of D. immitis of zoonotic significance by deep sequencing. A total of 1063 conserved miRNA candidates, including 68 anti-sense miRNA (miRNA*) sequences, were predicted by computational methods and could be grouped into 808 miRNA families. A significant bias towards family members, family abundance and sequence nucleotides was observed. Thirteen novel miRNA candidates were predicted by alignment with the Brugia malayi genome. Eleven out of 13 predicted miRNA candidates were verified by using a PCR-based method. Target genes of the novel miRNA candidates were predicted by using the heartworm transcriptome dataset. To our knowledge, this is the first report of miRNA profiles in D. immitis, which will contribute to a better understanding of the complex biology of this zoonotic filarial nematode and the molecular regulation roles of miRNA involved. Our findings may also become a useful resource for small RNA studies in other filarial parasitic nematodes. PMID:23331513

  8. Deep Sequencing of RNA from Ancient Maize Kernels

    PubMed Central

    Rasmussen, Morten; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Alquezar-Planas, David E.; Penfield, Steven; Brown, Terence A.; Vielle-Calzada, Jean-Philippe; Montiel, Rafael; Jørgensen, Tina; Odegaard, Nancy; Jacobs, Michael; Arriaza, Bernardo; Higham, Thomas F. G.; Ramsey, Christopher Bronk; Willerslev, Eske; Gilbert, M. Thomas P.

    2013-01-01

    The characterization of biomolecules from ancient samples can shed otherwise unobtainable insights into the past. Despite the fundamental role of transcriptomal change in evolution, the potential of ancient RNA remains unexploited – perhaps due to dogma associated with the fragility of RNA. We hypothesize that seeds offer a plausible refuge for long-term RNA survival, due to the fundamental role of RNA during seed germination. Using RNA-Seq on cDNA synthesized from nucleic acid extracts, we validate this hypothesis through demonstration of partial transcriptomal recovery from two sources of ancient maize kernels. The results suggest that ancient seed transcriptomics may offer a powerful new tool with which to study plant domestication. PMID:23326310

  9. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    PubMed Central

    Jenior, Matthew L.; Koumpouras, Charles C.; Westcott, Sarah L.; Highlander, Sarah K.

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting. PMID:27069806

  10. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    PubMed

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  11. The impact of RNA secondary structure on read start locations on the Illumina sequencing platform.

    PubMed

    Price, Adam; Garhyan, Jaishree; Gibas, Cynthia

    2017-01-01

    High-throughput sequencing is subject to sequence dependent bias, which must be accounted for if researchers are to make precise measurements and draw accurate conclusions from their data. A widely studied source of bias in sequencing is the GC content bias, in which levels of GC content in a genomic region effect the number of reads produced during sequencing. Although some research has been performed on methods to correct for GC bias, there has been little effort to understand the underlying mechanism. The availability of sequencing protocols that target the specific location of structure in nucleic acid molecules enables us to investigate the underlying molecular origin of observed GC bias in sequencing. By applying a parallel analysis of RNA structure (PARS) protocol to bacterial genomes of varying GC content, we are able to observe the relationship between local RNA secondary structure and sequencing outcome, and to establish RNA secondary structure as the significant contributing factor to observed GC bias.

  12. The impact of RNA secondary structure on read start locations on the Illumina sequencing platform

    PubMed Central

    Price, Adam; Garhyan, Jaishree

    2017-01-01

    High-throughput sequencing is subject to sequence dependent bias, which must be accounted for if researchers are to make precise measurements and draw accurate conclusions from their data. A widely studied source of bias in sequencing is the GC content bias, in which levels of GC content in a genomic region effect the number of reads produced during sequencing. Although some research has been performed on methods to correct for GC bias, there has been little effort to understand the underlying mechanism. The availability of sequencing protocols that target the specific location of structure in nucleic acid molecules enables us to investigate the underlying molecular origin of observed GC bias in sequencing. By applying a parallel analysis of RNA structure (PARS) protocol to bacterial genomes of varying GC content, we are able to observe the relationship between local RNA secondary structure and sequencing outcome, and to establish RNA secondary structure as the significant contributing factor to observed GC bias. PMID:28245230

  13. Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues.

    PubMed

    Lee, Je Hyuk; Daugharthy, Evan R; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C; Terry, Richard; Turczyk, Brian M; Yang, Joyce L; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M

    2015-03-01

    RNA-sequencing (RNA-seq) measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. In contrast, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq, our method enriches for context-specific transcripts over housekeeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d.

  14. Genome-Wide Probing of RNA Structures In Vitro Using Nucleases and Deep Sequencing.

    PubMed

    Wan, Yue; Qu, Kun; Ouyang, Zhengqing; Chang, Howard Y

    2016-01-01

    RNA structure probing is an important technique that studies the secondary and tertiary conformations of an RNA. While it was traditionally performed on one RNA at a time, recent advances in deep sequencing has enabled the secondary structure mapping of thousands of RNAs simultaneously. Here, we describe the method Parallel Analysis for RNA Structures (PARS), which couples double and single strand specific nuclease probing to high throughput sequencing. Upon cloning of the cleavage sites into a cDNA library, deep sequencing and mapping of reads to the transcriptome, the position of paired and unpaired bases along cellular RNAs can be identified. PARS can be performed under diverse solution conditions and on different organismal RNAs to provide genome-wide RNA structural information. This information can also be further used to constrain computational predictions to provide better RNA structure models under different conditions.

  15. Sequence-specific RNA Photocleavage by Single-stranded DNA in Presence of Riboflavin

    NASA Astrophysics Data System (ADS)

    Zhao, Yongyun; Chen, Gangyi; Yuan, Yi; Li, Na; Dong, Juan; Huang, Xin; Cui, Xin; Tang, Zhuo

    2015-10-01

    Constant efforts have been made to develop new method to realize sequence-specific RNA degradation, which could cause inhibition of the expression of targeted gene. Herein, by using an unmodified short DNA oligonucleotide for sequence recognition and endogenic small molecue, vitamin B2 (riboflavin) as photosensitizer, we report a simple strategy to realize the sequence-specific photocleavage of targeted RNA. The DNA strand is complimentary to the target sequence to form DNA/RNA duplex containing a G•U wobble in the middle. The cleavage reaction goes through oxidative elimination mechanism at the nucleoside downstream of U of the G•U wobble in duplex to obtain unnatural RNA terminal, and the whole process is under tight control by using light as switch, which means the cleavage could be carried out according to specific spatial and temporal requirements. The biocompatibility of this method makes the DNA strand in combination with riboflavin a promising molecular tool for RNA manipulation.

  16. Common 5S rRNA variants are likely to be accepted in many sequence contexts

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; D'Souza, Lisa M.; Lee, Youn-Hyung; Fox, George E.

    2003-01-01

    Over evolutionary time RNA sequences which are successfully fixed in a population are selected from among those that satisfy the structural and chemical requirements imposed by the function of the RNA. These sequences together comprise the structure space of the RNA. In principle, a comprehensive understanding of RNA structure and function would make it possible to enumerate which specific RNA sequences belong to a particular structure space and which do not. We are using bacterial 5S rRNA as a model system to attempt to identify principles that can be used to predict which sequences do or do not belong to the 5S rRNA structure space. One promising idea is the very intuitive notion that frequently seen sequence changes in an aligned data set of naturally occurring 5S rRNAs would be widely accepted in many other 5S rRNA sequence contexts. To test this hypothesis, we first developed well-defined operational definitions for a Vibrio region of the 5S rRNA structure space and what is meant by a highly variable position. Fourteen sequence variants (10 point changes and 4 base-pair changes) were identified in this way, which, by the hypothesis, would be expected to incorporate successfully in any of the known sequences in the Vibrio region. All 14 of these changes were constructed and separately introduced into the Vibrio proteolyticus 5S rRNA sequence where they are not normally found. Each variant was evaluated for its ability to function as a valid 5S rRNA in an E. coli cellular context. It was found that 93% (13/14) of the variants tested are likely valid 5S rRNAs in this context. In addition, seven variants were constructed that, although present in the Vibrio region, did not meet the stringent criteria for a highly variable position. In this case, 86% (6/7) are likely valid. As a control we also examined seven variants that are seldom or never seen in the Vibrio region of 5S rRNA sequence space. In this case only two of seven were found to be potentially valid. The

  17. Nucleotide sequence from the coding region of rabbit β-globin messenger RNA

    PubMed Central

    Proudfoot, N.J.

    1976-01-01

    A sequence of 89 nucleotides from rabbit β-globin mRNA has been determined and is shown to code for residues 107 to 137 of the β-globin protein. In addition, a sequence heterogeneity has been identified within this 89 nucleotide long sequence which corresponds to a known polymorphic variant of rabbit β-globin. Images PMID:61580

  18. Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences

    PubMed Central

    Xu, Zhenjiang; Mathews, David H.

    2011-01-01

    Motivation: With recent advances in sequencing, structural and functional studies of RNA lag behind the discovery of sequences. Computational analysis of RNA is increasingly important to reveal structure–function relationships with low cost and speed. The purpose of this study is to use multiple homologous sequences to infer a conserved RNA structure. Results: A new algorithm, called Multilign, is presented to find the lowest free energy RNA secondary structure common to multiple sequences. Multilign is based on Dynalign, which is a program that simultaneously aligns and folds two sequences to find the lowest free energy conserved structure. For Multilign, Dynalign is used to progressively construct a conserved structure from multiple pairwise calculations, with one sequence used in all pairwise calculations. A base pair is predicted only if it is contained in the set of low free energy structures predicted by all Dynalign calculations. In this way, Multilign improves prediction accuracy by keeping the genuine base pairs and excluding competing false base pairs. Multilign has computational complexity that scales linearly in the number of sequences. Multilign was tested on extensive datasets of sequences with known structure and its prediction accuracy is among the best of available algorithms. Multilign can run on long sequences (> 1500 nt) and an arbitrarily large number of sequences. Availability: The algorithm is implemented in ANSI C++ and can be downloaded as part of the RNAstructure package at: http://rna.urmc.rochester.edu Contact: david_mathews@urmc.rochester.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21193521

  19. RNA metabolism in isolated nuclei: processing and transport of immunoglobulin light chain sequences.

    PubMed Central

    Otegui, C; Patterson, R J

    1981-01-01

    Transport of prelabeled RNA from isolated myeloma nuclei is studied using conditions that permit RNA synthesis. Cytosol and spermidine are not required to maintain nuclear stability and inhibited RNA release. Omission of ATP or GTP decreased release 25 to 40%. The stimulatory effect of ATP or GTP is not due to hydrolysis of the triphosphates by the nuclear envelope NTPase, since addition of quercetin (an inhibitor of this NTPase) has no effect on the quantity of RNA released. The size distribution and percentage of poly A-containing species released from nuclei incubated with or without ATP or the other rNTPs are identical. Hybridization analysis of nuclear RNA before the transport assay revealed mature and precursor k light chain mRNA sequences. Following the transport assay, a significant fraction of k mRNA precursors is chased into mature k mRNA which is found both in nuclear-retained and released RNA. PMID:6795596

  20. Excess of Yra1 RNA-Binding Factor Causes Transcription-Dependent Genome Instability, Replication Impairment and Telomere Shortening.

    PubMed

    Gavaldá, Sandra; Santos-Pereira, José M; García-Rubio, María L; Luna, Rosa; Aguilera, Andrés

    2016-04-01

    Yra1 is an essential nuclear factor of the evolutionarily conserved family of hnRNP-like export factors that when overexpressed impairs mRNA export and cell growth. To investigate further the relevance of proper Yra1 stoichiometry in the cell, we overexpressed Yra1 by transforming yeast cells with YRA1 intron-less constructs and analyzed its effect on gene expression and genome integrity. We found that YRA1 overexpression induces DNA damage and leads to a transcription-associated hyperrecombination phenotype that is mediated by RNA:DNA hybrids. In addition, it confers a genome-wide replication retardation as seen by reduced BrdU incorporation and accumulation of the Rrm3 helicase. In addition, YRA1 overexpression causes a cell senescence-like phenotype and telomere shortening. ChIP-chip analysis shows that overexpressed Yra1 is loaded to transcribed chromatin along the genome and to Y' telomeric regions, where Rrm3 is also accumulated, suggesting an impairment of telomere replication. Our work not only demonstrates that a proper stoichiometry of the Yra1 mRNA binding and export factor is required to maintain genome integrity and telomere homeostasis, but suggests that the cellular imbalance between transcribed RNA and specific RNA-binding factors may become a major cause of genome instability mediated by co-transcriptional replication impairment.

  1. Excess of Yra1 RNA-Binding Factor Causes Transcription-Dependent Genome Instability, Replication Impairment and Telomere Shortening

    PubMed Central

    Gavaldá, Sandra; Santos-Pereira, José M.; García-Rubio, María L.; Luna, Rosa; Aguilera, Andrés

    2016-01-01

    Yra1 is an essential nuclear factor of the evolutionarily conserved family of hnRNP-like export factors that when overexpressed impairs mRNA export and cell growth. To investigate further the relevance of proper Yra1 stoichiometry in the cell, we overexpressed Yra1 by transforming yeast cells with YRA1 intron-less constructs and analyzed its effect on gene expression and genome integrity. We found that YRA1 overexpression induces DNA damage and leads to a transcription-associated hyperrecombination phenotype that is mediated by RNA:DNA hybrids. In addition, it confers a genome-wide replication retardation as seen by reduced BrdU incorporation and accumulation of the Rrm3 helicase. In addition, YRA1 overexpression causes a cell senescence-like phenotype and telomere shortening. ChIP-chip analysis shows that overexpressed Yra1 is loaded to transcribed chromatin along the genome and to Y’ telomeric regions, where Rrm3 is also accumulated, suggesting an impairment of telomere replication. Our work not only demonstrates that a proper stoichiometry of the Yra1 mRNA binding and export factor is required to maintain genome integrity and telomere homeostasis, but suggests that the cellular imbalance between transcribed RNA and specific RNA-binding factors may become a major cause of genome instability mediated by co-transcriptional replication impairment. PMID:27035147

  2. a Simple Symmetric Algorithm Using a Likeness with Introns Behavior in RNA Sequences

    NASA Astrophysics Data System (ADS)

    Regoli, Massimo

    2009-02-01

    The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences has some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algoritnm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.

  3. Nucleotide sequence and genetic organization of Hungarian grapevine chrome mosaic nepovirus RNA2.

    PubMed Central

    Brault, V; Hibrand, L; Candresse, T; Le Gall, O; Dunez, J

    1989-01-01

    The complete nucleotide sequence of hungarian grapevine chrome mosaic nepovirus (GCMV) RNA2 has been determined. The RNA sequence is 4441 nucleotides in length, excluding the poly(A) tail. A polyprotein of 1324 amino acids with a calculated molecular weight of 146 kDa is encoded in a single long open reading frame extending from nucleotides 218 to 4190. This polyprotein is homologous with the protein encoded by the S strain of tomato black ring virus (TBRV) RNA2, the only other nepovirus sequenced so far. Direct sequencing of the viral coat protein and in vitro translation of transcripts derived from cDNA sequences demonstrate that, as for comoviruses, the coat protein is located at the carboxy terminus of the polyprotein. A model for the expression of GCMV RNA2 is presented. Images PMID:2798129

  4. Alterations of microRNA and microRNA-regulated messenger RNA expression in germinal center B-cell lymphomas determined by integrative sequencing analysis.

    PubMed

    Hezaveh, Kebria; Kloetgen, Andreas; Bernhart, Stephan H; Mahapatra, Kunal Das; Lenze, Dido; Richter, Julia; Haake, Andrea; Bergmann, Anke K; Brors, Benedikt; Burkhardt, Birgit; Claviez, Alexander; Drexler, Hans G; Eils, Roland; Haas, Siegfried; Hoffmann, Steve; Karsch, Dennis; Klapper, Wolfram; Kleinheinz, Kortine; Korbel, Jan; Kretzmer, Helene; Kreuz, Markus; Küppers, Ralf; Lawerenz, Chris; Leich, Ellen; Loeffler, Markus; Mantovani-Loeffler, Luisa; López, Cristina; McHardy, Alice C; Möller, Peter; Rohde, Marius; Rosenstiel, Philip; Rosenwald, Andreas; Schilhabel, Markus; Schlesner, Matthias; Scholz, Ingrid; Stadler, Peter F; Stilgenbauer, Stephan; Sungalee, Stéphanie; Szczepanowski, Monika; Trümper, Lorenz; Weniger, Marc A; Siebert, Reiner; Borkhardt, Arndt; Hummel, Michael; Hoell, Jessica I

    2016-11-01

    MicroRNA are well-established players in post-transcriptional gene regulation. However, information on the effects of microRNA deregulation mainly relies on bioinformatic prediction of potential targets, whereas proof of the direct physical microRNA/target messenger RNA interaction is mostly lacking. Within the International Cancer Genome Consortium Project "Determining Molecular Mechanisms in Malignant Lymphoma by Sequencing", we performed miRnome sequencing from 16 Burkitt lymphomas, 19 diffuse large B-cell lymphomas, and 21 follicular lymphomas. Twenty-two miRNA separated Burkitt lymphomas from diffuse large B-cell lymphomas/follicular lymphomas, of which 13 have shown regulation by MYC. Moreover, we found expression of three hitherto unreported microRNA. Additionally, we detected recurrent mutations of hsa-miR-142 in diffuse large B-cell lymphomas and follicular lymphomas, and editing of the hsa-miR-376 cluster, providing evidence for microRNA editing in lymphomagenesis. To interrogate the direct physical interactions of microRNA with messenger RNA, we performed Argonaute-2 photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation experiments. MicroRNA directly targeted 208 messsenger RNA in the Burkitt lymphomas and 328 messenger RNA in the non-Burkitt lymphoma models. This integrative analysis discovered several regulatory pathways of relevance in lymphomagenesis including Ras, PI3K-Akt and MAPK signaling pathways, also recurrently deregulated in lymphomas by mutations. Our dataset reveals that messenger RNA deregulation through microRNA is a highly relevant mechanism in lymphomagenesis.

  5. Combined DECS Analysis and Next-Generation Sequencing Enable Efficient Detection of Novel Plant RNA Viruses.

    PubMed

    Yanagisawa, Hironobu; Tomita, Reiko; Katsu, Koji; Uehara, Takuya; Atsumi, Go; Tateda, Chika; Kobayashi, Kappei; Sekine, Ken-Taro

    2016-03-07

    The presence of high molecular weight double-stranded RNA (dsRNA) within plant cells is an indicator of infection with RNA viruses as these possess genomic or replicative dsRNA. DECS (dsRNA isolation, exhaustive amplification, cloning, and sequencing) analysis has been shown to be capable of detecting unknown viruses. We postulated that a combination of DECS analysis and next-generation sequencing (NGS) would improve detection efficiency and usability of the technique. Here, we describe a model case in which we efficiently detected the presumed genome sequence of Blueberry shoestring virus (BSSV), a member of the genus Sobemovirus, which has not so far been reported. dsRNAs were isolated from BSSV-infected blueberry plants using the dsRNA-binding protein, reverse-transcribed, amplified, and sequenced using NGS. A contig of 4,020 nucleotides (nt) that shared similarities with sequences from other Sobemovirus species was obtained as a candidate of the BSSV genomic sequence. Reverse transcription (RT)-PCR primer sets based on sequences from this contig enabled the detection of BSSV in all BSSV-infected plants tested but not in healthy controls. A recombinant protein encoded by the putative coat protein gene was bound by the BSSV-antibody, indicating that the candidate sequence was that of BSSV itself. Our results suggest that a combination of DECS analysis and NGS, designated here as "DECS-C," is a powerful method for detecting novel plant viruses.

  6. Chromosomal Instability Estimation Based on Next Generation Sequencing and Single Cell Genome Wide Copy Number Variation Analysis

    PubMed Central

    Dago, Angel E.; Leitz, Laura J.; Wang, Yipeng; Lee, Jerry; Werner, Shannon L.; Gendreau, Steven; Patel, Premal; Jia, Shidong; Zhang, Liangxuan; Tucker, Eric K.; Malchiodi, Michael; Graf, Ryon P.; Dittamore, Ryan; Marrinucci, Dena; Landers, Mark

    2016-01-01

    Genomic instability is a hallmark of cancer often associated with poor patient outcome and resistance to targeted therapy. Assessment of genomic instability in bulk tumor or biopsy can be complicated due to sample availability, surrounding tissue contamination, or tumor heterogeneity. The Epic Sciences circulating tumor cell (CTC) platform utilizes a non-enrichment based approach for the detection and characterization of rare tumor cells in clinical blood samples. Genomic profiling of individual CTCs could provide a portrait of cancer heterogeneity, identify clonal and sub-clonal drivers, and monitor disease progression. To that end, we developed a single cell Copy Number Variation (CNV) Assay to evaluate genomic instability and CNVs in patient CTCs. For proof of concept, prostate cancer cell lines, LNCaP, PC3 and VCaP, were spiked into healthy donor blood to create mock patient-like samples for downstream single cell genomic analysis. In addition, samples from seven metastatic castration resistant prostate cancer (mCRPC) patients were included to evaluate clinical feasibility. CTCs were enumerated and characterized using the Epic Sciences CTC Platform. Identified single CTCs were recovered, whole genome amplified, and sequenced using an Illumina NextSeq 500. CTCs were then analyzed for genome-wide copy number variations, followed by genomic instability analyses. Large-scale state transitions (LSTs) were measured as surrogates of genomic instability. Genomic instability scores were determined reproducibly for LNCaP, PC3, and VCaP, and were higher than white blood cell (WBC) controls from healthy donors. A wide range of LST scores were observed within and among the seven mCRPC patient samples. On the gene level, loss of the PTEN tumor suppressor was observed in PC3 and 5/7 (71%) patients. Amplification of the androgen receptor (AR) gene was observed in VCaP cells and 5/7 (71%) mCRPC patients. Using an in silico down-sampling approach, we determined that DNA copy

  7. RNA editing generates cellular subsets with diverse sequence within populations

    PubMed Central

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J.; Rayon-Estrada, Violeta; Papavasiliou, F. Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  8. Knowledge discovery in RNA sequence families of HIV using scalable computers

    SciTech Connect

    Hofacker, I.L.; Huynen, M.A.; Stadler, P.F.; Stolorz, P.E.

    1996-12-31

    The prediction of RNA secondary structure on the basis of sequence information is an important tool in biosequence analysis. However, it has typically been restricted to molecules containing no more than 4000 nucleotides due to the computational complexity of the underlying dynamic programming algorithm used. We describe here an approach to RNA sequence analysis based upon scalable computers, which enables molecules containing up to 20,000 nucleotides to be analyzed. We apply the approach to investigation of the entire HIV genome, illustrating the power of these methods to perform knowledge discovery by identification of important secondary structure motifs within RNA sequence families.

  9. Small RNA Library Cloning Procedure for Deep Sequencing of Specific Endogenous siRNA Classes in Caenorhabditis elegans

    PubMed Central

    Ow, Maria C.; Lau, Nelson C.; Hall, Sarah E.

    2017-01-01

    In recent years, distinct classes of small RNAs ranging in size from ~21 to 26 nucleotides have been discovered and shown to play important roles in a wide array of cellular functions. Because of the abundance of these small RNAs, library preparation from an RNA sample followed by deep sequencing provides the identity and quantity of a particular class of small RNAs. In this chapter we describe a detailed protocol for preparing small RNA libraries for deep sequencing on the Illumina platform from the nematode C. elegans. PMID:24920360

  10. Identification of Symmetrical RNA Editing Events in the Mitochondria of Salvia miltiorrhiza by Strand-specific RNA Sequencing.

    PubMed

    Wu, Bin; Chen, Haimei; Shao, Junjie; Zhang, Hui; Wu, Kai; Liu, Chang

    2017-02-10

    Salvia miltiorrhiza is one of the most widely-used medicinal plants. Here, we systematically analyzed the RNA editing events in its mitochondria. We developed a pipeline using REDItools to predict RNA editing events from stand-specific RNA-Seq data. The predictions were validated using reverse transcription, RT-PCR amplification and Sanger sequencing experiments. Putative sequences motifs were characterized. Comparative analyses were carried out between S. miltiorrhiza, Arabidopsis thaliana and Oryza sativa. We discovered 1123 editing sites, including 225 "C to U" sites in the protein-coding regions. Fourteen of sixteen (87.5%) sites were validated. Three putative DNA motifs were identified around the predicted sites. The nucleotides on both strands at 115 of the 225 sites had undergone RNA editing, which we called symmetrical RNA editing (SRE). Four of six these SRE sites (66.7%) were experimentally confirmed. Re-examination of strand-specific RNA-Seq data from A. thaliana and O. sativa identified 327 and 369 SRE sites respectively. 78, 20 and 13 SRE sites were found to be conserved among A. thaliana, O. sativa and S. miltiorrhiza respectively. This study provides a comprehensive picture of RNA editing events in the mitochondrial genome of S. miltiorrhiza. We identified SREs for the first time, which may represent a universal phenomenon.

  11. RNA sequencing of Sleeping Beauty transposon-induced tumors detects transposon-RNA fusions in forward genetic cancer screens

    PubMed Central

    Temiz, Nuri A.; Moriarity, Branden S.; Wolf, Natalie K.; Riordan, Jesse D.; Dupuy, Adam J.; Largaespada, David A.; Sarver, Aaron L.

    2016-01-01

    Forward genetic screens using Sleeping Beauty (SB)-mobilized T2/Onc transposons have been used to identify common insertion sites (CISs) associated with tumor formation. Recurrent sites of transposon insertion are commonly identified using ligation-mediated PCR (LM-PCR). Here, we use RNA sequencing (RNA-seq) data to directly identify transcriptional events mediated by T2/Onc. Surprisingly, the majority (∼80%) of LM-PCR identified junction fragments do not lead to observable changes in RNA transcripts. However, in CIS regions, direct transcriptional effects of transposon insertions are observed. We developed an automated method to systematically identify T2/Onc-genome RNA fusion sequences in RNA-seq data. RNA fusion-based CISs were identified corresponding to both DNA-based CISs (Cdkn2a, Mycl1, Nf2, Pten, Sema6d, and Rere) and additional regions strongly associated with cancer that were not observed by LM-PCR (Myc, Akt1, Pth, Csf1r, Fgfr2, Wisp1, Map3k5, and Map4k3). In addition to calculating recurrent CISs, we also present complementary methods to identify potential driver events via determination of strongly supported fusions and fusions with large transcript level changes in the absence of multitumor recurrence. These methods independently identify CIS regions and also point to cancer-associated genes like Braf. We anticipate RNA-seq analyses of tumors from forward genetic screens will become an efficient tool to identify causal events. PMID:26553456

  12. Identification of Symmetrical RNA Editing Events in the Mitochondria of Salvia miltiorrhiza by Strand-specific RNA Sequencing

    PubMed Central

    Wu, Bin; Chen, Haimei; Shao, Junjie; Zhang, Hui; Wu, Kai; Liu, Chang

    2017-01-01

    Salvia miltiorrhiza is one of the most widely-used medicinal plants. Here, we systematically analyzed the RNA editing events in its mitochondria. We developed a pipeline using REDItools to predict RNA editing events from stand-specific RNA-Seq data. The predictions were validated using reverse transcription, RT-PCR amplification and Sanger sequencing experiments. Putative sequences motifs were characterized. Comparative analyses were carried out between S. miltiorrhiza, Arabidopsis thaliana and Oryza sativa. We discovered 1123 editing sites, including 225 “C to U” sites in the protein-coding regions. Fourteen of sixteen (87.5%) sites were validated. Three putative DNA motifs were identified around the predicted sites. The nucleotides on both strands at 115 of the 225 sites had undergone RNA editing, which we called symmetrical RNA editing (SRE). Four of six these SRE sites (66.7%) were experimentally confirmed. Re-examination of strand-specific RNA-Seq data from A. thaliana and O. sativa identified 327 and 369 SRE sites respectively. 78, 20 and 13 SRE sites were found to be conserved among A. thaliana, O. sativa and S. miltiorrhiza respectively. This study provides a comprehensive picture of RNA editing events in the mitochondrial genome of S. miltiorrhiza. We identified SREs for the first time, which may represent a universal phenomenon. PMID:28186130

  13. 3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

    PubMed Central

    2013-01-01

    Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768

  14. miRBase: integrating microRNA annotation and deep-sequencing data.

    PubMed

    Kozomara, Ana; Griffiths-Jones, Sam

    2011-01-01

    miRBase is the primary online repository for all microRNA sequences and annotation. The current release (miRBase 16) contains over 15,000 microRNA gene loci in over 140 species, and over 17,000 distinct mature microRNA sequences. Deep-sequencing technologies have delivered a sharp rise in the rate of novel microRNA discovery. We have mapped reads from short RNA deep-sequencing experiments to microRNAs in miRBase and developed web interfaces to view these mappings. The user can view all read data associated with a given microRNA annotation, filter reads by experiment and count, and search for microRNAs by tissue- and stage-specific expression. These data can be used as a proxy for relative expression levels of microRNA sequences, provide detailed evidence for microRNA annotations and alternative isoforms of mature microRNAs, and allow us to revisit previous annotations. miRBase is available online at: http://www.mirbase.org/.

  15. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    PubMed Central

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A.C.T; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  16. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    NASA Astrophysics Data System (ADS)

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M.; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A. C. T.; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M.; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-09-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.

  17. Artificial small RNA for sequence specific cleavage of target RNA through RNase III endonuclease Dicer

    PubMed Central

    Liu, Yali; Liu, Li; Zhan, Yonghao; Zhuang, Chengle; Lin, Junhao; Chen, Mingwei; Li, Jianfa; Cai, Zhiming; Huang, Weiren; Zhang, Yong

    2016-01-01

    CRISPR-Cas9 system uses a guide RNA which functions in conjunction with Cas9 proteins to target a DNA and cleaves double-strand DNA. This phenomenon raises a question whether an artificial small RNA (asRNA), composed of a Dicer–binding RNA element and an antisense RNA, could also be used to induce Dicer to process and degrade a specific RNA. If so, we could develop a new method which is named DICERi for gene silencing or RNA editing. To prove the feasibility of asRNA, we selected MALAT-1 as target and used Hela and MDA-MB-231 cells as experimental models. The results of qRT-PCR showed that the introduction of asRNA decreased the relative expression level of target gene significantly. Next, we analyzed cell proliferation using CCK-8 and EdU staining assays, and then cell migration using wound scratch and Transwell invasion assays. We found that cell proliferation and cell migration were both suppressed remarkably after asRNA was expressed in Hela and MDA-MB-231 cells. Cell apoptosis was also detected through Hoechst staining and ELISA assays and the data indicated that he numbers of apoptotic cell in experimental groups significantly increased compared with negative controls. In order to prove that the gene silencing effects were caused by Dicer, we co-transfected shRNA silencing Dicer and asRNA. The relative expression levels of Dicer and MALAT-1 were both detected and the results indicated that when the cleavage role of Dicer was silenced, the relative expression level of MALAT-1 was not affected after the introduction of asRNA. All the above results demonstrated that these devices directed by Dicer effectively excised target RNA and repressed the target genes, thus causing phenotypic changes. Our works adds a new dimension to gene regulating technologies and may have broad applications in construction of gene circuits. PMID:27231846

  18. Analysis of a marine picoplankton community by 16S rRNA gene cloning and sequencing.

    PubMed Central

    Schmidt, T M; DeLong, E F; Pace, N R

    1991-01-01

    The phylogenetic diversity of an oligotrophic marine picoplankton community was examined by analyzing the sequences of cloned ribosomal genes. This strategy does not rely on cultivation of the resident microorganisms. Bulk genomic DNA was isolated from picoplankton collected in the north central Pacific Ocean by tangential flow filtration. The mixed-population DNA was fragmented, size fractionated, and cloned into bacteriophage lambda. Thirty-eight clones containing 16S rRNA genes were identified in a screen of 3.2 x 10(4) recombinant phage, and portions of the rRNA gene were amplified by polymerase chain reaction and sequenced. The resulting sequences were used to establish the identities of the picoplankton by comparison with an established data base of rRNA sequences. Fifteen unique eubacterial sequences were obtained, including four from cyanobacteria and eleven from proteobacteria. A single eucaryote related to dinoflagellates was identified; no archaebacterial sequences were detected. The cyanobacterial sequences are all closely related to sequences from cultivated marine Synechococcus strains and with cyanobacterial sequences obtained from the Atlantic Ocean (Sargasso Sea). Several sequences were related to common marine isolates of the gamma subdivision of proteobacteria. In addition to sequences closely related to those of described bacteria, sequences were obtained from two phylogenetic groups of organisms that are not closely related to any known rRNA sequences from cultivated organisms. Both of these novel phylogenetic clusters are proteobacteria, one group within the alpha subdivision and the other distinct from known proteobacterial subdivisions. The rRNA sequences of the alpha-related group are nearly identical to those of some Sargasso Sea picoplankton, suggesting a global distribution of these organisms. Images PMID:2066334

  19. Phylogenetic analysis of oryx species using partial sequences of mitochondrial rRNA genes.

    PubMed

    Khan, H A; Arif, I A; Al Farhan, A H; Al Homaidan, A A

    2008-10-28

    We conducted a comparative evaluation of 12S rRNA and 16S rRNA genes of the mitochondrial genome for molecular differentiation among three oryx species (Oryx leucoryx, Oryx dammah and Oryx gazella) with respect to two closely related outgroups, addax and roan. Our findings showed the failure of 12S rRNA gene to differentiate between the genus Oryx and addax, whereas a 342-bp partial sequence of 16S rRNA accurately grouped all five taxa studied, suggesting the utility of 16S rRNA segment for molecular phylogeny of oryx at the genus and possibly species levels.

  20. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud

    PubMed Central

    Griffith, Malachi; Walker, Jason R.; Spies, Nicholas C.; Ainscough, Benjamin J.; Griffith, Obi L.

    2015-01-01

    Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki. PMID:26248053

  1. The zinc fingers of YY1 bind single-stranded RNA with low sequence specificity.

    PubMed

    Wai, Dorothy C C; Shihab, Manar; Low, Jason K K; Mackay, Joel P

    2016-11-02

    Classical zinc fingers (ZFs) are traditionally considered to act as sequence-specific DNA-binding domains. More recently, classical ZFs have been recognised as potential RNA-binding modules, raising the intriguing possibility that classical-ZF transcription factors are involved in post-transcriptional gene regulation via direct RNA binding. To date, however, only one classical ZF-RNA complex, that involving TFIIIA, has been structurally characterised. Yin Yang-1 (YY1) is a multi-functional transcription factor involved in many regulatory processes, and binds DNA via four classical ZFs. Recent evidence suggests that YY1 also interacts with RNA, but the molecular nature of the interaction remains unknown. In the present work, we directly assess the ability of YY1 to bind RNA using in vitro assays. Systematic Evolution of Ligands by EXponential enrichment (SELEX) was used to identify preferred RNA sequences bound by the YY1 ZFs from a randomised library over multiple rounds of selection. However, a strong motif was not consistently recovered, suggesting that the RNA sequence selectivity of these domains is modest. YY1 ZF residues involved in binding to single-stranded RNA were identified by NMR spectroscopy and found to be largely distinct from the set of residues involved in DNA binding, suggesting that interactions between YY1 and ssRNA constitute a separate mode of nucleic acid binding. Our data are consistent with recent reports that YY1 can bind to RNA in a low-specificity, yet physiologically relevant manner.

  2. The zinc fingers of YY1 bind single-stranded RNA with low sequence specificity

    PubMed Central

    Wai, Dorothy C.C.; Shihab, Manar; Low, Jason K.K.; Mackay, Joel P.

    2016-01-01

    Classical zinc fingers (ZFs) are traditionally considered to act as sequence-specific DNA-binding domains. More recently, classical ZFs have been recognised as potential RNA-binding modules, raising the intriguing possibility that classical-ZF transcription factors are involved in post-transcriptional gene regulation via direct RNA binding. To date, however, only one classical ZF-RNA complex, that involving TFIIIA, has been structurally characterised. Yin Yang-1 (YY1) is a multi-functional transcription factor involved in many regulatory processes, and binds DNA via four classical ZFs. Recent evidence suggests that YY1 also interacts with RNA, but the molecular nature of the interaction remains unknown. In the present work, we directly assess the ability of YY1 to bind RNA using in vitro assays. Systematic Evolution of Ligands by EXponential enrichment (SELEX) was used to identify preferred RNA sequences bound by the YY1 ZFs from a randomised library over multiple rounds of selection. However, a strong motif was not consistently recovered, suggesting that the RNA sequence selectivity of these domains is modest. YY1 ZF residues involved in binding to single-stranded RNA were identified by NMR spectroscopy and found to be largely distinct from the set of residues involved in DNA binding, suggesting that interactions between YY1 and ssRNA constitute a separate mode of nucleic acid binding. Our data are consistent with recent reports that YY1 can bind to RNA in a low-specificity, yet physiologically relevant manner. PMID:27369384

  3. New perspectives on the diversification of the RNA interference system: insights from comparative genomics and small RNA sequencing

    PubMed Central

    Burroughs, Alexander Maxwell; Ando, Yoshinari; Aravind, L

    2014-01-01

    Our understanding of the pervasive involvement of small RNAs in regulating diverse biological processes has been greatly augmented by recent application of deep-sequencing technologies to small RNA across diverse eukaryotes. We review the currently-known small RNA classes and place them in context of the reconstructed evolutionary history of the RNAi protein machinery. This synthesis indicates the earliest versions of eukaryotic RNAi systems likely utilized small RNA processed from three types of precursors: 1) sense-antisense transcriptional products, 2) genome-encoded, imperfectly-complementary hairpin sequences, and 3) larger non-coding RNA precursor sequences. Structural dissection of PIWI proteins along with recent discovery of novel families (including Med13 of the Mediator complex) suggest that emergence of a distinct architecture with the N-terminal domains (also occurring separately fused to endoDNases in prokaryotes) formed via duplication of an ancestral unit was key to their recruitment as primary RNAi effectors and use of small RNAs of certain preferred lengths. Prokaryotic PIWI proteins are typically components of several RNA-directed DNA restriction or CRISPR/Cas systems. However, eukaryotic versions appear to have emerged from a subset that evolved RNA-directed RNA interference. They were recruited alongside RNaseIII domains and RdRP domains, also from prokaryotic systems, to form the core eukaryotic RNAi system. Like certain regulatory systems, RNAi diversified into two distinct but linked arms concomitant with eukaryotic nucleo-cytoplasmic compartmentalization. Subsequent elaboration of RNAi proceeded via diversification of the core protein machinery through lineage-specific expansions and recruitment of new components from prokaryotes (nucleases and small RNA-modifying enzymes), allowing for diversification of associating small RNAs. PMID:24311560

  4. Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues

    PubMed Central

    Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Terry, Richard; Turczyk, Brian M.; Yang, Joyce L.; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M.

    2014-01-01

    RNA sequencing measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. On the other hand, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq our method enriches for context-specific transcripts over house-keeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d. PMID:25675209

  5. RNA sequencing using fluorescent-labeled dideoxynucleotides and automated fluorescence detection.

    PubMed Central

    Bauer, G J

    1990-01-01

    Although dideoxy terminated sequencing of RNA, using reverse transcriptase and oligodeoxynucleotide primers, is now a well established method, the accuracy is limited by sequence ambiguities due to unspecific chain termination events. A protocol is described which circumvents these ambiguities by using fluorescence labels tagged to dideoxynucleotides. Only chain terminations caused by dideoxynucleotides were detected while premature terminated cDNA's remain undetectable. In addition, the remaining multiple signals at nucleotide positions can be assigned to sequence heterogeneities within the RNA sequence to be determined. Images PMID:1690393

  6. RNA sequence and transcriptional properties of the 3' end of the Newcastle disease virus genome

    SciTech Connect

    Kurilla, M.G.; Stone, H.O.; Keene, J.D.

    1985-09-01

    The 3' end of the genomic RNA of Newcastle disease virus (NDV) has been sequenced and the leader RNA defined. Using hybridization to a 3'-end-labeled genome, leader RNA species from in vitro transcription reactions and from infected cell extracts were found to be 47 and 53 nucleotides long. In addition, the start site of the 3'-proximal mRNA was determined by sequence analysis of in vitro (beta-32P)GTP-labeled transcription products. The genomic sequence extending beyond the leader region demonstrated an open reading frame for at least 42 amino acids and probably represents the amino terminus of the nucleocapsid protein (NP). The terminal 8 nucleotides of the NDV genome were identical to those of measles virus and Sendai virus while the sequence of the distal half of the leader region was more similar to that of vesicular stomatitis virus. These data argue for strong evolutionary relatedness between the paramyxovirus and rhabdovirus groups.

  7. RNA Sequencing Identifies New RNase III Cleavage Sites in Escherichia coli and Reveals Increased Regulation of mRNA.

    PubMed

    Gordon, Gina C; Cameron, Jeffrey C; Pfleger, Brian F

    2017-03-28

    Ribonucleases facilitate rapid turnover of RNA, providing cells with another mechanism to adjust transcript and protein levels in response to environmental conditions. While many examples have been documented, a comprehensive list of RNase targets is not available. To address this knowledge gap, we compared levels of RNA sequencing coverage of Escherichia coli and a corresponding RNase III mutant to expand the list of known RNase III targets. RNase III is a widespread endoribonuclease that binds and cleaves double-stranded RNA in many critical transcripts. RNase III cleavage at novel sites found in aceEF, proP, tnaC, dctA, pheM, sdhC, yhhQ, glpT, aceK, and gluQ accelerated RNA decay, consistent with previously described targets wherein RNase III cleavage initiates rapid degradation of secondary messages by other RNases. In contrast, cleavage at three novel sites in the ahpF, pflB, and yajQ transcripts led to stabilized secondary transcripts. Two other novel sites in hisL and pheM overlapped with transcriptional attenuators that likely serve to ensure turnover of these highly structured RNAs. Many of the new RNase III target sites are located on transcripts encoding metabolic enzymes. For instance, two novel RNase III sites are located within transcripts encoding enzymes near a key metabolic node connecting glycolysis and the tricarboxylic acid (TCA) cycle. Pyruvate dehydrogenase activity was increased in an rnc deletion mutant compared to the wild-type (WT) strain in early stationary phase, confirming the novel link between RNA turnover and regulation of pathway activity. Identification of these novel sites suggests that mRNA turnover may be an underappreciated mode of regulating metabolism.IMPORTANCE The concerted action and overlapping functions of endoribonucleases, exoribonucleases, and RNA processing enzymes complicate the study of global RNA turnover and recycling of specific transcripts. More information about RNase specificity and activity is needed to make

  8. Species identification and profiling of complex microbial communities using shotgun Illumina sequencing of 16S rRNA amplicon sequences.

    PubMed

    Ong, Swee Hoe; Kukkillaya, Vinutha Uppoor; Wilm, Andreas; Lay, Christophe; Ho, Eliza Xin Pei; Low, Louie; Hibberd, Martin Lloyd; Nagarajan, Niranjan

    2013-01-01

    The high throughput and cost-effectiveness afforded by short-read sequencing technologies, in principle, enable researchers to perform 16S rRNA profiling of complex microbial communities at unprecedented depth and resolution. Existing Illumina sequencing protocols are, however, limited by the fraction of the 16S rRNA gene that is interrogated and therefore limit the resolution and quality of the profiling. To address this, we present the design of a novel protocol for shotgun Illumina sequencing of the bacterial 16S rRNA gene, optimized to amplify more than 90% of sequences in the Greengenes database and with the ability to distinguish nearly twice as many species-level OTUs compared to existing protocols. Using several in silico and experimental datasets, we demonstrate that despite the presence of multiple variable and conserved regions, the resulting shotgun sequences can be used to accurately quantify the constituents of complex microbial communities. The reconstruction of a significant fraction of the 16S rRNA gene also enabled high precision (>90%) in species-level identification thereby opening up potential application of this approach for clinical microbial characterization.

  9. Species Identification and Profiling of Complex Microbial Communities Using Shotgun Illumina Sequencing of 16S rRNA Amplicon Sequences

    PubMed Central

    Lay, Christophe; Ho, Eliza Xin Pei; Low, Louie; Hibberd, Martin Lloyd; Nagarajan, Niranjan

    2013-01-01

    The high throughput and cost-effectiveness afforded by short-read sequencing technologies, in principle, enable researchers to perform 16S rRNA profiling of complex microbial communities at unprecedented depth and resolution. Existing Illumina sequencing protocols are, however, limited by the fraction of the 16S rRNA gene that is interrogated and therefore limit the resolution and quality of the profiling. To address this, we present the design of a novel protocol for shotgun Illumina sequencing of the bacterial 16S rRNA gene, optimized to amplify more than 90% of sequences in the Greengenes database and with the ability to distinguish nearly twice as many species-level OTUs compared to existing protocols. Using several in silico and experimental datasets, we demonstrate that despite the presence of multiple variable and conserved regions, the resulting shotgun sequences can be used to accurately quantify the constituents of complex microbial communities. The reconstruction of a significant fraction of the 16S rRNA gene also enabled high precision (>90%) in species-level identification thereby opening up potential application of this approach for clinical microbial characterization. PMID:23579286

  10. Small RNA and RNA-IP Sequencing Identifies and Validates Novel MicroRNAs in Human Mesenchymal Stem Cells.

    PubMed

    Tsai, Chin-Han; Liao, Ko-Hsun; Shih, Chuan-Chi; Chan, Chia-Hao; Hsieh, Jui-Yu; Tsai, Cheng-Fong; Wang, Hsei-Wei; Chang, Shing-Jyh

    2016-03-01

    Organ regeneration therapies using multipotent mesenchymal stem cells (MSCs) are currently being investigated for a variety of common complex diseases. Understanding the molecular regulation of MSC biology will benefit regenerative medicine. MicroRNAs (miRNAs) act as regulators in MSC stemness. There are approximately 2500 currently known human miRNAs that have been recorded in the miRBase v21 database. In the present study, we identified novel microRNAs involved in MSC stemness and differentiation by obtaining the global microRNA expression profiles (miRNomes) of MSCs from two anatomical locations bone marrow (BM-MSCs) and umbilical cord Wharton's jelly (WJ-MSCs) and from osteogenically and adipogenically differentiated progenies of BM-MSCs. Small RNA sequencing (smRNA-seq) and bioinformatics analyses predicted that 49 uncharacterized miRNA candidates had high cellular expression values in MSCs. Another independent batch of Ago1/2-based RNA immunoprecipitation (RNA-IP) sequencing datasets validated the existence of 40 unreported miRNAs in cells and their associations with the RNA-induced silencing complex (RISC). Nine of these 40 new miRNAs were universally overexpressed in both MSC types; nine others were overexpressed in differentiated cells. A novel miRNA (UNI-118-3p) was specifically expressed in BM-MSCs, as verified using RT-qPCR. Taken together, this report offers comprehensive miRNome profiles for two MSC types, as well as cells differentiated from BM-MSCs. MSC transplantation has the potential to ameliorate degenerative disorders and repair damaged tissues. Interventions involving the above 40 new microRNA members in transplanted MSCs may potentially guide future clinical applications.

  11. Production of Viral mRNA in Adenovirus-Transformed Cells by the Post-Transcriptional Processing of Heterogeneous Nuclear RNA Containing Viral and Cell Sequences

    PubMed Central

    Wall, R.; Weber, J.; Gage, Z.; Darnell, J. E.

    1973-01-01

    Adenovirus 2-transformed cells contain virus-specific sequences which are covalently linked to cell-specific RNA sequences in heterogeneous nuclear RNA (HnRNA) molecules larger than 45S. Virus sequences are identified by hybridization to viral DNA, and the cell sequences are detected by hybridization to cellular DNA under conditions where hybridization only occurs to reiterated sites in cell DNA. Such large composite viral-cell HnRNA molecules presumably arise through the uninterrupted transcription of host sequences and integrated viral DNA. Adenovirus-specific polysomal RNA from these cells sediments as three discrete species at 16, 20, and 26S. These specific classes of viral mRNA do not contain rapidly hybridizing host-specific RNA sequences. Both virus-specific HnRNA and mRNA contain polyadenylic acid sequences since they bind to polyU columns at levels characteristics of other polyA-terminated HnRNA and mRNA. Thus, the discrete species of virus-specific mRNA in adenovirus 2 transformed cells appear to be derived from high-molecular-weight virus-specific HnRNA through a series of post-transcriptional modifications involving polyA addition. Subsequently the HnRNA is cleaved so that the cell-specific RNA sequences that originate from the reiterated sites in cell DNA do not accompany the adenovirus mRNA to the cytoplasm. These events for the adenovirus-specific mRNA appear, therefore, to be similar to the stages in the biogenesis of the majority of mRNA in eukaryotic cells. PMID:4736534

  12. Complete nucleotide sequence of the 23S rRNA gene of the Cyanobacterium, Anacystis nidulans.

    PubMed Central

    Douglas, S E; Doolittle, W F

    1984-01-01

    The nucleotide sequence of the Anacystis nidulans 23S rRNA gene, including the 5'- and 3'-flanking regions has been determined. The gene is 2876 nucleotides long and shows higher primary sequence homology to the 23S rRNAs of plastids (84.5%) than to that of E. coli (79%). The predicted rRNA transcript also shares many secondary structural features with those of plastids, reinforcing the endosymbiont hypothesis for the origin of these organelles. PMID:6326060

  13. Method for rapid base sequencing in DNA and RNA with two base labeling

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Posner, R.G.; Marrone, B.L.; Hammond, M.L.; Simpson, D.J.

    1995-04-11

    A method is described for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand. 4 figures.

  14. Cloud-scale RNA-sequencing differential expression analysis with Myrna

    PubMed Central

    2010-01-01

    As sequencing throughput approaches dozens of gigabases per day, there is a growing need for efficient software for analysis of transcriptome sequencing (RNA-Seq) data. Myrna is a cloud-computing pipeline for calculating differential gene expression in large RNA-Seq datasets. We apply Myrna to the analysis of publicly available data sets and assess the goodness of fit of standard statistical models. Myrna is available from http://bowtie-bio.sf.net/myrna. PMID:20701754

  15. Method for rapid base sequencing in DNA and RNA with two base labeling

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Posner, Richard G.; Marrone, Babetta L.; Hammond, Mark L.; Simpson, Daniel J.

    1995-01-01

    Method for rapid-base sequencing in DNA and RNA with two-base labeling and employing fluorescent detection of single molecules at two wavelengths. Bases modified to accept fluorescent labels are used to replicate a single DNA or RNA strand to be sequenced. The bases are then sequentially cleaved from the replicated strand, excited with a chosen spectrum of electromagnetic radiation, and the fluorescence from individual, tagged bases detected in the order of cleavage from the strand.

  16. Quantitative Assessment of RNA-Protein Interactions with High Throughput Sequencing - RNA Affinity Profiling (HiTS-RAP)

    PubMed Central

    Ozer, Abdullah; Tome, Jacob M.; Friedman, Robin C.; Gheba, Dan; Schroth, Gary P.; Lis, John T.

    2016-01-01

    Because RNA-protein interactions play a central role in a wide-array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the High Throughput Sequencing-RNA Affinity Profiling (HiTS-RAP) assay, which couples sequencing on an Illumina GAIIx with the quantitative assessment of one or several proteins’ interactions with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of EGFP and NELF-E proteins with their corresponding canonical and mutant RNA aptamers. Here, we provide a detailed protocol for HiTS-RAP, which can be completed in about a month (8 days hands-on time) including the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, high-throughput sequencing and protein binding with GAIIx, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, RNA-MaP and RBNS. A successful HiTS-RAP experiment provides the sequence and binding curves for approximately 200 million RNAs in a single experiment. PMID:26182240

  17. Human cellular CYBA UTR sequences increase mRNA translation without affecting the half-life of recombinant RNA transcripts.

    PubMed

    Ferizi, Mehrije; Aneja, Manish K; Balmayor, Elizabeth R; Badieyan, Zohreh Sadat; Mykhaylyk, Olga; Rudolph, Carsten; Plank, Christian

    2016-12-15

    Modified nucleotide chemistries that increase the half-life (T1/2) of transfected recombinant mRNA and the use of non-native 5'- and 3'-untranslated region (UTR) sequences that enhance protein translation are advancing the prospects of transcript therapy. To this end, a set of UTR sequences that are present in mRNAs with long cellular T1/2 were synthesized and cloned as five different recombinant sequence set combinations as upstream 5'-UTR and/or downstream 3'-UTR regions flanking a reporter gene. Initial screening in two different cell systems in vitro revealed that cytochrome b-245 alpha chain (CYBA) combinations performed the best among all other UTR combinations and were characterized in detail. The presence or absence of CYBA UTRs had no impact on the mRNA stability of transfected mRNAs, but appeared to enhance the productivity of transfected transcripts based on the measurement of mRNA and protein levels in cells. When CYBA UTRs were fused to human bone morphogenetic protein 2 (hBMP2) coding sequence, the recombinant mRNA transcripts upon transfection produced higher levels of protein as compared to control transcripts. Moreover, transfection of human adipose mesenchymal stem cells with recombinant hBMP2-CYBA UTR transcripts induced bone differentiation demonstrating the osteogenic and therapeutic potential for transcript therapy based on hybrid UTR designs.

  18. Human cellular CYBA UTR sequences increase mRNA translation without affecting the half-life of recombinant RNA transcripts

    PubMed Central

    Ferizi, Mehrije; Aneja, Manish K.; Balmayor, Elizabeth R.; Badieyan, Zohreh Sadat; Mykhaylyk, Olga; Rudolph, Carsten; Plank, Christian

    2016-01-01

    Modified nucleotide chemistries that increase the half-life (T1/2) of transfected recombinant mRNA and the use of non-native 5′- and 3′-untranslated region (UTR) sequences that enhance protein translation are advancing the prospects of transcript therapy. To this end, a set of UTR sequences that are present in mRNAs with long cellular T1/2 were synthesized and cloned as five different recombinant sequence set combinations as upstream 5′-UTR and/or downstream 3′-UTR regions flanking a reporter gene. Initial screening in two different cell systems in vitro revealed that cytochrome b-245 alpha chain (CYBA) combinations performed the best among all other UTR combinations and were characterized in detail. The presence or absence of CYBA UTRs had no impact on the mRNA stability of transfected mRNAs, but appeared to enhance the productivity of transfected transcripts based on the measurement of mRNA and protein levels in cells. When CYBA UTRs were fused to human bone morphogenetic protein 2 (hBMP2) coding sequence, the recombinant mRNA transcripts upon transfection produced higher levels of protein as compared to control transcripts. Moreover, transfection of human adipose mesenchymal stem cells with recombinant hBMP2-CYBA UTR transcripts induced bone differentiation demonstrating the osteogenic and therapeutic potential for transcript therapy based on hybrid UTR designs. PMID:27974853

  19. ARM-Seq: AlkB-facilitated RNA methylation sequencing reveals a complex landscape of modified tRNA fragments

    PubMed Central

    Cozen, Aaron E.; Quartley, Erin; Holmes, Andrew D.; Robinson, Eva H.; Phizicky, Eric M.; Lowe, Todd M.

    2015-01-01

    High throughput RNA sequencing has accelerated discovery of the complex regulatory roles of small RNAs, but RNAs containing modified nucleosides may escape detection when those modifications interfere with reverse transcription during RNA-seq library preparation. Here we describe AlkB-facilitated RNA Methylation sequencing (ARM-Seq) which uses pre-treatment with Escherichia coli AlkB to demethylate 1-methyladenosine, 3-methylcytidine, and 1-methylguanosine, all commonly found in transfer RNAs. Comparative methylation analysis using ARM-Seq provides the first detailed, transcriptome-scale map of these modifications, and reveals an abundance of previously undetected, methylated small RNAs derived from tRNAs. ARM-Seq demonstrates that tRNA-derived small RNAs accurately recapitulate the m1A modification state for well-characterized yeast tRNAs, and generates new predictions for a large number of human tRNAs, including tRNA precursors and mitochondrial tRNAs. Thus, ARM-Seq provides broad utility for identifying previously overlooked methyl-modified RNAs, can efficiently monitor methylation state, and may reveal new roles for tRNA-derived RNAs as biomarkers or signaling molecules. PMID:26237225

  20. Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers

    PubMed Central

    Liu, Zongzhi; DeSantis, Todd Z.; Andersen, Gary L.; Knight, Rob

    2008-01-01

    The recent introduction of massively parallel pyrosequencers allows rapid, inexpensive analysis of microbial community composition using 16S ribosomal RNA (rRNA) sequences. However, a major challenge is to design a workflow so that taxonomic information can be accurately and rapidly assigned to each read, so that the composition of each community can be linked back to likely ecological roles played by members of each species, genus, family or phylum. Here, we use three large 16S rRNA datasets to test whether taxonomic information based on the full-length sequences can be recaptured by short reads that simulate the pyrosequencer outputs. We find that different taxonomic assignment methods vary radically in their ability to recapture the taxonomic information in full-length 16S rRNA sequences: most methods are sensitive to the region of the 16S rRNA gene that is targeted for sequencing, but many combinations of methods and rRNA regions produce consistent and accurate results. To process large datasets of partial 16S rRNA sequences obtained from surveys of various microbial communities, including those from human body habitats, we recommend the use of Greengenes or RDP classifier with fragments of at least 250 bases, starting from one of the primers R357, R534, R798, F343 or F517. PMID:18723574

  1. Processing of Escherichia coli 16S rRNA with bacteriophage lambda leader sequences.

    PubMed Central

    Krych, M; Sirdeshmukh, R; Gourse, R; Schlessinger, D

    1987-01-01

    To test whether any specific 5' precursor sequences are required for the processing of pre-16S rRNA, constructs were studied in which large parts of the 5' leader sequence were replaced by the coliphage lambda pL promoter and adjacent sequences. Unexpectedly, few full-length transcripts of the rRNA were detected after the pL promoter was induced, implying that either transcription was poor or most of the rRNA chains with lambda leader sequences were unstable. Nevertheless, sufficient transcription occurred to permit the detection of processing by S1 nuclease analysis. RNA transcripts in which 2/3 of the normal rRNA leader was deleted (from the promoter up to the normal RNase III cleavage site) were processed to form the normal 5' terminus. Thus, most of the double-stranded stem that forms from sequences bracketing wild-type 16S pre-rRNA is apparently not required for proper processing; the expression of such modified transcripts, however, must be increased before the efficiency of processing of the 16S rRNA formed can be assessed. Images PMID:2445728

  2. Accurate identification of A-to-I RNA editing in human by transcriptome sequencing.

    PubMed

    Bahn, Jae Hoon; Lee, Jae-Hyung; Li, Gang; Greer, Christopher; Peng, Guangdun; Xiao, Xinshu

    2012-01-01

    RNA editing enhances the diversity of gene products at the post-transcriptional level. Approaches for genome-wide identification of RNA editing face two main challenges: separating true editing sites from false discoveries and accurate estimation of editing levels. We developed an approach to analyze transcriptome sequencing data (RNA-seq) for global identification of RNA editing in cells for which whole-genome sequencing data are available. We applied the method to analyze RNA-seq data of a human glioblastoma cell line, U87MG. Around 10,000 DNA-RNA differences were identified, the majority being putative A-to-I editing sites. These predicted A-to-I events were associated with a low false-discovery rate (∼5%). Moreover, the estimated editing levels from RNA-seq correlated well with those based on traditional clonal sequencing. Our results further facilitated unbiased characterization of the sequence and evolutionary features flanking predicted A-to-I editing sites and discovery of a conserved RNA structural motif that may be functionally relevant to editing. Genes with predicted A-to-I editing were significantly enriched with those known to be involved in cancer, supporting the potential importance of cancer-specific RNA editing. A similar profile of DNA-RNA differences as in U87MG was predicted for another RNA-seq data set obtained from primary breast cancer samples. Remarkably, significant overlap exists between the putative editing sites of the two transcriptomes despite their difference in cell type, cancer type, and genomic backgrounds. Our approach enabled de novo identification of the RNA editome, which sets the stage for further mechanistic studies of this important step of post-transcriptional regulation.

  3. Research Techniques Made Simple: Bacterial 16S Ribosomal RNA Gene Sequencing in Cutaneous Research.

    PubMed

    Jo, Jay-Hyun; Kennedy, Elizabeth A; Kong, Heidi H

    2016-03-01

    Skin serves as a protective barrier and also harbors numerous microorganisms collectively comprising the skin microbiome. As a result of recent advances in sequencing (next-generation sequencing), our understanding of microbial communities on skin has advanced substantially. In particular, the 16S ribosomal RNA gene sequencing technique has played an important role in efforts to identify the global communities of bacteria in healthy individuals and patients with various disorders in multiple topographical regions over the skin surface. Here, we describe basic principles, study design, and a workflow of 16S ribosomal RNA gene sequencing methodology, primarily for investigators who are not familiar with this approach. This article will also discuss some applications and challenges of 16S ribosomal RNA sequencing as well as directions for future development.

  4. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    PubMed

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-07

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits.

  5. Short RNA duplexes guide sequence-dependent cleavage by human Dicer.

    PubMed

    Bergeron, Lucien; Perreault, Jean-Pierre; Abou Elela, Sherif

    2010-12-01

    Dicer is a member of the double-stranded (ds) RNA-specific ribonuclease III (RNase III) family that is required for RNA processing and degradation. Like most members of the RNase III family, Dicer possesses a dsRNA binding domain and cleaves long RNA duplexes in vitro. In this study, Dicer substrate selectivity was examined using bipartite substrates. These experiments revealed that an RNA helix possessing a 2-nucleotide (nt) 3'-overhang may bind and direct sequence-specific Dicer-mediated cleavage in trans at a fixed distance from the 3'-end overhang. Chemical modifications of the substrate indicate that the presence of the ribose 2'-hydroxyl group is not required for Dicer binding, but some located near the scissile bonds are needed for RNA cleavage. This suggests a flexible mechanism for substrate selectivity that recognizes the overall shape of an RNA helix. Examination of the structure of natural pre-microRNAs (pre-miRNAs) suggests that they may form bipartite substrates with complementary mRNA sequences, and thus induce seed-independent Dicer cleavage. Indeed, in vitro, natural pre-miRNA directed sequence-specific Dicer-mediated cleavage in trans by supporting the formation of a substrate mimic.

  6. RNA internal standard synthesis by nucleic acid sequence-based amplification for competitive quantitative amplification reactions.

    PubMed

    Lo, Wan-Yu; Baeumner, Antje J

    2007-02-15

    Nucleic acid sequence-based amplification (NASBA) reactions have been demonstrated to successfully synthesize new sequences based on deletion and insertion reactions. Two RNA internal standards were synthesized for use in competitive amplification reactions in which quantitative analysis can be achieved by coamplifying the internal standard with the wild type sample. The sequences were created in two consecutive NASBA reactions using the E. coli clpB mRNA sequence as model analyte. The primer sequences of the wild type sequence were maintained, and a 20-nt-long segment inside the amplicon region was exchanged for a new segment of similar GC content and melting temperature. The new RNA sequence was thus amplifiable using the wild type primers and detectable via a new inserted sequence. In the first reaction, the forwarding primer and an additional 20-nt-long sequence was deleted and replaced by a new 20-nt-long sequence. In the second reaction, a forwarding primer containing as 5' overhang sequence the wild type primer sequence was used. The presence of pure internal standard was verified using electrochemiluminescence and RNA lateral-flow biosensor analysis. Additional sequence deletion in order to shorten the internal standard amplicons and thus generate higher detection signals was found not to be required. Finally, a competitive NASBA reaction between one internal standard and the wild type sequence was carried out proving its functionality. This new rapid construction method via NASBA provides advantages over the traditional techniques since it requires no traditional cloning procedures, no thermocyclers, and can be completed in less than 4 h.

  7. Finding sRNA generative locales from high-throughput sequencing data with NiBLS

    PubMed Central

    2010-01-01

    Background Next-generation sequencing technologies allow researchers to obtain millions of sequence reads in a single experiment. One important use of the technology is the sequencing of small non-coding regulatory RNAs and the identification of the genomic locales from which they originate. Currently, there is a paucity of methods for finding small RNA generative locales. Results We describe and implement an algorithm that can determine small RNA generative locales from high-throughput sequencing data. The algorithm creates a network, or graph, of the small RNAs by creating links between them depending on their proximity on the target genome. For each of the sub-networks in the resulting graph the clustering coefficient, a measure of the interconnectedness of the subnetwork, is used to identify the generative locales. We test the algorithm over a wide range of parameters using RFAM sequences as positive controls and demonstrate that the algorithm has good sensitivity and specificity in a range of Arabidopsis and mouse small RNA sequence sets and that the locales it generates are robust to differences in the choice of parameters. Conclusions NiBLS is a fast, reliable and sensitive method for determining small RNA locales in high-throughput sequence data that is generally applicable to all classes of small RNA. PMID:20167070

  8. Complete genome of Hainan papaya ringspot virus using small RNA deep sequencing.

    PubMed

    Zhang, Yuliang; Yu, Naitong; Huang, Qixing; Yin, Guohua; Guo, Anping; Wang, Xiangfeng; Xiong, Zhongguo; Liu, Zhixin

    2014-06-01

    Small RNA deep sequencing allows for virus identification, virus genome assembly, and strain differentiation. In this study, papaya plants with virus-like symptoms collected in Hainan province were used for deep sequencing and small RNA library construction. After in silicon subtraction of the papaya sRNAs, small RNA reads were used to in the viral genome assembly using a reference-guided, iterative assembly approach. A nearly complete genome was assembled for a Hainan isolate of papaya ringspot virus (PRSV-HN-2). The complete PRSV-HN-2 genome (accession no.: KF734962) was obtained after a 15-nucleotide gap was filled by direct sequencing of the amplified genomic region. Direct sequencing of several random genomic regions of the PRSV isolate did not find any sequence discrepancy with the sRNA-assembled genome. The newly sequenced PRSV-HN-2 genome shared a nucleotide identity of 96 and 94 % to that of the PRSV-HN (EF183499) and PRSV-HN-1 (HQ424465) isolates, and together with these two isolates formed a new PRSV clade. These data demonstrate that the small RNA deep sequencing technology provides a viable and rapid mean to assemble complete viral genomes in plants.

  9. The complete nucleotide sequence of bean yellow mosaic potyvirus RNA.

    PubMed

    Guyatt, K J; Proll, D F; Menssen, A; Davidson, A D

    1996-01-01

    The complete nucleotide sequence of an Australian strain of bean yellow mosaic virus (BYMV-S) has been determined from cloned viral cDNAs. The BYMV-S genome is 9 547 nucleotides in length excluding a poly(A) tail. Computer analysis of the sequence revealed a single long open reading frame (ORF) of 9168 nucleotides, commencing at position 206 and terminating with UAG at position 9374-6. The ORF potentially encodes a polyprotein of 3056 amino acids with a deduced Mr of 347 409. The 5' and 3' untranslated regions are 205 and 174 nucleotides in length respectively. Alignment of the amino acid sequence of the BYMV-S polyprotein with those of other potyviruses identified nine putative proteolytic cleavage sites. The predicted consensus cleavage site of the BYMV NIa protease was found to differ from that described for other potyviruses. Processing of the BYMV polyprotein at the designated proteolytic cleavage sites would result in a typical potyviral genome arrangement. The amino acid sequences of the putative BYMV encoded proteins were compared to the homologous gene products of twelve individual potyviruses to identify overall and specific regions of amino acid sequence homology.

  10. Molecular Diagnosis of Actinomadura madurae Infection by 16S rRNA Deep Sequencing

    PubMed Central

    SenGupta, Dhruba J.; Hoogestraat, Daniel R.; Cummings, Lisa A.; Bryant, Bronwyn H.; Natividad, Catherine; Thielges, Stephanie; Monsaas, Peter W.; Chau, Mimosa; Barbee, Lindley A.; Rosenthal, Christopher; Cookson, Brad T.; Hoffman, Noah G.

    2013-01-01

    Next-generation DNA sequencing can be used to catalog individual organisms within complex, polymicrobial specimens. Here, we utilized deep sequencing of 16S rRNA to implicate Actinomadura madurae as the cause of mycetoma in a diabetic patient when culture and conventional molecular methods were overwhelmed by overgrowth of other organisms. PMID:24108607

  11. Molecular diagnosis of Actinomadura madurae infection by 16S rRNA deep sequencing.

    PubMed

    Salipante, Stephen J; Sengupta, Dhruba J; Hoogestraat, Daniel R; Cummings, Lisa A; Bryant, Bronwyn H; Natividad, Catherine; Thielges, Stephanie; Monsaas, Peter W; Chau, Mimosa; Barbee, Lindley A; Rosenthal, Christopher; Cookson, Brad T; Hoffman, Noah G

    2013-12-01

    Next-generation DNA sequencing can be used to catalog individual organisms within complex, polymicrobial specimens. Here, we utilized deep sequencing of 16S rRNA to implicate Actinomadura madurae as the cause of mycetoma in a diabetic patient when culture and conventional molecular methods were overwhelmed by overgrowth of other organisms.

  12. CCAT2, a novel noncoding RNA mapping to 8q24, underlies metastatic progression and chromosomal instability in colon cancer.

    PubMed

    Ling, Hui; Spizzo, Riccardo; Atlasi, Yaser; Nicoloso, Milena; Shimizu, Masayoshi; Redis, Roxana S; Nishida, Naohiro; Gafà, Roberta; Song, Jian; Guo, Zhiyi; Ivan, Cristina; Barbarotto, Elisa; De Vries, Ingrid; Zhang, Xinna; Ferracin, Manuela; Churchman, Mike; van Galen, Janneke F; Beverloo, Berna H; Shariati, Maryam; Haderk, Franziska; Estecio, Marcos R; Garcia-Manero, Guillermo; Patijn, Gijs A; Gotley, David C; Bhardwaj, Vikas; Shureiqi, Imad; Sen, Subrata; Multani, Asha S; Welsh, James; Yamamoto, Ken; Taniguchi, Itsuki; Song, Min-Ae; Gallinger, Steven; Casey, Graham; Thibodeau, Stephen N; Le Marchand, Loïc; Tiirikainen, Maarit; Mani, Sendurai A; Zhang, Wei; Davuluri, Ramana V; Mimori, Koshi; Mori, Masaki; Sieuwerts, Anieta M; Martens, John W M; Tomlinson, Ian; Negrini, Massimo; Berindan-Neagoe, Ioana; Foekens, John A; Hamilton, Stanley R; Lanza, Giovanni; Kopetz, Scott; Fodde, Riccardo; Calin, George A

    2013-09-01

    The functional roles of SNPs within the 8q24 gene desert in the cancer phenotype are not yet well understood. Here, we report that CCAT2, a novel long noncoding RNA transcript (lncRNA) encompassing the rs6983267 SNP, is highly overexpressed in microsatellite-stable colorectal cancer and promotes tumor growth, metastasis, and chromosomal instability. We demonstrate that MYC, miR-17-5p, and miR-20a are up-regulated by CCAT2 through TCF7L2-mediated transcriptional regulation. We further identify the physical interaction between CCAT2 and TCF7L2 resulting in an enhancement of WNT signaling activity. We show that CCAT2 is itself a WNT downstream target, which suggests the existence of a feedback loop. Finally, we demonstrate that the SNP status affects CCAT2 expression and the risk allele G produces more CCAT2 transcript. Our results support a new mechanism of MYC and WNT regulation by the novel lncRNA CCAT2 in colorectal cancer pathogenesis, and provide an alternative explanation of the SNP-conferred cancer risk.

  13. CCAT2, a novel noncoding RNA mapping to 8q24, underlies metastatic progression and chromosomal instability in colon cancer

    PubMed Central

    Ling, Hui; Spizzo, Riccardo; Atlasi, Yaser; Nicoloso, Milena; Shimizu, Masayoshi; Redis, Roxana S.; Nishida, Naohiro; Gafà, Roberta; Song, Jian; Guo, Zhiyi; Ivan, Cristina; Barbarotto, Elisa; De Vries, Ingrid; Zhang, Xinna; Ferracin, Manuela; Churchman, Mike; van Galen, Janneke F.; Beverloo, Berna H.; Shariati, Maryam; Haderk, Franziska; Estecio, Marcos R.; Garcia-Manero, Guillermo; Patijn, Gijs A.; Gotley, David C.; Bhardwaj, Vikas; Shureiqi, Imad; Sen, Subrata; Multani, Asha S.; Welsh, James; Yamamoto, Ken; Taniguchi, Itsuki; Song, Min-Ae; Gallinger, Steven; Casey, Graham; Thibodeau, Stephen N.; Le Marchand, Loïc; Tiirikainen, Maarit; Mani, Sendurai A.; Zhang, Wei; Davuluri, Ramana V.; Mimori, Koshi; Mori, Masaki; Sieuwerts, Anieta M.; Martens, John W.M.; Tomlinson, Ian; Negrini, Massimo; Berindan-Neagoe, Ioana; Foekens, John A.; Hamilton, Stanley R.; Lanza, Giovanni; Kopetz, Scott; Fodde, Riccardo; Calin, George A.

    2013-01-01

    The functional roles of SNPs within the 8q24 gene desert in the cancer phenotype are not yet well understood. Here, we report that CCAT2, a novel long noncoding RNA transcript (lncRNA) encompassing the rs6983267 SNP, is highly overexpressed in microsatellite-stable colorectal cancer and promotes tumor growth, metastasis, and chromosomal instability. We demonstrate that MYC, miR–17–5p, and miR–20a are up-regulated by CCAT2 through TCF7L2-mediated transcriptional regulation. We further identify the physical interaction between CCAT2 and TCF7L2 resulting in an enhancement of WNT signaling activity. We show that CCAT2 is itself a WNT downstream target, which suggests the existence of a feedback loop. Finally, we demonstrate that the SNP status affects CCAT2 expression and the risk allele G produces more CCAT2 transcript. Our results support a new mechanism of MYC and WNT regulation by the novel lncRNA CCAT2 in colorectal cancer pathogenesis, and provide an alternative explanation of the SNP-conferred cancer risk. PMID:23796952

  14. Towards next-generation sequencing analytics for foodborne RNA viruses: Examining the effect of RNA input quantity and viral RNA purity.

    PubMed

    Yang, Zhihui; Leonard, Susan R; Mammel, Mark K; Elkins, Christopher A; Kulka, Michael

    2016-10-01

    Detection and identification of viruses in food samples are technically challenging due largely to the low viral copy number in contaminated food items, and the lack of effective culture enrichment methods that are amenable to regulatory applications for many of the common foodborne viruses. Using an Illumina MiSeq platform and two hepatitis A virus (HAV) cell-culture adapted strains as a representative enteric virus species, this study examined the limits of single-stranded RNA (ssRNA) viral detection following next-generation sequencing without pre-amplification of the viral genome. Complete viral genome sequences were obtained from HAV samples of varying purities and with an input as low as 2ng total RNA containing 1.4×10(5) copies of viral RNA. In addition, single nucleotide variations were reproducibly detected over the range of concentrations examined, and their identity confirmed by alternate sequencing technology. In summary, next-generation sequencing technology has the potential for sensitive detection/identification of a viral genome at a low copy number. This study provides a benchmark for metagenomic sequencing application as is required for virus detection in complex food matrices using a culture-independent diagnostic approach.

  15. Taxonomic Assessment of Rumen Microbiota Using Total RNA and Targeted Amplicon Sequencing Approaches

    PubMed Central

    Li, Fuyong; Henderson, Gemma; Sun, Xu; Cox, Faith; Janssen, Peter H.; Guan, Le Luo

    2016-01-01

    Taxonomic characterization of active gastrointestinal microbiota is essential to detect shifts in microbial communities and functions under various conditions. This study aimed to identify and quantify potentially active rumen microbiota using total RNA sequencing and to compare the outcomes of this approach with the widely used targeted RNA/DNA amplicon sequencing technique. Total RNA isolated from rumen digesta samples from five beef steers was subjected to Illumina paired-end sequencing (RNA-seq), and bacterial and archaeal amplicons of partial 16S rRNA/rDNA were subjected to 454 pyrosequencing (RNA/DNA Amplicon-seq). Taxonomic assessments of the RNA-seq, RNA Amplicon-seq, and DNA Amplicon-seq datasets were performed using a pipeline developed in house. The detected major microbial phylotypes were common among the three datasets, with seven bacterial phyla, fifteen bacterial families, and five archaeal taxa commonly identified across all datasets. There were also unique microbial taxa detected in each dataset. Elusimicrobia and Verrucomicrobia phyla; Desulfovibrionaceae, Elusimicrobiaceae, and Sphaerochaetaceae families; and Methanobrevibacter woesei were only detected in the RNA-Seq and RNA Amplicon-seq datasets, whereas Streptococcaceae was only detected in the DNA Amplicon-seq dataset. In addition, the relative abundances of four bacterial phyla, eight bacterial families and one archaeal taxon were different among the three datasets. This is the first study to compare the outcomes of rumen microbiota profiling between RNA-seq and RNA/DNA Amplicon-seq datasets. Our results illustrate the differences between these methods in characterizing microbiota both qualitatively and quantitatively for the same sample, and so caution must be exercised when comparing data. PMID:27446027

  16. Organization and nucleotide sequence analysis of a ribosomal RNA gene cluster from Streptomyces ambofaciens.

    PubMed

    Pernodet, J L; Boccard, F; Alegre, M T; Gagnat, J; Guérineau, M

    1989-06-30

    The Streptomyces ambofaciens genome contains four rRNA gene clusters. These copies are called rrnA, B, C and D. The complete nucleotide (nt) sequence of rrnD has been determined. These genes possess striking similarity with other eubacterial rRNA genes. Comparison with other rRNA sequences allowed the putative localization of the sequences encoding mature rRNAs. The structural genes are arranged in the order 16S-23S-5S and are tightly linked. The mature rRNAs are predicted to contain 1528, 3120 and 120 nt, for the 16S, 23S and 5S rRNAs, respectively. The 23S rRNA is, to our knowledge, the longest of all sequenced prokaryotic 23S rRNAs. When compared to other large rRNAs it shows insertions at positions where they are also present in archaebacterial and in eukaryotic large rRNAs. Secondary structure models of S. ambofaciens rRNAs are proposed, based upon those existing for other bacterial rRNAs. Positions of putative transcription start points and of a termination signal are suggested. The corresponding putative primary transcript, containing the 16S, 23S and 5S rRNAs plus flanking regions, was folded into a secondary structure, and sequences possibly involved in rRNA maturation are described. The G + C content of the rRNA gene cluster is low (57%) compared with the overall G + C content of Streptomyces DNA (73%).

  17. Sequence characterization of 5S ribosomal RNA from eight gram positive procaryotes

    NASA Technical Reports Server (NTRS)

    Woese, C. R.; Luehrsen, K. R.; Pribula, C. D.; Fox, G. E.

    1976-01-01

    Complete nucleotide sequences are presented for 5S rRNA from Bacillus subtilis, B. firmus, B. pasteurii, B. brevis, Lactobacillus brevis, and Streptococcus faecalis, and 5S rRNA oligonucleotide catalogs and partial sequence data are given for B. cereus and Sporosarcina ureae. These data demonstrate a striking consistency of 5S rRNA primary and secondary structure within a given bacterial grouping. An exception is B. brevis, in which the 5S rRNA sequence varies significantly from that of other bacilli in the tuned helix and the procaryotic loop. The localization of these variations suggests that B. brevis occupies an ecological niche that selects such changes. It is noted that this organism produces antibiotics which affect ribosome function.

  18. Small RNA Deep Sequencing and the Effects of microRNA408 on Root Gravitropic Bending in Arabidopsis

    NASA Astrophysics Data System (ADS)

    Li, Huasheng; Lu, Jinying; Sun, Qiao; Chen, Yu; He, Dacheng; Liu, Min

    2015-11-01

    MicroRNA (miRNA) is a non-coding small RNA composed of 20 to 24 nucleotides that influences plant root development. This study analyzed the miRNA expression in Arabidopsis root tip cells using Illumina sequencing and real-time PCR before (sample 0) and 15 min after (sample 15) a 3-D clinostat rotational treatment was administered. After stimulation was performed, the expression levels of seven miRNA genes, including Arabidopsis miR160, miR161, miR394, miR402, miR403, miR408, and miR823, were significantly upregulated. Illumina sequencing results also revealed two novel miRNAsthat have not been previously reported, The target genes of these miRNAs included pentatricopeptide repeat-containing protein and diadenosine tetraphosphate hydrolase. An overexpression vector of Arabidopsis miR408 was constructed and transferred to Arabidopsis plant. The roots of plants over expressing miR408 exhibited a slower reorientation upon gravistimulation in comparison with those of wild-type. This result indicate that miR408 could play a role in root gravitropic response.

  19. From Sequences to Shapes and Back: A Case Study in RNA Secondary Structures

    NASA Astrophysics Data System (ADS)

    Schuster, Peter; Fontana, Walter; Stadler, Peter F.; Hofacker, Ivo L.

    1994-03-01

    RNA folding is viewed here as a map assigning secondary structures to sequences. At fixed chain length the number of sequences far exceeds the number of structures. Frequencies of structures are highly non-uniform and follow a generalized form of Zipf's law: we find relatively few common and many rare ones. By using an algorithm for inverse folding, we show that sequences sharing the same structure are distributed randomly over sequence space. All common structures can be accessed from an arbitrary sequence by a number of mutations much smaller than the chain length. The sequence space is percolated by extensive neutral networks connecting nearest neighbours folding into identical structures. Implications for evolutionary adaptation and for applied molecular evolution are evident: finding a particular structure by mutation and selection is much simpler than expected and, even if catalytic activity should turn out to be sparse in the space of RNA structures, it can hardly be missed by evolutionary processes.

  20. Sequence organization of the Acanthamoeba rRNA intergenic spacer: identification of transcriptional enhancers.

    PubMed Central

    Yang, Q; Zwick, M G; Paule, M R

    1994-01-01

    The primary sequence of the entire 2330 bp intergenic spacer of the A.castellanii ribosomal RNA gene was determined. Repeated sequence elements averaging 140 bp were identified and found to bind a protein required for optimum initiation at the core promoter. These repeated elements were shown to stimulate rRNA transcription by RNA polymerase I in vitro. The repeats inhibited transcription when placed in trans, and stimulated transcription when in cis, in either orientation, but only when upstream of the core promoter. Thus, these repeated elements have characteristics similar to polymerase I enhancers found in higher eukaryotes. The number of rRNA repeats in Acanthamoeba cells was determined to be 24 per haploid genome, the lowest number so far identified in any eukaryote. However, because Acanthamoeba is polyploid, each cell contains approximately 600 rRNA genes. Images PMID:7984432

  1. The RNA sequence context defines the mechanistic routes by which yeast arginyl-tRNA synthetase charges tRNA.

    PubMed Central

    Sissler, M; Giegé, R; Florentz, C

    1998-01-01

    Arginylation of tRNA transcripts by yeast arginyl-tRNA synthetase can be triggered by two alternate recognition sets in anticodon loops: C35 and U36 or G36 in tRNA(Arg) and C36 and G37 in tRNA(Asp) (Sissler M, Giegé R, Florentz C, 1996, EMBO J 15:5069-5076). Kinetic studies on tRNA variants were done to explore the mechanisms by which these sets are expressed. Although the synthetase interacts in a similar manner with tRNA(Arg) and tRNA(Asp), the details of the interaction patterns are idiosyncratic, especially in anticodon loops (Sissler M, Eriani G, Martin F, Giegé R, Florentz C, 1997, Nucleic Acids Res 25:4899-4906). Exchange of individual recognition elements between arginine and aspartate tRNA frameworks strongly blocks arginylation of the mutated tRNAs, whereas full exchange of the recognition sets leads to efficient arginine acceptance of the transplanted tRNAs. Unpredictably, the similar catalytic efficiencies of native and transplanted tRNAs originate from different k(cat) and Km combinations. A closer analysis reveals that efficient arginylation results from strong anticooperative effects between individual recognition elements. Nonrecognition nucleotides as well as the tRNA architecture are additional factors that tune efficiency. Altogether, arginyl-tRNA synthetase is able to utilize different context-dependent mechanistic routes to be activated. This confers biological advantages to the arginine aminoacylation system and sheds light on its evolutionary relationship with the aspartate system. PMID:9622124

  2. Recombinant human MDM2 oncoprotein shows sequence composition selectivity for binding to both RNA and DNA.

    PubMed

    Challen, Christine; Anderson, John J; Chrzanowska-Lightowlers, Zofia M A; Lightowlers, Robert N; Lunec, John

    2012-03-01

    MDM2 is a 90 kDa nucleo-phosphoprotein that binds p53 and other proteins contributing to its oncogenic properties. Its structure includes an amino proximal p53 binding site, a central acidic domain and a carboxy region which incorporates Zinc and Ring Finger domains suggestive of nucleic acid binding or transcription factor function. It has previously been reported that a bacculovirus expressed MDM2 protein binds RNA in a sequence-specific manner through the Ring Finger domain, however, its ability to bind DNA has yet to be examined. We report here that a bacterially expressed human MDM2 protein binds both DNA as well as the previously defined RNA consensus sequence. DNA binding appears selective and involves the carboxy-terminal domain of the molecule. RNA binding is inhibited by an MDM2 specific antibody, which recognises an epitope within the carboxy region of the protein. Selection cloning and sequence analysis of MDM2 DNA binding sequences, unlike RNA binding sequences, revealed no obvious DNA binding consensus sequence, but preferential binding to oligopurine:pyrimidine-rich stretches. Our results suggest that the observed preferential DNA binding may occur through the Zinc Finger or in a charge-charge interaction through the Ring Finger, thereby implying potentially different mechanisms for DNA and RNA MDM2 binding.

  3. Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

    PubMed

    Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

    2013-01-01

    Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.

  4. Identification of characteristic oligonucleotides in the bacterial 16S ribosomal RNA sequence dataset

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; Willson, Richard C.; Fox, George E.

    2002-01-01

    MOTIVATION: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification. RESULTS: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Q(s), was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 5-11 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16,000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.

  5. Identification of conserved and novel microRNAs in Aquilaria sinensis based on small RNA sequencing and transcriptome sequence data.

    PubMed

    Gao, Zhi-Hui; Wei, Jian-He; Yang, Yun; Zhang, Zheng; Xiong, Huan-Ying; Zhao, Wen-Ting

    2012-08-15

    Agarwood is in great demand for its high value in medicine, incense, and perfume across Asia, Middle East, and Europe. As agarwood is formed only when the Aquilaria trees are wounded or infected by some microbes, overharvesting and habitat loss are threatening some populations of agarwood-producing species. Aquilaria sinensis is such a significant economic tree species. To promote the production efficiency and protect the resource of A. sinensis, it would be critical to reveal the regulation mechanisms of stress-induced agarwood formation. MicroRNAs (miRNAs), a key gene expression regulator involved in various plant stress response and metabolic processes, might function in agarwood formation, but no report concerning miRNAs in Aquilaria is available. In this study, the small RNA high-throughput sequencing and 454 transcriptome data were adopted to identify both conserved and novel miRNAs in A. sinensis. Deep sequencing showed that the small RNA (sRNA) population of A. sinensis was complex and the length of sRNAs varied. By in silico analysis of the small RNA deep sequencing data and transcriptome data, we discovered 27 novel miRNAs in A. sinensis. Based on the mature miRNA sequence conservation, we identified 74 putative conserved miRNAs from A. sinensis and 10 of them were confirmed with hairpin forming precursor. Interestingly, a novel miRNA sequence was determined to be the miRNA of asi-miR408, but with accumulation much higher than asi-miR408. The expression levels of ten stress-responsive miRNAs were examined during the time-course after wound treatment. Eight were shown to be wound-responsive. This not only shows the existence of miRNAs in this Asian economically significant tree species but also indicated its critical role in stress-induced agarwood formation. The highly accumulated miRNA of asi-miR408 implied miRNAs would be functional as well as miRNAs in plants.

  6. Accuracy of RNA-Seq and its dependence on sequencing depth

    PubMed Central

    2012-01-01

    Background The cost of DNA sequencing has undergone a dramatical reduction in the past decade. As a result, sequencing technologies have been increasingly applied to genomic research. RNA-Seq is becoming a common technique for surveying gene expression based on DNA sequencing. As it is not clear how increased sequencing capacity has affected measurement accuracy of mRNA, we sought to investigate that relationship. Result We empirically evaluate the accuracy of repeated gene expression measurements using RNA-Seq. We identify library preparation steps prior to DNA sequencing as the main source of error in this process. Studying three datasets, we show that the accuracy indeed improves with the sequencing depth. However, the rate of improvement as a function of sequence reads is generally slower than predicted by the binomial distribution. We therefore used the beta-binomial distribution to model the overdispersion. The overdispersion parameters we introduced depend explicitly on the number of reads so that the resulting statistical uncertainty is consistent with the empirical data that measurement accuracy increases with the sequencing depth. The overdispersion parameters were determined by maximizing the likelihood. We shown that our modified beta-binomial model had lower false discovery rate than the binomial or the pure beta-binomial models. Conclusion We proposed a novel form of overdispersion guaranteeing that the accuracy improves with sequencing depth. We demonstrated that the new form provides a better fit to the data. PMID:23320920

  7. The phylogenetic utility and functional constraint of microRNA flanking sequences

    PubMed Central

    Kenny, Nathan J.; Sin, Yung Wa; Hayward, Alexander; Paps, Jordi; Chu, Ka Hou; Hui, Jerome H. L.

    2015-01-01

    MicroRNAs (miRNAs) have recently risen to prominence as novel factors responsible for post-transcriptional regulation of gene expression. miRNA genes have been posited as highly conserved in the clades in which they exist. Consequently, miRNAs have been used as rare genome change characters to estimate phylogeny by tracking their gain and loss. However, their short length (21–23 bp) has limited their perceived utility in sequenced-based phylogenetic inference. Here, using reference taxa with established phylogenetic relationships, we demonstrate that miRNA sequences are of high utility in quantitative, rather than in qualitative, phylogenetic analysis. The clear orthology among miRNA genes from different species makes it straightforward to identify and align these sequences from even fragmentary datasets. We also identify significant sequence conservation in the regions directly flanking miRNA genes, and show that this too is of utility in phylogenetic analysis, as well as highlighting conserved regions that will be of interest to other fields. Employing miRNA sequences from 12 sequenced drosophilid genomes, together with a Tribolium castaneum outgroup, we demonstrate that this approach is robust using Bayesian and maximum-likelihood methods. The utility of these characters is further demonstrated in the rhabditid nematodes and primates. As next-generation sequencing makes it more cost-effective to sequence genomes and small RNA libraries, this methodology provides an alternative data source for phylogenetic analysis. The approach allows rapid resolution of relationships between both closely related and rapidly evolving species, and provides an additional tool for investigation of relationships within the tree of life. PMID:25694624

  8. Integrative analyses of RNA editing, alternative splicing, and expression of young genes in human brain transcriptome by deep RNA sequencing.

    PubMed

    Wu, Dong-Dong; Ye, Ling-Qun; Li, Yan; Sun, Yan-Bo; Shao, Yi; Chen, Chunyan; Zhu, Zhu; Zhong, Li; Wang, Lu; Irwin, David M; Zhang, Yong E; Zhang, Ya-Ping

    2015-08-01

    Next-generation RNA sequencing has been successfully used for identification of transcript assembly, evaluation of gene expression levels, and detection of post-transcriptional modifications. Despite these large-scale studies, additional comprehensive RNA-seq data from different subregions of the human brain are required to fully evaluate the evolutionary patterns experienced by the human brain transcriptome. Here, we provide a total of 6.5 billion RNA-seq reads from different subregions of the human brain. A significant correlation was observed between the levels of alternative splicing and RNA editing, which might be explained by a competition between the molecular machineries responsible for the splicing and editing of RNA. Young human protein-coding genes demonstrate biased expression to the neocortical and non-neocortical regions during evolution on the lineage leading to humans. We also found that a significantly greater number of young human protein-coding genes are expressed in the putamen, a tissue that was also observed to have the highest level of RNA-editing activity. The putamen, which previously received little attention, plays an important role in cognitive ability, and our data suggest a potential contribution of the putamen to human evolution.

  9. The 3' sequences required for incorporation of an engineered ssRNA into the Reovirus genome

    PubMed Central

    Roner, Michael R; Roehr, Joanne

    2006-01-01

    Background Understanding how an organism replicates and assembles a multi-segmented genome with fidelity previously measured at 100% presents a model system for exploring questions involving genome assortment and RNA/protein interactions in general. The virus family Reoviridae, containing nine genera and more than 200 members, are unique in that they possess a segmented double-stranded (ds) RNA genome. Using reovirus as a model member of this family, we have developed the only functional reverse genetics system for a member of this family with ten or more genome segments. Using this system, we have previously identified the flanking 5' sequences required by an engineered s2 ssRNA for efficient incorporation into the genome of reovirus. The minimum 5' sequence retains 96 nucleotides and contains a predicted sequence/structure element. Within these 96 nucleotides, we have identified three nucleotides A-U-U at positions 79–81 that are essential for the incorporation of in vitro generated ssRNAs into new reovirus progeny viral particles. The work presented here builds on these findings and presents the results of an analysis of the required 3' flanking sequences of the s2 ssRNA. Results The minimum 3' sequence we localized retains 98 nucleotides of the wild type s2 ssRNA. These sequences do not interact with the 5' sequences and modifications of the 5' sequences does not result in a change in the sequences required at the 3' end of the engineered s2 ssRNA. Within the 3' sequence we discovered three regions that when mutated prevent the ssRNA from being replicated to dsRNA and subsequently incorporated into progeny virions. Using a series of substitutions we were able to obtain additional information about the sequences in these regions. We demonstrate that the individual nucleotides from, 98 to 84, 68 to 59, and 28 to 1, are required in addition to the total length of 98 nucleotides to direct an engineered reovirus ssRNA to be replicated to dsRNA and incorporated

  10. StarScan: a web server for scanning small RNA targets from degradome sequencing data

    PubMed Central

    Liu, Shun; Li, Jun-Hao; Wu, Jie; Zhou, Ke-Ren; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2015-01-01

    Endogenous small non-coding RNAs (sRNAs), including microRNAs, PIWI-interacting RNAs and small interfering RNAs, play important gene regulatory roles in animals and plants by pairing to the protein-coding and non-coding transcripts. However, computationally assigning these various sRNAs to their regulatory target genes remains technically challenging. Recently, a high-throughput degradome sequencing method was applied to identify biologically relevant sRNA cleavage sites. In this study, an integrated web-based tool, StarScan (sRNA target Scan), was developed for scanning sRNA targets using degradome sequencing data from 20 species. Given a sRNA sequence from plants or animals, our web server performs an ultrafast and exhaustive search for potential sRNA–target interactions in annotated and unannotated genomic regions. The interactions between small RNAs and target transcripts were further evaluated using a novel tool, alignScore. A novel tool, degradomeBinomTest, was developed to quantify the abundance of degradome fragments located at the 9–11th nucleotide from the sRNA 5′ end. This is the first web server for discovering potential sRNA-mediated RNA cleavage events in plants and animals, which affords mechanistic insights into the regulatory roles of sRNAs. The StarScan web server is available at http://mirlab.sysu.edu.cn/starscan/. PMID:25990732

  11. Uncultivated microbial eukaryotic diversity: a method to link ssu rRNA gene sequences with morphology.

    PubMed

    Hirst, Marissa B; Kita, Kelley N; Dawson, Scott C

    2011-01-01

    Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA "phylotypes" from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages

  12. De novo prediction of RNA-protein interactions from sequence information.

    PubMed

    Wang, Ying; Chen, Xiaowei; Liu, Zhi-Ping; Huang, Qiang; Wang, Yong; Xu, Derong; Zhang, Xiang-Sun; Chen, Runsheng; Chen, Luonan

    2013-01-27

    Protein-RNA interactions are fundamentally important in understanding cellular processes. In particular, non-coding RNA-protein interactions play an important role to facilitate biological functions in signalling, transcriptional regulation, and even the progression of complex diseases. However, experimental determination of protein-RNA interactions remains time-consuming and labour-intensive. Here, we develop a novel extended naïve-Bayes-classifier for de novo prediction of protein-RNA interactions, only using protein and RNA sequence information. Specifically, we first collect a set of known protein-RNA interactions as gold-standard positives and extract sequence-based features to represent each protein-RNA pair. To fill the gap between high dimensional features and scarcity of gold-standard positives, we select effective features by cutting a likelihood ratio score, which not only reduces the computational complexity but also allows transparent feature integration during prediction. An extended naïve Bayes classifier is then constructed using these effective features to train a protein-RNA interaction prediction model. Numerical experiments show that our method can achieve the prediction accuracy of 0.77 even though only a small number of protein-RNA interaction data are available. In particular, we demonstrate that the extended naïve-Bayes-classifier is superior to the naïve-Bayes-classifier by fully considering the dependences among features. Importantly, we conduct ncRNA pull-down experiments to validate the predicted novel protein-RNA interactions and identify the interacting proteins of sbRNA CeN72 in C. elegans, which further demonstrates the effectiveness of our method.

  13. [16S rRNA gene sequence analysis for bacterial identification in the clinical laboratory].

    PubMed

    Matsumoto, Takehisa; Sugano, Mitsutoshi

    2013-12-01

    The traditional identification of bacteria on the basis of phenotypic characteristics is generally not as accurate as identification based on genotypic methods. For many years, sequencing of the 16S ribosomal RNA (rRNA) gene has served as an important tool for determining phylogenetic relationships between bacteria. The features of this molecular target that make it a useful phylogenetic tool also make it useful for bacterial detection and identification in the clinical laboratory. 16S rRNA gene sequence analysis can better identify poorly described, rarely isolated, or phenotypically aberrant strains, and can lead to the recognition of novel pathogens and noncultured bacteria. In clinical microbiology, molecular identification based on 16S rDNA sequencing is applied fundamentally to bacteria whose identification by means of other types of techniques is impossible or difficult. However, there are some cases in which 16S rRNA gene sequence analysis can not differentiate closely related bacteria such as Shigella spp. and Escherichia coli at the species level. Thus, it is important to understand the advantages and disadvantages of 16S rRNA gene sequence analysis.

  14. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server

    PubMed Central

    Cannone, Jamie J.; Sweeney, Blake A.; Petrov, Anton I.; Gutell, Robin R.; Zirbel, Craig L.; Leontis, Neocles

    2015-01-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. PMID:26048960

  15. R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

    PubMed

    Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

    2015-07-01

    The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa.

  16. RNA editing in plant mitochondria—connecting RNA target sequences and acting proteins.

    PubMed

    Takenaka, Mizuki; Verbitskiy, Daniil; Zehrmann, Anja; Härtel, Barbara; Bayer-Császár, Eszter; Glass, Franziska; Brennicke, Axel

    2014-11-01

    RNA editing changes several hundred cytidines to uridines in the mRNAs of mitochondria in flowering plants. The target cytidines are identified by a subtype of PPR proteins characterized by tandem modules which each binds with a specific upstream nucleotide. Recent progress in correlating repeat structures with nucleotide identities allows to predict and identify target sites in mitochondrial RNAs. Additional proteins have been found to play a role in RNA editing; their precise function still needs to be elucidated. The enzymatic activity performing the C to U reaction may reside in the C-terminal DYW extensions of the PPR proteins; however, this still needs to be proven. Here we update recent progress in understanding RNA editing in flowering plant mitochondria.

  17. Sequence walkers: a graphical method to display how binding proteins interact with DNA or RNA sequences.

    PubMed Central

    Schneider, T D

    1997-01-01

    A graphical method is presented for displaying how binding proteins and other macromolecules interact with individual bases of nucleotide sequences. Characters representing the sequence are either oriented normally and placed above a line indicating favorable contact, or upside-down and placed below the line indicating unfavorable contact. The positive or negative height of each letter shows the contribution of that base to the average sequence conservation of the binding site, as represented by a sequence logo. These sequence 'walkers' can be stepped along raw sequence data to visually search for binding sites. Many walkers, for the same or different proteins, can be simultaneously placed next to a sequence to create a quantitative map of a complex genetic region. One can alter the sequence to quantitatively engineer binding sites. Database anomalies can be visualized by placing a walker at the recorded positions of a binding molecule and by comparing this to locations found by scanning the nearby sequences. The sequence can also be altered to predict whether a change is a polymorphism or a mutation for the recognizer being modeled. PMID:9336476

  18. High-quality RNA extraction from copepods for Next Generation Sequencing: A comparative study.

    PubMed

    Asai, Sneha; Ianora, Adrianna; Lauritano, Chiara; Lindeque, Penelope K; Carotenuto, Ylenia

    2015-12-01

    Despite the ecological importance of copepods, few Next Generation Sequencing studies (NGS) have been performed on small crustaceans, and a standard method for RNA extraction is lacking. In this study, we compared three commonly-used methods: TRIzol®, Aurum Total RNA Mini Kit and Qiagen RNeasy Micro Kit, in combination with preservation reagents TRIzol® or RNAlater®, to obtain high-quality and quantity of RNA from copepods for NGS. Total RNA was extracted from the copepods Calanus helgolandicus, Centropages typicus and Temora stylifera and its quantity and quality were evaluated using NanoDrop, agarose gel electrophoresis and Agilent Bioanalyzer. Our results demonstrate that preservation of copepods in RNAlater® and extraction with Qiagen RNeasy Micro Kit were the optimal isolation method for high-quality and quantity of RNA for NGS studies of C. helgolandicus. Intriguingly, C. helgolandicus 28S rRNA is formed by two subunits that separate after heat-denaturation and migrate along with 18S rRNA. This unique property of protostome RNA has never been reported in copepods. Overall, our comparative study on RNA extraction protocols will help increase gene expression studies on copepods using high-throughput applications, such as RNA-Seq and microarrays.

  19. The landscape of fusion transcripts in spitzoid melanoma and biologically indeterminate spitzoid tumors by RNA sequencing

    PubMed Central

    Wu, Gang; Barnhill, Raymond L.; Lee, Seungjae; Li, Yongjin; Shao, Ying; Easton, John; Dalton, James; Zhang, Jinghui; Pappo, Alberto; Bahrami, Armita

    2016-01-01

    Kinase activation by chromosomal translocations is a common mechanism that drives tumorigenesis in spitzoid neoplasms. To explore the landscape of fusion transcripts in these tumors, we performed whole-transcriptome sequencing using formalin-fixed paraffin-embedded tissues in malignant or biologically indeterminate spitzoid tumors from 7 patients (age 2–14 years). RNA sequence libraries enriched for coding regions were prepared and the sequencing was analyzed by a novel assembly-based algorithm designed for detecting complex fusions. In addition, tumor samples were screened for hotspot TERT promoter mutations, and telomerase expression was assessed by TERT mRNA in situ hybridization (ISH). Two patients had widespread metastasis and subsequently died of disease, and 5 patients had a benign clinical course on limited follow-up (mean: 30 months). RNA sequencing and TERT mRNA ISH were successful in 6 tumors and unsuccessful in 1 disseminating tumor due to low RNA quality. RNA sequencing identified a kinase fusion in 5 of the 6 sequenced tumors: TPM3–NTRK1 (2 tumors), complex rearrangements involving TPM3, ALK, and IL6R (1 tumor), BAIAP2L1–BRAF (1 tumor), and EML4–BRAF (1 disseminating tumor). All predicted chimeric transcripts were expressed at high levels and contained the intact kinase domain. In addition, 2 tumors each contained a second fusion gene, ARID1B-SNX9 or PTPRZ1-NFAM1. The detected chimeric genes were validated by home-brew break-apart or fusion fluorescence in situ hybridization. The 2 disseminating tumors each harbored the TERT promoter −124C>T (Chr 5:1,295,228 hg19 coordinate) mutation whereas the remaining 5 tumors retained the wild-type gene. The presence of the −124C>T mutation correlated with telomerase expression by TERT mRNA ISH. In summary, we demonstrated complex fusion transcripts and novel partner genes for BRAF by RNA sequencing of FFPE samples. The diversity of gene fusions demonstrated by RNA sequencing defines the molecular

  20. Investigation of molluscan phylogeny on the basis of 18S rRNA sequences.

    PubMed

    Winnepenninckx, B; Backeljau, T; De Wachter, R

    1996-12-01

    The 18S rRNA sequences of 12 molluscs, representing the extant classes Gastropoda, Bivalvia, Polyplacophora, Scaphopoda, and Caudofoveata, were determined and compared with selected known 18S rRNA sequences of Metazoa, including other Mollusca. These data do not provide support for a close relationship between Platyhelminthes (Turbellaria) and Mollusca, but rather suggest that the latter group belongs to a clade of eutrochozoan coelomates. The 18S rRNA data fail to recover molluscan, bivalve, or gastropod monophyly. However, the branching pattern of the eutrochozoan phyla and classes is unstable, probably due to the explosive Cambrian radiation during which these groups arose. Similarly, the 18S rRNA data do not provide a reliable signal for the molluscan interclass relationships. Nevertheless, we obtained strong preliminary support for phylogenetic inferences at more restricted taxonomic levels, such as the monophyly of Polyplacophora, Caenogastropoda, Euthyneura, Heterodonta, and Arcoida.

  1. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters.

    PubMed

    Core, Leighton J; Waterfall, Joshua J; Lis, John T

    2008-12-19

    RNA polymerases are highly regulated molecular machines. We present a method (global run-on sequencing, GRO-seq) that maps the position, amount, and orientation of transcriptionally engaged RNA polymerases genome-wide. In this method, nuclear run-on RNA molecules are subjected to large-scale parallel sequencing and mapped to the genome. We show that peaks of promoter-proximal polymerase reside on approximately 30% of human genes, transcription extends beyond pre-messenger RNA 3' cleavage, and antisense transcription is prevalent. Additionally, most promoters have an engaged polymerase upstream and in an orientation opposite to the annotated gene. This divergent polymerase is associated with active genes but does not elongate effectively beyond the promoter. These results imply that the interplay between polymerases and regulators over broad promoter regions dictates the orientation and efficiency of productive transcription.

  2. Sequence- and structure-specific RNA processing by a CRISPR endonuclease.

    PubMed

    Haurwitz, Rachel E; Jinek, Martin; Wiedenheft, Blake; Zhou, Kaihong; Doudna, Jennifer A

    2010-09-10

    Many bacteria and archaea contain clustered regularly interspaced short palindromic repeats (CRISPRs) that confer resistance to invasive genetic elements. Central to this immune system is the production of CRISPR-derived RNAs (crRNAs) after transcription of the CRISPR locus. Here, we identify the endoribonuclease (Csy4) responsible for CRISPR transcript (pre-crRNA) processing in Pseudomonas aeruginosa. A 1.8 angstrom crystal structure of Csy4 bound to its cognate RNA reveals that Csy4 makes sequence-specific interactions in the major groove of the crRNA repeat stem-loop. Together with electrostatic contacts to the phosphate backbone, these enable Csy4 to bind selectively and cleave pre-crRNAs using phylogenetically conserved serine and histidine residues in the active site. The RNA recognition mechanism identified here explains sequence- and structure-specific processing by a large family of CRISPR-specific endoribonucleases.

  3. Deep sequencing of RNA from immune cell-derived vesicles uncovers the selective incorporation of small non-coding RNA biotypes with potential regulatory functions

    PubMed Central

    Nolte-’t Hoen, Esther N. M.; Buermans, Henk P. J.; Waasdorp, Maaike; Stoorvogel, Willem; Wauben, Marca H. M.; ’t Hoen, Peter A. C.

    2012-01-01

    Cells release RNA-carrying vesicles and membrane-free RNA/protein complexes into the extracellular milieu. Horizontal vesicle-mediated transfer of such shuttle RNA between cells allows dissemination of genetically encoded messages, which may modify the function of target cells. Other studies used array analysis to establish the presence of microRNAs and mRNA in cell-derived vesicles from many sources. Here, we used an unbiased approach by deep sequencing of small RNA released by immune cells. We found a large variety of small non-coding RNA species representing pervasive transcripts or RNA cleavage products overlapping with protein coding regions, repeat sequences or structural RNAs. Many of these RNAs were enriched relative to cellular RNA, indicating that cells destine specific RNAs for extracellular release. Among the most abundant small RNAs in shuttle RNA were sequences derived from vault RNA, Y-RNA and specific tRNAs. Many of the highly abundant small non-coding transcripts in shuttle RNA are evolutionary well-conserved and have previously been associated to gene regulatory functions. These findings allude to a wider range of biological effects that could be mediated by shuttle RNA than previously expected. Moreover, the data present leads for unraveling how cells modify the function of other cells via transfer of specific non-coding RNA species. PMID:22821563

  4. Nucleotide sequence of a satellite RNA associated with carrot motley dwarf in parsley and carrot.

    PubMed

    Menzel, Wulf; Maiss, Edgar; Vetten, H Josef

    2009-02-01

    Carrot motley dwarf (CMD) is known to result from a mixed infection by two viruses, the polerovirus Carrot red leaf virus and one of the umbraviruses Carrot mottle mimic virus or Carrot mottle virus. Some umbraviruses have been shown to be associated with small satellite (sat) RNAs, but none have been reported for the latter two. A CMD-affected parsley plant was used for sap transmission to test plants, that were used for dsRNA isolation. The presence of a 0.8-kbp dsRNA indicated the occurrence of a hitherto unrecognized satRNA associated with CMD. The satRNAs of the CMD isolate from parsley and an isolate from carrot have been sequenced and showed 94% sequence identity. Nucleotide sequences and putative translation products had no significant similarities to GenBank entries. To our knowledge, this is the first report of satRNAs associated with CMD.

  5. Preparation of Single-Cell RNA-Seq Libraries for Next Generation Sequencing.

    PubMed

    Trombetta, John J; Gennert, David; Lu, Diana; Satija, Rahul; Shalek, Alex K; Regev, Aviv

    2014-07-01

    For the past several decades, due to technical limitations, the field of transcriptomics has focused on population-level measurements that can mask significant differences between individual cells. With the advent of single-cell RNA-Seq, it is now possible to profile the responses of individual cells at unprecedented depth and thereby uncover, transcriptome-wide, the heterogeneity that exists within these populations. This unit describes a method that merges several important technologies to produce, in high-throughput, single-cell RNA-Seq libraries. Complementary DNA (cDNA) is made from full-length mRNA transcripts using a reverse transcriptase that has terminal transferase activity. This, when combined with a second "template-switch" primer, allows for cDNAs to be constructed that have two universal priming sequences. Following preamplification from these common sequences, Nextera XT is used to prepare a pool of 96 uniquely indexed samples ready for Illumina sequencing.

  6. Telomerase RNA stem terminus element affects template boundary element function, telomere sequence, and shelterin binding.

    PubMed

    Webb, Christopher J; Zakian, Virginia A

    2015-09-08

    The stem terminus element (STE), which was discovered 13 y ago in human telomerase RNA, is required for telomerase activity, yet its mode of action is unknown. We report that the Schizosaccharomyces pombe telomerase RNA, TER1 (telomerase RNA 1), also contains a STE, which is essential for telomere maintenance. Cells expressing a partial loss-of-function TER1 STE allele maintained short stable telomeres by a recombination-independent mechanism. Remarkably, the mutant telomere sequence was different from that of wild-type cells. Generation of the altered sequence is explained by reverse transcription into the template boundary element, demonstrating that the STE helps maintain template boundary element function. The altered telomeres bound less Pot1 (protection of telomeres 1) and Taz1 (telomere-associated in Schizosaccharomyces pombe 1) in vivo. Thus, the S. pombe STE, although distant from the template, ensures proper telomere sequence, which in turn promotes proper assembly of the shelterin complex.

  7. Massive microRNA sequence conservation and prevalence in human and chimpanzee introns.

    PubMed

    Hill, Aubrey E; Sorscher, Eric J

    2013-06-01

    Human and chimpanzee introns contain numerous sequences strongly related to known microRNA hairpin structures. The relative frequency is precisely maintained across all chromosomes, suggesting the possible co-evolution of gene networks dependent upon microRNA regulation and with origins corresponding to the advent of primate transposable elements (TEs). While the motifs are known to be derived from transposable elements, the most common are far more numerous than expected from the number of TEs and their paralogous sequences, and exhibit striking conservation in comparison to the surrounding TE sequence context. Several of these motifs also exhibit structural complimentarity to each other, suggesting a pairing function at the level of DNA or RNA. These "pseudomicroRNAs," in semblance to pseudogenes, include hundreds of thousands of vestigial paralogs of primate microRNAs, many of which may have functioned historically or remain active today.

  8. Preparation of Single-Cell RNA-Seq Libraries for Next Generation Sequencing

    PubMed Central

    Trombetta, John J.; Gennert, David; Lu, Diana; Satija, Rahul; Shalek, Alex K.; Regev, Aviv

    2014-01-01

    For the past several decades, due to technical limitations, the field of transcriptomics has focused on population-level measurements that can mask significant differences between individual cells. With the advent of single-cell RNA-Seq, it is now possible to profile the responses of individual cells at unprecedented depth and thereby uncover, transcriptome-wide, the heterogeneity that exists within these populations. Here, we describe a method that merges several important technologies to produce, in high-throughput, single-cell RNA-Seq libraries. Complementary DNA (cDNA) is made from full-length mRNA transcripts using a reverse transcriptase that has terminal transferase activity. This, when combined with a second “template-switch” primer, allows for cDNAs to be constructed that have two universal priming sequences. Following preamplification from these common sequences, Nextera XT is used to prepare a pool of 96 uniquely indexed samples ready for Illumina sequencing. PMID:24984854

  9. Determining mutant spectra of three RNA viral samples using ultra-deep sequencing

    SciTech Connect

    Chen, H

    2012-06-06

    RNA viruses have extremely high mutation rates that enable the virus to adapt to new host environments and even jump from one species to another. As part of a viral transmission study, three viral samples collected from naturally infected animals were sequenced using Illumina paired-end technology at ultra-deep coverage. In order to determine the mutant spectra within the viral quasispecies, it is critical to understand the sequencing error rates and control for false positive calls of viral variants (point mutantations). I will estimate the sequencing error rate from two control sequences and characterize the mutant spectra in the natural samples with this error rate.

  10. Prediction of effective RNA interference targets and pathway-related genes in lepidopteran insects by RNA sequencing analysis.

    PubMed

    Guan, Ruo-Bing; Li, Hai-Chao; Miao, Xue-Xia

    2017-01-06

    When using RNA interference (RNAi) to study gene functions in Lepidoptera insects, we discovered that some genes could not be suppressed; instead, their expression levels could be up-regulated by double-stranded RNA (dsRNA). To predict which genes could be easily silenced, we treated the Asian corn borer (Ostrinia furnacalis) with dsGFP (green fluorescent protein) and dsMLP (muscle lim protein). A transcriptome sequence analysis was conducted using the cDNAs 6 h after treatment with dsRNA. The results indicated that 160 genes were up-regulated and 44 genes were down-regulated by the two dsRNAs. Then, 50 co-up-regulated, 25 co-down-regulated and 43 unaffected genes were selected to determine their RNAi responses. All the 25 down-regulated genes were knocked down by their corresponding dsRNA. However, several of the up-regulated and unaffected genes were up-regulated when treated with their corresponding dsRNAs instead of being knocked down. The genes up-regulated by the dsGFP treatment may be involved in insect immune responses or the RNAi pathway. When the immune-related genes were excluded, only seven genes were induced by dsGFP, including ago-2 and dicer-2. These results not only provide a reference for efficient RNAi target predications, but also provide some potential RNAi pathway-related genes for further study.

  11. YM500v2: a small RNA sequencing (smRNA-seq) database for human cancer miRNome research.

    PubMed

    Cheng, Wei-Chung; Chung, I-Fang; Tsai, Cheng-Fong; Huang, Tse-Shun; Chen, Chen-Yang; Wang, Shao-Chuan; Chang, Ting-Yu; Sun, Hsing-Jen; Chao, Jeffrey Yung-Chuan; Cheng, Cheng-Chung; Wu, Cheng-Wen; Wang, Hsei-Wei

    2015-01-01

    We previously presented YM500, which is an integrated database for miRNA quantification, isomiR identification, arm switching discovery and novel miRNA prediction from 468 human smRNA-seq datasets. Here in this updated YM500v2 database (http://ngs.ym.edu.tw/ym500/), we focus on the cancer miRNome to make the database more disease-orientated. New miRNA-related algorithms developed after YM500 were included in YM500v2, and, more significantly, more than 8000 cancer-related smRNA-seq datasets (including those of primary tumors, paired normal tissues, PBMC, recurrent tumors, and metastatic tumors) were incorporated into YM500v2. Novel miRNAs (miRNAs not included in the miRBase R21) were not only predicted by three independent algorithms but also cleaned by a new in silico filtration strategy and validated by wetlab data such as Cross-Linked ImmunoPrecipitation sequencing (CLIP-seq) to reduce the false-positive rate. A new function 'Meta-analysis' is additionally provided for allowing users to identify real-time differentially expressed miRNAs and arm-switching events according to customer-defined sample groups and dozens of clinical criteria tidying up by proficient clinicians. Cancer miRNAs identified hold the potential for both basic research and biotech applications.

  12. Cell-SELEX Identifies a “Sticky” RNA Aptamer Sequence

    PubMed Central

    2017-01-01

    Cell-SELEX is performed to select for cell binding aptamers. We employed an additional selection pressure by using RNAse to remove surface-binding aptamers and select for cell-internalizing aptamers. A common RNA sequence was identified from independent cell-SELEX procedures against two different pancreatic cancer cell lines, indicating a strong selection pressure towards this sequence from the large pool of other available sequences present in the aptamer library. The aptamer is not specific for the pancreatic cancer cell lines, and a similar sequence motif is present in previously published internalizing aptamers. The identified sequence forms a structural motif that binds to a surface protein, which either is highly abundant or has strong affinity for the selected aptamer sequence. Deselecting (removing) this sequence during cell-SELEX may increase the probability of identifying aptamers against cell type-specific targets on the cell surface. PMID:28194280

  13. Prediction of Immunomodulatory potential of an RNA sequence for designing non-toxic siRNAs and RNA-based vaccine adjuvants

    PubMed Central

    Chaudhary, Kumardeep; Nagpal, Gandharva; Dhanda, Sandeep Kumar; Raghava, Gajendra P. S.

    2016-01-01

    Our innate immune system recognizes a foreign RNA sequence of a pathogen and activates the immune system to eliminate the pathogen from our body. This immunomodulatory potential of RNA can be used to design RNA-based immunotherapy and vaccine adjuvants. In case of siRNA-based therapy, the immunomodulatory effect of an RNA sequence is unwanted as it may cause immunotoxicity. Thus, we developed a method for designing a single-stranded RNA (ssRNA) sequence with desired immunomodulatory potentials, for designing RNA-based therapeutics, immunotherapy and vaccine adjuvants. The dataset used for training and testing our models consists of 602 experimentally verified immunomodulatory oligoribonucleotides (IMORNs) that are ssRNA sequences of length 17 to 27 nucleotides and 520 circulating miRNAs as non-immunomodulatory sequences. We developed prediction models using various features that include composition-based features, binary profile, selected features, and hybrid features. All models were evaluated using five-fold cross-validation and external validation techniques; achieving a maximum mean Matthews Correlation Coefficient (MCC) of 0.86 with 93% accuracy. We identified motifs using MERCI software and observed the abundance of adenine (A) in motifs. Based on the above study, we developed a web server, imRNA, comprising of various modules important for designing RNA-based therapeutics (http://crdd.osdd.net/raghava/imrna/). PMID:26861761

  14. Yersinia spp. Identification Using Copy Diversity in the Chromosomal 16S rRNA Gene Sequence.

    PubMed

    Hao, Huijing; Liang, Junrong; Duan, Ran; Chen, Yuhuang; Liu, Chang; Xiao, Yuchun; Li, Xu; Su, Mingming; Jing, Huaiqi; Wang, Xin

    2016-01-01

    API 20E strip test, the standard for Enterobacteriaceae identification, is not sufficient to discriminate some Yersinia species for some unstable biochemical reactions and the same biochemical profile presented in some species, e.g. Yersinia ferderiksenii and Yersinia intermedia, which need a variety of molecular biology methods as auxiliaries for identification. The 16S rRNA gene is considered a valuable tool for assigning bacterial strains to species. However, the resolution of the 16S rRNA gene may be insufficient for discrimination because of the high similarity of sequences between some species and heterogeneity within copies at the intra-genomic level. In this study, for each strain we randomly selected five 16S rRNA gene clones from 768 Yersinia strains, and collected 3,840 sequences of the 16S rRNA gene from 10 species, which were divided into 439 patterns. The similarity among the five clones of 16S rRNA gene is over 99% for most strains. Identical sequences were found in strains of different species. A phylogenetic tree was constructed using the five 16S rRNA gene sequences for each strain where the phylogenetic classifications are consistent with biochemical tests; and species that are difficult to identify by biochemical phenotype can be differentiated. Most Yersinia strains form distinct groups within each species. However Yersinia kristensenii, a heterogeneous species, clusters with some Yersinia enterocolitica and Yersinia ferderiksenii/intermedia strains, while not affecting the overall efficiency of this species classification. In conclusion, through analysis derived from integrated information from multiple 16S rRNA gene sequences, the discrimination ability of Yersinia species is improved using our method.

  15. Analysis options for high-throughput sequencing in miRNA expression profiling

    PubMed Central

    2014-01-01

    Background Recently high-throughput sequencing (HTS) using next generation sequencing techniques became useful in digital gene expression profiling. Our study introduces analysis options for HTS data based on mapping to miRBase or counting and grouping of identical sequence reads. Those approaches allow a hypothesis free detection of miRNA differential expression. Methods We compare our results to microarray and qPCR data from one set of RNA samples. We use Illumina platforms for microarray analysis and miRNA sequencing of 20 samples from benign follicular thyroid adenoma and malignant follicular thyroid carcinoma. Furthermore, we use three strategies for HTS data analysis to evaluate miRNA biomarkers for malignant versus benign follicular thyroid tumors. Results High correlation of qPCR and HTS data was observed for the proposed analysis methods. However, qPCR is limited in the differential detection of miRNA isoforms. Moreover, we illustrate a much broader dynamic range of HTS compared to microarrays for small RNA studies. Finally, our data confirm hsa-miR-197-3p, hsa-miR-221-3p, hsa-miR-222-3p and both hsa-miR-144-3p and hsa-miR-144-5p as potential follicular thyroid cancer biomarkers. Conclusions Compared to microarrays HTS provides a global profile of miRNA expression with higher specificity and in more detail. Summarizing of HTS reads as isoform groups (analysis pipeline B) or according to functional criteria (seed analysis pipeline C), which better correlates to results of qPCR are promising new options for HTS analysis. Finally, data opens future miRNA research perspectives for HTS and indicates that qPCR might be limited in validating HTS data in detail. PMID:24625073

  16. Genome Sequence of a Novel Iflavirus from mRNA Sequencing of the Butterfly Heliconius erato

    PubMed Central

    Macias-Muñoz, Aide; Briscoe, Adriana D.

    2014-01-01

    Here, we report the genome sequence of a novel iflavirus strain recovered from the neotropical butterfly Heliconius erato. The coding DNA sequence (CDS) of the iflavirus genome was 8,895 nucleotides in length, encoding a polyprotein that was 2,965 amino acids long. PMID:24831145

  17. Size and distribution of polyadenylic acid sequences in Drosophila polytene DNA and RNA.

    PubMed

    Alonso, C; Pages, M; García, M L

    1977-12-02

    [3H]Poly(U) hybridizes very rapidly to polytene DNA from Drosophila hydei. When hybridization is performed at 30 degrees C in 2 X SSC to a large excess of DNA, 95% of the poly(U) becomes ribonuclease resistant. Also, complementary RNA transcribed in vitro from polytene DNA hybridizes to poly(U). 023--0.25% of the DNA is composed of (dA)-rich sequences and 0.23--0.31% of cRNA hybridizes to [3H]poly(U). The length of the (dA)-rich sequences on the DNA and cRNA is 40 nucleotides. The Tm values of these hybrids formed between DNA or cRNA-poly(U) is 45 degrees C. The poly(A) fragments from cytoplasmic RNA ranged from 80 to 170 nucleotides in lenght, and migrated in polyacrilamide gels as a broad peak. The average sizes of the poly(A) fragments from the poly(A)-containing RNA transcribed by nuclei isolated from salivary glands in vivo or in vitro were 40, 70, 170 and 70 nucleotides, respectively. Hybridization in situ of [3H]-poly(U) to chromosome squashes indicated that the (dA)-rich sequences are randomly distributed over the whole genome.

  18. A Census of rRNA Genes and Linked Genomic Sequences within a Soil Metagenomic Library

    PubMed Central

    Liles, Mark R.; Manske, Brian F.; Bintrim, Scott B.; Handelsman, Jo; Goodman, Robert M.

    2003-01-01

    We have analyzed the diversity of microbial genomes represented in a library of metagenomic DNA from soil. A total of 24,400 bacterial artificial chromosome (BAC) clones were screened for 16S rRNA genes. The sequences obtained from BAC clones were compared with a collection generated by direct PCR amplification and cloning of 16S rRNA genes from the same soil. The results indicated that the BAC library had substantially lower representation of bacteria among the Bacillus, α-Proteobacteria, and CFB groups; greater representation among the β- and γ-Proteobacteria, and OP10 divisions; and no rRNA genes from the domains Eukaryota and Archaea. In addition to rRNA genes recovered from the bacterial divisions Proteobacteria, Verrucomicrobia, Firmicutes, Cytophagales, and OP11, we identified many rRNA genes from the BAC library affiliated with the bacterial division Acidobacterium; all of these sequences were affiliated with subdivisions that lack cultured representatives. The complete sequence of one BAC clone derived from a member of the Acidobacterium division revealed a complete rRNA operon and 20 other open reading frames, including predicted gene products involved in cell division, cell cycling, folic acid biosynthesis, substrate metabolism, amino acid uptake, DNA repair, and transcriptional regulation. This study is the first step in using genomics to reveal the physiology of as-yet-uncultured members of the Acidobacterium division. PMID:12732537

  19. Sequence and analysis of the gene for bacteriophage T3 RNA polymerase.

    PubMed Central

    McGraw, N J; Bailey, J N; Cleaves, G R; Dembinski, D R; Gocke, C R; Joliffe, L K; MacWright, R S; McAllister, W T

    1985-01-01

    The RNA polymerases encoded by bacteriophages T3 and T7 have similar structures, but exhibit nearly exclusive template specificities. We have determined the nucleotide sequence of the region of T3 DNA that encodes the T3 RNA polymerase (the gene 1.0 region), and have compared this sequence with the corresponding region of T7 DNA. The predicted amino acid sequence of the T3 RNA polymerase exhibits very few changes when compared to the T7 enzyme (82% of the residues are identical). Significant differences appear to cluster in three distinct regions in the amino-terminal half of the protein. Analysis of the data from both enzymes suggests features that may be important for polymerase function. In particular, a region that differs between the T3 and T7 enzymes exhibits significant homology to the bi-helical domain that is common to many sequence-specific DNA binding proteins. The region that flanks the structural gene contains a number of regulatory elements including: a promoter for the E. coli RNA polymerase, a potential processing site for RNase III and a promoter for the T3 polymerase. The promoter for the T3 RNA polymerase is located only 12 base pairs distal to the stop codon for the structural gene. PMID:3903658

  20. Total chemical synthesis of a 77-nucleotide-long RNA sequence having methionine-acceptance activity.

    PubMed Central

    Ogilvie, K K; Usman, N; Nicoghosian, K; Cedergren, R J

    1988-01-01

    Chemical synthesis is described of a 77-nucleotide-long RNA molecule that has the sequence of an Escherichia coli Ado-47-containing tRNA(fMet) species in which the modified nucleosides have been substituted by their unmodified parent nucleosides. The sequence was assembled on a solid-phase, controlled-pore glass support in a stepwise manner with an automated DNA synthesizer. The ribonucleotide building blocks used were fully protected 5'-monomethoxytrityl-2'-silyl-3'-N,N-diisopropylaminophosphoram idites. p-Nitro-phenylethyl groups were used to protect the O6 of guanine residues. The fully deprotected tRNA analogue was characterized by polyacrylamide gel electrophoresis (sizing), terminal nucleotide analysis, sequencing, and total enzyme degradation, all of which indicated that the sequence was correct and contained only 3-5 linkages. The 77-mer was then assayed for amino acid acceptor activity by using E. coli methionyl-tRNA synthetase. The results indicated that the synthetic product, lacking modified bases, is a substrate for the enzyme and has an amino acid acceptance 11% of that of the major native species, tRNA(fMet) containing 7-methylguanosine at position 47. Images PMID:3413059

  1. Phenotype classification of single cells using SRS microscopy, RNA sequencing, and microfluidics (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi

    2016-03-01

    Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.

  2. incaRNAfbinv: a web server for the fragment-based design of RNA sequences.

    PubMed

    Drory Retwitzer, Matan; Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme; Barash, Danny

    2016-07-08

    In recent years, new methods for computational RNA design have been developed and applied to various problems in synthetic biology and nanotechnology. Lately, there is considerable interest in incorporating essential biological information when solving the inverse RNA folding problem. Correspondingly, RNAfbinv aims at including biologically meaningful constraints and is the only program to-date that performs a fragment-based design of RNA sequences. In doing so it allows the design of sequences that do not necessarily exactly fold into the target, as long as the overall coarse-grained tree graph shape is preserved. Augmented by the weighted sampling algorithm of incaRNAtion, our web server called incaRNAfbinv implements the method devised in RNAfbinv and offers an interactive environment for the inverse folding of RNA using a fragment-based design approach. It takes as input: a target RNA secondary structure; optional sequence and motif constraints; optional target minimum free energy, neutrality and GC content. In addition to the design of synthetic regulatory sequences, it can be used as a pre-processing step for the detection of novel natural occurring RNAs. The two complementary methodologies RNAfbinv and incaRNAtion are merged together and fully implemented in our web server incaRNAfbinv, available at http://www.cs.bgu.ac.il/incaRNAfbinv.

  3. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  4. High-Throughput Sequencing Reveals Circular Substrates for an Archaeal RNA ligase.

    PubMed

    Becker, Hubert F; Heliou, Alice; Djaout, Kamel; Lestini, Roxane; Regnier, Mireille; Myllykallio, Hannu

    2017-03-09

    It is only recently that the abundant presence of circular RNAs (circRNAs) in all kingdoms of Life, including the hyperthermophilic archaeon Pyrococcus abyssi, has emerged. This led us to investigate the physiological significance of a previously observed weak intramolecular ligation activity of Pab1020 RNA ligase. Here we demonstrate that this enzyme, despite sharing significant sequence similarity with DNA ligases, is indeed an RNA-specific polynucleotide ligase efficiently acting on physiologically significant substrates. Using a combination of RNA immunoprecipitation assays and RNA-seq, our genome-wide studies revealed 133 individual circRNA loci in P. abyssi. The large majority of these loci interacted with Pab1020 in cells and circularization of selected C/D Box and 5S rRNA transcripts was confirmed biochemically. Altogether these studies revealed that Pab1020 is required for RNA circularization. Our results further suggest the functional speciation of an ancestral NTase domain and/or DNA ligase towards RNA ligase activity and prompt for further characterization of the widespread functions of circular RNAs in prokaryotes. Detailed insight into the cellular substrates of Pab1020 may facilitate the development of new biotechnological applications e.g. in ligation of preadenylated adaptors to RNA molecules.

  5. Sequence selective recognition of double-stranded RNA using triple helix-forming peptide nucleic acids.

    PubMed

    Zengeya, Thomas; Gupta, Pankaj; Rozners, Eriks

    2014-01-01

    Noncoding RNAs are attractive targets for molecular recognition because of the central role they play in gene expression. Since most noncoding RNAs are in a double-helical conformation, recognition of such structures is a formidable problem. Herein, we describe a method for sequence-selective recognition of biologically relevant double-helical RNA (illustrated on ribosomal A-site RNA) using peptide nucleic acids (PNA) that form a triple helix in the major grove of RNA under physiologically relevant conditions. Protocols for PNA preparation and binding studies using isothermal titration calorimetry are described in detail.

  6. Nucleotide sequence of the rrnG ribosomal RNA promoter region of Escherichia coli.

    PubMed Central

    Shen, W F; Squires, C; Squires, C L

    1982-01-01

    The primary structure of the promoter region for a ribosomal RNA transcription unit (rrnG) of Escherichia coli K12 has been determined. The sequence was obtained from 1 1.5 kbp EcoRI fragment derived from the hybrid plasmid pLC23-30. This fragment contains 455 bp preceding P1 of the rrnG promoter region and 674 bp of the rrnG 16S RNA gene. The sequence before the rrnG promoter region contains an open reading frame (ORF-BG) followed by a possible hairpin structure that resembles other known transcription terminators. The sequence of the rrnG promoter region is similar but not identical to that of rrnA and rrnB. Several minor differences between the sequences of the 16S RNA genes of rrnG and rrnB were also noted. In addition, sequences were found that could generate special structures involving the promoter regions of rrn loci. Such structures are described and their possible involvement in the regulation of ribosomal RNA synthesis is discussed. PMID:6285294

  7. An improved and validated RNA HLA class I SBT approach for obtaining full length coding sequences.

    PubMed

    Gerritsen, K E H; Olieslagers, T I; Groeneweg, M; Voorter, C E M; Tilanus, M G J

    2014-11-01

    The functional relevance of human leukocyte antigen (HLA) class I allele polymorphism beyond exons 2 and 3 is difficult to address because more than 70% of the HLA class I alleles are defined by exons 2 and 3 sequences only. For routine application on clinical samples we improved and validated the HLA sequence-based typing (SBT) approach based on RNA templates, using either a single locus-specific or two overlapping group-specific polymerase chain reaction (PCR) amplifications, with three forward and three reverse sequencing reactions for full length sequencing. Locus-specific HLA typing with RNA SBT of a reference panel, representing the major antigen groups, showed identical results compared to DNA SBT typing. Alleles encountered with unknown exons in the IMGT/HLA database and three samples, two with Null and one with a Low expressed allele, have been addressed by the group-specific RNA SBT approach to obtain full length coding sequences. This RNA SBT approach has proven its value in our routine full length definition of alleles.

  8. Novel Transcription Factor Variants through RNA-Sequencing: The Importance of Being “Alternative”

    PubMed Central

    Scarpato, Margherita; Federico, Antonio; Ciccodicola, Alfredo; Costa, Valerio

    2015-01-01

    Alternative splicing is a pervasive mechanism of RNA maturation in higher eukaryotes, which increases proteomic diversity and biological complexity. It has a key regulatory role in several physiological and pathological states. The diffusion of Next Generation Sequencing, particularly of RNA-Sequencing, has exponentially empowered the identification of novel transcripts revealing that more than 95% of human genes undergo alternative splicing. The highest rate of alternative splicing occurs in transcription factors encoding genes, mostly in Krüppel-associated box domains of zinc finger proteins. Since these molecules are responsible for gene expression, alternative splicing is a crucial mechanism to “regulate the regulators”. Indeed, different transcription factors isoforms may have different or even opposite functions. In this work, through a targeted re-analysis of our previously published RNA-Sequencing datasets, we identified nine novel transcripts in seven transcription factors genes. In silico analysis, combined with RT-PCR, cloning and Sanger sequencing, allowed us to experimentally validate these new variants. Through computational approaches we also predicted their novel structural and functional properties. Our findings indicate that alternative splicing is a major determinant of transcription factor diversity, confirming that accurate analysis of RNA-Sequencing data can reliably lead to the identification of novel transcripts, with potentially new functions. PMID:25590302

  9. Identification of extracellular miRNA in archived serum samples by next-generation sequencing from RNA extracted using multiple methods.

    PubMed

    Gautam, Aarti; Kumar, Raina; Dimitrov, George; Hoke, Allison; Hammamieh, Rasha; Jett, Marti

    2016-10-01

    miRNAs act as important regulators of gene expression by promoting mRNA degradation or by attenuating protein translation. Since miRNAs are stably expressed in bodily fluids, there is growing interest in profiling these miRNAs, as it is minimally invasive and cost-effective as a diagnostic matrix. A technical hurdle in studying miRNA dynamics is the ability to reliably extract miRNA as small sample volumes and low RNA abundance create challenges for extraction and downstream applications. The purpose of this study was to develop a pipeline for the recovery of miRNA using small volumes of archived serum samples. The RNA was extracted employing several widely utilized RNA isolation kits/methods with and without addition of a carrier. The small RNA library preparation was carried out using Illumina TruSeq small RNA kit and sequencing was carried out using Illumina platform. A fraction of five microliters of total RNA was used for library preparation as quantification is below the detection limit. We were able to profile miRNA levels in serum from all the methods tested. We found out that addition of nucleic acid based carrier molecules had higher numbers of processed reads but it did not enhance the mapping of any miRBase annotated sequences. However, some of the extraction procedures offer certain advantages: RNA extracted by TRIzol seemed to align to the miRBase best; extractions using TRIzol with carrier yielded higher miRNA-to-small RNA ratios. Nuclease free glycogen can be carrier of choice for miRNA sequencing. Our findings illustrate that miRNA extraction and quantification is influenced by the choice of methodologies. Addition of nucleic acid- based carrier molecules during extraction procedure is not a good choice when assaying miRNA using sequencing. The careful selection of an extraction method permits the archived serum samples to become valuable resources for high-throughput applications.

  10. Combined sequencing of mRNA and DNA from human embryonic stem cells.

    PubMed

    Mertes, Florian; Kuhl, Heiner; Wruck, Wasco; Lehrach, Hans; Adjaye, James

    2016-06-01

    Combined transcriptome and whole genome sequencing of the same ultra-low input sample down to single cells is a rapidly evolving approach for the analysis of rare cells. Besides stem cells, rare cells originating from tissues like tumor or biopsies, circulating tumor cells and cells from early embryonic development are under investigation. Herein we describe a universal method applicable for the analysis of minute amounts of sample material (150 to 200 cells) derived from sub-colony structures from human embryonic stem cells. The protocol comprises the combined isolation and separate amplification of poly(A) mRNA and whole genome DNA followed by next generation sequencing. Here we present a detailed description of the method developed and an overview of the results obtained for RNA and whole genome sequencing of human embryonic stem cells, sequencing data is available in the Gene Expression Omnibus (GEO) database under accession number GSE69471.

  11. High-Throughput Sequencing of RNA Silencing-Associated Small RNAs in Olive (Olea europaea L.)

    PubMed Central

    Donaire, Livia; Pedrola, Laia; de la Rosa, Raúl; Llave, César

    2011-01-01

    Small RNAs (sRNAs) of 20 to 25 nucleotides (nt) in length maintain genome integrity and control gene expression in a multitude of developmental and physiological processes. Despite RNA silencing has been primarily studied in model plants, the advent of high-throughput sequencing technologies has enabled profiling of the sRNA component of more than 40 plant species. Here, we used deep sequencing and molecular methods to report the first inventory of sRNAs in olive (Olea europaea L.). sRNA libraries prepared from juvenile and adult shoots revealed that the 24-nt class dominates the sRNA transcriptome and atypically accumulates to levels never seen in other plant species, suggesting an active role of heterochromatin silencing in the maintenance and integrity of its large genome. A total of 18 known miRNA families were identified in the libraries. Also, 5 other sRNAs derived from potential hairpin-like precursors remain as plausible miRNA candidates. RNA blots confirmed miRNA expression and suggested tissue- and/or developmental-specific expression patterns. Target mRNAs of conserved miRNAs were computationally predicted among the olive cDNA collection and experimentally validated through endonucleolytic cleavage assays. Finally, we use expression data to uncover genetic components of the miR156, miR172 and miR390/TAS3-derived trans-acting small interfering RNA (tasiRNA) regulatory nodes, suggesting that these interactive networks controlling developmental transitions are fully operational in olive. PMID:22140484

  12. Different organisms associated with heartwater as shown by analysis of 16S ribosomal RNA gene sequences.

    PubMed

    Allsopp, M; Visser, E S; du Plessis, J L; Vogel, S W; Allsopp, B A

    1997-08-01

    Cowdria ruminantium is a rickettsial parasite which causes heartwater, a economically important disease of domestic and wild ruminants in tropical and subtropical Africa and parts of the Caribbean. Because existing diagnostic methods are unreliable, we investigated the small-subunit ribosomal RNA (srRNA) gene from heartwater-infected material to characterise the organisms present and to develop specific oligonucleotide probes for polymerase chain reaction (PCR) based diagnosis. DNA was obtained from ticks and ruminants from heartwater-free and heartwater-endemic areas from Cowdria in tissue culture. PCR was carried out using primers designed to amplify only rickettsial srRNA genes, the target region being the highly variable V1 loop. Amplicons were cloned and sequenced; 51% were C. ruminantium sequences corresponding to four genotypes, two of which were identical to previously reported C. ruminantium sequences while the other two were new. The four different Cowdria genotypes can be correlated with different phenotypes. Tissue-culture samples yielded only Cowdria genotype sequences, but an extraordinary heterogeneity of 16S sequences was obtained from field samples. In addition to Cowdria genotypes we found sequences from previously unknown Ehrlichia spp., sequences showing homology to other Rickettsiales and a variety of Pseudomonadaceae. One Ehrlichia sequence was phylogenetically closely related to Ehrlichia platys (Group II Ehrlichia) and one to Ehrlichia canis (Group III Ehrlichia). This latter sequence was from an isolate (Germishuys) made from a naturally infected sheep which, from brain smear examination and pathology, appeared to be suffering from heartwater; nevertheless no Cowdria genotype sequences were found in this isolate. In addition no Cowdria sequences were obtained from uninfected ticks. Complete 16S rRNA gene sequences were determined for two C. ruminantium genotypes and for two previously uncharacterised heartwater-associated Ehrlichia spp

  13. Identification of genetic variation between obligate plant pathogens Psuedoperonospora cubensis and P. humuli using RNA sequencing and genotyping-by-sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    RNA sequencing (RNA-seq) and genotyping-by-sequencing (GBS) were used for single nucleotide polymorphism (SNP) identification from two economically important obligate plant pathogens, Pseudoperonospora cubensis and P. humuli. Twenty isolates of P. cubensis and 19 isolates of P. humuli were genotyped...

  14. Evaluating the impact of sequencing error correction for RNA-seq data with ERCC RNA spike-in controls.

    PubMed

    Tong, Li; Yang, Cheng; Wu, Po-Yen; Wang, May D

    2016-02-01

    Sequencing errors are a major issue for several next-generation sequencing-based applications such as de novo assembly and single nucleotide polymorphism detection. Several error-correction methods have been developed to improve raw data quality. However, error-correction performance is hard to evaluate because of the lack of a ground truth. In this study, we propose a novel approach which using ERCC RNA spike-in controls as the ground truth to facilitate error-correction performance evaluation. After aligning raw and corrected RNA-seq data, we characterized the quality of reads by three metrics: mismatch patterns (i.e., the substitution rate of A to C) of reads aligned with one mismatch, mismatch patterns of reads aligned with two mismatches and the percentage increase of reads aligned to reference. We observed that the mismatch patterns for reads aligned with one mismatch are significantly correlated between ERCC spike-ins and real RNA samples. Based on such observations, we conclude that ERCC spike-ins can serve as ground truths for error correction beyond their previous applications for validation of dynamic range and fold-change response. Also, the mismatch patterns for ERCC reads aligned with one mismatch can serve as a novel and reliable metric to evaluate the performance of error-correction tools.

  15. RNA Sequencing Identifies New RNase III Cleavage Sites in Escherichia coli and Reveals Increased Regulation of mRNA

    PubMed Central

    Gordon, Gina C.; Cameron, Jeffrey C.

    2017-01-01

    ABSTRACT Ribonucleases facilitate rapid turnover of RNA, providing cells with another mechanism to adjust transcript and protein levels in response to environmental conditions. While many examples have been documented, a comprehensive list of RNase targets is not available. To address this knowledge gap, we compared levels of RNA sequencing coverage of Escherichia coli and a corresponding RNase III mutant to expand the list of known RNase III targets. RNase III is a widespread endoribonuclease that binds and cleaves double-stranded RNA in many critical transcripts. RNase III cleavage at novel sites found in aceEF, proP, tnaC, dctA, pheM, sdhC, yhhQ, glpT, aceK, and gluQ accelerated RNA decay, consistent with previously described targets wherein RNase III cleavage initiates rapid degradation of secondary messages by other RNases. In contrast, cleavage at three novel sites in the ahpF, pflB, and yajQ transcripts led to stabilized secondary transcripts. Two other novel sites in hisL and pheM overlapped with transcriptional attenuators that likely serve to ensure turnover of these highly structured RNAs. Many of the new RNase III target sites are located on transcripts encoding metabolic enzymes. For instance, two novel RNase III sites are located within transcripts encoding enzymes near a key metabolic node connecting glycolysis and the tricarboxylic acid (TCA) cycle. Pyruvate dehydrogenase activity was increased in an rnc deletion mutant compared to the wild-type (WT) strain in early stationary phase, confirming the novel link between RNA turnover and regulation of pathway activity. Identification of these novel sites suggests that mRNA turnover may be an underappreciated mode of regulating metabolism. PMID:28351917

  16. RNA deep sequencing reveals differential microRNA expression during development of sea urchin and sea star.

    PubMed

    Kadri, Sabah; Hinman, Veronica F; Benos, Panayiotis V

    2011-01-01

    microRNAs (miRNAs) are small (20-23 nt), non-coding single stranded RNA molecules that act as post-transcriptional regulators of mRNA gene expression. They have been implicated in regulation of developmental processes in diverse organisms. The echinoderms, Strongylocentrotus purpuratus (sea urchin) and Patiria miniata (sea star) are excellent model organisms for studying development with well-characterized transcriptional networks. However, to date, nothing is known about the role of miRNAs during development in these organisms, except that the genes that are involved in the miRNA biogenesis pathway are expressed during their developmental stages. In this paper, we used Illumina Genome Analyzer (Illumina, Inc.) to sequence small RNA libraries in mixed stage population of embryos from one to three days after fertilization of sea urchin and sea star (total of 22,670,000 reads). Analysis of these data revealed the miRNA populations in these two species. We found that 47 and 38 known miRNAs are expressed in sea urchin and sea star, respectively, during early development (32 in common). We also found 13 potentially novel miRNAs in the sea urchin embryonic library. miRNA expression is generally conserved between the two species during development, but 7 miRNAs are highly expressed in only one species. We expect that our two datasets will be a valuable resource for everyone working in the field of developmental biology and the regulatory networks that affect it. The computational pipeline to analyze Illumina reads is available at http://www.benoslab.pitt.edu/services.html.

  17. mirTools: microRNA profiling and discovery based on high-throughput sequencing.

    PubMed

    Zhu, Erle; Zhao, Fangqing; Xu, Gang; Hou, Huabin; Zhou, Linglin; Li, Xiaokun; Sun, Zhongsheng; Wu, Jinyu

    2010-07-01

    miRNAs are small, non-coding RNA that negatively regulate gene expression at post-transcriptional level, which play crucial roles in various physiological and pathological processes, such as development and tumorigenesis. Although deep sequencing technologies have been applied to investigate various small RNA transcriptomes, their computational methods are far away from maturation as compared to microarray-based approaches. In this study, a comprehensive web server mirTools was developed to allow researchers to comprehensively characterize small RNA transcriptome. With the aid of mirTools, users can: (i) filter low-quality reads and 3/5' adapters from raw sequenced data; (ii) align large-scale short reads to the reference genome and explore their length distribution; (iii) classify small RNA candidates into known categories, such as known miRNAs, non-coding RNA, genomic repeats and coding sequences; (iv) provide detailed annotation information for known miRNAs, such as miRNA/miRNA*, absolute/relative reads count and the most abundant tag; (v) predict novel miRNAs that have not been characterized before; and (vi) identify differentially expressed miRNAs between samples based on two different counting strategies: total read tag counts and the most abundant tag counts. We believe that the integration of multiple computational approaches in mirTools will greatly facilitate current microRNA researches in multiple ways. mirTools can be accessed at http://centre.bioinformatics.zj.cn/mirtools/ and http://59.79.168.90/mirtools.

  18. Phylogenetic analysis of Mexican Babesia bovis isolates using msa and ssrRNA gene sequences.

    PubMed

    Genis, Alma D; Mosqueda, Juan J; Borgonio, Verónica M; Falcón, Alfonso; Alvarez, Antonio; Camacho, Minerva; de Lourdes Muñoz, Maria; Figueroa, Julio V

    2008-12-01

    Variable merozoite surface antigens of Babesia bovis are exposed glycoproteins having a role in erythrocyte invasion. Members of this gene family include msa-1 and msa-2 (msa-2c, msa-2a(1), msa-2a(2), and msa-2b). Small subunit ribosomal (ssr)RNA gene is subject to evolutive pressure and has been used in phylogenetic studies. To determine the phylogenetic relationship among B. bovis Mexican isolates using different genetic markers, PCR amplicons, corresponding to msa-1, msa-2c, msa-2b, and ssrRNA genes, were cloned and plasmids carrying the corresponding inserts were sequenced. Comparative analysis of nucleotide and deduced amino acid sequences revealed distinct degrees of variability and identity among the coding gene sequences obtained from 12 geographically different B. bovis isolates and a reference strain. Overall sequence identities of 47.7%, 72.3%, 87.7%, and 94% were determined for msa-1, msa-2b, msa-2c, and ssrRNA, respectively. A robust phylogenetic tree was obtained with msa-2b sequences. The phylogenetic analysis suggests that Mexican B. bovis isolates group in clades not concordant with the Mexican geography. However, the Mexican isolates group together in an American clade separated from the Australian clade. Sequence heterogeneity in msa-1, msa-2b, and msa-2c coding regions of Mexican B. bovis isolates present in different geographical regions can be a result of either differential evolutive pressure or cattle movement from commercial trade.

  19. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    NASA Technical Reports Server (NTRS)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  20. Identification of microRNAs by small RNA deep sequencing for synthetic microRNA mimics to control Spodoptera exigua.

    PubMed

    Zhang, Yu Liang; Huang, Qi Xing; Yin, Guo Hua; Lee, Samantha; Jia, Rui Zong; Liu, Zhi Xin; Yu, Nai Tong; Pennerman, Kayla K; Chen, Xin; Guo, An Ping

    2015-02-25

    Beet armyworm, Spodoptera exigua, is a major pest of cotton around the world. With the increase of resistance to Bacillus thuringiensis (Bt) toxin in transgenic cotton plants, there is a need to develop an alternative control approach that can be used in combination with Bt transgenic crops as part of resistance management strategies. MicroRNAs (miRNAs), a non-coding small RNA family (18-25 nt), play crucial roles in various biological processes and over-expression of miRNAs has been shown to interfere with the normal development of insects. In this study, we identified 127 conserved miRNAs in S. exigua by using small RNA deep sequencing technology. From this, we tested the effects of 11 miRNAs on larval development. We found three miRNAs, Sex-miR-10-1a, Sex-miR-4924, and Sex-miR-9, to be differentially expressed during larval stages of S. exigua. Oral feeding experiments using synthetic miRNA mimics of Sex-miR-10-1a, Sex-miR-4924, and Sex-miR-9 resulted in suppressed growth of S. exigua and mortality. Over-expression of Sex-miR-4924 caused a significant reduction in the expression level of chitinase 1 and caused abortive molting in the insects. Therefore, we demonstrated a novel approach of using miRNA mimics to control S. exigua development.

  1. The impact of CRISPR repeat sequence on structures of a Cas6 protein-RNA complex

    SciTech Connect

    Wang, Ruiying; Zheng, Han; Preamplume, Gan; Shao, Yaming; Li, Hong

    2012-03-15

    The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of a noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.

  2. YM500v3: a database for small RNA sequencing in human cancer research

    PubMed Central

    Chung, I-Fang; Chang, Shing-Jyh; Chen, Chen-Yang; Liu, Shu-Hsuan; Li, Chia-Yang; Chan, Chia-Hao; Shih, Chuan-Chi; Cheng, Wei-Chung

    2017-01-01

    We previously presented the YM500 database, which contains >8000 small RNA sequencing (smRNA-seq) data sets and integrated analysis results for various cancer miRNome studies. In the updated YM500v3 database (http://ngs.ym.edu.tw/ym500/) presented herein, we not only focus on miRNAs but also on other functional small non-coding RNAs (sncRNAs), such as PIWI-interacting RNAs (piRNAs), tRNA-derived fragments (tRFs), small nuclear RNAs (snRNAs) and small nucleolar RNAs (snoRNAs). There is growing knowledge of the role of sncRNAs in gene regulation and tumorigenesis. We have also incorporated >10 000 cancer-related RNA-seq and >3000 more smRNA-seq data sets into the YM500v3 database. Furthermore, there are two main new sections, ‘Survival' and ‘Cancer', in this updated version. The ‘Survival’ section provides the survival analysis results in all cancer types or in a user-defined group of samples for a specific sncRNA. The ‘Cancer’ section provides the results of differential expression analyses, miRNA–gene interactions and cancer miRNA-related pathways. In the ‘Expression’ section, sncRNA expression profiles across cancer and sample types are newly provided. Cancer-related sncRNAs hold potential for both biotech applications and basic research. PMID:27899625

  3. Small RNA Sequencing Based Identification of MiRNAs in Daphnia magna.

    PubMed

    Ünlü, Ercan Selçuk; Gordon, Donna M; Telli, Murat

    2015-01-01

    Small RNA molecules are short, non-coding RNAs identified for their crucial role in post-transcriptional regulation. A well-studied example includes miRNAs (microRNAs) which have been identified in several model organisms including the freshwater flea and planktonic crustacean Daphnia. A model for epigenetic-based studies with an available genome database, the identification of miRNAs and their potential role in regulating Daphnia gene expression has only recently garnered interest. Computational-based work using Daphnia pulex, has indicated the existence of 45 miRNAs, 14 of which have been experimentally verified. To extend this study, we took a sequencing approach towards identifying miRNAs present in a small RNA library isolated from Daphnia magna. Using Perl codes designed for comparative genomic analysis, 815,699 reads were obtained from 4 million raw reads and run against a database file of known miRNA sequences. Using this approach, we have identified 205 putative mature miRNA sequences belonging to 188 distinct miRNA families. Data from this study provides critical information necessary to begin an investigation into a role for these transcripts in the epigenetic regulation of Daphnia magna.

  4. Messenger RNA sequence and the translation process --a particle transport perspective

    NASA Astrophysics Data System (ADS)

    Dong, Jiajia; Schmittmann, Beate; Zia, Royce K. P.

    2008-03-01

    The translation process in bacteria has been under intensive study. A key question concerns the quantitative effect of different elongation rates, associated with different codons, on the overall translation efficiency. Starting with a simple particle transport model, the totally asymmetric simple exclusion process (TASEP), we incorporate the essential components of the translation process: Ribosomes, cognate tRNA concentrations, and messenger RNA (mRNA) templates correspond to particles, hopping rates, and the underlying lattice, respectively. Using simulations and mean-field approximations to obtain the stationary currents (the protein production rates) associated with different mRNA sequences, we are especially interested in the effect of slow codons, i.e., codons which are associated with rare tRNAs and are therefore translated very slowly. As the first step, we look at a ``designed sequence'' with one and two slow codons and quantify the marked impact of their spatial distribution to the currents. Extending the results to several mRNA sequences taken from real genes, we argue that an effective translation rate including the information from the vicinity of each codon needs to be taken into consideration when seeking an efficient strategy to optimize the protein production.

  5. Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions and MapReduce.

    PubMed

    Yehdego, Daniel T; Zhang, Boyu; Kodimala, Vikram K R; Johnson, Kyle L; Taufer, Michela; Leung, Ming-Ying

    2013-05-01

    Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudoknots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudoknots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudoknots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure.

  6. Maize (Zea mays L.) genome diversity as revealed by RNA-sequencing.

    PubMed

    Hansey, Candice N; Vaillancourt, Brieanne; Sekhon, Rajandeep S; de Leon, Natalia; Kaeppler, Shawn M; Buell, C Robin

    2012-01-01

    Maize is rich in genetic and phenotypic diversity. Understanding the sequence, structural, and expression variation that contributes to phenotypic diversity would facilitate more efficient varietal improvement. RNA based sequencing (RNA-seq) is a powerful approach for transcriptional analysis, assessing sequence variation, and identifying novel transcript sequences, particularly in large, complex, repetitive genomes such as maize. In this study, we sequenced RNA from whole seedlings of 21 maize inbred lines representing diverse North American and exotic germplasm. Single nucleotide polymorphism (SNP) detection identified 351,710 polymorphic loci distributed throughout the genome covering 22,830 annotated genes. Tight clustering of two distinct heterotic groups and exotic lines was evident using these SNPs as genetic markers. Transcript abundance analysis revealed minimal variation in the total number of genes expressed across these 21 lines (57.1% to 66.0%). However, the transcribed gene set among the 21 lines varied, with 48.7% expressed in all of the lines, 27.9% expressed in one to 20 lines, and 23.4% expressed in none of the lines. De novo assembly of RNA-seq reads that did not map to the reference B73 genome sequence revealed 1,321 high confidence novel transcripts, of which, 564 loci were present in all 21 lines, including B73, and 757 loci were restricted to a subset of the lines. RT-PCR validation demonstrated 87.5% concordance with the computational prediction of these expressed novel transcripts. Intriguingly, 145 of the novel de novo assembled loci were present in lines from only one of the two heterotic groups consistent with the hypothesis that, in addition to sequence polymorphisms and transcript abundance, transcript presence/absence variation is present and, thereby, may be a mechanism contributing to the genetic basis of heterosis.

  7. DNA sequencing analysis of ITS and 28S rRNA of Poria cocos.

    PubMed

    Atsumi, Toshiyuki; Kakiuchi, Nobuko; Mikage, Masayuki

    2007-08-01

    We determined the DNA sequences of the internal transcribed spacer 1 and 2 (ITS 1 and 2), the 5.8S rRNA gene and most of the 28S rRNA gene of Poria cocos for the first time, and conducted analysis of 20 samples including cultured mycelias and crude drug materials obtained from various localities and markets. Direct sequencing of the ITS 1 and 2 regions of the samples, except for four wild samples, showed that they had identical DNA sequences for ITS 1 and 2 with nucleotide lengths of 997 bps and 460 bps, respectively. By cloning, the four wild samples were found to have combined sequences of common ITS sequences with 1 or 2-base-pair insertions. Altogether both ITS 1 and 2 sequences were substantially longer than those of other fungal crude drugs such as Ganoderma lucidum and Polyporus umbellatus. Thus, Poria cocos could be distinguished from these crude drugs and fakes by comparing the nucleotide length of PCR products of ITS 1 and 2. Contrary to the basic homogeneity in ITS 1 and 2, three types (Group 1, 2, 3) of the 28S rRNA gene with distinctive differences in length and sequence were found. Furthermore, Group 1 could be divided into three subgroups depending on differences at nucleotide position 690. Products with different types of 28S rRNA gene were found in crude drugs from Yunnan and Anhui Provinces as well as the Korean Peninsula, suggesting that the locality of the crude drugs does not guarantee genetic uniformity. The result of DNA typing of Poria cocos may help discrimination of the quality of the crude drug by genotype.

  8. Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence.

    PubMed

    Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab; Khandekar, Sushant; Crawford, Erin; Zirbel, Craig L; Leisner, Scott; Prakash, Ashwin; Fedorova, Larisa; Fedorov, Alexei

    2014-09-10

    Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals.

  9. Predicting candidate genomic sequences that correspond to synthetic functional RNA motifs

    PubMed Central

    Laserson, Uri; Gan, Hin Hark; Schlick, Tamar

    2005-01-01

    Riboswitches and RNA interference are important emerging mechanisms found in many organisms to control gene expression. To enhance our understanding of such RNA roles, finding small regulatory motifs in genomes presents a challenge on a wide scale. Many simple functional RNA motifs have been found by in vitro selection experiments, which produce synthetic target-binding aptamers as well as catalytic RNAs, including the hammerhead ribozyme. Motivated by the prediction of Piganeau and Schroeder [(2003) Chem. Biol., 10, 103–104] that synthetic RNAs may have natural counterparts, we develop and apply an efficient computational protocol for identifying aptamer-like motifs in genomes. We define motifs from the sequence and structural information of synthetic aptamers, search for sequences in genomes that will produce motif matches, and then evaluate the structural stability and statistical significance of the potential hits. Our application to aptamers for streptomycin, chloramphenicol, neomycin B and ATP identifies 37 candidate sequences (in coding and non-coding regions) that fold to the target aptamer structures in bacterial and archaeal genomes. Further energetic screening reveals that several candidates exhibit energetic properties and sequence conservation patterns that are characteristic of functional motifs. Besides providing candidates for experimental testing, our computational protocol offers an avenue for expanding natural RNA's functional repertoire. PMID:16254081

  10. Ultra Deep Sequencing of Listeria monocytogenes sRNA Transcriptome Revealed New Antisense RNAs

    PubMed Central

    Behrens, Sebastian; Widder, Stefanie; Mannala, Gopala Krishna; Qing, Xiaoxing; Madhugiri, Ramakanth; Kefer, Nathalie; Mraheil, Mobarak Abu; Rattei, Thomas; Hain, Torsten

    2014-01-01

    Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs) as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS) techniques have made RNA sequencing (RNA-Seq) the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from <40 nt, 40–150 nt and >150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs) associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes. PMID:24498259

  11. Molecular phylogeny of Stentor (Ciliophora: Heterotrichea) based on small subunit ribosomal RNA sequences.

    PubMed

    Gong, Ying-Chun; Yu, Yu-He; Zhu, Fei-Yun; Feng, Wei-Song

    2007-01-01

    To determine the phylogenetic position of Stentor within the Class Heterotrichea, the complete small subunit rRNA genes of three Stentor species, namely Stentor polymorphus, Stentor coeruleus, and Stentor roeseli, were sequenced and used to construct phylogenetic trees using the maximum parsimony, neighbor joining, and Bayesian analysis. With all phylogenetic methods, the genus Stentor was monophyletic, with S. roeseli branching basally.

  12. A library screening approach identifies naturally occurring RNA sequences for a G-quadruplex binding ligand.

    PubMed

    Mirihana Arachchilage, Gayan; Morris, Mark J; Basu, Soumitra

    2014-02-07

    An RNA G-quadruplex library was synthesised and screened against kanamycin A as the ligand. Naturally occurring G-quadruplex forming sequences that differentially bind to kanamycin A were identified and characterized. This provides a simple and effective strategy for identification of potential intracellular G-quadruplex targets for a ligand.

  13. RNA sequencing reveals small RNAs differentially expressed between incipient Japanese threespine sticklebacks

    PubMed Central

    2013-01-01

    Background Non-coding small RNAs, ranging from 20 to 30 nucleotides in length, mediate the regulation of gene expression and play important roles in many biological processes. One class of small RNAs, microRNAs (miRNAs), are highly conserved across taxa and mediate the regulation of the chromatin state and the post-transcriptional regulation of messenger RNA (mRNA). Another class of small RNAs is the Piwi-interacting RNAs, which play important roles in the silencing of transposons and other functional genes. Although the biological functions of the different small RNAs have been elucidated in several laboratory animals, little is known regarding naturally occurring variation in small RNA transcriptomes among closely related species. Results We employed next-generation sequencing technology to compare the expression profiles of brain small RNAs between sympatric species of the Japanese threespine stickleback (Gasterosteus aculeatus). We identified several small RNAs that were differentially expressed between sympatric Pacific Ocean and Japan Sea sticklebacks. Potential targets of several small RNAs were identified as repetitive sequences. Female-biased miRNA expression from the old X chromosome was also observed, and it was attributed to the degeneration of the Y chromosome. Conclusions Our results suggest that expression patterns of small RNA can differ between incipient species and may be a potential mechanism underlying differential mRNA expression and transposon activity. PMID:23547919

  14. RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes

    PubMed Central

    Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle

    2016-01-01

    ABSTRACT Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation. PMID:26580233

  15. Cleavage of tRNA within the mature tRNA sequence by the catalytic RNA of RNase P: implication for the formation of the primer tRNA fragment for reverse transcription in copia retrovirus-like particles.

    PubMed Central

    Kikuchi, Y; Sasaki, N; Ando-Yamagami, Y

    1990-01-01

    The retrovirus-like particles of Drosophila are intermediates of retrotransposition of the transposable element copia. In these particles, a 39-nucleotide-long fragment from the 5' region of Drosophila initiator methionine tRNA (tRNA(iMet) is used as the primer for copia minus-strand reverse transcription. To function as primer for this reverse transcription, the Drosophila tRNA(iMet) must be cleaved in vivo at the site between nucleotides 39 and 40. When a synthetic Drosophila tRNA(iMet) precursor was incubated with M1RNA, the catalytic RNA of Escherichia coli RNase P, other cleavages within the mature tRNA sequence were detected in addition to the efficient removal of the 5' leader sequence of this tRNA precursor. One of these cleavage sites is between nucleotides 39 and 40 of Drosophila tRNA(iMet). Based on this result, we propose a model for formation of the primer tRNA fragment for reverse transcription in copia retrovirus-like particles. Images PMID:1700426

  16. Infective Arthritis: Bacterial 23S rRNA Gene Sequencing as a Supplementary Diagnostic Method

    PubMed Central

    Moser, Claus; Andresen, Keld; Kjerulf, Anne; Salamon, Suheil; Kemp, Michael; Christensen, Jens Jørgen

    2008-01-01

    Consecutively collected synovial fluids were examined for presence of bacterial DNA (a 700-bp fragment of the bacterial 23S rRNA gene) followed by DNA sequencing of amplicons, and by conventional bacteriological methods. One or more microorganisms were identified in 22 of the 227 synovial fluids (9,7%) originating from 17 patients. Sixteen of the patients had clinical signs of arthritis. For 11 patients molecular and conventional bacterial examinations were in agreement. Staphylococcus aureus, Streptococcus dysgalactiae subspecies equisimilis and Streptococcus pneumoniae, were detected in synovial fluids from 6, 2 and 2 patients, respectively. In 3 patients only 23S rRNA analysis was positive; 2 synovial fluids contained S. dysgalactiae subspecies equisimilis and 1 S. pneumoniae). The present study indicates a significant contribution by PCR with subsequent DNA sequencing of the 23S rRNA gene analysis in recognizing and identification of microorganisms from synovial fluids. PMID:19088916

  17. Exploring PTX3 expression in Sus scrofa cardiac tissue using RNA sequencing.

    PubMed

    Cabiati, Manuela; Caselli, Chiara; Savelli, Sara; Prescimone, Tommaso; Lionetti, Vincenzo; Giannessi, Daniela; Del Ry, Silvia

    2012-02-10

    The prototypic long pentraxin PTX3 is a novel vascular inflammatory marker sharing similarities with the classic short pentraxin (C-reactive protein). PTX3 is rapidly produced and released by several cell types in response to local inflammation of the cardiovascular system. Plasma PTX3 levels are very low in normal conditions and increase in heart failure (HF) patients with advancing NYHA functional class, but its exact role during HF pathogenetic mechanisms is not yet established. No data about PTX3 cardiac expression in normal and pathological conditions are currently available, either in human or in large-size animals. Of the latter, the pig has a central role in "in vivo" clinical settings but its genome has not been completely sequenced and the PTX3 gene sequence is still lacking. The aim of this study was to sequence the PTX3 in Sus scrofa, whose sequence is not yet present in GenBank. Utilizing our knowledge of this sequence, PTX3 mRNA expression was evaluated in cardiac tissue of normal (n=6) and HF pigs (n=5), obtained from the four chambers. To sequence PTX3 gene in S. scrofa, the high homology between Homo sapiens and S. scrofa was exploited. Pig PTX3 mRNA was sequenced using polymerase chain reaction primers designed from human consensus sequences. The DNA, obtained from different RT-PCR reactions, was sequenced using the Sanger method. S. scrofa PTX3 mRNA, 1-336 bp, was submitted to GenBank (ID: GQ412351). The sequence obtained from pig cardiac tissue shared an 84% sequence identity with human homolog. The presence of PTX3 mRNA expression was detected in all the cardiac chambers sharing an increase after 3 weeks of pacing compared to controls (p=0.036 HF right atrium vs. N; p=0.022, HF left ventricle vs. N). Knowledge of the PTX3 sequence could be a useful starting point for future studies devoted to better understanding the specific role of this molecule in the pathogenesis of cardiovascular diseases.

  18. Sequences more than 500 base pairs upstream of the human U3 small nuclear RNA gene stimulate the synthesis of U3 RNA in frog oocytes

    SciTech Connect

    Suh, D.; Reddy, R. ); Wright, D. )

    1991-06-04

    Small nuclear RNA (snRNA) genes contain strong promoters capable of initiating transcription once every 4 s. Studies on the human U1 snRNA gene, carried out in other laboratories, showed that sequences within 400 bp of the 5' flanking region are sufficient for maximal levels of transcription both in vivo and in frog oocytes (reviewed in Dahlberg and Lund (1988)). The authors studied the expression of a human U3 snRNA gene by injecting 5' deletion mutants into frog oocytes. The results show that sequences more than 500 bp upstream of the U3 snRNA gene have a 2-3-fold stimulatory effect on the U3 snRNA synthesis. These results indicate that the human U3 snRNA gene is different from human U1 snRNA gene in containing regulatory elements more than 500 bp upstream. The U3 snRNA gene upstream sequences contain an AluI homologous sequence in the {minus}1,200 region; these AluI sequences were transcribed in vitro and in frog oocytes but were not detectable in Hela cells.

  19. Discovery and Validation of Barrett's Esophagus MicroRNA Transcriptome by Next Generation Sequencing

    PubMed Central

    Bansal, Ajay; Mathur, Sharad C.; Tawfik, Ossama; Rastogi, Amit; Buttar, Navtej; Visvanathan, Mahesh; Sharma, Prateek; Christenson, Lane K.

    2013-01-01

    Objective Barrett's esophagus (BE) is transition from squamous to columnar mucosa as a result of gastroesophageal reflux disease (GERD). The role of microRNA during this transition has not been systematically studied. Design For initial screening, total RNA from 5 GERD and 6 BE patients was size fractionated. RNA <70 nucleotides was subjected to SOLiD 3 library preparation and next generation sequencing (NGS). Bioinformatics analysis was performed using R package “DEseq”. A p value<0.05 adjusted for a false discovery rate of 5% was considered significant. NGS-identified miRNA were validated using qRT-PCR in an independent group of 40 GERD and 27 BE patients. MicroRNA expression of human BE tissues was also compared with three BE cell lines. Results NGS detected 19.6 million raw reads per sample. 53.1% of filtered reads mapped to miRBase version 18. NGS analysis followed by qRT-PCR validation found 10 differentially expressed miRNA; several are novel (-708-5p, -944, -224-5p and -3065-5p). Up- or down- regulation predicted by NGS was matched by qRT-PCR in every case. Human BE tissues and BE cell lines showed a high degree of concordance (70–80%) in miRNA expression. Prediction analysis identified targets that mapped to developmental signaling pathways such as TGFβ and Notch and inflammatory pathways such as toll-like receptor signaling and TGFβ. Cluster analysis found similarly regulated (up or down) miRNA to share common targets suggesting coordination between miRNA. Conclusion Using highly sensitive next-generation sequencing, we have performed a comprehensive genome wide analysis of microRNA in BE and GERD patients. Differentially expressed miRNA between BE and GERD have been further validated. Expression of miRNA between BE human tissues and BE cell lines are highly correlated. These miRNA should be studied in biological models to further understand BE development. PMID:23372692

  20. Next-generation SELEX identifies sequence and structural determinants of splicing factor binding in human pre-mRNA sequence

    PubMed Central

    Reid, Daniel C.; Chang, Brian L.; Gunderson, Samuel I.; Alpert, Lauren; Thompson, William A.; Fairbrother, William G.

    2009-01-01

    Many splicing factors interact with both mRNA and pre-mRNA. The identification of these interactions has been greatly improved by the development of in vivo cross-linking immunoprecipitation. However, the output carries a strong sampling bias in favor of RNPs that form on more abundant RNA species like mRNA. We have developed a novel in vitro approach for surveying binding on pre-mRNA, without cross-linking or sampling bias. Briefly, this approach entails specifically designed oligonucleotide pools that tile through a pre-mRNA sequence. The pool is then partitioned into bound and unbound fractions, which are quantified by a two-color microarray. We applied this approach to locating splicing factor binding sites in and around ∼4000 exons. We also quantified the effect of secondary structure on binding. The method is validated by the finding that U1snRNP binds at the 5′ splice site (5′ss) with a specificity that is nearly identical to the splice donor motif. In agreement with prior reports, we also show that U1snRNP appears to have some affinity for intronic G triplets that are proximal to the 5′ss. Both U1snRNP and the polypyrimidine tract binding protein (PTB) avoid exonic binding, and the PTB binding map shows increased enrichment at the polypyrimidine tract. For PTB, we confirm polypyrimidine specificity and are also able to identify structural determinants of PTB binding. We detect multiple binding motifs enriched in the PTB bound fraction of oligonucleotides. These motif combinations augment binding in vitro and are also enriched in the vicinity of exons that have been determined to be in vivo targets of PTB. PMID:19861426

  1. The influence of the local sequence environment on RNA loop structures.

    PubMed

    Schudoma, Christian; Larhlimi, Abdelhalim; Walther, Dirk

    2011-07-01

    RNA folding is assumed to be a hierarchical process. The secondary structure of an RNA molecule, signified by base-pairing and stacking interactions between the paired bases, is formed first. Subsequently, the RNA molecule adopts an energetically favorable three-dimensional conformation in the structural space determined mainly by the rotational degrees of freedom associated with the backbone of regions of unpaired nucleotides (loops). To what extent the backbone conformation of RNA loops also results from interactions within the local sequence context or rather follows global optimization constraints alone has not been addressed yet. Because the majority of base stacking interactions are exerted locally, a critical influence of local sequence on local structure appears plausible. Thus, local loop structure ought to be predictable, at least in part, from the local sequence context alone. To test this hypothesis, we used Random Forests on a nonredundant data set of unpaired nucleotides extracted from 97 X-ray structures from the Protein Data Bank (PDB) to predict discrete backbone angle conformations given by the discretized η/θ-pseudo-torsional space. Predictions on balanced sets with four to six conformational classes using local sequence information yielded average accuracies of up to 55%, thus significantly better than expected by chance (17%-25%). Bases close to the central nucleotide appear to be most tightly linked to its conformation. Our results suggest that RNA loop structure does not only depend on long-range base-pairing interactions; instead, it appears that local sequence context exerts a significant influence on the formation of the local loop structure.

  2. Computational generation and screening of RNA motifs in large nucleotide sequence pools

    PubMed Central

    Kim, Namhee; Izzo, Joseph A.; Elmetwaly, Shereef; Gan, Hin Hark; Schlick, Tamar

    2010-01-01

    Although identification of active motifs in large random sequence pools is central to RNA in vitro selection, no systematic computational equivalent of this process has yet been developed. We develop a computational approach that combines target pool generation, motif scanning and motif screening using secondary structure analysis for applications to 1012–1014-sequence pools; large pool sizes are made possible using program redesign and supercomputing resources. We use the new protocol to search for aptamer and ribozyme motifs in pools up to experimental pool size (1014 sequences). We show that motif scanning, structure matching and flanking sequence analysis, respectively, reduce the initial sequence pool by 6–8, 1–2 and 1 orders of magnitude, consistent with the rare occurrence of active motifs in random pools. The final yields match the theoretical yields from probability theory for simple motifs and overestimate experimental yields, which constitute lower bounds, for aptamers because screening analyses beyond secondary structure information are not considered systematically. We also show that designed pools using our nucleotide transition probability matrices can produce higher yields for RNA ligase motifs than random pools. Our methods for generating, analyzing and designing large pools can help improve RNA design via simulation of aspects of in vitro selection. PMID:20448026

  3. Full Genome Sequence and sfRNA Interferon Antagonist Activity of Zika Virus from Recife, Brazil

    PubMed Central

    Rezelj, Veronica V.; Clark, Jordan J.; Cordeiro, Marli T.; Freitas de Oliveira França, Rafael; Pena, Lindomar J.; Wilkie, Gavin S.; Da Silva Filipe, Ana; Davis, Christopher; Hughes, Joseph; Varjak, Margus; Selinger, Martin; Zuvanov, Luíza; Owsianka, Ania M.; Patel, Arvind H.; McLauchlan, John; Lindenbach, Brett D.; Fall, Gamou; Sall, Amadou A.; Biek, Roman; Rehwinkel, Jan; Schnettler, Esther; Kohl, Alain

    2016-01-01

    Background The outbreak of Zika virus (ZIKV) in the Americas has transformed a previously obscure mosquito-transmitted arbovirus of the Flaviviridae family into a major public health concern. Little is currently known about the evolution and biology of ZIKV and the factors that contribute to the associated pathogenesis. Determining genomic sequences of clinical viral isolates and characterization of elements within these are an important prerequisite to advance our understanding of viral replicative processes and virus-host interactions. Methodology/Principal findings We obtained a ZIKV isolate from a patient who presented with classical ZIKV-associated symptoms, and used high throughput sequencing and other molecular biology approaches to determine its full genome sequence, including non-coding regions. Genome regions were characterized and compared to the sequences of other isolates where available. Furthermore, we identified a subgenomic flavivirus RNA (sfRNA) in ZIKV-infected cells that has antagonist activity against RIG-I induced type I interferon induction, with a lesser effect on MDA-5 mediated action. Conclusions/Significance The full-length genome sequence including non-coding regions of a South American ZIKV isolate from a patient with classical symptoms will support efforts to develop genetic tools for this virus. Detection of sfRNA that counteracts interferon responses is likely to be important for further understanding of pathogenesis and virus-host interactions. PMID:27706161

  4. RNA-guided complex from a bacterial immune system enhances target recognition through seed sequence interactions.

    PubMed

    Wiedenheft, Blake; van Duijn, Esther; Bultema, Jelle B; Bultema, Jelle; Waghmare, Sakharam P; Waghmare, Sakharam; Zhou, Kaihong; Barendregt, Arjan; Westphal, Wiebke; Heck, Albert J R; Heck, Albert; Boekema, Egbert J; Boekema, Egbert; Dickman, Mark J; Dickman, Mark; Doudna, Jennifer A

    2011-06-21

    Prokaryotes have evolved multiple versions of an RNA-guided adaptive immune system that targets foreign nucleic acids. In each case, transcripts derived from clustered regularly interspaced short palindromic repeats (CRISPRs) are thought to selectively target invading phage and plasmids in a sequence-specific process involving a variable cassette of CRISPR-associated (cas) genes. The CRISPR locus in Pseudomonas aeruginosa (PA14) includes four cas genes that are unique to and conserved in microorganisms harboring the Csy-type (CRISPR system yersinia) immune system. Here we show that the Csy proteins (Csy1-4) assemble into a 350 kDa ribonucleoprotein complex that facilitates target recognition by enhancing sequence-specific hybridization between the CRISPR RNA and complementary target sequences. Target recognition is enthalpically driven and localized to a "seed sequence" at the 5' end of the CRISPR RNA spacer. Structural analysis of the complex by small-angle X-ray scattering and single particle electron microscopy reveals a crescent-shaped particle that bears striking resemblance to the architecture of a large CRISPR-associated complex from Escherichia coli, termed Cascade. Although similarity between these two complexes is not evident at the sequence level, their unequal subunit stoichiometry and quaternary architecture reveal conserved structural features that may be common among diverse CRISPR-mediated defense systems.

  5. Cloning and characterization of a Leishmania gene encoding a RNA spliced leader sequence.

    PubMed Central

    Miller, S I; Landfear, S M; Wirth, D F

    1986-01-01

    Recent studies on leishmania enriettii tubulin mRNAs revealed a 35 nucleotide addition to their 5' end. The gene that codes for this 35 nucleotide leader sequence has now been cloned and sequenced. In the Leishmania genome, the spliced leader gene exists as a tandem repeat of 438 bases. There are approximately 150 copies of this gene comprising 0.1% of the parasite genome. This gene codes for a 85 nucleotide transcript that contains the spliced leader at its 5' end. The 35 nucleotide sequence and the regions immediately 5' and 3' to it are highly conserved across trypanosomatids. We have detected a RNA molecule that is a putative by-product of the processing reaction in which the 35 nucleotide spliced leader has been transferred to mRNA. We suggest that this molecule is the remnant of the spliced leader transcript after removal of the 35 nucleotide spliced leader. Images PMID:2429261

  6. Assessment of translational importance of mammalian mRNA sequence features based on Ribo-Seq and mRNA-Seq data.

    PubMed

    Volkova, Oxana A; Kondrakhin, Yury V; Yevshin, Ivan S; Valeev, Tagir F; Sharipov, Ruslan N

    2016-04-01

    Ribosome profiling technology (Ribo-Seq) allowed to highlight more details of mRNA translation in cell and get additional information on importance of mRNA sequence features for this process. Application of translation inhibitors like harringtonine and cycloheximide along with mRNA-Seq technique helped to assess such important characteristic as translation efficiency. We assessed the translational importance of features of mRNA sequences with the help of statistical analysis of Ribo-Seq and mRNA-Seq data. Translationally important features known from literature as well as proposed by the authors were used in analysis. Such comparisons as protein coding versus non-coding RNAs and high- versus low-translated mRNAs were performed. We revealed a set of features that allowed to discriminate the compared categories of RNA. Significant relationships between mRNA features and efficiency of translation were also established.

  7. A genome-wide view of microsatellite instability: old stories of cancer mutations revisited with new sequencing technologies

    PubMed Central

    Kim, Tae-Min; Park, Peter J

    2014-01-01

    Microsatellites are simple tandem repeats that are present at millions of loci in the human genome. Microsatellite instability (MSI) refers to DNA slippage events on microsatellites that occur frequently in cancer genomes when there is a defect in the DNA mismatch repair system. These somatic mutations can result in inactivation of tumor suppressor genes or disrupt other non-coding regulatory sequences, thereby playing a role in carcinogenesis. Here, we will discuss the ways in which high-throughput sequencing data can facilitate a genome- or exome-wide discovery and more detailed investigation of MSI events in microsatellite-unstable cancer genomes. We will address the methodological aspects of this approach and highlight insights from recent analyses of colorectal and endometrial cancer genomes from The Cancer Genome Atlas project. These include identification of novel MSI targets within and across tumor types and the relationship between the likelihood of MSI events to chromatin structure. Given the increasing popularity of exome and genome sequencing of cancer genomes, a comprehensive characterization of MSI may serve as a valuable marker of cancer evolution and aid in a search for therapeutic targets. PMID:25371413

  8. Profiling status epilepticus-induced changes in hippocampal RNA expression using high-throughput RNA sequencing

    PubMed Central

    Hansen, Katelin F.; Sakamoto, Kensuke; Pelz, Carl; Impey, Soren; Obrietan, Karl

    2014-01-01

    Status epilepticus (SE) is a life-threatening condition that can give rise to a number of neurological disorders, including learning deficits, depression, and epilepsy. Many of the effects of SE appear to be mediated by alterations in gene expression. To gain deeper insight into how SE affects the transcriptome, we employed the pilocarpine SE model in mice and Illumina-based high-throughput sequencing to characterize alterations in gene expression from the induction of SE, to the development of spontaneous seizure activity. While some genes were upregulated over the entire course of the pathological progression, each of the three sequenced time points (12-hour, 10-days and 6-weeks post-SE) had a largely unique transcriptional profile. Hence, genes that regulate synaptic physiology and transcription were most prominently altered at 12-hours post-SE; at 10-days post-SE, marked changes in metabolic and homeostatic gene expression were detected; at 6-weeks, substantial changes in the expression of cell excitability and morphogenesis genes were detected. At the level of cell signaling, KEGG analysis revealed dynamic changes within the MAPK pathways, as well as in CREB-associated gene expression. Notably, the inducible expression of several noncoding transcripts was also detected. These findings offer potential new insights into the cellular events that shape SE-evoked pathology. PMID:25373493

  9. Deep sequencing reveals global patterns of mRNA recruitment during translation initiation

    PubMed Central

    Gao, Rong; Yu, Kai; Nie, Jukui; Lian, Tengfei; Jin, Jianshi; Liljas, Anders; Su, Xiao-Dong

    2016-01-01

    In this work, we developed a method to systematically study the sequence preference of mRNAs during translation initiation. Traditionally, the dynamic process of translation initiation has been studied at the single molecule level with limited sequencing possibility. Using deep sequencing techniques, we identified the sequence preference at different stages of the initiation complexes. Our results provide a comprehensive and dynamic view of the initiation elements in the translation initiation region (TIR), including the S1 binding sequence, the Shine-Dalgarno (SD)/anti-SD interaction and the second codon, at the equilibrium of different initiation complexes. Moreover, our experiments reveal the conformational changes and regional dynamics throughout the dynamic process of mRNA recruitment. PMID:27460773

  10. An Integrated Sequence-Structure Database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data.

    PubMed Central

    Adzhubei, I A; Adzhubei, A A; Neidle, S

    1998-01-01

    We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship. PMID:9399866

  11. RNA shotgun metagenomic sequencing of northern California (USA) mosquitoes uncovers viruses, bacteria, and fungi.

    PubMed

    Chandler, James Angus; Liu, Rachel M; Bennett, Shannon N

    2015-01-01

    Mosquitoes, most often recognized for the microbial agents of disease they may carry, harbor diverse microbial communities that include viruses, bacteria, and fungi, collectively called the microbiota. The composition of the microbiota can directly and indirectly affect disease transmission through microbial interactions that could be revealed by its characterization in natural populations of mosquitoes. Furthermore, the use of shotgun metagenomic sequencing (SMS) approaches could allow the discovery of unknown members of the microbiota. In this study, we use RNA SMS to characterize the microbiota of seven individual mosquitoes (species include Culex pipiens, Culiseta incidens, and Ochlerotatus sierrensis) collected from a variety of habitats in California, USA. Sequencing was performed on the Illumina HiSeq platform and the resulting sequences were quality-checked and assembled into contigs using the A5 pipeline. Sequences related to single stranded RNA viruses of the Bunyaviridae and Rhabdoviridae were uncovered, along with an unclassified genus of double-stranded RNA viruses. Phylogenetic analysis finds that in all three cases, the closest relatives of the identified viral sequences are other mosquito-associated viruses, suggesting widespread host-group specificity among disparate viral taxa. Interestingly, we identified a Narnavirus of fungi, also reported elsewhere in mosquitoes, that potentially demonstrates a nested host-parasite association between virus, fungi, and mosquito. Sequences related to 8 bacterial families and 13 fungal families were found across the seven samples. Bacillus and Escherichia/Shigella were identified in all samples and Wolbachia was identified in all Cx. pipiens samples, while no single fungal genus was found in more than two samples. This study exemplifies the utility of RNA SMS in the characterization of the natural microbiota of mosquitoes and, in particular, the value of identifying all microbes associated with a specific host.

  12. RNA shotgun metagenomic sequencing of northern California (USA) mosquitoes uncovers viruses, bacteria, and fungi

    PubMed Central

    Chandler, James Angus; Liu, Rachel M.; Bennett, Shannon N.

    2015-01-01

    Mosquitoes, most often recognized for the microbial agents of disease they may carry, harbor diverse microbial communities that include viruses, bacteria, and fungi, collectively called the microbiota. The composition of the microbiota can directly and indirectly affect disease transmission through microbial interactions that could be revealed by its characterization in natural populations of mosquitoes. Furthermore, the use of shotgun metagenomic sequencing (SMS) approaches could allow the discovery of unknown members of the microbiota. In this study, we use RNA SMS to characterize the microbiota of seven individual mosquitoes (species include Culex pipiens, Culiseta incidens, and Ochlerotatus sierrensis) collected from a variety of habitats in California, USA. Sequencing was performed on the Illumina HiSeq platform and the resulting sequences were quality-checked and assembled into contigs using the A5 pipeline. Sequences related to single stranded RNA viruses of the Bunyaviridae and Rhabdoviridae were uncovered, along with an unclassified genus of double-stranded RNA viruses. Phylogenetic analysis finds that in all three cases, the closest relatives of the identified viral sequences are other mosquito-associated viruses, suggesting widespread host-group specificity among disparate viral taxa. Interestingly, we identified a Narnavirus of fungi, also reported elsewhere in mosquitoes, that potentially demonstrates a nested host-parasite association between virus, fungi, and mosquito. Sequences related to 8 bacterial families and 13 fungal families were found across the seven samples. Bacillus and Escherichia/Shigella were identified in all samples and Wolbachia was identified in all Cx. pipiens samples, while no single fungal genus was found in more than two samples. This study exemplifies the utility of RNA SMS in the characterization of the natural microbiota of mosquitoes and, in particular, the value of identifying all microbes associated with a specific host

  13. Comprehensive comparative analysis of RNA sequencing methods for degraded or low input samples

    PubMed Central

    Adiconis, Xian; Borges-Rivera, Diego; Satija, Rahul; DeLuca, David S.; Busby, Michele A.; Berlin, Aaron M.; Sivachenko, Andrey; Thompson, Dawn Anne; Wysoker, Alec; Fennell, Timothy; Gnirke, Andreas; Pochet, Nathalie; Regev, Aviv; Levin, Joshua Z.

    2013-01-01

    RNA-Seq is an effective method to study the transcriptome, but can be difficult to apply to scarce or degraded RNA from fixed clinical samples, rare cell populations, or cadavers. Recent studies have proposed several methods for RNA-Seq of low quality and/or low quantity samples, but their relative merits have not been systematically analyzed. Here, we compare five such methods using metrics relevant to transcriptome annotation, transcript discovery, and gene expression. Using a single human RNA sample, we constructed and sequenced ten libraries with these methods and two control libraries. We find that the RNase H method performed best for low quality RNA, and confirmed this with actual degraded samples. RNase H can even effectively replace oligo (dT) based methods for standard RNA-Seq. SMART and NuGEN had distinct strengths for low quantity RNA. Our analysis allows biologists to select the most suitable methods and provides a benchmark for future method development. PMID:23685885

  14. Comparative analysis of RNA sequencing methods for degraded or low-input samples.

    PubMed

    Adiconis, Xian; Borges-Rivera, Diego; Satija, Rahul; DeLuca, David S; Busby, Michele A; Berlin, Aaron M; Sivachenko, Andrey; Thompson, Dawn Anne; Wysoker, Alec; Fennell, Timothy; Gnirke, Andreas; Pochet, Nathalie; Regev, Aviv; Levin, Joshua Z

    2013-07-01

    RNA-seq is an effective method for studying the transcriptome, but it can be difficult to apply to scarce or degraded RNA from fixed clinical samples, rare cell populations or cadavers. Recent studies have proposed several methods for RNA-seq of low-quality and/or low-quantity samples, but the relative merits of these methods have not been systematically analyzed. Here we compare five such methods using metrics relevant to transcriptome annotation, transcript discovery and gene expression. Using a single human RNA sample, we constructed and sequenced ten libraries with these methods and compared them against two control libraries. We found that the RNase H method performed best for chemically fragmented, low-quality RNA, and we confirmed this through analysis of actual degraded samples. RNase H can even effectively replace oligo(dT)-based methods for standard RNA-seq. SMART and NuGEN had distinct strengths for measuring low-quantity RNA. Our analysis allows biologists to select the most suitable methods and provides a benchmark for future method development.

  15. SiRNA sequence model: redesign algorithm based on available genome-wide libraries.

    PubMed

    Kozak, Karol

    2013-12-01

    The evolution of RNA interference (RNAi) and the development of technologies exploiting its biology have enabled scientists to rapidly examine the consequences of depleting a particular gene product in cells. Design tools have been developed based on experimental data to increase the knockdown efficiency of siRNAs. Not all siRNAs that are developed to a given target mRNA are equally effective. Currently available design algorithms take an accession, identify conserved regions among their transcript space, find accessible regions within the mRNA, design all possible siRNAs for these regions, filter them based on multi-scores thresholds, and then perform off-target filtration. These different criteria are used by commercial suppliers to produce siRNA genome-wide libraries for different organisms. In this article, we analyze existing siRNA design algorithms and evaluate weight of design parameters for libraries produced in the last decade. We proved that not all essential parameters are currently applied by siRNA vendors. Based on our evaluation results, we were able to suggest an siRNA sequence pattern. The findings in our study can be useful for commercial vendors improving the design of RNAi constructs, by addressing both the issue of potency and the issue of specificity.

  16. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3′ UTRs and coding sequences

    PubMed Central

    Šulc, Miroslav; Marín, Ray M.; Robins, Harlan S.; Vaníček, Jiří

    2015-01-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3′ untranslated regions (3′ UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3′ UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA–mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA–mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. PMID:25948580

  17. Strategy for microbiome analysis using 16S rRNA gene sequence analysis on the Illumina sequencing platform.

    PubMed

    Ram, Jeffrey L; Karim, Aos S; Sendler, Edward D; Kato, Ikuko

    2011-06-01

    Understanding the identity and changes of organisms in the urogenital and other microbiomes of the human body may be key to discovering causes and new treatments of many ailments, such as vaginosis. High-throughput sequencing technologies have recently enabled discovery of the great diversity of the human microbiome. The cost per base of many of these sequencing platforms remains high (thousands of dollars per sample); however, the Illumina Genome Analyzer (IGA) is estimated to have a cost per base less than one-fifth of its nearest competitor. The main disadvantage of the IGA for sequencing PCR-amplified 16S rRNA genes is that the maximum read-length of the IGA is only 100 bases; whereas, at least 300 bases are needed to obtain phylogenetically informative data down to the genus and species level. In this paper we describe and conduct a pilot test of a multiplex sequencing strategy suitable for achieving total reads of > 300 bases per extracted DNA molecule on the IGA. Results show that all proposed primers produce products of the expected size and that correct sequences can be obtained, with all proposed forward primers. Various bioinformatic optimization of the Illumina Bustard analysis pipeline proved necessary to extract the correct sequence from IGA image data, and these modifications of the data files indicate that further optimization of the analysis pipeline may improve the quality rankings of the data and enable more sequence to be correctly analyzed. The successful application of this method could result in an unprecedentedly deep description (800,000 taxonomic identifications per sample) of the urogenital and other microbiomes in a large number of samples at a reasonable cost per sample.

  18. Comparison of five different RNA sources to examine the lactating bovine mammary gland transcriptome using RNA-Sequencing

    PubMed Central

    Cánovas, Angela; Rincón, Gonzalo; Bevilacqua, Claudia; Islas-Trejo, Alma; Brenaut, Pauline; Hovey, Russell C.; Boutinaud, Marion; Morgenthaler, Caroline; VanKlompenberg, Monica K.; Martin, Patrice; Medrano, Juan F.

    2014-01-01

    The objective of this study was to examine five different sources of RNA, namely mammary gland tissue (MGT), milk somatic cells (SC), laser microdissected mammary epithelial cells (LCMEC), milk fat globules (MFG) and antibody-captured milk mammary epithelial cells (mMEC) to analyze the bovine mammary gland transcriptome using RNA-Sequencing. Our results provide a comparison between different sampling methods (invasive and non-invasive) to define the transcriptome of mammary gland tissue and milk cells. This information will be of value to investigators in choosing the most appropriate sampling method for different research applications to study specific physiological states during lactation. One of the simplest procedures to study the transcriptome associated with milk appears to be the isolation of total RNA directly from SC or MFG released into milk during lactation. Our results indicate that the SC and MFG transcriptome are representative of MGT and LCMEC and can be used as effective and alternative samples to study mammary gland expression without the need to perform a tissue biopsy. PMID:25001089

  19. Analysis of Human mRNAs With the Reference Genome Sequence Reveals Potential Errors, Polymorphisms, and RNA Editing

    PubMed Central

    Furey, Terrence S.; Diekhans, Mark; Lu, Yontao; Graves, Tina A.; Oddy, Lachlan; Randall-Maher, Jennifer; Hillier, LaDeana W.; Wilson, Richard K.; Haussler, David

    2004-01-01

    The NCBI Reference Sequence (RefSeq) project and the NIH Mammalian Gene Collection (MGC) together define a set of ∼30,000 nonredundant human mRNA sequences with identified coding regions representing 17,000 distinct loci. These high-quality mRNA sequences allow for the identification of transcribed regions in the human genome sequence, and many researchers accept them as the correct representation of each defined gene sequence. Computational comparison of these mRNA sequences and the recently published essentially finished human genome sequence reveals several thousand undocumented nonsynonymous substitution and frame shift discrepancies between the two resources. Additional analysis is undertaken to verify that the euchromatic human genome is sufficiently complete—containing nearly the whole mRNA collection, thus allowing for a comprehensive analysis to be undertaken. Many of the discrepancies will prove to be genuine polymorphisms in the human population, somatic cell genomic variants, or examples of RNA editing. It is observed that the genome sequence variant has significant additional support from other mRNAs and ESTs, almost four times more often than does the mRNA variant, suggesting that the genome sequence is more accurate. In ∼15% of these cases, there is substantial support for both variants, suggestive of an undocumented polymorphism. An initial screening against a 24-individual genomic DNA diversity panel verified 60% of a small set of potential single nucleotide polymorphisms from which successful results could be obtained. We also find statistical evidence that a few of these discrepancies are due to RNA editing. Overall, these results suggest that the mRNA collections may contain a substantial number of errors. For current and future mRNA collections, it may be prudent to fully reconcile each genome sequence discrepancy, classifying each as a polymorphism, site of RNA editing or somatic cell variation, or genome sequence error. PMID:15489323

  20. SHAPE Selection (SHAPES) enrich for RNA structure signal in SHAPE sequencing-based probing data.

    PubMed

    Poulsen, Line Dahl; Kielpinski, Lukasz Jan; Salama, Sofie R; Krogh, Anders; Vinther, Jeppe

    2015-05-01

    Selective 2' Hydroxyl Acylation analyzed by Primer Extension (SHAPE) is an accurate method for probing of RNA secondary structure. In existing SHAPE methods, the SHAPE probing signal is normalized to a no-reagent control to correct for the background caused by premature termination of the reverse transcriptase. Here, we introduce a SHAPE Selection (SHAPES) reagent, N-propanone isatoic anhydride (NPIA), which retains the ability of SHAPE reagents to accurately probe RNA structure, but also allows covalent coupling between the SHAPES reagent and a biotin molecule. We demonstrate that SHAPES-based selection of cDNA-RNA hybrids on streptavidin beads effectively removes the large majority of background signal present in SHAPE probing data and that sequencing-based SHAPES data contain the same amount of RNA structure data as regular sequencing-based SHAPE data obtained through normalization to a no-reagent control. Moreover, the selection efficiently enriches for probed RNAs, suggesting that the SHAPES strategy will be useful for applications with high-background and low-probing signal such as in vivo RNA structure probing.

  1. SHAPE Selection (SHAPES) enrich for RNA structure signal in SHAPE sequencing-based probing data

    PubMed Central

    Poulsen, Line Dahl; Kielpinski, Lukasz Jan; Salama, Sofie R.; Krogh, Anders; Vinther, Jeppe

    2015-01-01

    Selective 2′ Hydroxyl Acylation analyzed by Primer Extension (SHAPE) is an accurate method for probing of RNA secondary structure. In existing SHAPE methods, the SHAPE probing signal is normalized to a no-reagent control to correct for the background caused by premature termination of the reverse transcriptase. Here, we introduce a SHAPE Selection (SHAPES) reagent, N-propanone isatoic anhydride (NPIA), which retains the ability of SHAPE reagents to accurately probe RNA structure, but also allows covalent coupling between the SHAPES reagent and a biotin molecule. We demonstrate that SHAPES-based selection of cDNA–RNA hybrids on streptavidin beads effectively removes the large majority of background signal present in SHAPE probing data and that sequencing-based SHAPES data contain the same amount of RNA structure data as regular sequencing-based SHAPE data obtained through normalization to a no-reagent control. Moreover, the selection efficiently enriches for probed RNAs, suggesting that the SHAPES strategy will be useful for applications with high-background and low-probing signal such as in vivo RNA structure probing. PMID:25805860

  2. Whole Transcriptome Sequencing Reveals Extensive Unspliced mRNA in Metastatic Castration-Resistant Prostate Cancer

    PubMed Central

    Sowalsky, Adam G.; Xia, Zheng; Wang, Liguo; Zhao, Hao; Chen, Shaoyong; Bubley, Glenn J.; Balk, Steven P.; Li, Wei

    2014-01-01

    Men with metastatic prostate cancer (PCa) who are treated with androgen deprivation therapies (ADT) usually relapse within 2–3 years with disease that is termed castration-resistant prostate cancer (CRPC). To identify the mechanism that drives these advanced tumors, paired-end RNA-sequencing (RNA-seq) was performed on a panel of CRPC bone marrow biopsy specimens. From this genome-wide approach, mutations were found in a series of genes with PCa relevance including: AR, NCOR1, KDM3A, KDM4A, CHD1, SETD5, SETD7, INPP4B, RASGRP3, RASA1, TP53BP1 and CDH1, and a novel SND1:BRAF gene fusion. Amongst the most highly-expressed transcripts were ten non-coding RNAs (ncRNAs), including MALAT1 and PABPC1, which are involved in RNA processing. Notably, a high percentage of sequence reads mapped to introns, which were determined to be the result of incomplete splicing at canonical splice junctions. Using quantitative PCR (qPCR) a series of genes (AR, KLK2, KLK3, STEAP2, CPSF6, and CDK19) were confirmed to have a greater proportion of unspliced RNA in CRPC specimens than in normal prostate epithelium, untreated primary PCa, and cultured PCa cells. This inefficient coupling of transcription and mRNA splicing suggests an overall increase in transcription or defect in splicing. PMID:25189356

  3. Mitochondrial tRNA sequences as unusual replication origins: pathogenic implications for Homo sapiens.

    PubMed

    Seligmann, Hervé; Krishnan, Neeraja M; Rao, Basuthkar J

    2006-12-07

    The heavy strand of vertebrate mitochondrial genomes accumulates deaminations proportionally to the time it spends single-stranded during replication. A previous study showed that the strength of genome-wide deamination gradients originating from tRNA gene's locations increases with their capacities to form secondary structures resembling mitochondrial origins of light strand replication (OL), suggesting an alternative function for tRNA sequences. We hypothesize that this function is frequently pathogenic for those tRNA genes that normally do not form OL-like structures, because this could cause excess mutations in genome regions unadapted to tolerate them. In human mitochondrial genomes, pathogenic tRNA variants usually form less OL-like structures than non-pathogenic ones in cases where the normal non-pathogenic tRNA variant can function as OL, as evolutionary analyses reveal. For tRNAs lacking the putative OL-like functioning capacity, pathogenic variants form more OL-like secondary structures, particularly structures that might invoke bi-directional replication (true for 14 among 21 tRNA species, p<0.05, sign test; significantly at p<0.05 (1 tailed test) for 7 tRNA species), but not more unidirectional replication invoking structures. Accounting for the functional cloverleaf-like structure-forming capacities of tRNAs yields similar results. Rare, non-pathogenic tRNA mutants tend to form more OL-like structures than the common, non-pathogenic ones, suggesting weak directional selection also among non-pathogenic variants. The duration spent single stranded by a region of the heavy strand (D(ssH)) during replication, estimated by integrating over all regions that can function as OL in Homo sapiens mitochondrial genomes, increases with distance of that region from the Dloop. This suggests convergence of single-strandedness during replication and transcription, and explains conserved locations of tRNA species in mitochondrial genomes and bacterial operons. These

  4. A novel RNA sequencing data analysis method for cell line authentication

    PubMed Central

    Fasterius, Erik; Rauch, Nora; Lundin, Pär; Kolch, Walter; Uhlén, Mathias

    2017-01-01

    We have developed a novel analysis method that can interrogate the authenticity of biological samples used for generation of transcriptome profiles in public data repositories. The method uses RNA sequencing information to reveal mutations in expressed transcripts and subsequently confirms the identity of analysed cells by comparison with publicly available cell-specific mutational profiles. Cell lines constitute key model systems widely used within cancer research, but their identity needs to be confirmed in order to minimise the influence of cell contaminations and genetic drift on the analysis. Using both public and novel data, we demonstrate the use of RNA-sequencing data analysis for cell line authentication by examining the validity of COLO205, DLD1, HCT15, HCT116, HKE3, HT29 and RKO colorectal cancer cell lines. We successfully authenticate the studied cell lines and validate previous reports indicating that DLD1 and HCT15 are synonymous. We also show that the analysed HKE3 cells harbour an unexpected KRAS-G13D mutation and confirm that this cell line is a genuine KRAS dosage mutant, rather than a true isogenic derivative of HCT116 expressing only the wild type KRAS. This authentication method could be used to revisit the numerous cell line based RNA sequencing experiments available in public data repositories, analyse new experiments where whole genome sequencing is not available, as well as facilitate comparisons of data from different experiments, platforms and laboratories. PMID:28192450

  5. A novel RNA sequencing data analysis method for cell line authentication.

    PubMed

    Fasterius, Erik; Raso, Cinzia; Kennedy, Susan; Rauch, Nora; Lundin, Pär; Kolch, Walter; Uhlén, Mathias; Al-Khalili Szigyarto, Cristina

    2017-01-01

    We have developed a novel analysis method that can interrogate the authenticity of biological samples used for generation of transcriptome profiles in public data repositories. The method uses RNA sequencing information to reveal mutations in expressed transcripts and subsequently confirms the identity of analysed cells by comparison with publicly available cell-specific mutational profiles. Cell lines constitute key model systems widely used within cancer research, but their identity needs to be confirmed in order to minimise the influence of cell contaminations and genetic drift on the analysis. Using both public and novel data, we demonstrate the use of RNA-sequencing data analysis for cell line authentication by examining the validity of COLO205, DLD1, HCT15, HCT116, HKE3, HT29 and RKO colorectal cancer cell lines. We successfully authenticate the studied cell lines and validate previous reports indicating that DLD1 and HCT15 are synonymous. We also show that the analysed HKE3 cells harbour an unexpected KRAS-G13D mutation and confirm that this cell line is a genuine KRAS dosage mutant, rather than a true isogenic derivative of HCT116 expressing only the wild type KRAS. This authentication method could be used to revisit the numerous cell line based RNA sequencing experiments available in public data repositories, analyse new experiments where whole genome sequencing is not available, as well as facilitate comparisons of data from different experiments, platforms and laboratories.

  6. Sequence analysis and location of capsid proteins within RNA 2 of strawberry latent ringspot virus.

    PubMed

    Kreiah, S; Strunk, G; Cooper, J I

    1994-09-01

    The nucleotide sequence of the RNA 2 of a strawberry isolate (H) of strawberry latent ringspot virus (SLRSV) comprised 3824 nucleotides and contained one long open reading frame with a theoretical coding capacity of 890 amino acids equivalent to a protein of 98.8K. The N-terminal amino acid sequences of virion-derived proteins were determined by Edman degradation allowing the capsid coding regions to be located and serine/glycine cleavage sites to be identified within the polyprotein. The amino acid sequence in the capsid coding region of an isolate of SLRSV from flowering cherry in New Zealand was 97% identical to that of SLRSV-H. Except in the 3' and 5' terminal non-coding sequences, computer-based alignment and comparison algorithms did not reveal any substantial homologies between RNA 2 of SLRSV-H and the equivalent genomic segments in the nepoviruses arabis mosaic, cherry leaf roll, grapevine fanleaf, raspberry ringspot, grapevine hungarian chrome mosaic, tomato blackring, tomato ringspot, tobacco ringspot, or in the comoviruses cowpea mosaic and red clover mottle. Despite the similarities in overall genome organization, data from RNA 2 remain insufficient for unambiguous positioning of SLRSV in relation to species/genera in the Comoviridae.

  7. Enhancing potency of siRNA targeting fusion genes by optimization outside of target sequence

    PubMed Central

    Gavrilov, Kseniya; Seo, Young-Eun; Tietjen, Gregory T.; Cui, Jiajia; Cheng, Christopher J.; Saltzman, W. Mark

    2015-01-01

    Canonical siRNA design algorithms have become remarkably effective at predicting favorable binding regions within a target mRNA, but in some cases (e.g., a fusion junction site) region choice is restricted. In these instances, alternative approaches are necessary to obtain a highly potent silencing molecule. Here we focus on strategies for rational optimization of two siRNAs that target the junction sites of fusion oncogenes BCR-ABL and TMPRSS2-ERG. We demonstrate that modifying the termini of these siRNAs with a terminal G-U wobble pair or a carefully selected pair of terminal asymmetry-enhancing mismatches can result in an increase in potency at low doses. Importantly, we observed that improvements in silencing at the mRNA level do not necessarily translate to reductions in protein level and/or cell death. Decline in protein level is also heavily influenced by targeted protein half-life, and delivery vehicle toxicity can confound measures of cell death due to silencing. Therefore, for BCR-ABL, which has a long protein half-life that is difficult to overcome using siRNA, we also developed a nontoxic transfection vector: poly(lactic-coglycolic acid) nanoparticles that release siRNA over many days. We show that this system can achieve effective killing of leukemic cells. These findings provide insights into the implications of siRNA sequence for potency and suggest strategies for the design of more effective therapeutic siRNA molecules. Furthermore, this work points to the importance of integrating studies of siRNA design and delivery, while heeding and addressing potential limitations such as restricted targetable mRNA regions, long protein half-lives, and nonspecific toxicities. PMID:26627251

  8. Mutation Detection in an Antibody-Producing Chinese Hamster Ovary Cell Line by Targeted RNA Sequencing

    PubMed Central

    Zhang, Siyan; Hughes, Jason D.; Murgolo, Nicholas; Levitan, Diane; Chen, Janice; Liu, Zhong

    2016-01-01

    Chinese hamster ovary (CHO) cells have been used widely in the pharmaceutical industry for production of biological therapeutics including monoclonal antibodies (mAb). The integrity of the gene of interest and the accuracy of the relay of genetic information impact product quality and patient safety. Here we employed next-generation sequencing, particularly RNA-seq, and developed a method to systematically analyze the mutation rate of the mRNA of CHO cell lines producing a mAb. The effect of an extended culturing period to mimic the scale of cell expansion in a manufacturing process and varying selection pressure in the cell culture were also closely examined. PMID:27088091

  9. tRNADB-CE: tRNA gene database well-timed in the era of big sequence data.

    PubMed

    Abe, Takashi; Inokuchi, Hachiro; Yamada, Yuko; Muto, Akira; Iwasaki, Yuki; Ikemura, Toshimichi

    2014-01-01

    The tRNA gene data base curated by experts "tRNADB-CE" (http://trna.ie.niigata-u.ac.jp) was constructed by analyzing 1,966 complete and 5,272 draft genomes of prokaryotes, 171 viruses', 121 chloroplasts', and 12 eukaryotes' genomes plus fragment sequences obtained by metagenome studies of environmental samples. 595,115 tRNA genes in total, and thus two times of genes compiled previously, have been registered, for which sequence, clover-leaf structure, and results of sequence-similarity and oligonucleotide-pattern searches can be browsed. To provide collective knowledge with help from experts in tRNA researches, we added a column for enregistering comments to each tRNA. By grouping bacterial tRNAs with an identical sequence, we have found high phylogenetic preservation of tRNA sequences, especially at the phylum level. Since many species-unknown tRNAs from metagenomic sequences have sequences identical to those found in species-known prokaryotes, the identical sequence group (ISG) can provide phylogenetic markers to investigate the microbial community in an environmental ecosystem. This strategy can be applied to a huge amount of short sequences obtained from next-generation sequencers, as showing that tRNADB-CE is a well-timed database in the era of big sequence data. It is also discussed that batch-learning self-organizing-map with oligonucleotide composition is useful for efficient knowledge discovery from big sequence data.

  10. Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses from clinical and biological samples.

    PubMed

    Matranga, Christian B; Andersen, Kristian G; Winnicki, Sarah; Busby, Michele; Gladden, Adrianne D; Tewhey, Ryan; Stremlau, Matthew; Berlin, Aaron; Gire, Stephen K; England, Eleina; Moses, Lina M; Mikkelsen, Tarjei S; Odia, Ikponmwonsa; Ehiane, Philomena E; Folarin, Onikepe; Goba, Augustine; Kahn, S Humarr; Grant, Donald S; Honko, Anna; Hensley, Lisa; Happi, Christian; Garry, Robert F; Malboeuf, Christine M; Birren, Bruce W; Gnirke, Andreas; Levin, Joshua Z; Sabeti, Pardis C

    2014-01-01

    We have developed a robust RNA sequencing method for generating complete de novo assemblies with intra-host variant calls of Lassa and Ebola virus genomes in clinical and biological samples. Our method uses targeted RNase H-based digestion to remove contaminating poly(rA) carrier and ribosomal RNA. This depletion step improves both the quality of data and quantity of informative reads in unbiased total RNA sequencing libraries. We have also developed a hybrid-selection protocol to further enrich the viral content of sequencing libraries. These protocols have enabled rapid deep sequencing of both Lassa and Ebola virus and are broadly applicable to other viral genomics studies.

  11. Strange mode instability driven finite amplitude pulsations and mass-loss in models of massive zero-age main-sequence stars

    NASA Astrophysics Data System (ADS)

    Yadav, Abhay Pratap; Glatzel, Wolfgang

    2017-02-01

    The stability with respect to radial perturbations of massive zero-age main-sequence stars having solar chemical composition and masses between 50 and 150 M⊙ is reinvestigated. As a first step, a linear non-adiabatic stability analysis is performed, confirming the existence of dynamical strange mode instabilities for models with masses above 58 M⊙. For selected models, the evolution of the strange mode instabilities into the non-linear regime is followed by numerical simulation. The final results of strange mode instabilities are thus found to be finite amplitude pulsations with periods between 3 and 24 h. Mean acoustic luminosities capable to drive winds with mass-loss rates of the order of 0.5 × 10-7 M⊙ yr-1, which can at most marginally affect stellar evolution in the vicinity of the zero-age main sequence, are associated with these finite amplitude pulsations.

  12. Genome-Wide Identification of Long Noncoding RNAs in Human Intervertebral Disc Degeneration by RNA Sequencing

    PubMed Central

    Zhao, Bo; Lu, Minjuan; Wang, Dong; Li, Haopeng

    2016-01-01

    Long noncoding RNAs (lncRNAs) are emerging as crucial players in a myriad of biological processes. However, the precise mechanism and functions of most lncRNAs are poorly characterized. In this study, we presented genome-wide identification of lncRNAs in the patients with intervertebral disc degeneration (IDD) and spinal cord injury (control) using RNA sequencing (RNA-seq). A total of 124.6 million raw reads were yielded using Hiseq 2500 platform and approximately 88% clean reads could be aligned to human reference genome in both IDD and control groups. RNA-seq profiling indicated that 1,854 lncRNAs were differentially expressed (log2 fold change ≥ 1 or ≤−1, p < 0.05), in which 1,530 could potentially target 6,386 genes via cis-regulatory effects. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis for these target genes suggested that lncRNAs were involved in diverse pathways, such as lysosome, focal adhesion, and MAPK signaling. In addition, a competing endogenous RNA (ceRNA) network was constructed for analyzing the function of lncRNAs. Further, quantitative real time PCR (qRT-PCR) was used to confirm the differentially expressed lncRNAs and ceRNA network. In conclusion, our results present the first global identification of lncRNAs in IDD and may provide candidate diagnostic biomarkers for IDD treatment. PMID:28097131

  13. Transcriptome Sequencing Identifies PCAT-1, a Novel lincRNA Implicated in Prostate Cancer Progression

    PubMed Central

    Prensner, John R.; Iyer, Matthew K.; Balbin, O. Alejandro; Dhanasekaran, Saravana M.; Cao, Qi; Brenner, J. Chad; Laxman, Bharathi; Asangani, Irfan; Grasso, Catherine; Kominsky, Hal D.; Cao, Xuhong; Jing, Xiaojun; Wang, Xiaoju; Siddiqui, Javed; Wei, John T.; Robinson, Daniel; Iyer, Hari K.; Palanisamy, Nallasivam; Maher, Christopher A.; Chinnaiyan, Arul M.

    2011-01-01

    High-throughput sequencing of polyA+ RNA (RNA-Seq) in human cancer shows remarkable potential to identify both novel markers of disease and uncharacterized aspects of tumor biology, particularly non-coding RNA (ncRNA) species. We employed RNA-Seq on a cohort of 102 prostate tissues and cells lines and performed ab initio transcriptome assembly to discover unannotated ncRNAs. We nominated 121 such Prostate Cancer Associated Transcripts (PCATs) with cancer-specific expression patterns. Among these, we characterized PCAT-1 as a novel prostate-specific regulator of cell proliferation and target of the Polycomb Repressive Complex 2 (PRC2). We further found that high PCAT-1 and PRC2 expression stratified patient tissues into molecular subtypes distinguished by expression signatures of PCAT-1-repressed target genes. Taken together, the findings presented herein identify PCAT-1 as a novel transcriptional repressor implicated in subset of prostate cancer patients. These findings establish the utility of RNA-Seq to identify disease-associated ncRNAs that may improve the stratification of cancer subtypes. PMID:21804560

  14. Tools for Sequence-Based miRNA Target Prediction: What to Choose?

    PubMed Central

    Riffo-Campos, Ángela L.; Riquelme, Ismael; Brebi-Mieville, Priscilla

    2016-01-01

    MicroRNAs (miRNAs) are defined as small non-coding RNAs ~22 nt in length. They regulate gene expression at a post-transcriptional level through complementary base pairing with the target mRNA, leading to mRNA degradation and therefore blocking translation. In the last decade, the dysfunction of miRNAs has been related to the development and progression of many diseases. Currently, researchers need a method to identify precisely the miRNA targets, prior to applying experimental approaches that allow a better functional characterization of miRNAs in biological processes and can thus predict their effects. Computational prediction tools provide a rapid method to identify putative miRNA targets. However, since a large number of tools for the prediction of miRNA:mRNA interactions have been developed, all with different algorithms, the biological researcher sometimes does not know which is the best choice for his study and many times does not understand the bioinformatic basis of these tools. This review describes the biological fundamentals of these prediction tools, characterizes the main sequence-based algorithms, and offers some insights into their uses by biologists. PMID:27941681

  15. Tools for Sequence-Based miRNA Target Prediction: What to Choose?

    PubMed

    Riffo-Campos, Ángela L; Riquelme, Ismael; Brebi-Mieville, Priscilla

    2016-12-09

    MicroRNAs (miRNAs) are defined as small non-coding RNAs ~22 nt in length. They regulate gene expression at a post-transcriptional level through complementary base pairing with the target mRNA, leading to mRNA degradation and therefore blocking translation. In the last decade, the dysfunction of miRNAs has been related to the development and progression of many diseases. Currently, researchers need a method to identify precisely the miRNA targets, prior to applying experimental approaches that allow a better functional characterization of miRNAs in biological processes and can thus predict their effects. Computational prediction tools provide a rapid method to identify putative miRNA targets. However, since a large number of tools for the prediction of miRNA:mRNA interactions have been developed, all with different algorithms, the biological researcher sometimes does not know which is the best choice for his study and many times does not understand the bioinformatic basis of these tools. This review describes the biological fundamentals of these prediction tools, characterizes the main sequence-based algorithms, and offers some insights into their uses by biologists.

  16. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning.

    PubMed

    Alipanahi, Babak; Delong, Andrew; Weirauch, Matthew T; Frey, Brendan J

    2015-08-01

    Knowing the sequence specificities of DNA- and RNA-binding proteins is essential for developing models of the regulatory processes in biological systems and for identifying causal disease variants. Here we show that sequence specificities can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for pattern discovery. Using a diverse array of experimental data and evaluation metrics, we find that deep learning outperforms other state-of-the-art methods, even when training on in vitro data and testing on in vivo data. We call this approach DeepBind and have built a stand-alone software tool that is fully automatic and handles millions of sequences per experiment. Specificities determined by DeepBind are readily visualized as a weighted ensemble of position weight matrices or as a 'mutation map' that indicates how variations affect binding within a specific sequence.

  17. The nucleotide sequence of Beneckea harveyi 5S rRNA. [bioluminescent marine bacterium

    NASA Technical Reports Server (NTRS)

    Luehrsen, K. R.; Fox, G. E.

    1981-01-01

    The primary sequence of the 5S ribosomal RNA isolated from the free-living bioluminescent marine bacterium Beneckea harveyi is reported and discussed in regard to indications of phylogenetic relationships with the bacteria Escherichia coli and Photobacterium phosphoreum. Sequences were determined for oligonucleotide products generated by digestion with ribonuclease T1, pancreatic ribonuclease and ribonuclease T2. The presence of heterogeneity is indicated for two sites. The B. harveyi sequence can be arranged into the same four helix secondary structures as E. coli and other prokaryotic 5S rRNAs. Examination of the 5S-RNS sequences of the three bacteria indicates that B. harveyi and P. phosphoreum are specifically related and share a common ancestor which diverged from an ancestor of E. coli at a somewhat earlier time, consistent with previous studies.

  18. Improving mRNA 5' coding sequence determination in the mouse genome.

    PubMed

    Piovesan, Allison; Caracausi, Maria; Pelleri, Maria Chiara; Vitale, Lorenza; Martini, Silvia; Bassani, Chiara; Gurioli, Annalisa; Casadei, Raffaella; Soldà, Giulia; Strippoli, Pierluigi

    2014-04-01

    The incomplete determination of the mRNA 5' end sequence may lead to the incorrect assignment of the first AUG codon and to errors in the prediction of the encoded protein product. Due to the significance of the mouse as a model organism in biomedical research, we performed a systematic identification of coding regions at the 5' end of all known mouse mRNAs, using an automated expressed sequence tag (EST)-based approach which we have previously described. By parsing almost 4 million BLAT alignments we found 351 mouse loci, out of 20,221 analyzed, in which an extension of the mRNA 5' coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for Apc2 and Mknk2 cDNAs. We also generated a list of 16,330 mouse mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5' end in the current form. Systematic searches in the main mouse genome databases and genome browsers showed that 82% of our results are original and have not been identified by their annotation pipelines. Moreover, the same information is not easily derivable from RNA-Seq data, due to short sequence length and laboriousness in building full-length transcript structures. In conclusion, our results improve the determination of full-length 5' coding sequences and might be useful in order to reduce errors when studying mouse gene structure and function in biomedical research.

  19. Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

    NASA Technical Reports Server (NTRS)

    Fox, G. E.

    1985-01-01

    Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.

  20. Two distinct extracellular RNA signatures released by a single cell type identified by microarray and next-generation sequencing

    PubMed Central

    Lässer, Cecilia; Shelke, Ganesh Vilas; Yeri, Ashish; Kim, Dae-Kyum; Crescitelli, Rossella; Raimondo, Stefania; Sjöstrand, Margareta; Gho, Yong Song; Van Keuren Jensen, Kendall; Lötvall, Jan

    2017-01-01

    ABSTRACT Cells secrete extracellular RNA (exRNA) to their surrounding environment and exRNA has been found in many body fluids such as blood, breast milk and cerebrospinal fluid. However, there are conflicting results regarding the nature of exRNA. Here, we have separated 2 distinct exRNA profiles released by mast cells, here termed high-density (HD) and low-density (LD) exRNA. The exRNA in both fractions was characterized by microarray and next-generation sequencing. Both exRNA fractions contained mRNA and miRNA, and the mRNAs in the LD exRNA correlated closely with the cellular mRNA, whereas the HD mRNA did not. Furthermore, the HD exRNA was enriched in lincRNA, antisense RNA, vault RNA, snoRNA, and snRNA with little or no evidence of full-length 18S and 28S rRNA. The LD exRNA was enriched in mitochondrial rRNA, mitochondrial tRNA, tRNA, piRNA, Y RNA, and full-length 18S and 28S rRNA. The proteomes of the HD and LD exRNA-containing fractions were determined with LC-MS/MS and analyzed with Gene Ontology term finder, which showed that both proteomes were associated with the term extracellular vesicles and electron microscopy suggests that at least a part of the exRNA is associated with exosome-like extracellular vesicles. Additionally, the proteins in the HD fractions tended to be associated with the nucleus and ribosomes, whereas the LD fraction proteome tended to be associated with the mitochondrion. We show that the 2 exRNA signatures released by a single cell type can be separated by floatation on a density gradient. These results show that cells can release multiple types of exRNA with substantial differences in RNA species content. This is important for any future studies determining the nature and function of exRNA released from different cells under different conditions. PMID:27791479

  1. Determining RNA quality for NextGen sequencing: some exceptions to the gold standard rule of 23S to 16S rRNA ratio

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Using next-generation-sequencing technology to assess entire transcriptomes requires high quality starting RNA. Currently, RNA quality is routinely judged using automated microfluidic gel electrophoresis platforms and associated algorithms. Here we report that such automated methods generate false-n...

  2. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    SciTech Connect

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S.; Arbuthnot, Patrick

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  3. Identification of RNA sequence isomer by isotope labeling and LC-MS/MS.

    PubMed

    Li, Siwei; Limbach, Patrick A

    2014-11-01

    Recently, we developed a method for modified ribonucleic acid (RNA) analysis based on the comparative analysis of RNA digests (CARD). Within this CARD approach, sequence or modification differences between two samples are identified through differential isotopic labeling of two samples. Components present in both samples will each be labeled, yielding doublets in the CARD mass spectrum. Components unique to only one sample should be detected as singlets. A limitation of the prior singlet identification strategy occurs when the two samples contain components of unique sequence but identical base composition. At the first stage of mass spectrometry, these sequence isomers cannot be differentiated and would appear as doublets rather than singlets. However, underlying sequence differences should be detectable by collision-induced dissociation tandem mass spectrometry (CID MS/MS), as y-type product ions will retain the original enzymatically incorporated isotope label. Here, we determine appropriate instrumental conditions that enable CID MS/MS of isotopically labeled ribonuclease T1 (RNase T1) digestion products such that the original isotope label is maintained in the product ion mass spectrum. Next, we demonstrate how y-type product ions can be used to differentiate singlets and doublets from isomer sequences. We were then able to extend the utility of this approach by using CID MS/MS for the confirmation of an expected RNase T1 digestion product within the CARD analysis of an Escherichia coli mutant strain even in the presence of interfering and overlapping digestion products from other transfer RNAs.

  4. GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences.

    PubMed

    Antonov, Ivan; Baranov, Pavel; Borodovsky, Mark

    2013-01-01

    Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (-1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).

  5. Phylogenetic diversity in the genus Bacillus as seen by 16S rRNA sequencing studies.

    PubMed

    Rössler, D; Ludwig, W; Schleifer, K H; Lin, C; McGill, T J; Wisotzkey, J D; Jurtshuk, P; Fox, G E

    1991-01-01

    Comparative sequence analysis of 16S ribosomal (r)RNAs or DNAs of Bacillus alvei, B. laterosporus, B. macerans, B. macquariensis, B. polymyxa and B. stearothermophilus revealed the phylogenetic diversity of the genus Bacillus. Based on the presently available data set of 16S rRNA sequences from bacilli and relatives at least four major "Bacillus clusters" can be defined: a "Bacillus subtilis cluster" including B. stearothermophilus, a "B. brevis cluster" including B. laterosporus, a "B. alvei cluster" including B. macerans, B. maquariensis and B. polymyxa and a "B. cycloheptanicus branch".

  6. Phylogenetic diversity in the genus Bacillus as seen by 16S rRNA sequencing studies

    NASA Technical Reports Server (NTRS)

    Rossler, D.; Ludwig, W.; Schleifer, K. H.; Lin, C.; McGill, T. J.; Wisotzkey, J. D.; Jurtshuk, P. Jr; Fox, G. E.

    1991-01-01

    Comparative sequence analysis of 16S ribosomal (r)RNAs or DNAs of Bacillus alvei, B. laterosporus, B. macerans, B. macquariensis, B. polymyxa and B. stearothermophilus revealed the phylogenetic diversity of the genus Bacillus. Based on the presently available data set of 16S rRNA sequences from bacilli and relatives at least four major "Bacillus clusters" can be defined: a "Bacillus subtilis cluster" including B. stearothermophilus, a "B. brevis cluster" including B. laterosporus, a "B. alvei cluster" including B. macerans, B. maquariensis and B. polymyxa and a "B. cycloheptanicus branch".

  7. An analysis of vertebrate mRNA sequences: intimations of translational control

    PubMed Central

    1991-01-01

    Five structural features in mRNAs have been found to contribute to the fidelity and efficiency of initiation by eukaryotic ribosomes. Scrutiny of vertebrate cDNA sequences in light of these criteria reveals a set of transcripts--encoding oncoproteins, growth factors, transcription factors, and other regulatory proteins--that seem designed to be translated poorly. Thus, throttling at the level of translation may be a critical component of gene regulation in vertebrates. An alternative interpretation is that some (perhaps many) cDNAs with encumbered 5' noncoding sequences represent mRNA precursors, which would imply extensive regulation at a posttranscriptional step that precedes translation. PMID:1955461

  8. Combined small RNA and degradome sequencing reveals complex microRNA regulation of catechin biosynthesis in tea (Camellia sinensis)

    PubMed Central

    Sun, Ping; Cheng, Chunzhen; Lin, Yuling; Zhu, Qiufang; Lin, Jinke

    2017-01-01

    MicroRNAs are endogenous non-coding small RNAs playing crucial regulatory roles in plants. Tea, a globally popular non-alcoholic drink, is rich in health-enhancing catechins. In this study, 69 conserved and 47 novel miRNAs targeting 644 genes were identified by high-throughout sequencing. Predicted target genes of miRNAs were mainly involved in plant growth, signal transduction, morphogenesis and defense. To further identify targets of tea miRNAs, degradome sequencing and RNA ligase-mediated rapid amplification of 5’cDNA ends (RLM-RACE) were applied. Using degradome sequencing, 26 genes mainly involved in transcription factor, resistance protein and signal transduction protein synthesis were identified as potential miRNA targets, with 5 genes subsequently verified. Quantitative real-time PCR (qRT-PCR) revealed that the expression patterns of novel-miR1, novel-miR2, csn-miR160a, csn-miR162a, csn-miR394 and csn-miR396a were negatively correlated with catechin content. The expression of six miRNAs (csn-miRNA167a, csn-miR2593e, csn-miR4380a, csn-miR3444b, csn-miR5251 and csn-miR7777-5p.1) and their target genes involved in catechin biosynthesis were also analyzed by qRT-PCR. Negative and positive correlations were found between these miRNAs and catechin contents, while positive correlations were found between their target genes and catechin content. This result suggests that these miRNAs may negatively regulate catechin biosynthesis by down-regulating their biosynthesis-related target genes. Taken together, our results indicate that miRNAs are crucial regulators in tea, with the results of 5’-RLM-RACE and expression analyses revealing the important role of miRNAs in catechin anabolism. Our findings should facilitate future research to elucidate the function of miRNAs in catechin biosynthesis. PMID:28225779

  9. Phylogeny of Metschnikowia species estimated from partial rRNA sequences.

    PubMed

    Mendonça-Hagler, L C; Hagler, A N; Kurtzman, C P

    1993-04-01

    Phylogenetic relationships of species assigned to the genus Metschnikowia were estimated from the extents of divergence among partial sequences of rRNA. The data suggest that the aquatic species (Metschnikowia australis, Metschnikowia bicuspidata, Metschnikowia krissii, and Metschnikowia zobellii) and the terrestrial species (Metschnikowia hawaiiensis, Metschnikowia lunata, Metschnikowia pulcherrima, and Metschnikowia reukaufii) form two groups within the genus. M. lunata and M. hawaiiensis are well separated from other members of the genus, and M. hawaiiensis may be sufficiently divergent that it could be placed in a new genus. Species of the genus Metschnikowia are unique compared with other ascomycetous yeasts because they have a deletion in the large-subunit rRNA sequence that includes nucleotides 434 to 483.

  10. RNA-sequencing profiles hippocampal gene expression in a validated model of cancer-induced depression.

    PubMed

    Nashed, M G; Linher-Melville, K; Frey, B N; Singh, G

    2016-11-01

    To investigate the pathophysiology of cancer-induced depression (CID), we have recently developed a validated CID mouse model. Given that the efficacy of antidepressants in cancer patients is controversial, it remains unclear whether CID is a biologically distinct form of depression. We used RNA-sequencing (RNA-seq) to investigate differentially expressed genes (DEGs) in hippocampi of animals from our CID model relative a positive control model of depressive-like behavior induced with chronic corticosterone (CORT). To validate RNA-seq results, we performed quantitative real-time RT-PCR (qRT-PCR) on a subset of DEGs. Enrichment analysis using DAVID was performed on DEGs to identify enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and biological process gene ontologies (GO:BP). qRT-PCR results significantly predicted RNA-seq results. RNA-seq revealed that most DEGs identified in the CORT model overlapped with the CID model. Enrichment analyses identified KEGG pathways and GO:BP terms associated with ion homeostasis and neuronal communication for both the CORT and CID model. In addition, CID DEGs were enriched in pathways and terms relating to neuronal development, intracellular signaling, learning and memory. This study is the first to investigate CID at the mRNA level. We have shown that most hippocampal mRNA changes that are associated with a depressive-like state are also associated with cancer. Several other changes occur at the mRNA level in cancer, suggesting that the CID model may represent a biologically distinct form of a depressive-like state.

  11. Full-length HLA-DRB1 coding sequences generated by a hemizygous RNA-SBT approach.

    PubMed

    Gerritsen, K E H; Groeneweg, M; Meertens, C M H; Voorter, C E M; Tilanus, M G J

    2015-11-01

    Currently 1582 HLA-DRB1 alleles have been identified in the IMGT/HLA database (v3.18). Among those alleles, more than 90% have incomplete allele sequences, which complicates the analysis of the functional relevance of polymorphism beyond exon 2. The polymorphic index of each individual exon of the currently known allele sequences, shows that polymorphism is present in all exons, albeit not equally abundant. Full-length HLA-DRB1 RNA sequencing identifies polymorphism of the complete coding region. Here we describe a hemizygous full-length RNA sequence-based typing (SBT) approach based on group-specific HLA-DRB1 amplification and subsequent sequencing. RNA full-length sequences can easily be accessed because of the short amplicon length (801 bp). The RNA-SBT approach was successfully validated on a panel of DRB1 alleles having fully known coding sequences according to the IMGT/HLA database, and cover all serological equivalents. Subsequently, the approach was applied on a panel of 54 alleles with incomplete allele sequences, resulting in full-length coding sequences and the identification of one new and one corrected allele. This study shows the universal applicability of the RNA-based sequencing approach to identify full-length coding sequences and to define the polymorphic content of HLA-DRB1 alleles.

  12. Characterization of mitochondrial ribosomal RNA genes in gadiformes: sequence variations, secondary structural features, and phylogenetic implications.

    PubMed

    Bakke, Ingrid; Johansen, Steinar

    2002-10-01

    Secondary structure features of mitochondrial ribosomal RNAs (mt-rRNAs) of bony fishes were investigated by a DNA sequence alignment approach. The small subunit (SSU) and large subunit (LSU) mt-rRNA genes were found to contain several additional variable regions compared to their mammalian counterparts. Fish mt-LSU rRNA genes were found to be longer than the mammalians due to increased length of some of the variable regions. The 5' and 3' ends of Atlantic cod mt-rRNAs were precisely mapped. The 3' ends of mt-SSU rRNAs were found to be homogenous and mono-adenylated, whereas that of the mt-LSU rRNAs were heterogenous and oligo-adenylated. The 5' ends of mt-SSU rRNAs appeared to be heterogenous, corresponding to the presumed first and second positions of the gene. Sequences of the central domain and the D-domain of the mt-SSU and mt-LSU rRNA genes, respectively, were determined and characterized for 11 gadiform species (representing the families Gadidae, Lotidae, Ranicipitidae, Merlucciidae, Phycidae, and Macrouridae) and one Lophiidae species. Detailed secondary structure models of the RNA regions are presented for the Atlantic cod (Gadus morhua) and Roundnose grenadier (Coryphaeonides rupestris). Saturation plots revealed that DNA nucleotide positions corresponding to unpaired RNA regions become saturated with transitions at sequence divergence levels about 0.15. Phylogenetic analyses revealed some aspects of gadiform relationships. Gadidae was identified as the most derived of the gadiform families. Lotidae was found to be the family closest related to Gadidae, and Ranicipitidae was also recognized as a derived gadiform taxon.

  13. Mapping specificity landscapes of RNA-protein interactions by high throughput sequencing.

    PubMed

    Jankowsky, Eckhard; Harris, Michael E

    2017-03-02

    To function in a biological setting, RNA binding proteins (RBPs) have to discriminate between alternative binding sites in RNAs. This discrimination can occur in the ground state of an RNA-protein binding reaction, in its transition state, or in both. The extent by which RBPs discriminate at these reaction states defines RBP specificity landscapes. Here, we describe the HiTS-Kin and HiTS-EQ techniques, which combine kinetic and equilibrium binding experiments with high throughput sequencing to quantitatively assess substrate discrimination for large numbers of substrate variants at ground and transition states of RNA-protein binding reactions. We discuss experimental design, practical considerations and data analysis and outline how a combination of HiTS-Kin and HiTS-EQ allows the mapping of RBP specificity landscapes.

  14. Structure and Genome Organization of Cherry Virus A (Capillovirus, Betaflexiviridae) from China Using Small RNA Sequencing

    PubMed Central

    Wang, Jiawei; Zhai, Ying; Liu, Weizhen; Dhingra, Amit

    2016-01-01

    Cherry virus A (CVA) (Capillovirus, Betaflexiviridae) is widely present in cherry-growing areas. We obtained the complete genome of a CVA isolate (CVA-TA) using small RNA deep sequencing, followed by overlapping reverse transcription-PCR (RT-PCR) and rapid amplification of cDNA ends (RACE). The newly identified 5′-untranslated region (5′-UTR) from CVA-TA may form additional hairpin and loop structures to stabilize the CVA genome. PMID:27174277

  15. Phenotypic characterisation and 16S rRNA sequence analysis of veterinary isolates of Streptococcus pluranimalium.

    PubMed

    Twomey, D F; Carson, T; Foster, G; Koylass, M S; Whatmore, A M

    2012-05-01

    Forty-two isolates of Streptococcus pluranimalium were identified from cattle (n=38), sheep (n=2), an alpaca (n=1) and a pheasant (n=1) in the United Kingdom. The isolates were confirmed as S. pluranimalium by 16S rRNA sequence analysis but could not be differentiated reliably from Streptococcus acidominimus by phenotypic characterisation using commercial kits routinely used in veterinary laboratories. The alanyl-phenylalanyl-proline arylamidase reaction could be used to differentiate S. pluranimalium (positive) from Aerococcus urinae (negative).

  16. Prediction of human miRNA target genes using computationally reconstructed ancestral mammalian sequences

    PubMed Central

    Leclercq, Mickael; Diallo, Abdoulaye Baniré; Blanchette, Mathieu

    2017-01-01

    MicroRNAs (miRNA) are short single-stranded RNA molecules derived from hairpin-forming precursors that play a crucial role as post-transcriptional regulators in eukaryotes and viruses. In the past years, many microRNA target genes (MTGs) have been identified experimentally. However, because of the high costs of experimental approaches, target genes databases remain incomplete. Although several target prediction programs have been developed in the recent years to identify MTGs in silico, their specificity and sensitivity remain low. Here, we propose a new approach called MirAncesTar, which uses ancestral genome reconstruction to boost the accuracy of existing MTGs prediction tools for human miRNAs. For each miRNA and each putative human target UTR, our algorithm makes uses of existing prediction tools to identify putative target sites in the human UTR, as well as in its mammalian orthologs and inferred ancestral sequences. It then evaluates evidence in support of selective pressure to maintain target site counts (rather than sequences), accounting for the possibility of target site turnover. It finally integrates this measure with several simpler ones using a logistic regression predictor. MirAncesTar improves the accuracy of existing MTG predictors by 26% to 157%. Source code and prediction results for human miRNAs, as well as supporting evolutionary data are available at http://cs.mcgill.ca/∼blanchem/mirancestar. PMID:27899600

  17. Higher order asymptotics for negative binomial regression inferences from RNA-sequencing data.

    PubMed

    Di, Yanming; Emerson, Sarah C; Schafer, Daniel W; Kimbrel, Jeffrey A; Chang, Jeff H

    2013-03-26

    RNA sequencing (RNA-Seq) is the current method of choice for characterizing transcriptomes and quantifying gene expression changes. This next generation sequencing-based method provides unprecedented depth and resolution. The negative binomial (NB) probability distribution has been shown to be a useful model for frequencies of mapped RNA-Seq reads and consequently provides a basis for statistical analysis of gene expression. Negative binomial exact tests are available for two-group comparisons but do not extend to negative binomial regression analysis, which is important for examining gene expression as a function of explanatory variables and for adjusted group comparisons accounting for other factors. We address the adequacy of available large-sample tests for the small sample sizes typically available from RNA-Seq studies and consider a higher-order asymptotic (HOA) adjustment to likelihood ratio tests. We demonstrate that 1) the HOA-adjusted likelihood ratio test is practically indistinguishable from the exact test in situations where the exact test is available, 2) the type I error of the HOA test matches the nominal specification in regression settings we examined via simulation, and 3) the power of the likelihood ratio test does not appear to be affected by the HOA adjustment. This work helps clarify the accuracy of the unadjusted likelihood ratio test and the degree of improvement available with the HOA adjustment. Furthermore, the HOA test may be preferable even when the exact test is available because it does not require ad hoc library size adjustments.

  18. Analysis and expansion of the eosinophilic esophagitis transcriptome by RNA sequencing

    PubMed Central

    Sherrill, Joseph D.; Kiran, KC; Blanchard, Carine; Stucke, Emily M.; Kemme, Katherine A.; Collins, Margaret H.; Abonia, J. Pablo; Putnam, Philip E.; Mukkada, Vincent A.; Kaul, Ajay; Kocoshis, Samuel A.; Kushner, Jonathan P.; Plassard, Andrew J.; Karns, Rebekah A.; Dexheimer, Phillip J.; Aronow, Bruce J.; Rothenberg, Marc E.

    2014-01-01

    Eosinophilic esophagitis (EoE) is an allergic inflammatory disorder of the esophagus that is compounded by genetic predisposition and hypersensitivity to environmental antigens. Using high-density oligonucleotide expression chips, a disease-specific esophageal transcript signature was identified and shown to be largely reversible with therapy. In an effort to expand the molecular signature of EoE, we performed RNA sequencing on esophageal biopsies from healthy controls and patients with active EoE and identified a total of 1 607 significantly dysregulated transcripts (1 096 upregulated, 511 downregulated). When clustered by raw expression levels, an abundance of immune-cell specific transcripts that are highly induced in EoE are expressed at low (or undetectable) levels in healthy controls. Moreover, 66% of the gene signature identified by RNA sequencing was previously unrecognized in the EoE transcript signature by microarray-based expression profiling and included several long non-coding RNAs (lncRNA), an emerging class of transcriptional regulators. The lncRNA BANCR was upregulated in EoE and induced in IL-13–treated primary esophageal epithelial cells. Repression of BANCR significantly altered the expression of IL-13–induced pro-inflammatory genes. Together, these data comprise new potential biomarkers of EoE and demonstrate a novel role for lncRNAs in EoE and IL-13–associated responses. PMID:24920534

  19. Exploration of sequence space as the basis of viral RNA genome segmentation.

    PubMed

    Moreno, Elena; Ojosnegros, Samuel; García-Arriaza, Juan; Escarmís, Cristina; Domingo, Esteban; Perales, Celia

    2014-05-06

    The mechanisms of viral RNA genome segmentation are unknown. On extensive passage of foot-and-mouth disease virus in baby hamster kidney-21 cells, the virus accumulated multiple point mutations and underwent a transition akin to genome segmentation. The standard single RNA genome molecule was replaced by genomes harboring internal in-frame deletions affecting the L- or capsid-coding region. These genomes were infectious and killed cells by complementation. Here we show that the point mutations in the nonstructural protein-coding region (P2, P3) that accumulated in the standard genome before segmentation increased the relative fitness of the segmented version relative to the standard genome. Fitness increase was documented by intracellular expression of virus-coded proteins and infectious progeny production by RNAs with the internal deletions placed in the sequence context of the parental and evolved genome. The complementation activity involved several viral proteins, one of them being the leader proteinase L. Thus, a history of genetic drift with accumulation of point mutations was needed to allow a major variation in the structure of a viral genome. Thus, exploration of sequence space by a viral genome (in this case an unsegmented RNA) can reach a point of the space in which a totally different genome structure (in this case, a segmented RNA) is favored over the form that performed the exploration.

  20. Identification of purple sea urchin telomerase RNA using a next-generation sequencing based approach.

    PubMed

    Li, Yang; Podlevsky, Joshua D; Marz, Manja; Qi, Xiaodong; Hoffmann, Steve; Stadler, Peter F; Chen, Julian J-L

    2013-06-01

    Telomerase is a ribonucleoprotein (RNP) enzyme essential for telomere maintenance and chromosome stability. While the catalytic telomerase reverse transcriptase (TERT) protein is well conserved across eukaryotes, telomerase RNA (TR) is extensively divergent in size, sequence, and structure. This diversity prohibits TR identification from many important organisms. Here we report a novel approach for TR discovery that combines in vitro TR enrichment from total RNA, next-generation sequencing, and a computational screening pipeline. With this approach, we have successfully identified TR from Strongylocentrotus purpuratus (purple sea urchin) from the phylum Echinodermata. Reconstitution of activity in vitro confirmed that this RNA is an integral component of sea urchin telomerase. Comparative phylogenetic analysis against vertebrate TR sequences revealed that the purple sea urchin TR contains vertebrate-like template-pseudoknot and H/ACA domains. While lacking a vertebrate-like CR4/5 domain, sea urchin TR has a unique central domain critical for telomerase activity. This is the first TR identified from the previously unexplored invertebrate clade and provides the first glimpse of TR evolution in the deuterostome lineage. Moreover, our TR discovery approach is a significant step toward the comprehensive understanding of telomerase RNP evolution.

  1. Nucleotide sequence of miRNA precursor contributes to cleavage site selection by Dicer.

    PubMed

    Starega-Roslan, Julia; Galka-Marciniak, Paulina; Krzyzosiak, Wlodzimierz J

    2015-12-15

    The ribonuclease Dicer excises mature miRNAs from a diverse group of precursors (pre-miRNAs), most of which contain various secondary structure motifs in their hairpin stem. In this study, we analyzed Dicer cleavage in hairpin substrates deprived of such motifs. We searched for the factors other than the secondary structure, which may influence the length diversity and heterogeneity of miRNAs. We found that the nucleotide sequence at the Dicer cleavage site influences both of these miRNA characteristics. With regard to cleavage mechanism, we demonstrate that the Dicer RNase IIIA domain that cleaves within the 3' arm of the pre-miRNA is more sensitive to the nucleotide sequence of its substrate than is the RNase IIIB domain. The RNase IIIA domain avoids releasing miRNAs with G nucleotide and prefers to generate miRNAs with a U nucleotide at the 5' end. We also propose that the sequence restrictions at the Dicer cleavage site might be the factor that contributes to the generation of miRNA duplexes with 3' overhangs of atypical lengths. This finding implies that the two RNase III domains forming the single processing center of Dicer may exhibit some degree of flexibility, which allows for the formation of these non-standard 3' overhangs.

  2. Mutational and fitness landscapes of an RNA virus revealed through population sequencing.

    PubMed

    Acevedo, Ashley; Brodsky, Leonid; Andino, Raul

    2014-01-30

    RNA viruses exist as genetically diverse populations. It is thought that diversity and genetic structure of viral populations determine the rapid adaptation observed in RNA viruses and hence their pathogenesis. However, our understanding of the mechanisms underlying virus evolution has been limited by the inability to accurately describe the genetic structure of virus populations. Next-generation sequencing technologies generate data of sufficient depth to characterize virus populations, but are limited in their utility because most variants are present at very low frequencies and are thus indistinguishable from next-generation sequencing errors. Here we present an approach that reduces next-generation sequencing errors and allows the description of virus populations with unprecedented accuracy. Using this approach, we define the mutation rates of poliovirus and uncover the mutation landscape of the population. Furthermore, by monitoring changes in variant frequencies on serially passaged populations, we determined fitness values for thousands of mutations across the viral genome. Mapping of these fitness values onto three-dimensional structures of viral proteins offers a powerful approach for exploring structure-function relationships and potentially uncovering new functions. To our knowledge, our study provides the first single-nucleotide fitness landscape of an evolving RNA virus and establishes a general experimental platform for studying the genetic changes underlying the evolution of virus populations.

  3. Determination and augmentation of RNA sequence specificity of the Nova K-homology domains.

    PubMed

    Musunuru, Kiran; Darnell, Robert B

    2004-01-01

    The Nova onconeural antigens are implicated in the pathogenesis of paraneoplastic opsoclonus-myoclonus-ataxia (POMA). The Nova antigens are neuron-specific RNA-binding proteins harboring three repeats of the K-homology (KH) motif; they have been implicated in the regulation of alternative splicing of a host of genes involved in inhibitory synaptic transmission. Although the third Nova KH domain (KH3) has been extensively characterized using biochemical and crystallographic techniques, the roles of the KH1 and KH2 domains remain unclear. Furthermore, the specificity determinants that distinguish the Nova KH domains from those of the closely related hnRNP E and hnRNP K proteins are undefined. We demonstrate through the use of RNA selection and biochemical analysis that the sequence specificity of the Nova KH1/2 domains is similar to that of Nova KH3. We also show that the mutagenesis of a Nova KH domain to render it similar to the KH domains of the heterogeneous nuclear ribonucleoprotein E (hnRNP E) and hnRNP K allow it to recognize longer RNA sequences. These data yield important insights into KH domain function and suggest a strategy by which to engineer KH domains with novel sequence preferences.

  4. Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints

    PubMed Central

    Dowell, Robin D; Eddy, Sean R

    2006-01-01

    Background We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. Results We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. Conclusion Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses. PMID:16952317

  5. Analysis of the sequence and embryonic expression of chicken neurofibromin mRNA.

    PubMed

    Schafer, G L; Ciment, G; Stocker, K M; Baizer, L

    1993-04-01

    Neurofibromatosis type 1 (NF1) is a common inherited disorder that primarily affects tissues derived from the neural crest. Recent identification and characterization of the human NF1 gene has revealed that it encodes a protein (now called neurofibromin) that is similar in sequence to the ras-GTPase activator protein (or ras-GAP), suggesting that neurofibromin may be a component of cellular signal transduction pathways regulating cellular proliferation and/or differentiation. To initiate investigations on the role of the NF1 gene product in embryonic development, we have isolated a partial cDNA for chicken neurofibromin. Sequence analysis reveals that the predicted amino acid sequence is highly conserved between chick and human. The chicken cDNA hybridizes to a 12.5-kb transcript on RNA blots, a mol wt similar to that reported for the human and murine mRNAs. Ribonuclease protection assays indicate that NF1 mRNA is expressed in a variety of tissues in the chick embryo; this is confirmed by in situ hybridization analysis. NF1 mRNA expression is detectable as early as embryonic stage 18 in the neural plate. This pattern of expression may suggest a role for neurofibromin during normal development, including that of the nervous system.

  6. CoverageAnalyzer (CAn): A Tool for Inspection of Modification Signatures in RNA Sequencing Profiles

    PubMed Central

    Hauenschild, Ralf; Werner, Stephan; Tserovski, Lyudmil; Hildebrandt, Andreas; Motorin, Yuri; Helm, Mark

    2016-01-01

    Combination of reverse transcription (RT) and deep sequencing has emerged as a powerful instrument for the detection of RNA modifications, a field that has seen a recent surge in activity because of its importance in gene regulation. Recent studies yielded high-resolution RT signatures of modified ribonucleotides relying on both sequence-dependent mismatch patterns and reverse transcription arrests. Common alignment viewers lack specialized functionality, such as filtering, tailored visualization, image export and differential analysis. Consequently, the community will profit from a platform seamlessly connecting detailed visual inspection of RT signatures and automated screening for modification candidates. CoverageAnalyzer (CAn) was developed in response to the demand for a powerful inspection tool. It is freely available for all three main operating systems. With SAM file format as standard input, CAn is an intuitive and user-friendly tool that is generally applicable to the large community of biomedical users, starting from simple visualization of RNA sequencing (RNA-Seq) data, up to sophisticated modification analysis with significance-based modification candidate calling. PMID:27834909

  7. Profiling single-guide RNA specificity reveals a mismatch sensitive core sequence

    PubMed Central

    Zheng, Ting; Hou, Yingzi; Zhang, Pingjing; Zhang, Zhenxi; Xu, Ying; Zhang, Letian; Niu, Leilei; Yang, Yi; Liang, Da; Yi, Fan; Peng, Wei; Feng, Wenjian; Yang, Ying; Chen, Jianxin; Zhu, York Yuanyuan; Zhang, Li-He; Du, Quan

    2017-01-01

    Targeting specificity is an essential issue in the development of CRISPR-Cas technology. Using a luciferase activation assay, off-target cleavage activity of sgRNA was systematically investigated on single nucleotide-mismatched targets. In addition to confirming that PAM-proximal mismatches are less tolerated than PAM-distal mismatches, our study further identified a “core” sequence that is highly sensitive to target-mismatch. This sequence is of 4-nucleotide long, located at +4 to +7 position upstream of PAM, and positioned in a steric restriction region when assembled into Cas9 endonuclease. Our study also found that, single or multiple target mismatches at this region abolished off-target cleavage mediated by active sgRNAs, thus proposing a principle for gene-specific sgRNA design. Characterization of a mismatch sensitive “core” sequence not only enhances our understanding of how this elegant system functions, but also facilitates our efforts to improve targeting specificity of a sgRNA. PMID:28098181

  8. Nucleotide sequence of miRNA precursor contributes to cleavage site selection by Dicer

    PubMed Central

    Starega-Roslan, Julia; Galka-Marciniak, Paulina; Krzyzosiak, Wlodzimierz J.

    2015-01-01

    The ribonuclease Dicer excises mature miRNAs from a diverse group of precursors (pre-miRNAs), most of which contain various secondary structure motifs in their hairpin stem. In this study, we analyzed Dicer cleavage in hairpin substrates deprived of such motifs. We searched for the factors other than the secondary structure, which may influence the length diversity and heterogeneity of miRNAs. We found that the nucleotide sequence at the Dicer cleavage site influences both of these miRNA characteristics. With regard to cleavage mechanism, we demonstrate that the Dicer RNase IIIA domain that cleaves within the 3′ arm of the pre-miRNA is more sensitive to the nucleotide sequence of its substrate than is the RNase IIIB domain. The RNase IIIA domain avoids releasing miRNAs with G nucleotide and prefers to generate miRNAs with a U nucleotide at the 5′ end. We also propose that the sequence restrictions at the Dicer cleavage site might be the factor that contributes to the generation of miRNA duplexes with 3′ overhangs of atypical lengths. This finding implies that the two RNase III domains forming the single processing center of Dicer may exhibit some degree of flexibility, which allows for the formation of these non-standard 3′ overhangs. PMID:26424848

  9. Isolation and sequence of four small nuclear U RNA genes of Trypanosoma brucei subsp. brucei: identification of the U2, U4, and U6 RNA analogs.

    PubMed Central

    Mottram, J; Perry, K L; Lizardi, P M; Lührmann, R; Agabian, N; Nelson, R G

    1989-01-01

    Trypanosomes use trans splicing to place a common 39-nucleotide spliced-leader sequence on the 5' ends of all of their mRNAs. To identify likely participants in this reaction, we used antiserum directed against the characteristic U RNA 2,2,7-trimethylguanosine (TMG) cap to immunoprecipitate six candidate U RNAs from total trypanosome RNA. Genomic Southern analysis using oligonucleotide probes constructed from partial RNA sequence indicated that the four largest RNAs (A through D) are encoded by single-copy genes that are not closely linked to one another. We have cloned and sequenced these genes, mapped the 5' ends of the encoded RNAs, and identified three of the RNAs as the trypanosome U2, U4, and U6 analogs by virtue of their sequences and structural homologies with the corresponding metazoan U RNAs. The fourth RNA, RNA B (144 nucleotides), was not sufficiently similar to known U RNAs to allow us to propose an identify. Surprisingly, none of these U RNAs contained the consensus Sm antigen-binding site, a feature totally conserved among several classes of U RNAs, including U2 and U4. Similarly, the sequence of the U2 RNA region shown to be involved in pre-mRNA branchpoint recognition in yeast, and exactly conserved in metazoan U2 RNAs, was totally divergent in trypanosomes. Like all other U6 RNAs, trypanosome U6 did not contain a TMG cap and was immunoprecipitated from deproteinized RNA by anti-TMG antibody because of its association with the TMG-capped U4 RNA. These two RNAs contained extensive regions of sequence complementarity which phylogenetically support the secondary-structure model proposed by D. A. Brow and C. Guthrie (Nature [London] 334:213-218, 1988) for the organization of the analogous yeast U4-U6 complex. Images PMID:2725495

  10. Isolation and sequence of four small nuclear U RNA genes of Trypanosoma brucei subsp. brucei: Identification of the U2, U4, and U6 RNA analogs

    SciTech Connect

    Mottram, J. ); Perry, K.L.; Agabian, N. . School of Medicine); Lizardi, P.M. ); Luhrmann, R. ); Nelson, R.G. . Dept. of Pharmaceutical Chemistry)

    1989-03-01

    Trypanosomes use trans splicing to place a common 39-nucleotide spliced-leader sequence on the 5' ends of all of their mRNAs. To identify likely participants in this reaction, the authors used antiserum directed against the characteristic U RNA 2,2,7-trimehtylguanosine (TMG) cap to immunoprecipitate six candidate U RNAs from total trypanosome RNA. Genomic Southern analysis using oligonucleotide probes constructed frm partial RNA sequence indicated that the four largest RNAs (A through D) are encoded by single-copy genes that are not closely linked to one another. The authors have cloned and sequenced these genes, mapped the 5' ends of the encoded RNAs,and identified three of the RNAs as the trypanosome U2, U4, and U6 analogs by virtue of their sequences and structural homologies with the corresponding metazoan U RNAs. The fourth RNA, RNA B (144 nucleotides), was not sufficiently similar to known U RNAs to allow them to propose an identity. Surprisingly, none of the U RNAs contained the consensus Sm antigen-binding site, a feature totally conserved among several classes of U RNAs, including U2 and U4. Similarly, the sequence of the U2 RNA region shown to be involved in pre-mRNA branchpoint recognition in yeast, and exactly conserved in metazoan U2 RNAs, was totally divergent in trypanosomes. Like all other U6 RNAs, trypanosome U6 did not contain a TMG cap and was immunoprecipitated from deproteinized RNA by anti-TMG antibody because of its association with the TMG-capped U4 RNA. These two RNAs contained extensive regions of sequence complementarity which phylogenetically support the secondary-structure model proposed by D.A. Brow and C. Guthrie (Nature (London) 334:213-218, 1988) for the organization of the analogous yeast U4-U6 complex.

  11. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGES

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; ...

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  12. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  13. Utility of next-generation RNA-sequencing in identifying chimeric transcription involving human endogenous retroviruses.

    PubMed

    Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou

    2016-01-01

    Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights

  14. Sequence-Based Analysis Uncovers an Abundance of Non-Coding RNA in the Total Transcriptome of Mycobacterium tuberculosis

    PubMed Central

    Arnvig, Kristine B.; Comas, Iñaki; Thomson, Nicholas R.; Houghton, Joanna; Boshoff, Helena I.; Croucher, Nicholas J.; Rose, Graham; Perkins, Timothy T.; Parkhill, Julian; Dougan, Gordon; Young, Douglas B.

    2011-01-01

    RNA sequencing provides a new perspective on the genome of Mycobacterium tuberculosis by revealing an extensive presence of non-coding RNA, including long 5’ and 3’ untranslated regions, antisense transcripts, and intergenic small RNA (sRNA) molecules. More than a quarter of all sequence reads mapping outside of ribosomal RNA genes represent non-coding RNA, and the density of reads mapping to intergenic regions was more than two-fold higher than that mapping to annotated coding sequences. Selected sRNAs were found at increased abundance in stationary phase cultures and accumulated to remarkably high levels in the lungs of chronically infected mice, indicating a potential contribution to pathogenesis. The ability of tubercle bacilli to adapt to changing environments within the host is critical to their ability to cause disease and to persist during drug treatment; it is likely that novel post-transcriptional regulatory networks will play an important role in these adaptive responses. PMID:22072964

  15. Phylogenetic Sequence Variations in Bacterial rRNA Affect Species-Specific Susceptibility to Drugs Targeting Protein Synthesis▿‡

    PubMed Central

    Akshay, Subramanian; Bertea, Mihai; Hobbie, Sven N.; Oettinghaus, Björn; Shcherbakov, Dimitri; Böttger, Erik C.; Akbergenov, Rashid

    2011-01-01

    Antibiotics targeting the bacterial ribosome typically bind to highly conserved rRNA regions with only minor phylogenetic sequence variations. It is unclear whether these sequence variations affect antibiotic susceptibility or resistance development. To address this question, we have investigated the drug binding pockets of aminoglycosides and macrolides/ketolides. The binding site of aminoglycosides is located within helix 44 of the 16S rRNA (A site); macrolides/ketolides bind to domain V of the 23S rRNA (peptidyltransferase center). We have used mutagenesis of rRNA sequences in Mycobacterium smegmatis ribosomes to reconstruct the different bacterial drug binding sites and to study the effects of rRNA sequence variations on drug activity. Our results provide a rationale for differences in species-specific drug susceptibility patterns and species-specific resistance phenotypes associated with mutational alterations in the drug binding pocket. PMID:21730122

  16. Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences.

    PubMed

    Yarza, Pablo; Yilmaz, Pelin; Pruesse, Elmar; Glöckner, Frank Oliver; Ludwig, Wolfgang; Schleifer, Karl-Heinz; Whitman, William B; Euzéby, Jean; Amann, Rudolf; Rosselló-Móra, Ramon

    2014-09-01

    Publicly available sequence databases of the small subunit ribosomal RNA gene, also known as 16S rRNA in bacteria and archaea, are growing rapidly, and the number of entries currently exceeds 4 million. However, a unified classification and nomenclature framework for all bacteria and archaea does not yet exist. In this Analysis article, we propose rational taxonomic boundaries for high taxa of bacteria and archaea on the basis of 16S rRNA gene sequence identities and suggest a rationale for the circumscription of uncultured taxa that is compatible with the taxonomy of cultured bacteria and archaea. Our analyses show that only nearly complete 16S rRNA sequences give accurate measures of taxonomic diversity. In addition, our analyses suggest that most of the 16S rRNA sequences of the high taxa will be discovered in environmental surveys by the end of the current decade.

  17. The siRNA Non-seed Region and Its Target Sequences Are Auxiliary Determinants of Off-Target Effects

    PubMed Central

    Kamola, Piotr J.; Nakano, Yuko; Takahashi, Tomoko; Wilson, Paul A.; Ui-Tei, Kumiko

    2015-01-01

    RNA interference (RNAi) is a powerful tool for post-transcriptional gene silencing. However, the siRNA guide strand may bind unintended off-target transcripts via partial sequence complementarity by a mechanism closely mirroring micro RNA (miRNA) silencing. To better understand these off-target effects, we investigated the correlation between sequence features within various subsections of siRNA guide strands, and its corresponding target sequences, with off-target activities. Our results confirm previous reports that strength of base-pairing in the siRNA seed region is the primary factor determining the efficiency of off-target silencing. However, the degree of downregulation of off-target transcripts with shared seed sequence is not necessarily similar, suggesting that there are additional auxiliary factors that influence the silencing potential. Here, we demonstrate that both the melting temperature (Tm) in a subsection of siRNA non-seed region, and the GC contents of its corresponding target sequences, are negatively correlated with the efficiency of off-target effect. Analysis of experimentally validated miRNA targets demonstrated a similar trend, indicating a putative conserved mechanistic feature of seed region-dependent targeting mechanism. These observations may prove useful as parameters for off-target prediction algorithms and improve siRNA ‘specificity’ design rules. PMID:26657993

  18. RNA Sequencing Reveals that Kaposi Sarcoma-Associated Herpesvirus Infection Mimics Hypoxia Gene Expression Signature

    PubMed Central

    Viollet, Coralie; Davis, David A.; Tekeste, Shewit S.; Reczko, Martin; Pezzella, Francesco; Ragoussis, Jiannis

    2017-01-01

    Kaposi sarcoma-associated herpesvirus (KSHV) causes several tumors and hyperproliferative disorders. Hypoxia and hypoxia-inducible factors (HIFs) activate latent and lytic KSHV genes, and several KSHV proteins increase the cellular levels of HIF. Here, we used RNA sequencing, qRT-PCR, Taqman assays, and pathway analysis to explore the miRNA and mRNA response of uninfected and KSHV-infected cells to hypoxia, to compare this with the genetic changes seen in chronic latent KSHV infection, and to explore the degree to which hypoxia and KSHV infection interact in modulating mRNA and miRNA expression. We found that the gene expression signatures for KSHV infection and hypoxia have a 34% overlap. Moreover, there were considerable similarities between the genes up-regulated by hypoxia in uninfected (SLK) and in KSHV-infected (SLKK) cells. hsa-miR-210, a HIF-target known to have pro-angiogenic and anti-apoptotic properties, was significantly up-regulated by both KSHV infection and hypoxia using Taqman assays. Interestingly, expression of KSHV-encoded miRNAs was not affected by hypoxia. These results demonstrate that KSHV harnesses a part of the hypoxic cellular response and that a substantial portion of hypoxia-induced changes in cellular gene expression are induced by KSHV infection. Therefore, targeting hypoxic pathways may be a useful way to develop therapeutic strategies for KSHV-related diseases. PMID:28046107

  19. Robust detection of immune transcripts in FFPE samples using targeted RNA sequencing.

    PubMed

    Paluch, Benjamin E; Glenn, Sean T; Conroy, Jeffrey M; Papanicolau-Sengos, Antonios; Bshara, Wiam; Omilian, Angela R; Brese, Elizabeth; Nesline, Mary; Burgher, Blake; Andreas, Jonathan; Odunsi, Kunle; Eng, Kevin; He, Ji; Qin, Maochun; Gardner, Mark; Galluzzi, Lorenzo; Morrison, Carl D

    2017-01-10

    Current criteria for identifying cancer patients suitable for immunotherapy with immune checkpoint blockers (ICBs) are subjective and prone to misinterpretation, as they mainly rely on the visual assessment of CD274 (best known as PD-L1) expression levels by immunohistochemistry (IHC). To address this issue, we developed a RNA sequencing (RNAseq)-based approach that specifically measures the abundance of immune transcripts in formalin-fixed paraffin embedded (FFPE) specimens. Besides exhibiting superior sensitivity as compared to whole transcriptome RNAseq, our assay requires little starting material, implying that it is compatible with RNA degradation normally caused by formalin. Here, we demonstrate that a targeted RNAseq panel reliably profiles mRNA expression levels in FFPE samples from a cohort of ovarian carcinoma patients. The expression profile of immune transcripts as measured by targeted RNAseq in FFPE versus freshly frozen (FF) samples from the same tumor was highly concordant, in spite of the RNA quality issues associated with formalin fixation. Moreover, the results of targeted RNAseq on FFPE specimens exhibited a robust correlation with mRNA expression levels as measured on the same samples by quantitative RT-PCR, as well as with protein abundance as determined by IHC. These findings demonstrate that RNAseq profiling on archival FFPE tissues can be used reliably in studies assessing the efficacy of cancer immunotherapy.

  20. Gene profiling of bone around orthodontic mini-implants by RNA-sequencing analysis.

    PubMed

    Nahm, Kyung-Yen; Heo, Jung Sun; Lee, Jae-Hyung; Lee, Dong-Yeol; Chung, Kyu-Rhim; Ahn, Hyo-Won; Kim, Seong-Hun

    2015-01-01

    This study aimed to evaluate the genes that were expressed in the healing bones around SLA-treated titanium orthodontic mini-implants in a beagle at early (1-week) and late (4-week) stages with RNA-sequencing (RNA-Seq). Samples from sites of surgical defects were used as controls. Total RNA was extracted from the tissue around the implants, and an RNA-Seq analysis was performed with Illumina TruSeq. In the 1-week group, genes in the gene ontology (GO) categories of cell growth and the extracellular matrix (ECM) were upregulated, while genes in the categories of the oxidation-reduction process, intermediate filaments, and structural molecule activity were downregulated. In the 4-week group, the genes upregulated included ECM binding, stem cell fate specification, and intramembranous ossification, while genes in the oxidation-reduction process category were downregulated. GO analysis revealed an upregulation of genes that were related to significant mechanisms, including those with roles in cell proliferation, the ECM, growth factors, and osteogenic-related pathways, which are associated with bone formation. From these results, implant-induced bone formation progressed considerably during the times examined in this study. The upregulation or downregulation of selected genes was confirmed with real-time reverse transcription polymerase chain reaction. The RNA-Seq strategy was useful for defining the biological responses to orthodontic mini-implants and identifying the specific genetic networks for targeted evaluations of successful peri-implant bone remodeling.

  1. High quality RNA extraction from Maqui berry for its application in next-generation sequencing.

    PubMed

    Sánchez, Carolina; Villacreses, Javier; Blanc, Noelle; Espinoza, Loreto; Martinez, Camila; Pastor, Gabriela; Manque, Patricio; Undurraga, Soledad F; Polanco, Victor

    2016-01-01

    Maqui berry (Aristotelia chilensis) is a native Chilean species that produces berries that are exceptionally rich in anthocyanins and natural antioxidants. These natural compounds provide an array of health benefits for humans, making them very desirable in a fruit. At the same time, these substances also interfere with nucleic acid preparations, making RNA extraction from Maqui berry a major challenge. Our group established a method for RNA extraction of Maqui berry with a high quality RNA (good purity, good integrity and higher yield). This procedure is based on the adapted CTAB method using high concentrations of PVP (4 %) and β-mercaptoethanol (4 %) and spermidine in the extraction buffer. These reagents help to remove contaminants such as polysaccharides, proteins, phenols and also prevent the oxidation of phenolic compounds. The high quality of RNA isolated through this method allowed its uses with success in molecular applications for this endemic Chilean fruit, such as differential expression analysis of RNA-Seq data using next generation sequencing (NGS). Furthermore, we consider that our method could potentially be used for other plant species with extremely high levels of antioxidants and anthocyanins.

  2. A functional sequence-specific interaction between influenza A virus genomic RNA segments

    PubMed Central

    Gavazzi, Cyrille; Yver, Matthieu; Isel, Catherine; Smyth, Redmond P.; Rosa-Calatrava, Manuel; Lina, Bruno; Moulès, Vincent; Marquet, Roland

    2013-01-01

    Influenza A viruses cause annual influenza epidemics and occasional severe pandemics. Their genome is segmented into eight fragments, which offers evolutionary advantages but complicates genomic packaging. The existence of a selective packaging mechanism, in which one copy of each viral RNA is specifically packaged into each virion, is suspected, but its molecular details remain unknown. Here, we identified a direct intermolecular interaction between two viral genomic RNA segments of an avian influenza A virus using in vitro experiments. Using silent trans-complementary mutants, we then demonstrated that this interaction takes place in infected cells and is required for optimal viral replication. Disruption of this interaction did not affect the HA titer of the mutant viruses, suggesting that the same amount of viral particles was produced. However, it nonspecifically decreased the amount of viral RNA in the viral particles, resulting in an eightfold increase in empty viral particles. Competition experiments indicated that this interaction favored copackaging of the interacting viral RNA segments. The interaction we identified involves regions not previously designated as packaging signals and is not widely conserved among influenza A virus. Combined with previous studies, our experiments indicate that viral RNA segments can promote the selective packaging of the influenza A virus genome by forming a sequence-dependent supramolecular network of interactions. The lack of conservation of these interactions might limit genetic reassortment between divergent influenza A viruses. PMID:24067651

  3. RNA Sequencing Reveals that Kaposi Sarcoma-Associated Herpesvirus Infection Mimics Hypoxia Gene Expression Signature.

    PubMed

    Viollet, Coralie; Davis, David A; Tekeste, Shewit S; Reczko, Martin; Ziegelbauer, Joseph M; Pezzella, Francesco; Ragoussis, Jiannis; Yarchoan, Robert

    2017-01-01

    Kaposi sarcoma-associated herpesvirus (KSHV) causes several tumors and hyperproliferative disorders. Hypoxia and hypoxia-inducible factors (HIFs) activate latent and lytic KSHV genes, and several KSHV proteins increase the cellular levels of HIF. Here, we used RNA sequencing, qRT-PCR, Taqman assays, and pathway analysis to explore the miRNA and mRNA response of uninfected and KSHV-infected cells to hypoxia, to compare this with the genetic changes seen in chronic latent KSHV infection, and to explore the degree to which hypoxia and KSHV infection interact in modulating mRNA and miRNA expression. We found that the gene expression signatures for KSHV infection and hypoxia have a 34% overlap. Moreover, there were considerable similarities between the genes up-regulated by hypoxia in uninfected (SLK) and in KSHV-infected (SLKK) cells. hsa-miR-210, a HIF-target known to have pro-angiogenic and anti-apoptotic properties, was significantly up-regulated by both KSHV infection and hypoxia using Taqman assays. Interestingly, expression of KSHV-encoded miRNAs was not affected by hypoxia. These results demonstrate that KSHV harnesses a part of the hypoxic cellular response and that a substantial portion of hypoxia-induced changes in cellular gene expression are induced by KSHV infection. Therefore, targeting hypoxic pathways may be a useful way to develop therapeutic strategies for KSHV-related diseases.

  4. Cotranscriptional recruitment of yeast TRAMP complex to intronic sequences promotes optimal pre-mRNA splicing.

    PubMed

    Kong, Ka-Yiu Edwin; Tang, Hei-Man Vincent; Pan, Kewu; Huang, Zhe; Lee, Tsz-Hang Jimmy; Hinnebusch, Alan G; Jin, Dong-Yan; Wong, Chi-Ming

    2014-01-01

    Most unwanted RNA transcripts in the nucleus of eukaryotic cells, such as splicing-defective pre-mRNAs and spliced-out introns, are rapidly degraded by the nuclear exosome. In budding yeast, a number of these unwanted RNA transcripts, including spliced-out introns, are first recognized by the nuclear exosome cofactor Trf4/5p-Air1/2p-Mtr4p polyadenylation (TRAMP) complex before subsequent nuclear-exosome-mediated degradation. However, it remains unclear when spliced-out introns are recognized by TRAMP, and whether TRAMP may have any potential roles in pre-mRNA splicing. Here, we demonstrated that TRAMP is cotranscriptionally recruited to nascent RNA transcripts, with particular enrichment at intronic sequences. Deletion of TRAMP components led to further accumulation of unspliced pre-mRNAs even in a yeast strain defective in nuclear exosome activity, suggesting a novel stimulatory role of TRAMP in splicing. We also uncovered new genetic and physical interactions between TRAMP and several splicing factors, and further showed that TRAMP is required for optimal recruitment of the splicing factor Msl5p. Our study provided the first evidence that TRAMP facilitates pre-mRNA splicing, and we interpreted this as a fail-safe mechanism to ensure the cotranscriptional recruitment of TRAMP before or during splicing to prepare for the subsequent targeting of spliced-out introns to rapid degradation by the nuclear exosome.

  5. Deep sequencing analysis of the developing mouse brain reveals a novel microRNA

    PubMed Central

    2011-01-01

    Background MicroRNAs (miRNAs) are small non-coding RNAs that can exert multilevel inhibition/repression at a post-transcriptional or protein synthesis level during disease or development. Characterisation of miRNAs in adult mammalian brains by deep sequencing has been reported previously. However, to date, no small RNA profiling of the developing brain has been undertaken using this method. We have performed deep sequencing and small RNA analysis of a developing (E15.5) mouse brain. Results We identified the expression of 294 known miRNAs in the E15.5 developing mouse brain, which were mostly represented by let-7 family and other brain-specific miRNAs such as miR-9 and miR-124. We also discovered 4 putative 22-23 nt miRNAs: mm_br_e15_1181, mm_br_e15_279920, mm_br_e15_96719 and mm_br_e15_294354 each with a 70-76 nt predicted pre-miRNA. We validated the 4 putative miRNAs and further characterised one of them, mm_br_e15_1181, throughout embryogenesis. Mm_br_e15_1181 biogenesis was Dicer1-dependent and was expressed in E3.5 blastocysts and E7 whole embryos. Embryo-wide expression patterns were observed at E9.5 and E11.5 followed by a near complete loss of expression by E13.5, with expression restricted to a specialised layer of cells within the developing and early postnatal brain. Mm_br_e15_1181 was upregulated during neurodifferentiation of P19 teratocarcinoma cells. This novel miRNA has been identified as miR-3099. Conclusions We have generated and analysed the first deep sequencing dataset of small RNA sequences of the developing mouse brain. The analysis revealed a novel miRNA, miR-3099, with potential regulatory effects on early embryogenesis, and involvement in neuronal cell differentiation/function in the brain during late embryonic and early neonatal development. PMID:21466694

  6. Comparison of hepatocellular carcinoma miRNA expression profiling as evaluated by next generation sequencing and microarray.

    PubMed

    Murakami, Yoshiki; Tanahashi, Toshihito; Okada, Rina; Toyoda, Hidenori; Kumada, Takashi; Enomoto, Masaru; Tamori, Akihiro; Kawada, Norifumi; Taguchi, Y-h; Azuma, Takeshi

    2014-01-01

    MicroRNA (miRNA) expression profiling has proven useful in diagnosing and understanding the development and progression of several diseases. Microarray is the standard method for analyzing miRNA expression profiles; however, it has several disadvantages, including its limited detection of miRNAs. In recent years, advances in genome sequencing have led to the development of next-generation sequencing (NGS) technologies, which significantly advance genome sequencing speed and discovery. In this study, we compared the expression profiles obtained by next generation sequencing (NGS) with the profiles created using microarray to assess if NGS could produce a more accurate and complete miRNA profile. Total RNA from 14 hepatocellular carcinoma tumors (HCC) and 6 matched non-tumor control tissues were sequenced with Illumina MiSeq 50-bp single-end reads. Micro RNA expression profiles were estimated using miRDeep2 software. As a comparison, miRNA expression profiles for 11 out of 14 HCCs were also established by microarray (Agilent human microRNA microarray). The average total sequencing exceeded 2.2 million reads per sample and of those reads, approximately 57% mapped to the human genome. The average correlation for miRNA expression between microarray and NGS and subtraction were 0.613 and 0.587, respectively, while miRNA expression between technical replicates was 0.976. The diagnostic accuracy of HCC, p-value, and AUC were 90.0%, 7.22×10(-4), and 0.92, respectively. In summary, NGS created an miRNA expression profile that was reproducible and comparable to that produced by microarray. Moreover, NGS discovered novel miRNAs that were otherwise undetectable by microarray. We believe that miRNA expression profiling by NGS can be a useful diagnostic tool applicable to multiple fields of medicine.

  7. Efficient system of homologous RNA recombination in brome mosaic virus: sequence and structure requirements and accuracy of crossovers.

    PubMed Central

    Nagy, P D; Bujarski, J J

    1995-01-01

    Brome mosaic virus (BMV), a tripartite positive-stranded RNA virus of plants engineered to support intersegment RNA recombination, was used for the determination of sequence and structural requirements of homologous crossovers. A 60-nucleotide (nt) sequence, common between wild-type RNA2 and mutant RNA3, supported efficient repair (90%) of a modified 3' noncoding region in the RNA3 segment by homologous recombination with wild-type RNA2 3' noncoding sequences. Deletions within this sequence in RNA3 demonstrated that a nucleotide identity as short as 15 nt can support efficient homologous recombination events, while shorter (5-nt) sequence identity resulted in reduced recombination frequency (5%) within this region. Three or more mismatches within a downstream portion of the common 60-nt RNA3 sequence affected both the incidence of recombination and the distribution of crossover sites, suggesting that besides the length, the extent of sequence identity between two recombining BMV RNAs is an important factor in homologous recombination. Site-directed mutagenesis of the common sequence in RNA3 did not reveal a clear correlation between the stability of predicted secondary structures and recombination activity. This indicates that homologous recombination does not require similar secondary structures between two recombining RNAs at the sites of crossovers. Nearly 20% of homologous recombinants were imprecise (aberrant), containing either nucleotide mismatches, small deletions, or small insertions within the region of crossovers. This implies that homologous RNA recombination is not as accurate as proposed previously. Our results provide experimental evidence that the requirements and thus the mechanism of homologous recombination in BMV differ from those of previously described heteroduplex-mediated nonhomologous recombination (P. D. Nagy and J. J. Bujarski, Proc. Natl. Acad. Sci. USA 90:6390-6394, 1993). PMID:7983703

  8. Diversity, distribution, and evolution of tomato viruses in China uncovered by small RNA sequencing.

    PubMed

    Xu, Chenxi; Sun, Xuepeng; Taylor, Angela; Jiao, Chen; Xu, Yimin; Cai, Xiaofeng; Wang, Xiaoli; Ge, Chenhui; Pan, Guanghui; Wang, Quanxi; Fei, Zhangjun; Wang, Quanhua

    2017-03-22

    Tomato is a major vegetable crop that has tremendous popularity. However, viral disease is still a major factor limiting tomato production. Here we report the tomato virome identified through sequencing small RNAs of 170 field-grown samples collected in China. A total of 22 viruses were identified including both well-documented and newly detected viruses. The tomato viral community is dominated by a few species, and they exhibit polymorphisms and recombination in the genomes with coldspots and hotspots. Most samples were co-infected by multiple viruses and the majority of identified viruses are positive-sense single-stranded RNA viruses. Evolutionary analysis of one of the most dominant tomato viruses, Tomato yellow leaf curl virus (TYLCV), predicts its origin and the time back to its most recent common ancestor. The broadly sampled data has enabled us to identify several unreported viruses in tomato including a completely new virus, which has a genome of ∼13.4 kb and groups with aphid-transmitted viruses in genus Cytorhabdovirus Although both DNA and RNA viruses can trigger the biogenesis of virus-derived small interfering RNAs (vsiRNAs), we show that features such as length distribution, paired distance and base selection bias of vsiRNA sequences reflect different plant Dicer-like proteins and Argonautes involved in vsiRNA biogenesis. Collectively, this study offers insights into host-virus interaction in tomato and provides valuable information to facilitate the management of viral diseases.IMPORTANCE Tomato is an important source of micronutrient in human diet and is extensively consumed in the world. Virus is among the major constrains to tomato production. Categorizing virus species that are capable of infecting tomato and understanding their diversity and evolution are challenging due to difficulties in detecting such fast evolving biological entities. Here we report the landscape of tomato virome in China, the leading country of tomato production. We

  9. DNA and RNA sequencing by nanoscale reading through programmable electrophoresis and nanoelectrode-gated tunneling and dielectric detection

    DOEpatents

    Lee, James W.; Thundat, Thomas G.

    2005-06-14

    An apparatus and method for performing nucleic acid (DNA and/or RNA) sequencing on a single molecule. The genetic sequence information is obtained by probing through a DNA or RNA molecule base by base at nanometer scale as though looking through a strip of movie film. This DNA sequencing nanotechnology has the theoretical capability of performing DNA sequencing at a maximal rate of about 1,000,000 bases per second. This enhanced performance is made possible by a series of innovations including: novel applications of a fine-tuned nanometer gap for passage of a single DNA or RNA molecule; thin layer microfluidics for sample loading and delivery; and programmable electric fields for precise control of DNA or RNA movement. Detection methods include nanoelectrode-gated tunneling current measurements, dielectric molecular characterization, and atomic force microscopy/electrostatic force microscopy (AFM/EFM) probing for nanoscale reading of the nucleic acid sequences.

  10. Harnessing RNA sequencing for global, unbiased evaluation of two new adjuvants for dendritic-cell immunotherapy.

    PubMed

    Mathan, Till S M; Textor, Johannes; Sköld, Annette E; Reinieren-Beeren, Inge; van Oorschot, Tom; Brüning, Mareke; Figdor, Carl G; Buschow, Sonja I; Bakdash, Ghaith; de Vries, I Jolanda M

    2017-02-08

    Effective stimulation of immune cells is crucial for the success of cancer immunotherapies. Current approaches to evaluate the efficiency of stimuli are mainly defined by known flow cytometry-based cell activation or cell maturation markers. This method however does not give a complete overview of the achieved activation state and may leave important side effects unnoticed. Here, we used an unbiased RNA sequencing (RNA-seq)-based approach to compare the capacity of four clinical-grade dendritic cell (DC) activation stimuli used to prepare DC-vaccines composed of various types of DC subsets; the already clinically applied GM-CSF and Frühsommer meningoencephalitis (FSME) prophylactic vaccine and the novel clinical grade adjuvants protamine-RNA complexes (pRNA) and CpG-P. We found that GM-CSF and pRNA had similar effects on their target cells, whereas pRNA and CpG-P induced stronger type I interferon (IFN) expression than FSME. In general, the pathways most affected by all stimuli were related to immune activity and cell migration. GM-CSF stimulation, however, also induced a significant increase of genes related to nonsense-mediated decay, indicating a possible deleterious effect of this stimulus. Taken together, the two novel stimuli appear to be promising alternatives. Our study demonstrates how RNA-seq based investigation of changes in a large number of genes and gene groups can be exploited for fast and unbiased, global evaluation of clinical-grade stimuli, as opposed to the general limited evaluation of a pre-specified set of genes, by which one might miss important biological effects that are detrimental for vaccine efficacy.

  11. A highly efficient method for extracting next-generation sequencing quality RNA from adipose tissue of recalcitrant animal species.

    PubMed

    Sharma, Davinder; Golla, Naresh; Singh, Dheer; Onteru, Suneel Kumar

    2017-04-13

    The next-generation sequencing (NGS) based RNA sequencing (RNA-Seq) and transcriptome profiling offers an opportunity to unveil complex evolutionary processes. Successful RNA-Seq and transcriptome profiling requires a large amount of high-quality RNA. However, NGS-quality RNA isolation is extremely difficult from recalcitrant adipose tissue (AT) with high lipid content and low cell numbers. Further, the amount and biochemical composition of AT lipid varies depending upon the animal species which can pose different degree of resistance to RNA extraction. Currently available approaches may work effectively in one species but can be almost unproductive in another species. Herein, we report a two step protocol for the extraction of NGS quality RNA from AT across a broad range of animal species. This article is protected by copyright. All rights reserved.

  12. The signal sequence coding region promotes nuclear export of mRNA.

    PubMed

    Palazzo, Alexander F; Springer, Michael; Shibata, Yoko; Lee, Chung-Sheng; Dias, Anusha P; Rapoport, Tom A

    2007-12-01

    In eukaryotic cells, most mRNAs are exported from the nucleus by the transcription export (TREX) complex, which is loaded onto mRNAs after their splicing and capping. We have studied in mammalian cells the nuclear export of mRNAs that code for secretory proteins, which are targeted to the endoplasmic reticulum membrane by hydrophobic signal sequences. The mRNAs were injected into the nucleus or synthesized from injected or transfected DNA, and their export was followed by fluorescent in situ hybridization. We made the surprising observation that the signal sequence coding region (SSCR) can serve as a nuclear export signal of an mRNA that lacks an intron or functional cap. Even the export of an intron-containing natural mRNA was enhanced by its SSCR. Like conventional export, the SSCR-dependent pathway required the factor TAP, but depletion of the TREX components had only moderate effects. The SSCR export signal appears to be characterized in vertebrates by a low content of adenines, as demonstrated by genome-wide sequence analysis and by the inhibitory effect of silent adenine mutations in SSCRs. The discovery of an SSCR-mediated pathway explains the previously noted amino acid bias in signal sequences and suggests a link between nuclear export and membrane targeting of mRNAs.

  13. Identification and sequence of the initiation site for rat 45S ribosomal RNA synthesis.

    PubMed Central

    Harrington, C A; Chikaraishi, D M

    1983-01-01

    The transcription initiation site for rat 45S precursor ribosomal RNA synthesis was determined by nuclease protection mapping with two single-strand endonucleases. S1 and mung bean, and one single-strand exonuclease, ExoVII. These experiments were performed with end-labeled ribosomal DNA from double-stranded pBR322 recombinants and from single-stranded M13 recombinants. Results from experiments using both kinds of DNA and all three enzymes showed that the 5' end of 45S RNA mapped to a unique site 125 bases upstream from the Hind III site in the ribosomal DNA gene. The DNA surrounding this site (designated +1) was sequenced from -281 to +641. The entire sequence of this region shows extensive homology to the comparable region of mouse. This includes three stretches of T residues in the non-coding strand between +300 and +630. Two sets of direct repeats adjacent to these T-rich regions are observed. Comparison of the mouse and human ribosomal DNA transcription initiation sites with the rat sequence reported in this paper demonstrates a conserved sequence at +2 to +16, CTGACACGCTGTCCT. This suggests that this region may be important for the initiation of transcription on mammalian ribosomal DNAs. Images PMID:6304628

  14. The RNase P RNA from cyanobacteria: short tandemly repeated repetitive (STRR) sequences are present within the RNase P RNA gene in heterocyst-forming cyanobacteria.

    PubMed Central

    Vioque, A

    1997-01-01

    The RNase P RNA gene (rnpB) from 10 cyanobacteria has been characterized. These new RNAs, together with the previously available ones, provide a comprehensive data set of RNase P RNA from diverse cyanobacterial lineages. All heterocystous cyanobacteria, but none of the non-heterocystous strains analyzed, contain short tandemly repeated repetitive (STRR) sequences that increase the length of helix P12. Site-directed mutagenesis experiments indicate that the STRR sequences are not required for catalytic activity in vitro. STRR sequences seem to have recently and independently invaded the RNase P RNA genes in heterocyst-forming cyanobacteria because closely related strains contain unrelated STRR sequences. Most cyanobacteria RNase P RNAs lack the sequence GGU in the loop connecting helices P15 and P16 that has been established to interact with the 3'-end CCA in precursor tRNA substrates in other bacteria. This character is shared with plastid RNase P RNA. Helix P6 is longer than usual in most cyanobacteria as well as in plastid RNase P RNA. PMID:9254706

  15. Gene arrangement and sequence of the 5S rRNA in Filobasidiella neoformans (Cryptococcus neoformans) as a phylogenetic indicator.

    PubMed

    Kwon-Chung, K J; Chang, Y C

    1994-04-01

    We cloned the 5S rRNA gene and determined its organization in the four genes encoding rRNAs in a ribosomal DNA repeat unit of Filobasidiella neoformans, the teleomorph of Cryptococcus neoformans. The 5S rRNA gene contained 118 nucleotides and was located 1 kb upstream from the 18S rRNA gene within the 8.6-kb fragment of the ribosomal DNA repeat unit. The sequence of the 5S rRNA gene from F. neoformans was more similar to the sequence of the 5S rRNA gene from Tremella mesenterica than to the sequences of the 5S rRNA genes from Filobasidium species. The arrangement of the rRNA genes in F. neoformans closely resembles the arrangement of the rRNA genes in mushrooms such as Schizophyllum commune, Agaricus bisporus, and Coprinus cinereus in that the 5S rRNA-coding region not only is located within the repeat unit that encodes the other rRNAs but also is transcribed in the same direction as the other rRNA genes. This is the first description of the arrangement of rRNA genes in a species belonging to the Heterobasidiomycetes.

  16. The 3'-terminal consensus sequence of rotavirus mRNA is the minimal promoter of negative-strand RNA synthesis.

    PubMed Central

    Wentz, M J; Patton, J T; Ramig, R F

    1996-01-01

    We used an in vitro template-dependent replicase assay (D. Chen, C. Zeng, M. Wentz, M. Gorziglia, M. Estes, and R. Ramig. J. Virol. 68:7030-7039, 1994) to identify the cis-acting signals required for replication of a genome segment 9 template from the group A rotavirus strain OSU. The replicase phenotypes for a panel of templates with internal deletions or 3'-terminal truncations indicated that no essential replication signals were present within the open reading frame and that key elements were present in the 5' and 3' noncoding regions. Chimeric constructs containing portions of viral sequence ligated to a nonviral backbone were generated to further map the regions required for in vitro replication of segment 9. The data from these constructs showed that the 3'-terminal seven nucleotides of the segment 9 mRNA provided the minimum requirement for replication (minimal promoter). Analysis of additional chimeric templates demonstrated that sequences capable of enhancing replication from the minimal promoter were located immediately upstream of the minimal promoter and at the extreme 5' terminus of the template. Mutational analysis of the minimal promoter revealed that the 3'-terminal -CC residues are required for efficient replication. Comparison of the replication levels for templates with guanosines and uridines at nucleotides -4 to -6 from the 3' terminus compared with levels for templates containing neither of these residues at these positions indicated that either or both residues must be present in this region for efficient replication in vitro. PMID:8892905

  17. Spectrum of Text Information Content in the RNA Sequence of the Vesicular Stomatitis Virus

    NASA Astrophysics Data System (ADS)

    Filyukov, Alexander A.

    A new strategy to recognize patterns in the DNA sequences with functional significance is proposed. The strategy is based on the general definition of any individual organism as a Gibbsian ensemble of identical personal DNA molecules. This approach provides application of the methods of statistical thermodynamics of irreversible steady processes to genome informatics. The random processes theory and its Markov chains approximation lead in this approach directly to the definition of the generalized concept of evolution entropy and to the genuine measure of text information content in the sequences. Computer-assisted proofs of the existence of the nonequilibrium steady state conditions in genome molecule were obtained by investigation of the special type balance relations in the vesicular stomatitis virus (VSV) RNA sequence. The main maxima of the text information content were decoded and denominated. The established coding principles are connected with deviations from equilibrium conditions and from equipartition.

  18. An automated approach to prepare tissue-derived spatially barcoded RNA-sequencing libraries

    PubMed Central

    Jemt, Anders; Salmén, Fredrik; Lundmark, Anna; Mollbrink, Annelie; Fernández Navarro, José; Ståhl, Patrik L.; Yucel-Lindberg, Tülay; Lundeberg, Joakim

    2016-01-01

    Sequencing the nucleic acid content of individual cells or specific biological samples is becoming increasingly common. This drives the need for robust, scalable and automated library preparation protocols. Furthermore, an increased understanding of tissue heterogeneity has lead to the development of several unique sequencing protocols that aim to retain or infer spatial context. In this study, a protocol for retaining spatial information of transcripts has been adapted to run on a robotic workstation. The method spatial transcriptomics is evaluated in terms of robustness and variability through the preparation of reference RNA, as well as through preparation and sequencing of six replicate sections of a gingival tissue biopsy from a patient with periodontitis. The results are reduced technical variability between replicates and a higher throughput, processing four times more samples with less than a third of the hands on time, compared to the standard protocol. PMID:27849009

  19. The Rhinovirus Subviral A-Particle Exposes 3′-Terminal Sequences of Its Genomic RNA

    PubMed Central

    Harutyunyan, Shushan; Kowalski, Heinrich

    2014-01-01

    ABSTRACT Enteroviruses, which represent a large genus within the family Picornaviridae, undergo important conformational modifications during infection of the host cell. Once internalized by receptor-mediated endocytosis, receptor binding and/or the acidic endosomal environment triggers the native virion to expand and convert into the subviral (altered) A-particle. The A-particle is lacking the internal capsid protein VP4 and exposes N-terminal amphipathic sequences of VP1, allowing for its direct interaction with a lipid bilayer. The genomic single-stranded (+)RNA then exits through a hole close to a 2-fold axis of icosahedral symmetry and passes through a pore in the endosomal membrane into the cytosol, leaving behind the empty shell. We demonstrate that in vitro acidification of a prototype of the minor receptor group of common cold viruses, human rhinovirus A2 (HRV-A2), also results in egress of the poly(A) tail of the RNA from the A-particle, along with adjacent nucleotides totaling ∼700 bases. However, even after hours of incubation at pH 5.2, 5′-proximal sequences remain inside the capsid. In contrast, the entire RNA genome is released within minutes of exposure to the acidic endosomal environment in vivo. This finding suggests that the exposed 3′-poly(A) tail facilitates the positioning of the RNA exit site onto the putative channel in the lipid bilayer, thereby preventing the egress of viral RNA into the endosomal lumen, where it may be degraded. IMPORTANCE For host cell infection, a virus transfers its genome from within the protective capsid into the cytosol; this requires modifications of the viral shell. In common cold viruses, exit of the RNA genome is prepared by the acidic environment in endosomes converting the native virion into the subviral A-particle. We demonstrate that acidification in vitro results in RNA exit starting from the 3′-terminal poly(A). However, the process halts as soon as about 700 bases have left the viral shell

  20. Using deep RNA sequencing for the structural annotation of the laccaria bicolor mycorrhizal transcriptome.

    SciTech Connect

    Larsen, P. E.; Trivedi, G.; Sreedasyam, A.; Lu, V.; Podila, G. K.; Collart, F. R.; Biosciences Division; Univ. of Alabama

    2010-07-06

    Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there

  1. Mapping a nucleolar targeting sequence of an RNA binding nucleolar protein, Nop25

    SciTech Connect

    Fujiwara, Takashi; Suzuki, Shunji . E-mail: suzukis@yamanashi.ac.jp; Kanno, Motoko; Sugiyama, Hironobu; Takahashi, Hisaaki; Tanaka, Junya

    2006-06-10

    Nop25 is a putative RNA binding nucleolar protein associated with rRNA transcription. The present study was undertaken to determine the mechanism of Nop25 localization in the nucleolus. Deletion experiments of Nop25 amino acid sequence showed Nop25 to contain a nuclear targeting sequence in the N-terminal and a nucleolar targeting sequence in the C-terminal. By expressing derivative peptides from the C-terminal as GFP-fusion proteins in the cells, a lysine and arginine residue-enriched peptide (KRKHPRRAQDSTKKPPSATRTSKTQRRRR) allowed a GFP-fusion protein to be transported and fully retained in the nucleolus. When the peptide was fused with cMyc epitope and expressed in the cells, a cMyc epitope was then detected in the nucleolus. Nop25 did not localize in the nucleolus by deletion of the peptide from Nop25. Furthermore, deletion of a subdomain (KRKHPRRAQ) in the peptide or amino acid substitution of lysine and arginine residues in the subdomain resulted in the loss of Nop25 nucleolar localization. These results suggest that the lysine and arginine residue-enriched peptide is the most prominent nucleolar targeting sequence of Nop25 and that the long stretch of basic residues might play an important role in the nucleolar localization of Nop25. Although Nop25 contained putative SUMOylation, phosphorylation and glycosylation sites, the amino acid substitution in these sites had no effect on the nucleolar localization, thus suggesting that these post-translational modifications did not contribute to the localization of Nop25 in the nucleolus. The treatment of the cells, which expressed a GFP-fusion protein with a nucleolar targeting sequence of Nop25, with RNase A resulted in a complete dislocation of the protein from the nucleolus. These data suggested that the nucleolar targeting sequence might therefore play an important role in the binding of Nop25 to RNA molecules and that the RNA binding of Nop25 might be essential for the nucleolar localization of Nop25.

  2. Chromosomal instability in Afrotheria: fragile sites, evolutionary breakpoints and phylogenetic inference from genome sequence assemblies

    PubMed Central

    Ruiz-Herrera, Aurora; Robinson, Terence J

    2007-01-01

    Background Extant placental mammals are divided into four major clades (Laurasiatheria, Supraprimates, Xenarthra and Afrotheria). Given that Afrotheria is generally thought to root the eutherian tree in phylogenetic analysis of large nuclear gene data sets, the study of the organization of the genomes of afrotherian species provides new insights into the dynamics of mammalian chromosomal evolution. Here we test if there are chromosomal bands with a high tendency to break and reorganize in Afrotheria, and by analyzing the expression of aphidicolin-induced common fragile sites in three afrotherian species, whether these are coincidental with recognized evolutionary breakpoints. Results We described 29 fragile sites in the aardvark (OAF) genome, 27 in the golden mole (CAS), and 35 in the elephant-shrew (EED) genome. We show that fragile sites are conserved among afrotherian species and these are correlated with evolutionary breakpoints when compared to the human (HSA) genome. Inddition, by computationally scanning the newly released opossum (Monodelphis domestica) and chicken sequence assemblies for use as outgroups to Placentalia, we validate the HSA 3/21/5 chromosomal synteny as a rare genomic change that defines the monophyly of this ancient African clade of mammals. On the other hand, support for HSA 1/19p, which is also thought to underpin Afrotheria, is currently ambiguous. Conclusion We provide evidence that (i) the evolutionary breakpoints that characterise human syntenies detected in the basal Afrotheria correspond at the chromosomal band level with fragile sites, (ii) that HSA 3p/21 was in the amniote ancestor (i.e., common to turtles, lepidosaurs, crocodilians, birds and mammals) and was subsequently disrupted in the lineage leading to marsupials. Its expansion to include HSA 5 in Afrotheria is unique and (iii) that its fragmentation to HSA 3p/21 + HSA 5/21 in elephant and manatee was due to a fission within HSA 21 that is probably shared by all

  3. The nucleotide sequence of blue-green algae phenylalanine-tRNA and the evolutionary origin of chloroplasts.

    PubMed Central

    Hecker, L I; Barnett, W E; Lin, F K; Furr, T D; Heckman, J E; RajBhandary, U L; Chang, S H

    1982-01-01

    Phenylalanine tRNA from the blue-green alga, Agmenellum quadruplicatum, has been purified to homogeneity. The nucleotide sequence of this tRNA was determined to be: (see tests) Comparisons of the sequence and the modified nucleosides of this tRNA with those of other tRNAPhes thus far sequenced, indicate that this blue green algal tRNAPhe is typically prokaryotic and closely resembles the chloroplast tRNAPhes of higher plants and Euglena. The significance of this observation to the evolutionary origin of chloroplasts is discussed. Images PMID:6817301

  4. Identification of Genes Potentially Associated with the Fertility Instability of S-Type Cytoplasmic Male Sterility in Maize via Bulked Segregant RNA-Seq

    PubMed Central

    Xing, Jinfeng; Zhao, Yanxin; Zhang, Ruyang; Li, Chunhui; Duan, Minxiao; Luo, Meijie; Shi, Zi; Zhao, Jiuran

    2016-01-01

    S-type cytoplasmic male sterility (CMS-S) is the largest group among the three major types of CMS in maize. CMS-S exhibits fertility instability as a partial fertility restoration in a specific nuclear genetic background, which impedes its commercial application in hybrid breeding programs. The fertility instability phenomenon of CMS-S is controlled by several minor quantitative trait locus (QTLs), but not the major nuclear fertility restorer (Rf3). However, the gene mapping of these minor QTLs and the molecular mechanism of the genetic modifications are still unclear. Using completely sterile and partially rescued plants of fertility instable line (FIL)-B, we performed bulk segregant RNA-Seq and identified six potential associated genes in minor effect QTLs contributing to fertility instability. Analyses demonstrate that these potential associated genes may be involved in biological processes, such as floral organ differentiation and development regulation, energy metabolism and carbohydrates biosynthesis, which results in a partial anther exsertion and pollen fertility restoration in the partially rescued plants. The single nucleotide polymorphisms (SNPs) identified in two potential associated genes were validated to be related to the fertility restoration phenotype by KASP marker assays. This novel knowledge contributes to the understanding of the molecular mechanism of the partial fertility restoration of CMS-S in maize and thus helps to guide the breeding programs. PMID:27669430

  5. Detecting cooperative sequences in the binding of RNA Polymerase-II

    NASA Astrophysics Data System (ADS)

    Glass, Kimberly; Rozenberg, Julian; Girvan, Michelle; Losert, Wolfgang; Ott, Ed; Vinson, Charles

    2008-03-01

    Regulation of the expression level of genes is a key biological process controlled largely by the 1000 base pair (bp) sequence preceding each gene (the promoter region). Within that region transcription factor binding sites (TFBS), 5-10 bp long sequences, act individually or cooperate together in the recruitment of, and therefore subsequent gene transcription by, RNA Polymerase-II (RNAP). We have measured the binding of RNAP to promoters on a genome-wide basis using Chromatin Immunoprecipitation (ChIP-on-Chip) microarray assays. Using all 8-base pair long sequences as a test set, we have identified the DNA sequences that are enriched in promoters with high RNAP binding values. We are able to demonstrate that virtually all sequences enriched in such promoters contain a CpG dinucleotide, indicating that TFBS that contain the CpG dinucleotide are involved in RNAP binding to promoters. Further analysis shows that the presence of pairs of CpG containing sequences cooperate to enhance the binding of RNAP to the promoter.

  6. Molecular cloning of mRNA sequences encoding rat lens crystallins.

    PubMed Central

    Dodemont, H J; Andreoli, P M; Moormann, R J; Ramaekers, F C; Schoenmakers, J G; Bloemendal, H

    1981-01-01

    To provide access to crystallin-specific DNA sequences, we have constructed plasmid clones bearing duplex DNA sequences complementary to crystallin mRNAs isolated from rat lens. Optimization of the cDNA reaction conditions enabled us to fractionate three double-stranded (ds) cDNA groups. Molecular cloning of dC-tailed ds cDNAs into the Pst I site of dG-tailed pBR322 yielded crystallin-specific clones of each group. By means of positive hybridization selection and translation, recombinant plasmids containing cDNA sequences coding for rat lens polypeptides from alpha-, beta-, and gamma-crystallins could be identified. The established cDNA clones have been used for a blot-hybridization analysis to map the crystallin mRNAs from which they originated. Both procedures revealed a high degree of homology between the gamma-crystallin sequences. From the beta-crystallin class, the beta H-specific cDNA coding for the beta B1a polypeptide was obtained. The alpha A-chain clone did not show any cross-hybridization to the alpha B-chain mRNA despite the existence of 60% homology between the corresponding gene products. As this clone hybridized to both alpha A2 and alpha AIns mRNAs, sequence analysis was applied for further characterization. The results showed that the cloned cDNA corresponds to the alpha A2 sequence exclusively. Images PMID:6946472

  7. mRNA deep sequencing reveals 75 new genes and a complex transcriptional landscape in Mimivirus.

    PubMed

    Legendre, Matthieu; Audic, Stéphane; Poirot, Olivier; Hingamp, Pascal; Seltzer, Virginie; Byrne, Deborah; Lartigue, Audrey; Lescot, Magali; Bernadac, Alain; Poulain, Julie; Abergel, Chantal; Claverie, Jean-Michel

    2010-05-01

    Mimivirus, a virus infecting Acanthamoeba, is the prototype of the Mimiviridae, the latest addition to the nucleocytoplasmic large DNA viruses. The Mimivirus genome encodes close to 1000 proteins, many of them never before encountered in a virus, such as four amino-acyl tRNA synthetases. To explore the physiology of this exceptional virus and identify the genes involved in the building of its characteristic intracytoplasmic "virion factory," we coupled electron microscopy observations with the massively parallel pyrosequencing of the polyadenylated RNA fractions of Acanthamoeba castellanii cells at various time post-infection. We generated 633,346 reads, of which 322,904 correspond to Mimivirus transcripts. This first application of deep mRNA sequencing (454 Life Sciences [Roche] FLX) to a large DNA virus allowed the precise delineation of the 5' and 3' extremities of Mimivirus mRNAs and revealed 75 new transcripts including several noncoding RNAs. Mimivirus genes are expressed across a wide dynamic range, in a finely regulated manner broadly described by three main temporal classes: early, intermediate, and late. This RNA-seq study confirmed the AAAATTGA sequence as an early promoter element, as well as the presence of palindromes at most of the polyadenylation sites. It also revealed a new promoter element correlating with late gene expression, which is also prominent in Sputnik, the recently described Mimivirus "virophage." These results-validated genome-wide by the hybridization of total RNA extracted from infected Acanthamoeba cells on a tiling array (Agilent)--will constitute the foundation on which to build subsequent functional studies of the Mimivirus/Acanthamoeba system.

  8. Selection and Characterization of Pre-mRNA Splicing Enhancers: Identification of Novel SR Protein-Specific Enhancer Sequences

    PubMed Central

    Schaal, Thomas D.; Maniatis, Tom

    1999-01-01

    Splicing enhancers are RNA sequences required for accurate splice site recognition and the control of alternative splicing. In this study, we used an in vitro selection procedure to identify and characterize novel RNA sequences capable of functioning as pre-mRNA splicing enhancers. Randomized 18-nucleotide RNA sequences were inserted downstream from a Drosophila doublesex pre-mRNA enhancer-dependent splicing substrate. Functional splicing enhancers were then selected by multiple rounds of in vitro splicing in nuclear extracts, reverse transcription, and selective PCR amplification of the spliced products. Characterization of the selected splicing enhancers revealed a highly heterogeneous population of sequences, but we identified six classes of recurring degenerate sequence motifs five to seven nucleotides in length including novel splicing enhancer sequence motifs. Analysis of selected splicing enhancer elements and other enhancers in S100 complementation assays led to the identification of individual enhancers capable of being activated by specific serine/arginine (SR)-rich splicing factors (SC35, 9G8, and SF2/ASF). In addition, a potent splicing enhancer sequence isolated in the selection specifically binds a 20-kDa SR protein. This enhancer sequence has a high level of sequence homology with a recently identified RNA-protein adduct that can be immunoprecipitated with an SRp20-specific antibody. We conclude that distinct classes of selected enhancers are activated by specific SR proteins, but there is considerable sequence degeneracy within each class. The results presented here, in conjunction with previous studies, reveal a remarkably broad spectrum of RNA sequences capable of binding specific SR proteins and/or functioning as SR-specific splicing enhancers. PMID:10022858

  9. Deep RNA Sequencing of the Skeletal Muscle Transcriptome in Swimming Fish

    PubMed Central

    Palstra, Arjan P.; Beltran, Sergi; Burgerhout, Erik; Brittijn, Sebastiaan A.; Magnoni, Leonardo J.; Henkel, Christiaan V.; Jansen, Hans J.; van den Thillart, Guido E. E. J. M.; Spaink, Herman P.; Planas, Josep V.

    2013-01-01

    Deep RNA sequencing (RNA-seq) was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss) with the specific objective to identify expressed genes and quantify the transcriptomic effects of swimming-induced exercise. Pubertal autumn-spawning seawater-raised female rainbow trout were rested (n = 10) or swum (n = 10) for 1176 km at 0.75 body-lengths per second in a 6,000-L swim-flume under reproductive conditions for 40 days. Red and white muscle RNA of exercised and non-exercised fish (4 lanes) was sequenced and resulted in 15–17 million reads per lane that, after de novo assembly, yielded 149,159 red and 118,572 white muscle contigs. Most contigs were annotated using an iterative homology search strategy against salmonid ESTs, the zebrafish Danio rerio genome and general Metazoan genes. When selecting for large contigs (>500 nucleotides), a number of novel rainbow trout gene sequences were identified in this study: 1,085 and 1,228 novel gene sequences for red and white muscle, respectively, which included a number of important molecules for skeletal muscle function. Transcriptomic analysis revealed that sustained swimming increased transcriptional activity in skeletal muscle and specifically an up-regulation of genes involved in muscle growth and developmental processes in white muscle. The unique collection of transcripts will contribute to our understanding of red and white muscle physiology, specifically during the long-term reproductive migration of salmonids. PMID:23308156

  10. SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments

    PubMed Central

    Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric

    2014-01-01

    This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee. PMID:24972831

  11. Delivery of siRNA using ternary complexes containing branched cationic peptides: the role of peptide sequence, branching and targeting.

    PubMed

    Kudsiova, Laila; Welser, Katharina; Campbell, Frederick; Mohammadi, Atefeh; Dawson, Natalie; Cui, Lili; Hailes, Helen C; Lawrence, M Jayne; Tabor, Alethea B

    2016-03-01

    Ternary nanocomplexes, composed of bifunctional cationic peptides, lipids and siRNA, as delivery vehicles for siRNA have been investigated. The study is the first to determine the optimal sequence and architecture of the bifunctional cationic peptide used for siRNA packaging and delivery using lipopolyplexes. Specifically three series of cationic peptides of differing sequence, degrees of branching and cell-targeting sequences were co-formulated with siRNA and vesicles prepared from a 1 : 1 molar ratio of the cationic lipid DOTMA and the helper lipid, DOPE. The level of siRNA knockdown achieved in the human alveolar cell line, A549-luc cells, in both reduced serum and in serum supplemented media was evaluated, and the results correlated to the nanocomplex structure (established using a range of physico-chemical tools, namely small angle neutron scattering, transmission electron microscopy, dynamic light scattering and zeta potential measurement); the conformational properties of each component (circular dichroism); the degree of protection of the siRNA in the lipopolyplex (using gel shift assays) and to the cellular uptake, localisation and toxicity of the nanocomplexes (confocal microscopy). Although the size, charge, structure and stability of the various lipopolyplexes were broadly similar, it was clear that lipopolyplexes formulated from branched peptides containing His-Lys sequences perform best as siRNA delivery agents in serum, with protection of the siRNA in serum balanced against efficient release of the siRNA into the cytoplasm of the cell.

  12. Crystallographic Analysis of Rotavirus NSP2-RNA Complex Reveals Specific Recognition of 5′ GG Sequence for RTPase Activity

    PubMed Central

    Hu, Liya; Chow, Dar-Chone; Patton, John T.; Palzkill, Timothy; Estes, Mary K.

    2012-01-01

    Rotavirus nonstructural protein NSP2, a functional octamer, is critical for the formation of viroplasms, which are exclusive sites for replication and packaging of the segmented double-stranded RNA (dsRNA) rotavirus genome. As a component of replication intermediates, NSP2 is also implicated in various replication-related activities. In addition to sequence-independent single-stranded RNA-binding and helix-destabilizing activities, NSP2 exhibits monomer-associated nucleoside and 5′ RNA triphosphatase (NTPase/RTPase) activities that are mediated by a conserved H225 residue within a narrow enzymatic cleft. Lack of a 5′ γ-phosphate is a common feature of the negative-strand RNA [(−)RNA] of the packaged dsRNA segments in rotavirus. Strikingly, all (−)RNAs (of group A rotaviruses) have a 5′ GG dinucleotide sequence. As the only rotavirus protein with 5′ RTPase activity, NSP2 is implicated in the removal of the γ-phosphate from the rotavirus (−)RNA. To understand how NSP2, despite its sequence-independent RNA-binding property, recognizes (−)RNA to hydrolyze the γ-phosphate within the catalytic cleft, we determined a crystal structure of NSP2 in complex with the 5′ consensus sequence of minus-strand rotavirus RNA. Our studies show that the 5′ GG of the bound oligoribonucleotide interacts extensively with highly conserved residues in the NSP2 enzymatic cleft. Although these residues provide GG-specific interactions, surface plasmon resonance studies suggest that the C-terminal helix and other basic residues outside the enzymatic cleft account for sequence-independent RNA binding of NSP2. A novel observation from our studies, which may have implications in viroplasm formation, is that the C-terminal helix of NSP2 exhibits two distinct conformations and engages in domain-swapping interactions, which result in the formation of NSP2 octamer chains. PMID:22811529

  13. Sequence-based discrimination of protein-RNA interacting residues using a probabilistic approach.

    PubMed

    Pai, Priyadarshini P; Dash, Tirtharaj; Mondal, Sukanta

    2017-04-07

    Protein interactions with ribonucleic acids (RNA) are well-known to be crucial for a wide range of cellular processes such as transcriptional regulation, protein synthesis or translation, and post-translational modifications. Identification of the RNA-interacting residues can provide insights into these processes and aid in relevant biotechnological manipulations. Owing to their eventual potential in combating diseases and industrial production, several computational attempts have been made over years using sequence- and structure-based information. Recent comparative studies suggest that despite these developments, many problems are faced with respect to the usability, prerequisites, and accessibility of various tools, thereby calling for an alternative approach and perspective supplementation in the prediction scenario. With this motivation, in this paper, we propose the use of a simple-yet-efficient conditional probabilistic approach based on the application of local occurrence of amino acids in the interacting region in a non-numeric sequence feature space, for discriminating between RNA interacting and non-interacting residues. The proposed method has been meticulously tested for robustness using a cross-estimation method showing MCC of 0.341 and F- measure of 66.84%. Upon exploring large scale applications using benchmark datasets available to date, this approach showed an encouraging performance comparable with the state-of-art. The software is available at https://github.com/ABCgrp/DORAEMON.

  14. Combined heat shock protein 90 and ribosomal RNA sequence phylogeny supports multiple replacements of dinoflagellate plastids.

    PubMed

    Shalchian-Tabrizi, Kamran; Minge, Marianne A; Cavalier-Smith, Tom; Nedreklepp, Joachim M; Klaveness, Dag; Jakobsen, Kjetill S

    2006-01-01

    Dinoflagellates harbour diverse plastids obtained from several algal groups, including haptophytes, diatoms, cryptophytes, and prasinophytes. Their major plastid type with the accessory pigment peridinin is found in the vast majority of photosynthetic species. Some species of dinoflagellates have other aberrantly pigmented plastids. We sequenced the nuclear small subunit (SSU) ribosomal RNA (rRNA) gene of the "green" dinoflagellate Gymnodinium chlorophorum and show that it is sister to Lepidodinium viride, indicating that their common ancestor obtained the prasinophyte (or other green alga) plastid in one event. As the placement of dinoflagellate species that acquired green algal or haptophyte plastids is unclear from small and large subunit (LSU) rRNA trees, we tested the usefulness of the heat shock protein (Hsp) 90 gene for dinoflagellate phylogeny by sequencing it from four species with aberrant plastids (G. chlorophorum, Karlodinium micrum, Karenia brevis, and Karenia mikimotoi) plus Alexandrium tamarense, and constructing phylogenetic trees for Hsp90 and rRNAs, separately and together. Analyses of the Hsp90 and concatenated data suggest an ancestral origin of the peridinin-containing plastid, and two independent replacements of the peridinin plastid soon after the early radiation of the dinoflagellates. Thus, the Hsp90 gene seems to be a promising phylogenetic marker for dinoflagellate phylogeny.

  15. The Role of 16S rRNA Gene Sequencing in Confirmation of Suspected Neonatal Sepsis

    PubMed Central

    El Gawhary, Somaia; El-Anany, Mervat; Ali, Doaa; El Gameel, El Qassem

    2016-01-01

    Different molecular assays for the detection of bacterial DNA in the peripheral blood represented a diagnostic tool for neonatal sepsis. We targeted to evaluate the role of 16S rRNA gene sequencing to screen for bacteremia to confirm suspected neonatal sepsis (NS) and compare with risk factors and septic screen testing. Sixty-two neonates with suspected NS were enrolled. White blood cells count, I/T ratio, C-reactive protein, blood culture and 16S rRNA sequencing were performed. Blood culture was positive in 26% of cases, and PCR was positive in 26% of cases. Evaluation of PCR for the diagnosis of NS showed sensitivity 62.5%, specificity 86.9%, PPV 62.5%, NPV 86.9% and accuracy of 79.7%. 16S rRNA PCR increased the sensitivity of detecting bacterial DNA in newborns with signs of sepsis from 26 to 35.4%, and its use can be limited to cases with the most significant risk factors and positive septic screen. PMID:26494728

  16. The Role of 16S rRNA Gene Sequencing in Confirmation of Suspected Neonatal Sepsis.

    PubMed

    El Gawhary, Somaia; El-Anany, Mervat; Hassan, Reem; Ali, Doaa; El Gameel, El Qassem

    2016-02-01

    Different molecular assays for the detection of bacterial DNA in the peripheral blood represented a diagnostic tool for neonatal sepsis. We targeted to evaluate the role of 16S rRNA gene sequencing to screen for bacteremia to confirm suspected neonatal sepsis (NS) and compare with risk factors and septic screen testing. Sixty-two neonates with suspected NS were enrolled. White blood cells count, I/T ratio, C-reactive protein, blood culture and 16S rRNA sequencing were performed. Blood culture was positive in 26% of cases, and PCR was positive in 26% of cases. Evaluation of PCR for the diagnosis of NS showed sensitivity 62.5%, specificity 86.9%, PPV 62.5%, NPV 86.9% and accuracy of 79.7%. 16S rRNA PCR increased the sensitivity of detecting bacterial DNA in newborns with signs of sepsis from 26 to 35.4%, and its use can be limited to cases with the most significant risk factors and positive septic screen.

  17. Deep sequencing reveals small RNA characterization of invasive micropapillary carcinomas of the breast.

    PubMed

    Li, Shuai; Yang, Cuicui; Zhai, Lili; Zhang, Wenwei; Yu, Jing; Gu, Feng; Lang, Ronggang; Fan, Yu; Gong, Meihua; Zhang, Xiuqing; Fu, Li

    2012-11-01

    Invasive micropapillary carcinoma (IMPC) is an uncommon histological type of breast cancer. IMPC has a special growth pattern and a more aggressive behavior than invasive ductal carcinomas of no special types (IDC-NSTs). microRNAs are a large class of non-coding RNAs involved in the regulation of various biological processes. Here, we analyzed the small RNA transcriptomes of five formalin-fixed paraffin-embedded (FFPE) pure IMPC samples and five FFPE IDC-NSTs samples by means of next-generation sequencing, generating a total of >170,000,000 clean reads. In an unsupervised cluster analysis, differently expressed miRNAs generated a tree with clear distinction between IMPC and IDC-NSTs classes. Paired fresh-frozen and FFPE specimens showed very similar miRNA expression profiles. By means of RT-qPCR, we further investigated miRNA expression in more IMPC (n = 22) and IDC-NSTs (n = 24) FFPE samples and found let-7b, miR-30c, miR-148a, miR-181a, miR-181a*, and miR-181b were significantly differently expressed between the two groups. We also elucidated several features of miRNA in these breast cancer tissues including 5' variability, miRNA editing, and 3' untemplated addition. Our findings will lead to further understanding of the invasive potency of IMPC and gain an insight into the diversity and complexity of small RNA molecules in breast cancer tissues.

  18. MicroRNA deep-sequencing reveals master regulators of follicular and papillary thyroid tumors.

    PubMed

    Mancikova, Veronika; Castelblanco, Esmeralda; Pineiro-Yanez, Elena; Perales-Paton, Javier; de Cubas, Aguirre A; Inglada-Perez, Lucia; Matias-Guiu, Xavier; Capel, Ismael; Bella, Maria; Lerma, Enrique; Riesco-Eizaguirre, Garcilaso; Santisteban, Pilar; Maravall, Francisco; Mauricio, Didac; Al-Shahrour, Fatima; Robledo, Mercedes

    2015-06-01

    MicroRNA deregulation could be a crucial event in thyroid carcinogenesis. However, current knowledge is based on studies that have used inherently biased methods. Thus, we aimed to define in an unbiased way a list of deregulated microRNAs in well-differentiated thyroid cancer in order to identify diagnostic and prognostic markers. We performed a microRNA deep-sequencing study using the largest well-differentiated thyroid tumor collection reported to date, comprising 127 molecularly characterized tumors with follicular or papillary patterns of growth and available clinical follow-up data, and 17 normal tissue samples. Furthermore, we integrated microRNA and gene expression data for the same tumors to propose targets for the novel molecules identified. Two main microRNA expression profiles were identified: one common for follicular-pattern tumors, and a second for papillary tumors. Follicular tumors showed a notable overexpression of several members of miR-515 family, and downregulation of the novel microRNA miR-1247. Among papillary tumors, top upregulated microRNAs were miR-146b and the miR-221~222 cluster, while miR-1179 was downregulated. BRAF-positive samples displayed extreme downregulation of miR-7 and -204. The identification of the predicted targets for the novel molecules gave insights into the proliferative potential of the transformed follicular cell. Finally, by integrating clinical follow-up information with microRNA expression, we propose a prediction model for disease relapse based on expression of two miRNAs (miR-192 and let-7a) and several other clinicopathological features. This comprehensive study complements the existing knowledge about deregulated microRNAs in the development of well-differentiated thyroid cancer and identifies novel markers associated with recurrence-free survival.

  19. The binding of TIA-1 to RNA C-rich sequences is driven by its C-terminal RRM domain.

    PubMed

    Cruz-Gallardo, Isabel; Aroca, Ángeles; Gunzburg, Menachem J; Sivakumaran, Andrew; Yoon, Je-Hyun; Angulo, Jesús; Persson, Cecilia; Gorospe, Myriam; Karlsson, B Göran; Wilce, Jacqueline A; Díaz-Moreno, Irene

    2014-01-01

    T-cell intracellular antigen-1 (TIA-1) is a key DNA/RNA binding protein that regulates translation by sequestering target mRNAs in stress granules (SG) in response to stress conditions. TIA-1 possesses three RNA recognition motifs (RRM) along with a glutamine-rich domain, with the central domains (RRM2 and RRM3) acting as RNA binding platforms. While the RRM2 domain, which displays high affinity for U-rich RNA sequences, is primarily responsible for interaction with RNA, the contribution of RRM3 to bind RNA as well as the target RNA sequences that it binds preferentially are still unknown. Here we combined nuclear magnetic resonance (NMR) and surface plasmon resonance (SPR) techniques to elucidate the sequence specificity of TIA-1 RRM3. With a novel approach using saturation transfer difference NMR (STD-NMR) to quantify protein-nucleic acids interactions, we demonstrate that isolated RRM3 binds to both C- and U-rich stretches with micromolar affinity. In combination with RRM2 and in the context of full-length TIA-1, RRM3 significantly enhanced the binding to RNA, particularly to cytosine-rich RNA oligos, as assessed by biotinylated RNA pull-down analysis. Our findings provide new insight into the role of RRM3 in regulating TIA-1 binding to C-rich stretches, that are abundant at the 5' TOPs (5' terminal oligopyrimidine tracts) of mRNAs whose translation is repressed under stress situations.

  20. CLUSTOM: A Novel Method for Clustering 16S rRNA Next Generation Sequences by Overlap Minimization

    PubMed Central

    Kim, Byung Kwon; Yu, Dong Su; Hou, Bo Kyeng; Caetano-Anollés, Gustavo; Hong, Soon Gyu; Kim, Kyung Mo

    2013-01-01

    The recent nucleic acid sequencing revolution driven by shotgun and high-throughput technologies has led to a rapid increase in the number of sequences for microbial communities. The availability of 16S ribosomal RNA (rRNA) gene sequences from a multitude of natural environments now offers a unique opportunity to study microbial diversity and community structure. The large volume of sequencing data however makes it time consuming to assign individual sequences to phylotypes by searching them against public databases. Since ribosomal sequences have diverged across prokaryotic species, they can be grouped into clusters that represent operational taxonomic units. However, available clustering programs suffer from overlap of sequence spaces in adjacent clusters. In natural environments, gene sequences are homogenous within species but divergent between species. This evolutionary constraint results in an uneven distribution of genetic distances of genes in sequence space. To cluster 16S rRNA sequences more accurately, it is therefore essential to select core sequences that are located at the centers of the distributions represented by the genetic distance of sequences in taxonomic units. Based on this idea, we here describe a novel sequence clustering algorithm named CLUSTOM that minimizes the overlaps between adjacent clusters. The performance of this algorithm was evaluated in a comparative exercise with existing programs, using the reference sequences of the SILVA database as well as published pyrosequencing datasets. The test revealed that our algorithm achieves higher accuracy than ESPRIT-Tree and mothur, few of the best clustering algorithms. Results indicate that the concept of an uneven distribution of sequence distances can effectively and successfully cluster 16S rRNA gene sequences. The algorithm of CLUSTOM has been implemented both as a web and as a standalone command line application, which are available at http://clustom.kribb.re.kr. PMID:23650520

  1. RNA transcript sequencing reveals inorganic sulfur compound oxidation pathways in the acidophile Acidithiobacillus ferrivorans.

    PubMed

    Christel, Stephan; Fridlund, Jimmy; Buetti-Dinh, Antoine; Buck, Moritz; Watkin, Elizabeth L; Dopson, Mark

    2016-04-01

    Acidithiobacillus ferrivorans is an acidophile implicated in low-temperature biomining for the recovery of metals from sulfide minerals. Acidithiobacillus ferrivorans obtains its energy from the oxidation of inorganic sulfur compounds, and genes encoding several alternative pathways have been identified. Next-generation sequencing of At. ferrivorans RNA transcripts identified the genes coding for metabolic and electron transport proteins for energy conservation from tetrathionate as electron donor. RNA transcripts suggested that tetrathionate was hydrolyzed by the tetH1 gene product to form thiosulfate, elemental sulfur and sulfate. Despite two of the genes being truncated, RNA transcripts for the SoxXYZAB complex had higher levels than for thiosulfate quinone oxidoreductase (doxDAgenes). However, a lack of heme-binding sites in soxX suggested that DoxDA was responsible for thiosulfate metabolism. Higher RNA transcript counts also suggested that elemental sulfur was metabolized by heterodisulfide reductase (hdrgenes) rather than sulfur oxygenase reductase (sor). The sulfite produced as a product of heterodisulfide reductase was suggested to be oxidized by a pathway involving the sat gene product or abiotically react with elemental sulfur to form thiosulfate. Finally, several electron transport complexes were involved in energy conservation. This study has elucidated the previously unknown At. ferrivorans tetrathionate metabolic pathway that is important in biomining.

  2. Identifying and removing the cell-cycle effect from single-cell RNA-Sequencing data

    PubMed Central

    Barron, Martin; Li, Jun

    2016-01-01

    Single-cell RNA-Sequencing (scRNA-Seq) is a revolutionary technique for discovering and describing cell types in heterogeneous tissues, yet its measurement of expression often suffers from large systematic bias. A major source of this bias is the cell cycle, which introduces large within-cell-type heterogeneity that can obscure the differences in expression between cell types. The current method for removing the cell-cycle effect is unable to effectively identify this effect and has a high risk of removing other biological components of interest, compromising downstream analysis. We present ccRemover, a new method that reliably identifies the cell-cycle effect and removes it. ccRemover preserves other biological signals of interest in the data and thus can serve as an important pre-processing step for many scRNA-Seq data analyses. The effectiveness of ccRemover is demonstrated using simulation data and three real scRNA-Seq datasets, where it boosts the performance of existing clustering algorithms in distinguishing between cell types. PMID:27670849

  3. Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.

    PubMed

    Mi, Gu; Di, Yanming; Schafer, Daniel W

    2015-01-01

    This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.

  4. Single-cell RNA sequencing identifies diverse roles of epithelial cells in idiopathic pulmonary fibrosis

    PubMed Central

    Mizuno, Takako; Sridharan, Anusha; Du, Yina; Guo, Minzhe; Wikenheiser-Brokamp, Kathryn A.; Perl, Anne-Karina T.; Funari, Vincent A.; Gokey, Jason J.; Stripp, Barry R.; Whitsett, Jeffrey A.

    2016-01-01

    Idiopathic pulmonary fibrosis (IPF) is a lethal interstitial lung disease characterized by airway remodeling, inflammation, alveolar destruction, and fibrosis. We utilized single-cell RNA sequencing (scRNA-seq) to identify epithelial cell types and associated biological processes involved in the pathogenesis of IPF. Transcriptomic analysis of normal human lung epithelial cells defined gene expression patterns associated with highly differentiated alveolar type 2 (AT2) cells, indicated by enrichment of RNAs critical for surfactant homeostasis. In contrast, scRNA-seq of IPF cells identified 3 distinct subsets of epithelial cell types with characteristics of conducting airway basal and goblet cells and an additional atypical transitional cell that contributes to pathological processes in IPF. Individual IPF cells frequently coexpressed alveolar type 1 (AT1), AT2, and conducting airway selective markers, demonstrating “indeterminate” states of differentiation not seen in normal lung development. Pathway analysis predicted aberrant activation of canonical signaling via TGF-β, HIPPO/YAP, P53, WNT, and AKT/PI3K. Immunofluorescence confocal microscopy identified the disruption of alveolar structure and loss of the normal proximal-peripheral differentiation of pulmonary epithelial cells. scRNA-seq analyses identified loss of normal epithelial cell identities and unique contributions of epithelial cells to the pathogenesis of IPF. The present study provides a rich data source to further explore lung health and disease. PMID:27942595

  5. RNA sequencing revealed novel actors of the acquisition of drug resistance in Candida albicans

    PubMed Central

    2012-01-01

    Background Drug susceptible clinical isolates of Candida albicans frequently become highly tolerant to drugs during chemotherapy, with dreadful consequences to patient health. We used RNA sequencing (RNA-seq) to analyze the transcriptomes of a CDR (Candida Drug Resistance) strain and its isogenic drug sensitive counterpart. Results RNA-seq unveiled differential expression of 228 genes including a) genes previously identified as involved in CDR, b) genes not previously associated to the CDR phenotype, and c) novel transcripts whose function as a gene is uncharacterized. In particular, we show for the first time that CDR acquisition is correlated with an overexpression of the transcription factor encoding gene CZF1. CZF1 null mutants were susceptible to many drugs, independently of known multidrug resistance mechanisms. We show that CZF1 acts as a repressor of β-glucan synthesis, thus negatively regulating cell wall integrity. Finally, our RNA-seq data allowed us to identify a new transcribed region, upstream of the TAC1 gene, which encodes the major CDR transcriptional regulator. Conclusion Our results open new perspectives of the role of Czf1 and of our understanding of the transcriptional and post-transcriptional mechanisms that lead to the acquisition of drug resistance in C. albicans, with potential for future improvements of therapeutic strategies. PMID:22897889

  6. Transcriptome analysis of the silkworm (Bombyx mori) by high-throughput RNA sequencing.

    PubMed

    Li, Yinü; Wang, Guozeng; Tian, Jian; Liu, Huifen; Yang, Huipeng; Yi, Yongzhu; Wang, Jinhui; Shi, Xiaofeng; Jiang, Feng; Yao, Bin; Zhang, Zhifang

    2012-01-01

    The domestic silkworm, Bombyx mori, is a model insect with important economic value for silk production that also acts as a bioreactor for biomaterial production. The functional complexity of the silkworm transcriptome has not yet been fully elucidated, although genomic sequencing and other tools have been widely used in its study. We explored the transcriptome of silkworm at different developmental stages using high-throughput paired-end RNA sequencing. A total of about 3.3 gigabases (Gb) of sequence was obtained, representing about a 7-fold coverage of the B. mori genome. From the reads that were mapped to the genome sequence; 23,461 transcripts were obtained, 5,428 of them were novel. Of the 14,623 predicted protein-coding genes in the silkworm genome database, 11,884 of them were found to be expressed in the silkworm transcriptome, giving a coverage of 81.3%. A total of 13,195 new exons were detected, of which, 5,911 were found in the annotated genes in the Silkworm Genome Database (SilkDB). An analysis of alternative splicing in the transcriptome revealed that 3,247 genes had undergone alternative splicing. To help with the data analysis, a transcriptome database that integrates our transcriptome data with the silkworm genome data was constructed and is publicly available at http://124.17.27.136/gbrowse2/. To our knowledge, this is the first study to elucidate the silkworm transcriptome using high-throughput RNA sequencing technology. Our data indicate that the transcriptome of silkworm is much more complex than previously anticipated. This work provides tools and resources for the identification of new functional elements and paves the way for future functional genomics studies.

  7. Identification of the Hevea brasiliensis AP2/ERF superfamily by RNA sequencing

    PubMed Central

    2013-01-01

    Background Rubber tree (Hevea brasiliensis) laticifers are the source of natural rubber. Rubber production depends on endogenous and exogenous ethylene (ethephon). AP2/ERF transcription factors, and especially Ethylene-Response Factors, play a crucial role in plant development and response to biotic and abiotic stresses. This study set out to sequence transcript expressed in various tissues using next-generation sequencing and to identify AP2/ERF superfamily in the rubber tree. Results The 454 sequencing technique was used to produce five tissue-type transcript libraries (leaf, bark, latex, embryogenic tissues and root). Reads from all libraries were pooled and reassembled to improve mRNA lengths and produce a global library. One hundred and seventy-three AP2/ERF contigs were identified by in silico analysis based on the amino acid sequence of the conserved AP2 domain from the global library. The 142 contigs with the full AP2 domain were classified into three main families (20 AP2 members, 115 ERF members divided into 11 groups, and 4 RAV members) and 3 soloist members. Fifty-nine AP2/ERF transcripts were found in latex. Alongside the microRNA172 already described in plants, eleven additional microRNAs were predicted to inhibit Hevea AP2/ERF transcripts. Conclusions Hevea has a similar number of AP2/ERF genes to that of other dicot species. We adapted the alignment and classification methods to data from next-generation sequencing techniques to provide reliable information. We observed several specific features for the ERF family. Three HbSoloist members form a group in Hevea. Several AP2/ERF genes highly expressed in latex suggest they have a specific function in Hevea. The analysis of AP2/ERF transcripts in Hevea presented here provides the basis for studying the molecular regulation of latex production in response to abiotic stresses and latex cell differentiation. PMID:23324139

  8. Application of sequence-independent amplification (SIA) for the identification of RNA viruses in bioenergy crops.

    PubMed

    Agindotan, Bright O; Ahonsi, Monday O; Domier, Leslie L; Gray, Michael E; Bradley, Carl A

    2010-10-01

    Miscanthus x giganteus, energycane, and Panicum virgatum (switchgrass) are three potential biomass crops being evaluated for commercial cellulosic ethanol production. Viral diseases are potentially significant threats to these crops. Therefore, identification of viruses infecting these bioenergy crops is important for quarantine purposes, virus resistance breeding, and production of virus-free planting materials. The application is described of sequence-independent amplification, for the identification of RNA viruses in bioenergy crops. The method involves virus partial purification from a small amount of infected leaf tissue (miniprep), extraction of viral RNA, amplification of randomly primed cDNAs, cloning, sequencing, and BLAST searches for sequence homology in the GenBank. This method has distinct advantage over other virus characterization techniques in that it does not require reagent specific to target viruses. Using this method, a possible new species was identified in the genus Marafivirus in switchgrass related to Maize rayado fino virus, its closest relative currently in GenBank. Sugarcane mosaic virus (SCMV), genus Potyvirus, was identified in M.xgiganteus, energycane, corn (Zea mays), and switchgrass. Other viruses identified were: Maize dwarf mosaic virus (MDMV), genus Potyvirus, in johnsongrass (Sorghum halepense); Soil borne wheat mosaic virus (SBWMV), genus Furovirus, in wheat (Triticum aestivum); and Bean pod mottle virus (BPMV), genus Comovirus, in soybean (Glycine max). The method was as sensitive as conventional RT-PCR. This is the first report of a Marafivirus infecting switchgrass, and SCMV infecting both energycane and M. x giganteus.

  9. Novel haloarchaeal 16S rRNA gene sequences from Alpine Permo-Triassic rock salt.

    PubMed

    Radax, C; Gruber, C; Stan-Lotter, H

    2001-08-01

    Prokaryotic diversity in Alpine salt sediments was investigated by polymerase chain reaction (PCR) amplification of 16S rRNA genes, sequencing of cloned products, and comparisons with culturable strains. DNA was extracted from the residue following filtration of dissolved Permo-Triassic rock salt. Fifty-four haloarchaeal sequences were obtained, which could be grouped into at least five distinct clusters. Similarity values of three clusters to known 16S rRNA genes were less than 90%-95%, suggesting the presence of uncultured novel taxa; two clusters were 98% and 99% similar to isolates from Permo-Triassic or Miocene salt from England and Poland, and to Halobacterium salinarum, respectively. Some rock salt samples, including drilling cores, yielded no amplifiable DNA and no cells or only a few culturable cells. This result suggested a variable distribution of haloarchaea within different strata, probably consistent with the known geologic heterogeneity of Alpine salt deposits. We recently reported identical culturable Halococcus salifodinae strains in Permo-Triassic salt sediments from England, Germany, and Austria; together with the data presented here, those results suggest one plausible scenario to be an ancient continuous hypersaline ocean (Zechstein sea) populated by haloarchaea, whose descendants are found today in the salt sediments. The novelty of the sequences also suggested avoidance of haloarchaeal contaminants during our isolation of strains, preparation of DNA, and PCR reactions.

  10. Identification and characterization of rhizospheric microbial diversity by 16S ribosomal RNA gene sequencing

    PubMed Central

    Naveed, Muhammad; Mubeen, Samavia; khan, SamiUllah; Ahmed, Iftikhar; Khalid, Nauman; Suleria, Hafiz Ansar Rasul; Bano, Asghari; Mumtaz, Abdul Samad

    2014-01-01

    In the present study, samples of rhizosphere and root nodules were collected from different areas of Pakistan to isolate plant growth promoting rhizobacteria. Identification of bacterial isolates was made by 16S rRNA gene sequence analysis and taxonomical confirmation on EzTaxon Server. The identified bacterial strains were belonged to 5 genera i.e. Ensifer, Bacillus, Pseudomona, Leclercia and Rhizobium. Phylogenetic analysis inferred from 16S rRNA gene sequences showed the evolutionary relationship of bacterial strains with the respective genera. Based on phylogenetic analysis, some candidate novel species were also identified. The bacterial strains were also characterized for morphological, physiological, biochemical tests and glucose dehydrogenase (gdh) gene that involved in the phosphate solublization using cofactor pyrroloquinolone quinone (PQQ). Seven rhizoshperic and 3 root nodulating stains are positive for gdh gene. Furthermore, this study confirms a novel association between microbes and their hosts like field grown crops, leguminous and non-leguminous plants. It was concluded that a diverse group of bacterial population exist in the rhizosphere and root nodules that might be useful in evaluating the mechanisms behind plant microbial interactions and strains QAU-63 and QAU-68 have sequence similarity of 97 and 95% which might be declared as novel after further taxonomic characterization. PMID:25477935

  11. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity

    NASA Technical Reports Server (NTRS)

    Fox, G. E.; Wisotzkey, J. D.; Jurtshuk, P. Jr

    1992-01-01

    16S rRNA (genes coding for rRNA) sequence comparisons were conducted with the following three psychrophilic strains: Bacillus globisporus W25T (T = type strain) and Bacillus psychrophilus W16AT, and W5. These strains exhibited more than 99.5% sequence identity and within experimental uncertainty could be regarded as identical. Their close taxonomic relationship was further documented by phenotypic similarities. In contrast, previously published DNA-DNA hybridization results have convincingly established that these strains do not belong to the same species if current standards are used. These results emphasize the important point that effective identity of 16S rRNA sequences is not necessarily a sufficient criterion to guarantee species identity. Thus, although 16S rRNA sequences can be used routinely to distinguish and establish relationships between genera and well-resolved species, very recently diverged species may not be recognizable.

  12. An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples.

    PubMed

    Scolnick, Jonathan A; Dimon, Michelle; Wang, I-Ching; Huelga, Stephanie C; Amorese, Douglas A

    2015-01-01

    Fusion genes are known to be key drivers of tumor growth in several types of cancer. Traditionally, detecting fusion genes has been a difficult task based on fluorescent in situ hybridization to detect chromosomal abnormalities. More recently, RNA sequencing has enabled an increased pace of fusion gene identification. However, RNA-Seq is inefficient for the identification of fusion genes due to the high number of sequencing reads needed to detect the small number of fusion transcripts present in cells of interest. Here we describe a method, Single Primer Enrichment Technology (SPET), for targeted RNA sequencing that is customizable to any target genes, is simple to use, and efficiently detects gene fusions. Using SPET to target 5701 exons of 401 known cancer fusion genes for sequencing, we were able to identify known and previously unreported gene fusions from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE) tissue RNA in both normal tissue and cancer cells.

  13. An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples

    PubMed Central

    Scolnick, Jonathan A.; Dimon, Michelle; Wang, I-Ching; Huelga, Stephanie C.; Amorese, Douglas A.

    2015-01-01

    Fusion genes are known to be key drivers of tumor growth in several types of cancer. Traditionally, detecting fusion genes has been a difficult task based on fluorescent in situ hybridization to detect chromosomal abnormalities. More recently, RNA sequencing has enabled an increased pace of fusion gene identification. However, RNA-Seq is inefficient for the identification of fusion genes due to the high number of sequencing reads needed to detect the small number of fusion transcripts present in cells of interest. Here we describe a method, Single Primer Enrichment Technology (SPET), for targeted RNA sequencing that is customizable to any target genes, is simple to use, and efficiently detects gene fusions. Using SPET to target 5701 exons of 401 known cancer fusion genes for sequencing, we were able to identify known and previously unreported gene fusions from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE) tissue RNA in both normal tissue and cancer cells. PMID:26132974

  14. Detecting morphological convergence in true fungi, using 18S rRNA gene sequence data.

    PubMed

    Berbee, M L; Taylor, J W

    1992-01-01

    For the true fungi, phylogenetic relationships inferred from 18S ribosomal DNA sequence data agree with morphology when (1) the fungi exhibit diagnostic morphological characters, (2) the sequence-based phylogenetic groups are statistically supported, and (3) the ribosomal DNA evolves at roughly the same rate in the lineages being compared. 18S ribosomal RNA gene sequence data and biochemical data provide a congruent definition of true fungi. Sequence data support the traditional fungal subdivisions Ascomycotina and Basidiomycotina. In conflict with morphology, some zygomycetes group with chytrid water molds rather than with other terrestrial fungi, possibly owing to unequal rates of nucleotide substitutions among zygomycete lineages. Within the ascomycetes, the taxonomic consequence of simple or reduced morphology has been a proliferation of mutually incongruent classification systems. Sequence data provide plausible resolution of relationships for some cases where reduced morphology has created confusion. For example, phylogenetic trees from rDNA indicate that those morphologically simple ascomycetes classified as yeasts are polyphyletic and that forcible spore discharge was lost convergently from three lineages of ascomycetes producing flask-like fruiting bodies.

  15. Single-cell RNA-sequencing: The future of genome biology is now.

    PubMed

    Picelli, Simone

    2016-07-21

    Genome-wide single-cell analysis represents the ultimate frontier of genomics research. In particular, single-cell RNA-sequencing (scRNA-seq) studies have been boosted in the last few years by an explosion of new technologies enabling the study of the transcriptomic landscape of thousands of single cells in complex multicellular organisms. More sensitive and automated methods are being continuously developed and promise to deliver better data quality and higher throughput with less hands-on time. The outstanding amount of knowledge that is going to be gained from present and future studies will have a profound impact in many aspects of our society, from the introduction of truly tailored cancer treatments, to a better understanding of antibiotic resistance and host-pathogen interactions; from the discovery of the mechanisms regulating stem cell differentiation to the characterization of the early event of human embryogenesis.

  16. De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity

    PubMed Central

    Yassour, Moran; Grabherr, Manfred; Blood, Philip D.; Bowden, Joshua; Couger, Matthew Brian; Eccles, David; Li, Bo; Lieber, Matthias; MacManes, Matthew D.; Ott, Michael; Orvis, Joshua; Pochet, Nathalie; Strozzi, Francesco; Weeks, Nathan; Westerman, Rick; William, Thomas; Dewey, Colin N.; Henschel, Robert; LeDuc, Richard D.; Friedman, Nir; Regev, Aviv

    2013-01-01

    De novo assembly of RNA-Seq data allows us to study transcriptomes without the need for a genome sequence, such as in non-model organisms of ecological and evolutionary importance, cancer samples, or the microbiome. In this protocol, we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-Seq data in non-model organisms. We also present Trinity’s supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples, and approaches to identify protein coding genes. In an included tutorial we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sf.net. PMID:23845962

  17. The transcriptome of Verticillium dahliae-infected Nicotiana benthamiana determined by deep RNA sequencing.

    PubMed

    Faino, Luigi; de Jonge, Ronnie; Thomma, Bart P H J

    2012-09-01

    Verticillium wilt disease is caused by fungi of the Verticillium genus that occur on a wide range of host plants, including Solanaceous species such as tomato and tobacco. Currently, the well characterized Ve1 gene of tomato is the only Verticillium wilt resistance gene cloned. During experiments to identify the Verticillium molecule that activates Ve1 resistance in tomato, RNA sequencing (RNA-Seq) of Verticillium-infected Nicotiana benthamiana was performed. In total, over 99% of the obtained reads were derived from N. benthamiana. Here, we report the assembly and annotation of the N. benthamiana transcriptome. In total, 142,738 transcripts > 100 bp were obtained, amounting to a total transcriptome size of 38.7 Mbp, which is comparable to the Arabidopsis transcriptome. About 30,282 transcripts could be annotated based on homology to Arabidopsis genes. By assembly of the N. benthamiana transcriptome, we provide a catalogue of transcripts of a Solanaceous model plant under pathogen stress.

  18. Use of 16S rRNA, 23S rRNA, and gyrB Gene Sequence Analysis To Determine Phylogenetic Relationships of Bacillus cereus Group Microorganisms

    PubMed Central

    Bavykin, Sergei G.; Lysov, Yuri P.; Zakhariev, Vladimir; Kelly, John J.; Jackman, Joany; Stahl, David A.; Cherni, Alexey

    2004-01-01

    In order to determine if variations in rRNA sequence could be used for discrimination of the members of the Bacillus cereus group, we analyzed 183 16S rRNA and 74 23S rRNA sequences for all species in the B. cereus group. We also analyzed 30 gyrB sequences for B. cereus group strains with published 16S rRNA sequences. Our findings indicated that the three most common species of the B. cereus group, B. cereus, Bacillus thuringiensis, and Bacillus mycoides, were each heterogeneous in all three gene sequences, while all analyzed strains of Bacillus anthracis were found to be homogeneous. Based on analysis of 16S and 23S rRNA sequence variations, the microorganisms within the B. cereus group were divided into seven subgroups, Anthracis, Cereus A and B, Thuringiensis A and B, and Mycoides A and B, and these seven subgroups were further organized into two distinct clusters. This classification of the B. cereus group conflicts with current taxonomic groupings, which are based on phenotypic traits. The presence of B. cereus strains in six of the seven subgroups and the presence of B. thuringiensis strains in three of the subgroups do not support the proposed unification of B. cereus and B. thuringiensis into one species. Analysis of the available phenotypic data for the strains included in this study revealed phenotypic traits that may be characteristic of several of the subgroups. Finally, our results demonstrated that rRNA and gyrB sequences may be used for discriminating B. anthracis from other microorganisms in the B. cereus group. PMID:15297521

  19. Use of 16S rRNA, 23S rRNA, and gyrB gene sequence analysis to determine phylogenetic relationships of Bacillus cereus group.

    SciTech Connect

    Bayvkin, S. G.; Lysov, Y. P.; Zakhariev, V.; Kelly, J. J.; Jackman, J.; Stahl, D. A.; Cherni, A.; Engelhardt Inst. of Molecular Biology; Loyola Univ.; Johns Hopkins Univ.; Univ. of Washington

    2004-08-01

    In order to determine if variations in rRNA sequence could be used for discrimination of the members of the Bacillus cereus group, we analyzed 183 16S rRNA and 74 23S rRNA sequences for all species in the B. cereus group. We also analyzed 30 gyrB sequences for B. cereus group strains with published 16S rRNA sequences. Our findings indicated that the three most common species of the B. cereus group, B. cereus, Bacillus thuringiensis, and Bacillus mycoides, were each heterogeneous in all three gene sequences, while all analyzed strains of Bacillus anthracis were found to be homogeneous. Based on analysis of 16S and 23S rRNA sequence variations, the microorganisms within the B. cereus group were divided into seven subgroups, Anthracis, Cereus A and B, Thuringiensis A and B, and Mycoides A and B, and these seven subgroups were further organized into two distinct clusters. This classification of the B. cereus group conflicts with current taxonomic groupings, which are based on phenotypic traits. The presence of B. cereus strains in six of the seven subgroups and the presence of B. thuringiensis strains in three of the subgroups do not support the proposed unification of B. cereus and B. thuringiensis into one species. Analysis of the available phenotypic data for the strains included in this study revealed phenotypic traits that may be characteristic of several of the subgroups. Finally, our results demonstrated that rRNA and gyrB sequences may be used for discriminating B. anthracis from other microorganisms in the B. cereus group.

  20. Small RNA Library Preparation Method for Next-Generation Sequencing Using Chemical Modifications to Prevent Adapter Dimer Formation.

    PubMed

    Shore, Sabrina; Henderson, Jordana M; Lebedev, Alexandre; Salcedo, Michelle P; Zon, Gerald; McCaffrey, Anton P; Paul, Natasha; Hogrefe, Richard I

    2016-01-01

    For most sample types, the automation of RNA and DNA sample preparation workflows enables high throughput next-generation sequencing (NGS) library preparation. Greater adoption of small RNA (sRNA) sequencing has been hindered by high sample input requirements and inherent ligation side products formed during library preparation. These side products, known as adapter dimer, are very similar in size to the tagged library. Most sRNA library preparation strategies thus employ a gel purification step to isolate tagged library from adapter dimer contaminants. At very low sample inputs, adapter dimer side products dominate the reaction and limit the sensitivity of this technique. Here we address the need for improved specificity of sRNA library preparation workflows with a novel library preparation approach that uses modified adapters to suppress adapter dimer formation. This workflow allows for lower sample inputs and elimination of the gel purification step, which in turn allows for an automatable sRNA library preparation protocol.

  1. Analysis of the intestinal microbiota using SOLiD 16S rRNA gene sequencing and SOLiD shotgun sequencing

    PubMed Central

    2013-01-01

    Background Metagenomics seeks to understand microbial communities and assemblages by DNA sequencing. Technological advances in next generation sequencing technologies are fuelling a rapid growth in the number and scope of projects aiming to analyze complex microbial environments such as marine, soil or the gut. Recent improvements in longer read lengths and paired-sequencing allow better resolution in profiling microbial communities. While both 454 sequencing and Illumina sequencing have been used in numerous metagenomic studies, SOLiD sequencing is not commonly used in this area, as it is believed to be more suitable in the context of reference-guided projects. Results To investigate the performance of SOLiD sequencing in a metagenomic context, we compared taxonomic profiles of SOLiD mate-pair sequencing reads with Sanger paired reads and 454 single reads. All sequences were obtained from the bacterial 16S rRNA gene, which was amplified from microbial DNA extracted from a human fecal sample. Additionally, from the same fecal sample, complete genomic microbial DNA was extracted and shotgun sequenced using SOLiD sequencing to study the composition of the intestinal microbiota and the existing microbial metabolism. We found that the microbiota composition of 16S rRNA gene sequences obtained using Sanger, 454 and SOLiD sequencing provide results comparable to the result based on shotgun sequencing. Moreover, with SOLiD sequences we obtained more resolution down to the species level. In addition, the shotgun data allowed us to determine a functional profile using the databases SEED and KEGG. Conclusions This study shows that SOLiD mate-pair sequencing is a viable and cost-efficient option for analyzing a complex microbiome. To the best of our knowledge, this is the first time that SOLiD sequencing has been used in a human sample. PMID:24564472

  2. Abiotrophia defectiva infection of a total hip arthroplasty diagnosed by 16S rRNA gene sequencing.

    PubMed

    Rozemeijer, Wouter; Jiya, Timothy U; Rijnsburger, Martine; Heddema, Edou; Savelkoul, Paul; Ang, Wim

    2011-05-01

    We describe a case of a total hip arthroplasty infection caused by Abiotrophia defectiva, identified by 16S rRNA gene sequencing. Removal of the prosthesis followed by antibiotic treatment resulted in a good clinical outcome. 16S rRNA gene sequencing can be a useful tool in diagnosing infection with this fastidious microorganism that can easily be misidentified using phenotypic identification methods.

  3. Ultra-Deep Sequencing Reveals the microRNA Expression Pattern of the Human Stomach

    PubMed Central

    Ribeiro-dos-Santos, Ândrea; Khayat, André S.; Silva, Artur; Alencar, Dayse O.; Lobato, Jessé; Luz, Larissa; Pinheiro, Daniel G.; Varuzza, Leonardo; Assumpção, Monica; Assumpção, Paulo; Santos, Sidney; Zanette, Dalila L.; Silva, Wilson A.; Burbano, Rommel; Darnet, Sylvain

    2010-01-01

    Background While microRNAs (miRNAs) play important roles in tissue differentiation and in maintaining basal physiology, little is known about the miRNA expression levels in stomach tissue. Alterations in the miRNA profile can lead to cell deregulation, which can induce neoplasia. Methodology/Principal Findings A small RNA library of stomach tissue was sequenced using high-throughput SOLiD sequencing technology. We obtained 261,274 quality reads with perfect matches to the human miRnome, and 42% of known miRNAs were identified. Digital Gene Expression profiling (DGE) was performed based on read abundance and showed that fifteen miRNAs were highly expressed in gastric tissue. Subsequently, the expression of these miRNAs was validated in 10 healthy individuals by RT-PCR showed a significant correlation of 83.97% (P<0.05). Six miRNAs showed a low variable pattern of expression (miR-29b, miR-29c, miR-19b, miR-31, miR-148a, miR-451) and could be considered part of the expression pattern of the healthy gastric tissue. Conclusions/Significance This study aimed to validate normal miRNA profiles of human gastric tissue to establish a reference profile for healthy individuals. Determining the regulatory processes acting in the stomach will be important in the fight against gastric cancer, which is the second-leading cause of cancer mortality worldwide. PMID:20949028

  4. Unprecedented High-Resolution View of Bacterial Operon Architecture Revealed by RNA Sequencing

    PubMed Central

    Creecy, James P.; Maddox, Scott M.; Grissom, Joe E.; Conkle, Trevor L.; Shadid, Tyler M.; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada

    2014-01-01

    ABSTRACT We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3′ transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5′ ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. PMID:25006232

  5. Sequence analysis of zein cDNAs obtained by an efficient mRNA cloning method.

    PubMed Central

    Heidecker, G; Messing, J

    1983-01-01

    A cDNA library was generated from mRNA isolated from the developing endosperm of W22 maize inbred. cDNA clones for zein, the maize storage protein family, were isolated and analyzed by DNA sequencing. The DNA sequences of four clones containing cDNA copies of mRNAs belonging to one zein subfamily were determined. The data support the following conclusions: a) genes encoding the larger of the two zein species contain eleven instead of nine repeat units within the coding sequence of the gene; b) transcription can be terminated at either of the two polyadenlation signals and c) transcription starts 31 basepairs downstream from the first T in the TATA box. To facilitate this analysis a new method for the construction of cDNA libraries was developed. The mRNA was annealed to linearized and oligo-dT tailed pUC9 plasmid DNA, which then primed synthesis of the first strand of the cDNA. Oligo-dG tails were added to the cDNA-plasmid molecules, which were then centrifuged through an alkaline sucrose gradient. The gradient step removed small molecules and separated the two cDNAs which were formerly attached to the same double stranded plasmid molecule. An excess of oligo-dC tailed denatured pUC9 DNA was added and the DNA was renatured under conditions that favor the circularization of monomers by the oligo-dC and oligo-dG tails. The oligo-dC tail served as primer for the synthesis of the second strand of the cDNA. The library was screened by colony hybridization using 32P-labelled cDNA and DNA from genomic zein clones as probes. We obtained 20,000 clones hybridizing total cDNA starting with 1 microgram of plasmid DNA and 1 microgram of mRNA. Images PMID:6688299

  6. DECIPHER, a search-based approach to chimera identification for 16S rRNA sequences.

    PubMed

    Wright, Erik S; Yilmaz, L Safak; Noguera, Daniel R

    2012-02-01

    DECIPHER is a new method for finding 16S rRNA chimeric sequences by the use of a search-based approach. The method is based upon detecting short fragments that are uncommon in the phylogenetic group where a query sequence is classified but frequently found in another phylogenetic group. The algorithm was calibrated for full sequences (fs_DECIPHER) and short sequences (ss_DECIPHER) and benchmarked against WigeoN (Pintail), ChimeraSlayer, and Uchime using artificially generated chimeras. Overall, ss_DECIPHER and Uchime provided the highest chimera detection for sequences 100 to 600 nucleotides long (79% and 81%, respectively), but Uchime's performance deteriorated for longer sequences, while ss_DECIPHER maintained a high detection rate (89%). Both methods had low false-positive rates (1.3% and 1.6%). The more conservative fs_DECIPHER, benchmarked only for sequences longer than 600 nucleotides, had an overall detection rate lower than that of ss_DECIPHER (75%) but higher than those of the other programs. In addition, fs_DECIPHER had the lowest false-positive rate among all the benchmarked programs (<0.20%). DECIPHER was outperformed only by ChimeraSlayer and Uchime when chimeras were formed from closely related parents (less than 10% divergence). Given the differences in the programs, it was possible to detect over 89% of all chimeras with just the combination of ss_DECIPHER and Uchime. Using fs_DECIPHER, we detected between 1% and 2% additional chimeras in the RDP, SILVA, and Greengenes databases from which chimeras had already been removed with Pintail or Bellerophon. DECIPHER was implemented in the R programming language and is directly accessible through a webpage or by downloading the program as an R package (http://DECIPHER.cee.wisc.edu).

  7. Identification and characterization of an intervening sequence within the 23S ribosomal RNA genes of Edwardsiella ictaluri

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparison of the 23S rRNA gene sequences of Edwardsiella tarda and Edwardsiella ictaluri confirmed a close phylogenetic relationship between these two fish pathogen species and a distant relation with the 'core' members of the Enterobacteriaceae family. Analysis of the rrl gene for 23S rRNA in E. i...

  8. A combined sequence and structure based method for discovering enriched motifs in RNA from in vivo binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Kohen, Refael; Mesika, Rona; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2017-03-06

    RNA binding proteins (RBPs) play an important role in regulating many processes in the cell. RBPs often recognize their RNA targets in a specific manner. In addition to the RNA primary sequence, the structure of the RNA has been shown to play a central role in RNA recognition by RBPs. In recent years, many experimental approaches, both in vitro and in vivo, were developed and employed to identify and characterize RBP targets and extract their binding specificities. In vivo binding techniques, such as CrossLinking and ImmunoPrecipitation (CLIP)-based methods, enable the characterization of protein binding sites on RNA targets. However, these methods do not provide information regarding the structural preferences of the protein. While methods to obtain the structure of RNA are available, inferring both the sequence and the structure preferences of RBPs remains a challenge. Here we present SMARTIV, a novel computational tool for discovering combined sequence and structure binding motifs from in vivo RNA binding data relying on the sequences of the target sites, the ranking of their binding scores and their predicted secondary structure. The combined motifs are provided in a unified representation that is informative and easy for visual perception. We tested the method on CLIP-seq data from different platforms for a variety of RBPs. Overall, we show that our results are highly consistent with known binding motifs of RBPs, offering additional information on their structural preferences.