Sample records for protein coding transcripts

  1. Cell cycle, oncogenic and tumor suppressor pathways regulate numerous long and macro non-protein-coding RNAs

    PubMed Central

    2014-01-01

    Background The genome is pervasively transcribed but most transcripts do not code for proteins, constituting non-protein-coding RNAs. Despite increasing numbers of functional reports of individual long non-coding RNAs (lncRNAs), assessing the extent of functionality among the non-coding transcriptional output of mammalian cells remains intricate. In the protein-coding world, transcripts differentially expressed in the context of processes essential for the survival of multicellular organisms have been instrumental in the discovery of functionally relevant proteins and their deregulation is frequently associated with diseases. We therefore systematically identified lncRNAs expressed differentially in response to oncologically relevant processes and cell-cycle, p53 and STAT3 pathways, using tiling arrays. Results We found that up to 80% of the pathway-triggered transcriptional responses are non-coding. Among these we identified very large macroRNAs with pathway-specific expression patterns and demonstrated that these are likely continuous transcripts. MacroRNAs contain elements conserved in mammals and sauropsids, which in part exhibit conserved RNA secondary structure. Comparing evolutionary rates of a macroRNA to adjacent protein-coding genes suggests a local action of the transcript. Finally, in different grades of astrocytoma, a tumor disease unrelated to the initially used cell lines, macroRNAs are differentially expressed. Conclusions It has been shown previously that the majority of expressed non-ribosomal transcripts are non-coding. We now conclude that differential expression triggered by signaling pathways gives rise to a similar abundance of non-coding content. It is thus unlikely that the prevalence of non-coding transcripts in the cell is a trivial consequence of leaky or random transcription events. PMID:24594072

  2. Network perturbation by recurrent regulatory variants in cancer

    PubMed Central

    Cho, Ara; Lee, Insuk; Choi, Jung Kyoon

    2017-01-01

    Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928

  3. Polymerization of non-complementary RNA: systematic symmetric nucleotide exchanges mainly involving uracil produce mitochondrial RNA transcripts coding for cryptic overlapping genes.

    PubMed

    Seligmann, Hervé

    2013-03-01

    Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging uracil. Independent measures of rates of rare replicational enzymatic DNA nucleotide misinsertions predict frequencies of RNA transcripts systematically exchanging the corresponding misinserted nucleotides. Exchange transcripts self-hybridize less than other gene regions, self-hybridization increases with length, suggesting endoribonuclease-limited elongation. Blast detects stop codon depleted putative protein coding overlapping genes within exchange-transcribed mitochondrial genes. These align with existing GenBank proteins (mainly metazoan origins, prokaryotic and viral origins underrepresented). These GenBank proteins frequently interact with RNA/DNA, are membrane transporters, or are typical of mitochondrial metabolism. Nucleotide exchange transcript frequencies increase with overlapping gene densities and stop densities, indicating finely tuned counterbalancing regulation of expression of systematic symmetric nucleotide exchange-encrypted proteins. Such expression necessitates combined activities of suppressor tRNAs matching stops, and nucleotide exchange transcription. Two independent properties confirm predicted exchanged overlap coding genes: discrepancy of third codon nucleotide contents from replicational deamination gradients, and codon usage according to circular code predictions. Predictions from both properties converge, especially for frequent nucleotide exchange types. Nucleotide exchanging transcription apparently increases coding densities of protein coding genes without lengthening genomes, revealing unsuspected functional DNA coding potential. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  4. Improving the genome annotation of the acarbose producer Actinoplanes sp. SE50/110 by sequencing enriched 5'-ends of primary transcripts.

    PubMed

    Schwientek, Patrick; Neshat, Armin; Kalinowski, Jörn; Klein, Andreas; Rückert, Christian; Schneiker-Bekel, Susanne; Wendler, Sergej; Stoye, Jens; Pühler, Alfred

    2014-11-20

    Actinoplanes sp. SE50/110 is the producer of the alpha-glucosidase inhibitor acarbose, which is an economically relevant and potent drug in the treatment of type-2 diabetes mellitus. In this study, we present the detection of transcription start sites on this genome by sequencing enriched 5'-ends of primary transcripts. Altogether, 1427 putative transcription start sites were initially identified. With help of the annotated genome sequence, 661 transcription start sites were found to belong to the leader region of protein-coding genes with the surprising result that roughly 20% of these genes rank among the class of leaderless transcripts. Next, conserved promoter motifs were identified for protein-coding genes with and without leader sequences. The mapped transcription start sites were finally used to improve the annotation of the Actinoplanes sp. SE50/110 genome sequence. Concerning protein-coding genes, 41 translation start sites were corrected and 9 novel protein-coding genes could be identified. In addition to this, 122 previously undetermined non-coding RNA (ncRNA) genes of Actinoplanes sp. SE50/110 were defined. Focusing on antisense transcription start sites located within coding genes or their leader sequences, it was discovered that 96 of those ncRNA genes belong to the class of antisense RNA (asRNA) genes. The remaining 26 ncRNA genes were found outside of known protein-coding genes. Four chosen examples of prominent ncRNA genes, namely the transfer messenger RNA gene ssrA, the ribonuclease P class A RNA gene rnpB, the cobalamin riboswitch RNA gene cobRS, and the selenocysteine-specific tRNA gene selC, are presented in more detail. This study demonstrates that sequencing of enriched 5'-ends of primary transcripts and the identification of transcription start sites are valuable tools for advanced genome annotation of Actinoplanes sp. SE50/110 and most probably also for other bacteria. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes

    DOE PAGES

    Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui; ...

    2014-10-02

    Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptionalmore » regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.« less

  6. Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui

    Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptionalmore » regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.« less

  7. Prediction of plant lncRNA by ensemble machine learning classifiers.

    PubMed

    Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian

    2018-05-02

    In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.

  8. microRNA in Cerebral Spinal Fluid as Biomarkers of Alzheimer’s Disease Risk After Brain Injury

    DTIC Science & Technology

    2016-08-01

    protein processing is a key feature of AD. MiRNAs are small non- coding RNA that regulate mRNA transcription, and may be a significant cause of protein...non- coding RNA that regulate mRNA transcription, and may be a significant cause of protein dysregulation. Our investigative team has generated

  9. PAR-CLIP data indicate that Nrd1-Nab3-dependent transcription termination regulates expression of hundreds of protein coding genes in yeast

    PubMed Central

    2014-01-01

    Background Nrd1 and Nab3 are essential sequence-specific yeast RNA binding proteins that function as a heterodimer in the processing and degradation of diverse classes of RNAs. These proteins also regulate several mRNA coding genes; however, it remains unclear exactly what percentage of the mRNA component of the transcriptome these proteins control. To address this question, we used the pyCRAC software package developed in our laboratory to analyze CRAC and PAR-CLIP data for Nrd1-Nab3-RNA interactions. Results We generated high-resolution maps of Nrd1-Nab3-RNA interactions, from which we have uncovered hundreds of new Nrd1-Nab3 mRNA targets, representing between 20 and 30% of protein-coding transcripts. Although Nrd1 and Nab3 showed a preference for binding near 5′ ends of relatively short transcripts, they bound transcripts throughout coding sequences and 3′ UTRs. Moreover, our data for Nrd1-Nab3 binding to 3′ UTRs was consistent with a role for these proteins in the termination of transcription. Our data also support a tight integration of Nrd1-Nab3 with the nutrient response pathway. Finally, we provide experimental evidence for some of our predictions, using northern blot and RT-PCR assays. Conclusions Collectively, our data support the notion that Nrd1 and Nab3 function is tightly integrated with the nutrient response and indicate a role for these proteins in the regulation of many mRNA coding genes. Further, we provide evidence to support the hypothesis that Nrd1-Nab3 represents a failsafe termination mechanism in instances of readthrough transcription. PMID:24393166

  10. GENCODE: the reference human genome annotation for The ENCODE Project.

    PubMed

    Harrow, Jennifer; Frankish, Adam; Gonzalez, Jose M; Tapanari, Electra; Diekhans, Mark; Kokocinski, Felix; Aken, Bronwen L; Barrell, Daniel; Zadissa, Amonida; Searle, Stephen; Barnes, If; Bignell, Alexandra; Boychenko, Veronika; Hunt, Toby; Kay, Mike; Mukherjee, Gaurab; Rajan, Jeena; Despacio-Reyes, Gloria; Saunders, Gary; Steward, Charles; Harte, Rachel; Lin, Michael; Howald, Cédric; Tanzer, Andrea; Derrien, Thomas; Chrast, Jacqueline; Walters, Nathalie; Balasubramanian, Suganthi; Pei, Baikang; Tress, Michael; Rodriguez, Jose Manuel; Ezkurdia, Iakes; van Baren, Jeltje; Brent, Michael; Haussler, David; Kellis, Manolis; Valencia, Alfonso; Reymond, Alexandre; Gerstein, Mark; Guigó, Roderic; Hubbard, Tim J

    2012-09-01

    The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.

  11. Long non-coding RNA and Polycomb: an intricate partnership in cancer biology.

    PubMed

    Achour, Cyrinne; Aguilo, Francesca

    2018-06-01

    High-throughput analyses have revealed that the vast majority of the transcriptome does not code for proteins. These non-translated transcripts, when larger than 200 nucleotides, are termed long non-coding RNAs (lncRNAs), and play fundamental roles in diverse cellular processes. LncRNAs are subject to dynamic chemical modification, adding another layer of complexity to our understanding of the potential roles that lncRNAs play in health and disease. Many lncRNAs regulate transcriptional programs by influencing the epigenetic state through direct interactions with chromatin-modifying proteins. Among these proteins, Polycomb repressive complexes 1 and 2 (PRC1 and PRC2) have been shown to be recruited by lncRNAs to silence target genes. Aberrant expression, deficiency or mutation of both lncRNA and Polycomb have been associated with numerous human diseases, including cancer. In this review, we have highlighted recent findings regarding the concerted mechanism of action of Polycomb group proteins (PcG), acting together with some classically defined lncRNAs including X-inactive specific transcript ( XIST ), antisense non-coding RNA in the INK4 locus ( ANRIL ), metastasis associated lung adenocarcinoma transcript 1 ( MALAT1 ), and HOX transcript antisense RNA ( HOTAIR ).

  12. Evolution at protein ends: major contribution of alternative transcription initiation and termination to the transcriptome and proteome diversity in mammals

    PubMed Central

    Shabalina, Svetlana A.; Ogurtsov, Aleksey Y.; Spiridonov, Nikolay A.; Koonin, Eugene V.

    2014-01-01

    Alternative splicing (AS), alternative transcription initiation (ATI) and alternative transcription termination (ATT) create the extraordinary complexity of transcriptomes and make key contributions to the structural and functional diversity of mammalian proteomes. Analysis of mammalian genomic and transcriptomic data shows that contrary to the traditional view, the joint contribution of ATI and ATT to the transcriptome and proteome diversity is quantitatively greater than the contribution of AS. Although the mean numbers of protein-coding constitutive and alternative nucleotides in gene loci are nearly identical, their distribution along the transcripts is highly non-uniform. On average, coding exons in the variable 5′ and 3′ transcript ends that are created by ATI and ATT contain approximately four times more alternative nucleotides than core protein-coding regions that diversify exclusively via AS. Short upstream exons that encompass alternative 5′-untranslated regions and N-termini of proteins evolve under strong nucleotide-level selection whereas in 3′-terminal exons that encode protein C-termini, protein-level selection is significantly stronger. The groups of genes that are subject to ATI and ATT show major differences in biological roles, expression and selection patterns. PMID:24792168

  13. Specificity Protein (Sp) Transcription Factors and Metformin Regulate Expression of the Long Non-coding RNA HULC

    EPA Science Inventory

    There is evidence that specificity protein 1 (Sp1) transcription factor (TF) regulates expression of long non-coding RNAs (lncRNAs) in hepatocellular carcinoma (HCC) cells. RNA interference (RNAi) studies showed that among several lncRNAs expressed in HepG2, SNU-449 and SK-Hep-1...

  14. LncRNApred: Classification of Long Non-Coding RNAs and Protein-Coding Transcripts by the Ensemble Algorithm with a New Hybrid Feature.

    PubMed

    Pian, Cong; Zhang, Guangle; Chen, Zhi; Chen, Yuanyuan; Zhang, Jin; Yang, Tao; Zhang, Liangyun

    2016-01-01

    As a novel class of noncoding RNAs, long noncoding RNAs (lncRNAs) have been verified to be associated with various diseases. As large scale transcripts are generated every year, it is significant to accurately and quickly identify lncRNAs from thousands of assembled transcripts. To accurately discover new lncRNAs, we develop a classification tool of random forest (RF) named LncRNApred based on a new hybrid feature. This hybrid feature set includes three new proposed features, which are MaxORF, RMaxORF and SNR. LncRNApred is effective for classifying lncRNAs and protein coding transcripts accurately and quickly. Moreover,our RF model only requests the training using data on human coding and non-coding transcripts. Other species can also be predicted by using LncRNApred. The result shows that our method is more effective compared with the Coding Potential Calculate (CPC). The web server of LncRNApred is available for free at http://mm20132014.wicp.net:57203/LncRNApred/home.jsp.

  15. A repertoire of the dominant transcripts from the salivary glands of the blood-sucking bug, Triatoma dimidiata, a vector of Chagas disease

    PubMed Central

    Kato, Hirotomo; Jochim, Ryan C.; Gomez, Eduardo A.; Sakoda, Ryo; Iwata, Hiroyuki; Valenzuela, Jesus G.; Hashiguchi, Yoshihisa

    2010-01-01

    Triatoma (T.) dimidiata is a hematophagous Hemiptera and a main vector of Chagas disease. The saliva of this and other blood-sucking insects contains potent pharmacologically active components that assist them in counteracting the host hemostatic and inflammatory systems during blood feeding. To describe the repertoire of potential bioactive salivary molecules from this insect, a number of randomly selected transcripts from the salivary gland cDNA library of T. dimidiata were sequenced and analyzed. This analysis showed that 77.5% of the isolated transcripts coded for putative secreted proteins, and 89.9% of these coded for variants of the lipocalin family proteins. The most abundant transcript was a homologue of procalin, the major allergen of T. protracta saliva, and contributed more than 50% of the transcripts coding for putative secreted proteins, suggesting that it may play an important role in the blood-feeding process. Other salivary transcripts encoding lipocalin family proteins had homology to triabin (a thrombin inhibitor), triafestin (an inhibitor of kallikrein–kinin system), pallidipin (an inhibitor of collagen-induced platelet aggregation) and others with unknown function. PMID:19900580

  16. Role of Alternative Polyadenylation during Adipogenic Differentiation: An In Silico Approach

    PubMed Central

    Spangenberg, Lucía; Correa, Alejandro; Dallagiovanna, Bruno; Naya, Hugo

    2013-01-01

    Post-transcriptional regulation of stem cell differentiation is far from being completely understood. Changes in protein levels are not fully correlated with corresponding changes in mRNAs; the observed differences might be partially explained by post-transcriptional regulation mechanisms, such as alternative polyadenylation. This would involve changes in protein binding, transcript usage, miRNAs and other non-coding RNAs. In the present work we analyzed the distribution of alternative transcripts during adipogenic differentiation and the potential role of miRNAs in post-transcriptional regulation. Our in silico analysis suggests a modest, consistent, bias in 3′UTR lengths during differentiation enabling a fine-tuned transcript regulation via small non-coding RNAs. Including these effects in the analyses partially accounts for the observed discrepancies in relative abundance of protein and mRNA. PMID:24143171

  17. Long Non-Coding RNAs Differentially Expressed between Normal versus Primary Breast Tumor Tissues Disclose Converse Changes to Breast Cancer-Related Protein-Coding Genes

    PubMed Central

    Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U.; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N.; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O.

    2014-01-01

    Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes. PMID:25264628

  18. Long non-coding RNAs differentially expressed between normal versus primary breast tumor tissues disclose converse changes to breast cancer-related protein-coding genes.

    PubMed

    Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O

    2014-01-01

    Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes.

  19. Activity-Dependent Human Brain Coding/Noncoding Gene Regulatory Networks

    PubMed Central

    Lipovich, Leonard; Dachet, Fabien; Cai, Juan; Bagla, Shruti; Balan, Karina; Jia, Hui; Loeb, Jeffrey A.

    2012-01-01

    While most gene transcription yields RNA transcripts that code for proteins, a sizable proportion of the genome generates RNA transcripts that do not code for proteins, but may have important regulatory functions. The brain-derived neurotrophic factor (BDNF) gene, a key regulator of neuronal activity, is overlapped by a primate-specific, antisense long noncoding RNA (lncRNA) called BDNFOS. We demonstrate reciprocal patterns of BDNF and BDNFOS transcription in highly active regions of human neocortex removed as a treatment for intractable seizures. A genome-wide analysis of activity-dependent coding and noncoding human transcription using a custom lncRNA microarray identified 1288 differentially expressed lncRNAs, of which 26 had expression profiles that matched activity-dependent coding genes and an additional 8 were adjacent to or overlapping with differentially expressed protein-coding genes. The functions of most of these protein-coding partner genes, such as ARC, include long-term potentiation, synaptic activity, and memory. The nuclear lncRNAs NEAT1, MALAT1, and RPPH1, composing an RNAse P-dependent lncRNA-maturation pathway, were also upregulated. As a means to replicate human neuronal activity, repeated depolarization of SY5Y cells resulted in sustained CREB activation and produced an inverse pattern of BDNF-BDNFOS co-expression that was not achieved with a single depolarization. RNAi-mediated knockdown of BDNFOS in human SY5Y cells increased BDNF expression, suggesting that BDNFOS directly downregulates BDNF. Temporal expression patterns of other lncRNA-messenger RNA pairs validated the effect of chronic neuronal activity on the transcriptome and implied various lncRNA regulatory mechanisms. lncRNAs, some of which are unique to primates, thus appear to have potentially important regulatory roles in activity-dependent human brain plasticity. PMID:22960213

  20. Identification of a functionally distinct truncated BDNF mRNA splice variant and protein in Trachemys scripta elegans.

    PubMed

    Ambigapathy, Ganesh; Zheng, Zhaoqing; Li, Wei; Keifer, Joyce

    2013-01-01

    Brain-derived neurotrophic factor (BDNF) has a diverse functional role and complex pattern of gene expression. Alternative splicing of mRNA transcripts leads to further diversity of mRNAs and protein isoforms. Here, we describe the regulation of BDNF mRNA transcripts in an in vitro model of eyeblink classical conditioning and a unique transcript that forms a functionally distinct truncated BDNF protein isoform. Nine different mRNA transcripts from the BDNF gene of the pond turtle Trachemys scripta elegans (tBDNF) are selectively regulated during classical conditioning: exon I mRNA transcripts show no change, exon II transcripts are downregulated, while exon III transcripts are upregulated. One unique transcript that codes from exon II, tBDNF2a, contains a 40 base pair deletion in the protein coding exon that generates a truncated tBDNF protein. The truncated transcript and protein are expressed in the naïve untrained state and are fully repressed during conditioning when full-length mature tBDNF is expressed, thereby having an alternate pattern of expression in conditioning. Truncated BDNF is not restricted to turtles as a truncated mRNA splice variant has been described for the human BDNF gene. Further studies are required to determine the ubiquity of truncated BDNF alternative splice variants across species and the mechanisms of regulation and function of this newly recognized BDNF protein.

  1. Identification of a Functionally Distinct Truncated BDNF mRNA Splice Variant and Protein in Trachemys scripta elegans

    PubMed Central

    Ambigapathy, Ganesh; Zheng, Zhaoqing; Li, Wei; Keifer, Joyce

    2013-01-01

    Brain-derived neurotrophic factor (BDNF) has a diverse functional role and complex pattern of gene expression. Alternative splicing of mRNA transcripts leads to further diversity of mRNAs and protein isoforms. Here, we describe the regulation of BDNF mRNA transcripts in an in vitro model of eyeblink classical conditioning and a unique transcript that forms a functionally distinct truncated BDNF protein isoform. Nine different mRNA transcripts from the BDNF gene of the pond turtle Trachemys scripta elegans (tBDNF) are selectively regulated during classical conditioning: exon I mRNA transcripts show no change, exon II transcripts are downregulated, while exon III transcripts are upregulated. One unique transcript that codes from exon II, tBDNF2a, contains a 40 base pair deletion in the protein coding exon that generates a truncated tBDNF protein. The truncated transcript and protein are expressed in the naïve untrained state and are fully repressed during conditioning when full-length mature tBDNF is expressed, thereby having an alternate pattern of expression in conditioning. Truncated BDNF is not restricted to turtles as a truncated mRNA splice variant has been described for the human BDNF gene. Further studies are required to determine the ubiquity of truncated BDNF alternative splice variants across species and the mechanisms of regulation and function of this newly recognized BDNF protein. PMID:23825634

  2. ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data.

    PubMed

    Zhou, Ke-Ren; Liu, Shun; Sun, Wen-Ju; Zheng, Ling-Ling; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2017-01-04

    The abnormal transcriptional regulation of non-coding RNAs (ncRNAs) and protein-coding genes (PCGs) is contributed to various biological processes and linked with human diseases, but the underlying mechanisms remain elusive. In this study, we developed ChIPBase v2.0 (http://rna.sysu.edu.cn/chipbase/) to explore the transcriptional regulatory networks of ncRNAs and PCGs. ChIPBase v2.0 has been expanded with ∼10 200 curated ChIP-seq datasets, which represent about 20 times expansion when comparing to the previous released version. We identified thousands of binding motif matrices and their binding sites from ChIP-seq data of DNA-binding proteins and predicted millions of transcriptional regulatory relationships between transcription factors (TFs) and genes. We constructed 'Regulator' module to predict hundreds of TFs and histone modifications that were involved in or affected transcription of ncRNAs and PCGs. Moreover, we built a web-based tool, Co-Expression, to explore the co-expression patterns between DNA-binding proteins and various types of genes by integrating the gene expression profiles of ∼10 000 tumor samples and ∼9100 normal tissues and cell lines. ChIPBase also provides a ChIP-Function tool and a genome browser to predict functions of diverse genes and visualize various ChIP-seq data. This study will greatly expand our understanding of the transcriptional regulations of ncRNAs and PCGs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Transcriptomes of six mutants in the Sen1 pathway reveal combinatorial control of transcription termination across the Saccharomyces cerevisiae genome

    PubMed Central

    Carver, Melissa N.; Müller, Ulrika; Bekiranov, Stefan; Auble, David T.

    2017-01-01

    Transcriptome studies on eukaryotic cells have revealed an unexpected abundance and diversity of noncoding RNAs synthesized by RNA polymerase II (Pol II), some of which influence the expression of protein-coding genes. Yet, much less is known about biogenesis of Pol II non-coding RNA than mRNAs. In the budding yeast Saccharomyces cerevisiae, initiation of non-coding transcripts by Pol II appears to be similar to that of mRNAs, but a distinct pathway is utilized for termination of most non-coding RNAs: the Sen1-dependent or “NNS” pathway. Here, we examine the effect on the S. cerevisiae transcriptome of conditional mutations in the genes encoding six different essential proteins that influence Sen1-dependent termination: Sen1, Nrd1, Nab3, Ssu72, Rpb11, and Hrp1. We observe surprisingly diverse effects on transcript abundance for the different proteins that cannot be explained simply by differing severity of the mutations. Rather, we infer from our results that termination of Pol II transcription of non-coding RNA genes is subject to complex combinatorial control that likely involves proteins beyond those studied here. Furthermore, we identify new targets and functions of Sen1-dependent termination, including a role in repression of meiotic genes in vegetative cells. In combination with other recent whole-genome studies on termination of non-coding RNAs, our results provide promising directions for further investigation. PMID:28665995

  4. Localization of TFIIB binding regions using serial analysis of chromatin occupancy

    PubMed Central

    Yochum, Gregory S; Rajaraman, Veena; Cleland, Ryan; McWeeney, Shannon

    2007-01-01

    Background: RNA Polymerase II (RNAP II) is recruited to core promoters by the pre-initiation complex (PIC) of general transcription factors. Within the PIC, transcription factor for RNA polymerase IIB (TFIIB) determines the start site of transcription. TFIIB binding has not been localized, genome-wide, in metazoans. Serial analysis of chromatin occupancy (SACO) is an unbiased methodology used to empirically identify transcription factor binding regions. In this report, we use TFIIB and SACO to localize TFIIB binding regions across the rat genome. Results: A sample of the TFIIB SACO library was sequenced and 12,968 TFIIB genomic signature tags (GSTs) were assigned to the rat genome. GSTs are 20–22 base pair fragments that are derived from TFIIB bound chromatin. TFIIB localized to both non-protein coding and protein-coding loci. For 21% of the 1783 protein-coding genes in this sample of the SACO library, TFIIB binding mapped near the characterized 5' promoter that is upstream of the transcription start site (TSS). However, internal TFIIB binding positions were identified in 57% of the 1783 protein-coding genes. Internal positions are defined as those within an inclusive region greater than 2.5 kb downstream from the 5' TSS and 2.5 kb upstream from the transcription stop. We demonstrate that both TFIIB and TFIID (an additional component of PICs) bound to internal regions using chromatin immunoprecipitation (ChIP). The 5' cap of transcripts associated with internal TFIIB binding positions were identified using a cap-trapping assay. The 5' TSSs for internal transcripts were confirmed by primer extension. Additionally, an analysis of the functional annotation of mouse 3 (FANTOM3) databases indicates that internally initiated transcripts identified by TFIIB SACO in rat are conserved in mouse. Conclusion: Our findings that TFIIB binding is not restricted to the 5' upstream region indicates that the propensity for PIC to contribute to transcript diversity is far greater than previously appreciated. PMID:17997859

  5. Tuning of Recombinant Protein Expression in Escherichia coli by Manipulating Transcription, Translation Initiation Rates, and Incorporation of Noncanonical Amino Acids.

    PubMed

    Schlesinger, Orr; Chemla, Yonatan; Heltberg, Mathias; Ozer, Eden; Marshall, Ryan; Noireaux, Vincent; Jensen, Mogens Høgh; Alfonta, Lital

    2017-06-16

    Protein synthesis in cells has been thoroughly investigated and characterized over the past 60 years. However, some fundamental issues remain unresolved, including the reasons for genetic code redundancy and codon bias. In this study, we changed the kinetics of the Eschrichia coli transcription and translation processes by mutating the promoter and ribosome binding domains and by using genetic code expansion. The results expose a counterintuitive phenomenon, whereby an increase in the initiation rates of transcription and translation lead to a decrease in protein expression. This effect can be rescued by introducing slow translating codons into the beginning of the gene, by shortening gene length or by reducing initiation rates. On the basis of the results, we developed a biophysical model, which suggests that the density of co-transcriptional-translation plays a role in bacterial protein synthesis. These findings indicate how cells use codon bias to tune translation speed and protein synthesis.

  6. In silico methods for co-transcriptional RNA secondary structure prediction and for investigating alternative RNA structure expression.

    PubMed

    Meyer, Irmtraud M

    2017-05-01

    RNA transcripts are the primary products of active genes in any living organism, including many viruses. Their cellular destiny not only depends on primary sequence signals, but can also be determined by RNA structure. Recent experimental evidence shows that many transcripts can be assigned more than a single functional RNA structure throughout their cellular life and that structure formation happens co-transcriptionally, i.e. as the transcript is synthesised in the cell. Moreover, functional RNA structures are not limited to non-coding transcripts, but can also feature in coding transcripts. The picture that now emerges is that RNA structures constitute an additional layer of information that can be encoded in any RNA transcript (and on top of other layers of information such as protein-context) in order to exert a wide range of functional roles. Moreover, different encoded RNA structures can be expressed at different stages of a transcript's life in order to alter the transcript's behaviour depending on its actual cellular context. Similar to the concept of alternative splicing for protein-coding genes, where a single transcript can yield different proteins depending on cellular context, it is thus appropriate to propose the notion of alternative RNA structure expression for any given transcript. This review introduces several computational strategies that my group developed to detect different aspects of RNA structure expression in vivo. Two aspects are of particular interest to us: (1) RNA secondary structure features that emerge during co-transcriptional folding and (2) functional RNA structure features that are expressed at different times of a transcript's life and potentially mutually exclusive. Copyright © 2017. Published by Elsevier Inc.

  7. Forty-four novel protein-coding loci discovered using a proteomics informed by transcriptomics (PIT) approach in rat male germ cells.

    PubMed

    Chocu, Sophie; Evrard, Bertrand; Lavigne, Régis; Rolland, Antoine D; Aubry, Florence; Jégou, Bernard; Chalmel, Frédéric; Pineau, Charles

    2014-11-01

    Spermatogenesis is a complex process, dependent upon the successive activation and/or repression of thousands of gene products, and ends with the production of haploid male gametes. RNA sequencing of male germ cells in the rat identified thousands of novel testicular unannotated transcripts (TUTs). Although such RNAs are usually annotated as long noncoding RNAs (lncRNAs), it is possible that some of these TUTs code for protein. To test this possibility, we used a "proteomics informed by transcriptomics" (PIT) strategy combining RNA sequencing data with shotgun proteomics analyses of spermatocytes and spermatids in the rat. Among 3559 TUTs and 506 lncRNAs found in meiotic and postmeiotic germ cells, 44 encoded at least one peptide. We showed that these novel high-confidence protein-coding loci exhibit several genomic features intermediate between those of lncRNAs and mRNAs. We experimentally validated the testicular expression pattern of two of these novel protein-coding gene candidates, both highly conserved in mammals: one for a vesicle-associated membrane protein we named VAMP-9, and the other for an enolase domain-containing protein. This study confirms the potential of PIT approaches for the discovery of protein-coding transcripts initially thought to be untranslated or unknown transcripts. Our results contribute to the understanding of spermatogenesis by characterizing two novel proteins, implicated by their strong expression in germ cells. The mass spectrometry proteomics data have been deposited with the ProteomeXchange Consortium under the data set identifier PXD000872. © 2014 by the Society for the Study of Reproduction, Inc.

  8. Non-coding, mRNA-like RNAs database Y2K.

    PubMed

    Erdmann, V A; Szymanski, M; Hochberg, A; Groot, N; Barciszewski, J

    2000-01-01

    In last few years much data has accumulated on various non-translatable RNA transcripts that are synthesised in different cells. They are lacking in protein coding capacity and it seems that they work mainly or exclusively at the RNA level. All known non-coding RNA transcripts are collected in the database: http://www. man.poznan.pl/5SData/ncRNA/index.html

  9. Non-coding, mRNA-like RNAs database Y2K

    PubMed Central

    Erdmann, Volker A.; Szymanski, Maciej; Hochberg, Abraham; Groot, Nathan de; Barciszewski, Jan

    2000-01-01

    In last few years much data has accumulated on various non-translatable RNA transcripts that are synthesised in different cells. They are lacking in protein coding capacity and it seems that they work mainly or exclusively at the RNA level. All known non-coding RNA transcripts are collected in the database: http://www.man.poznan.pl/5SData/ncRNA/index.html PMID:10592224

  10. Temporal regulation of expression of immediate early and second phase transcripts by endothelin-1 in cardiomyocytes

    PubMed Central

    Cullingford, Timothy E; Markou, Thomais; Fuller, Stephen J; Giraldo, Alejandro; Pikkarainen, Sampsa; Zoumpoulidou, Georgia; Alsafi, Ali; Ekere, Collins; Kemp, Timothy J; Dennis, Jayne L; Game, Laurence; Sugden, Peter H; Clerk, Angela

    2008-01-01

    Background Endothelin-1 stimulates Gq protein-coupled receptors to promote proliferation in dividing cells or hypertrophy in terminally differentiated cardiomyocytes. In cardiomyocytes, endothelin-1 rapidly (within minutes) stimulates protein kinase signaling, including extracellular-signal regulated kinases 1/2 (ERK1/2; though not ERK5), with phenotypic/physiological changes developing from approximately 12 h. Hypertrophy is associated with changes in mRNA/protein expression, presumably consequent to protein kinase signaling, but the connections between early, transient signaling events and developed hypertrophy are unknown. Results Using microarrays, we defined the early transcriptional responses of neonatal rat cardiomyocytes to endothelin-1 over 4 h, differentiating between immediate early gene (IEG) and second phase RNAs with cycloheximide. IEGs exhibited differential temporal and transient regulation, with expression of second phase RNAs within 1 h. Of transcripts upregulated at 30 minutes encoding established proteins, 28 were inhibited >50% by U0126 (which inhibits ERK1/2/5 signaling), with 9 inhibited 25-50%. Expression of only four transcripts was not inhibited. At 1 h, most RNAs (approximately 67%) were equally changed in total and polysomal RNA with approximately 17% of transcripts increased to a greater extent in polysomes. Thus, changes in expression of most protein-coding RNAs should be reflected in protein synthesis. However, approximately 16% of transcripts were essentially excluded from the polysomes, including some protein-coding mRNAs, presumably inefficiently translated. Conclusion The phasic, temporal regulation of early transcriptional responses induced by endothelin-1 in cardiomyocytes indicates that, even in terminally differentiated cells, signals are propagated beyond the primary signaling pathways through transcriptional networks leading to phenotypic changes (that is, hypertrophy). Furthermore, ERK1/2 signaling plays a major role in this response. PMID:18275597

  11. Transcription of the cottontail rabbit papillomavirus early region and identification of two E6 polypeptides in COS-7 cells.

    PubMed Central

    Barbosa, M S; Wettstein, F O

    1987-01-01

    Cottontail rabbit papillomavirus (CRPV) early proteins are present at very low levels in virus-induced tumors and cannot be detected by immunological methods. Furthermore, cells in culture are not readily transformed by the virus. To overcome these difficulties in identifying and characterizing the putative transforming protein(s) coded by the E6 open reading frame, the early cottontail rabbit papillomavirus region was expressed under the control of the late simian virus 40 promoter. Mapping of the transcripts in transiently transfected COS-7 cells indicated that transcription was initiated in the late region of simian virus 40. Two E6-coded polypeptides were identified, representing translation products initiated at the first and second AUG codons. Images PMID:3039182

  12. RNAi mediates post-transcriptional repression of gene expression in fission yeast Schizosaccharomyces pombe

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smialowska, Agata, E-mail: smialowskaa@gmail.com; School of Life Sciences, Södertörn Högskola, Huddinge 141-89; Djupedal, Ingela

    Highlights: • Protein coding genes accumulate anti-sense sRNAs in fission yeast S. pombe. • RNAi represses protein-coding genes in S. pombe. • RNAi-mediated gene repression is post-transcriptional. - Abstract: RNA interference (RNAi) is a gene silencing mechanism conserved from fungi to mammals. Small interfering RNAs are products and mediators of the RNAi pathway and act as specificity factors in recruiting effector complexes. The Schizosaccharomyces pombe genome encodes one of each of the core RNAi proteins, Dicer, Argonaute and RNA-dependent RNA polymerase (dcr1, ago1, rdp1). Even though the function of RNAi in heterochromatin assembly in S. pombe is established, its rolemore » in controlling gene expression is elusive. Here, we report the identification of small RNAs mapped anti-sense to protein coding genes in fission yeast. We demonstrate that these genes are up-regulated at the protein level in RNAi mutants, while their mRNA levels are not significantly changed. We show that the repression by RNAi is not a result of heterochromatin formation. Thus, we conclude that RNAi is involved in post-transcriptional gene silencing in S. pombe.« less

  13. Genes uniquely expressed in human growth plate chondrocytes uncover a distinct regulatory network.

    PubMed

    Li, Bing; Balasubramanian, Karthika; Krakow, Deborah; Cohn, Daniel H

    2017-12-20

    Chondrogenesis is the earliest stage of skeletal development and is a highly dynamic process, integrating the activities and functions of transcription factors, cell signaling molecules and extracellular matrix proteins. The molecular mechanisms underlying chondrogenesis have been extensively studied and multiple key regulators of this process have been identified. However, a genome-wide overview of the gene regulatory network in chondrogenesis has not been achieved. In this study, employing RNA sequencing, we identified 332 protein coding genes and 34 long non-coding RNA (lncRNA) genes that are highly selectively expressed in human fetal growth plate chondrocytes. Among the protein coding genes, 32 genes were associated with 62 distinct human skeletal disorders and 153 genes were associated with skeletal defects in knockout mice, confirming their essential roles in skeletal formation. These gene products formed a comprehensive physical interaction network and participated in multiple cellular processes regulating skeletal development. The data also revealed 34 transcription factors and 11,334 distal enhancers that were uniquely active in chondrocytes, functioning as transcriptional regulators for the cartilage-selective genes. Our findings revealed a complex gene regulatory network controlling skeletal development whereby transcription factors, enhancers and lncRNAs participate in chondrogenesis by transcriptional regulation of key genes. Additionally, the cartilage-selective genes represent candidate genes for unsolved human skeletal disorders.

  14. Long Non-coding RNAs and Their Biological Roles in Plants

    PubMed Central

    Liu, Xue; Hao, Lili; Li, Dayong; Zhu, Lihuang; Hu, Songnian

    2015-01-01

    With the development of genomics and bioinformatics, especially the extensive applications of high-throughput sequencing technology, more transcriptional units with little or no protein-coding potential have been discovered. Such RNA molecules are called non-protein-coding RNAs (npcRNAs or ncRNAs). Among them, long npcRNAs or ncRNAs (lnpcRNAs or lncRNAs) represent diverse classes of transcripts longer than 200 nucleotides. In recent years, the lncRNAs have been considered as important regulators in many essential biological processes. In plants, although a large number of lncRNA transcripts have been predicted and identified in few species, our current knowledge of their biological functions is still limited. Here, we have summarized recent studies on their identification, characteristics, classification, bioinformatics, resources, and current exploration of their biological functions in plants. PMID:25936895

  15. De novo assembly and characterization of the Trichuris trichiura adult worm transcriptome using Ion Torrent sequencing.

    PubMed

    Santos, Leonardo N; Silva, Eduardo S; Santos, André S; De Sá, Pablo H; Ramos, Rommel T; Silva, Artur; Cooper, Philip J; Barreto, Maurício L; Loureiro, Sebastião; Pinheiro, Carina S; Alcantara-Neves, Neuza M; Pacheco, Luis G C

    2016-07-01

    Infection with helminthic parasites, including the soil-transmitted helminth Trichuris trichiura (human whipworm), has been shown to modulate host immune responses and, consequently, to have an impact on the development and manifestation of chronic human inflammatory diseases. De novo derivation of helminth proteomes from sequencing of transcriptomes will provide valuable data to aid identification of parasite proteins that could be evaluated as potential immunotherapeutic molecules in near future. Herein, we characterized the transcriptome of the adult stage of the human whipworm T. trichiura, using next-generation sequencing technology and a de novo assembly strategy. Nearly 17.6 million high-quality clean reads were assembled into 6414 contiguous sequences, with an N50 of 1606bp. In total, 5673 protein-encoding sequences were confidentially identified in the T. trichiura adult worm transcriptome; of these, 1013 sequences represent potential newly discovered proteins for the species, most of which presenting orthologs already annotated in the related species T. suis. A number of transcripts representing probable novel non-coding transcripts for the species T. trichiura were also identified. Among the most abundant transcripts, we found sequences that code for proteins involved in lipid transport, such as vitellogenins, and several chitin-binding proteins. Through a cross-species expression analysis of gene orthologs shared by T. trichiura and the closely related parasites T. suis and T. muris it was possible to find twenty-six protein-encoding genes that are consistently highly expressed in the adult stages of the three helminth species. Additionally, twenty transcripts could be identified that code for proteins previously detected by mass spectrometry analysis of protein fractions of the whipworm somatic extract that present immunomodulatory activities. Five of these transcripts were amongst the most highly expressed protein-encoding sequences in the T. trichiura adult worm. Besides, orthologs of proteins demonstrated to have potent immunomodulatory properties in related parasitic helminths were also predicted from the T. trichiura de novo assembled transcriptome. Copyright © 2016. Published by Elsevier B.V.

  16. Quantitative Profiling of Peptides from RNAs classified as non-coding

    PubMed Central

    Prabakaran, Sudhakaran; Hemberg, Martin; Chauhan, Ruchi; Winter, Dominic; Tweedie-Cullen, Ry Y.; Dittrich, Christian; Hong, Elizabeth; Gunawardena, Jeremy; Steen, Hanno; Kreiman, Gabriel; Steen, Judith A.

    2014-01-01

    Only a small fraction of the mammalian genome codes for messenger RNAs destined to be translated into proteins, and it is generally assumed that a large portion of transcribed sequences - including introns and several classes of non-coding RNAs (ncRNAs) do not give rise to peptide products. A systematic examination of translation and physiological regulation of ncRNAs has not been conducted. Here, we use computational methods to identify the products of non-canonical translation in mouse neurons by analyzing unannotated transcripts in combination with proteomic data. This study supports the existence of non-canonical translation products from both intragenic and extragenic genomic regions, including peptides derived from anti-sense transcripts and introns. Moreover, the studied novel translation products exhibit temporal regulation similar to that of proteins known to be involved in neuronal activity processes. These observations highlight a potentially large and complex set of biologically regulated translational events from transcripts formerly thought to lack coding potential. PMID:25403355

  17. Prevalence of transcription promoters within archaeal operons and coding sequences.

    PubMed

    Koide, Tie; Reiss, David J; Bare, J Christopher; Pang, Wyming Lee; Facciotti, Marc T; Schmid, Amy K; Pan, Min; Marzolf, Bruz; Van, Phu T; Lo, Fang-Yin; Pratap, Abhishek; Deutsch, Eric W; Peterson, Amelia; Martin, Dan; Baliga, Nitin S

    2009-01-01

    Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of approximately 64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein-DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3' ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes-events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements.

  18. A common class of transcripts with 5'-intron depletion, distinct early coding sequence features, and N1-methyladenosine modification.

    PubMed

    Cenik, Can; Chua, Hon Nian; Singh, Guramrit; Akef, Abdalla; Snyder, Michael P; Palazzo, Alexander F; Moore, Melissa J; Roth, Frederick P

    2017-03-01

    Introns are found in 5' untranslated regions (5'UTRs) for 35% of all human transcripts. These 5'UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5'UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5'UTR intron status, we developed a classifier that can predict 5'UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5 ' proximal- i ntron- m inus-like-coding regions ("5IM" transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5' cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5' proximal positions. Finally, N 1 -methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5' proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N 1 -methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC. © 2017 Cenik et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  19. [Regulation of heat shock gene expression in response to stress].

    PubMed

    Garbuz, D G

    2017-01-01

    Heat shock (HS) genes, or stress genes, code for a number of proteins that collectively form the most ancient and universal stress defense system. The system determines the cell capability of adaptation to various adverse factors and performs a variety of auxiliary functions in normal physiological conditions. Common stress factors, such as higher temperatures, hypoxia, heavy metals, and others, suppress transcription and translation for the majority of genes, while HS genes are upregulated. Transcription of HS genes is controlled by transcription factors of the HS factor (HSF) family. Certain HSFs are activated on exposure to higher temperatures or other adverse factors to ensure stress-induced HS gene expression, while other HSFs are specifically activated at particular developmental stages. The regulation of the main mammalian stress-inducible factor HSF1 and Drosophila melanogaster HSF includes many components, such as a variety of early warning signals indicative of abnormal cell activity (e.g., increases in intracellular ceramide, cytosolic calcium ions, or partly denatured proteins); protein kinases, which phosphorylate HSFs at various Ser residues; acetyltransferases; and regulatory proteins, such as SUMO and HSBP1. Transcription factors other than HSFs are also involved in activating HS gene transcription; the set includes D. melanogaster GAF, mammalian Sp1 and NF-Y, and other factors. Transcription of several stress genes coding for molecular chaperones of the glucose-regulated protein (GRP) family is predominantly regulated by another stress-detecting system, which is known as the unfolded protein response (UPR) system and is activated in response to massive protein misfolding in the endoplasmic reticulum and mitochondrial matrix. A translational fine tuning of HS protein expression occurs via changing the phosphorylation status of several proteins involved in translation initiation. In addition, specific signal sequences in the 5'-UTRs of some HS protein mRNAs ensure their preferential translation in stress.

  20. A lncRNA Perspective into (Re)Building the Heart.

    PubMed

    Frank, Stefan; Aguirre, Aitor; Hescheler, Juergen; Kurian, Leo

    2016-01-01

    Our conception of the human genome, long focused on the 2% that codes for proteins, has profoundly changed since its first draft assembly in 2001. Since then, an unanticipatedly expansive functionality and convolution has been attributed to the majority of the genome that is transcribed in a cell-type/context-specific manner into transcripts with no apparent protein coding ability. While the majority of these transcripts, currently annotated as long non-coding RNAs (lncRNAs), are functionally uncharacterized, their prominent role in embryonic development and tissue homeostasis, especially in the context of the heart, is emerging. In this review, we summarize and discuss the latest advances in understanding the relevance of lncRNAs in (re)building the heart.

  1. Long non-coding RNAs and their biological roles in plants.

    PubMed

    Liu, Xue; Hao, Lili; Li, Dayong; Zhu, Lihuang; Hu, Songnian

    2015-06-01

    With the development of genomics and bioinformatics, especially the extensive applications of high-throughput sequencing technology, more transcriptional units with little or no protein-coding potential have been discovered. Such RNA molecules are called non-protein-coding RNAs (npcRNAs or ncRNAs). Among them, long npcRNAs or ncRNAs (lnpcRNAs or lncRNAs) represent diverse classes of transcripts longer than 200 nucleotides. In recent years, the lncRNAs have been considered as important regulators in many essential biological processes. In plants, although a large number of lncRNA transcripts have been predicted and identified in few species, our current knowledge of their biological functions is still limited. Here, we have summarized recent studies on their identification, characteristics, classification, bioinformatics, resources, and current exploration of their biological functions in plants. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  2. Transcription and DNA Damage: Holding Hands or Crossing Swords?

    PubMed

    D'Alessandro, Giuseppina; d'Adda di Fagagna, Fabrizio

    2017-10-27

    Transcription has classically been considered a potential threat to genome integrity. Collision between transcription and DNA replication machinery, and retention of DNA:RNA hybrids, may result in genome instability. On the other hand, it has been proposed that active genes repair faster and preferentially via homologous recombination. Moreover, while canonical transcription is inhibited in the proximity of DNA double-strand breaks, a growing body of evidence supports active non-canonical transcription at DNA damage sites. Small non-coding RNAs accumulate at DNA double-strand break sites in mammals and other organisms, and are involved in DNA damage signaling and repair. Furthermore, RNA binding proteins are recruited to DNA damage sites and participate in the DNA damage response. Here, we discuss the impact of transcription on genome stability, the role of RNA binding proteins at DNA damage sites, and the function of small non-coding RNAs generated upon damage in the signaling and repair of DNA lesions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Identification of a novel herpes simplex virus type 1 transcript and protein (AL3) expressed during latency.

    PubMed

    Jaber, Tareq; Henderson, Gail; Li, Sumin; Perng, Guey-Chuen; Carpenter, Dale; Wechsler, Steven L; Jones, Clinton

    2009-10-01

    The herpes simplex virus type 1 (HSV-1) latency-associated transcript (LAT) is abundantly expressed in latently infected sensory neurons. In small animal models of infection, expression of the first 1.5 kb of LAT coding sequences is necessary and sufficient for wild-type reactivation from latency. The ability of LAT to inhibit apoptosis is important for reactivation from latency. Within the first 1.5 kb of LAT coding sequences and LAT promoter sequences, additional transcripts have been identified. For example, the anti-sense to LAT transcript (AL) is expressed in the opposite direction to LAT from the 5' end of LAT and LAT promoter sequences. In addition, the upstream of LAT (UOL) transcript is expressed in the LAT direction from sequences in the LAT promoter. Further examination of the first 1.5 kb of LAT coding sequences revealed two small ORFs that are anti-sense with respect to LAT (AL2 and AL3). A transcript spanning AL3 was detected in productively infected cells, mouse neuroblastoma cells stably expressing LAT and trigeminal ganglia (TG) of latently infected mice. Peptide-specific IgG directed against AL3 specifically recognized a protein migrating near 15 kDa in cells stably transfected with LAT, mouse neuroblastoma cells transfected with a plasmid containing the AL3 ORF and TG of latently infected mice. The inability to detect the AL3 protein during productive infection may have been because the 5' terminus of the AL3 transcript was downstream of the first in-frame methionine of the AL3 ORF during productive infection.

  4. Identification of Circular RNAs from the Parental Genes Involved in Multiple Aspects of Cellular Metabolism in Barley

    PubMed Central

    Darbani, Behrooz; Noeparvar, Shahin; Borg, Søren

    2016-01-01

    RNA circularization made by head-to-tail back-splicing events is involved in the regulation of gene expression from transcriptional to post-translational levels. By exploiting RNA-Seq data and down-stream analysis, we shed light on the importance of circular RNAs in plants. The results introduce circular RNAs as novel interactors in the regulation of gene expression in plants and imply the comprehensiveness of this regulatory pathway by identifying circular RNAs for a diverse set of genes. These genes are involved in several aspects of cellular metabolism as hormonal signaling, intracellular protein sorting, carbohydrate metabolism and cell-wall biogenesis, respiration, amino acid biosynthesis, transcription and translation, and protein ubiquitination. Additionally, these parental loci of circular RNAs, from both nuclear and mitochondrial genomes, encode for different transcript classes including protein coding transcripts, microRNA, rRNA, and long non-coding/microprotein coding RNAs. The results shed light on the mitochondrial exonic circular RNAs and imply the importance of circular RNAs for regulation of mitochondrial genes. Importantly, we introduce circular RNAs in barley and elucidate their cellular-level alterations across tissues and in response to micronutrients iron and zinc. In further support of circular RNAs' functional roles in plants, we report several cases where fluctuations of circRNAs do not correlate with the levels of their parental-loci encoded linear transcripts. PMID:27375638

  5. Biallelic insertion of a transcriptional terminator via the CRISPR/Cas9 system efficiently silences expression of protein-coding and non-coding RNA genes.

    PubMed

    Liu, Yangyang; Han, Xiao; Yuan, Junting; Geng, Tuoyu; Chen, Shihao; Hu, Xuming; Cui, Isabelle H; Cui, Hengmi

    2017-04-07

    The type II bacterial CRISPR/Cas9 system is a simple, convenient, and powerful tool for targeted gene editing. Here, we describe a CRISPR/Cas9-based approach for inserting a poly(A) transcriptional terminator into both alleles of a targeted gene to silence protein-coding and non-protein-coding genes, which often play key roles in gene regulation but are difficult to silence via insertion or deletion of short DNA fragments. The integration of 225 bp of bovine growth hormone poly(A) signals into either the first intron or the first exon or behind the promoter of target genes caused efficient termination of expression of PPP1R12C , NSUN2 (protein-coding genes), and MALAT1 (non-protein-coding gene). Both NeoR and PuroR were used as markers in the selection of clonal cell lines with biallelic integration of a poly(A) signal. Genotyping analysis indicated that the cell lines displayed the desired biallelic silencing after a brief selection period. These combined results indicate that this CRISPR/Cas9-based approach offers an easy, convenient, and efficient novel technique for gene silencing in cell lines, especially for those in which gene integration is difficult because of a low efficiency of homology-directed repair. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  6. Regulatory consequences of neuronal ELAV-like protein binding to coding and non-coding RNAs in human brain

    PubMed Central

    Scheckel, Claudia; Drapeau, Elodie; Frias, Maria A; Park, Christopher Y; Fak, John; Zucker-Scharff, Ilana; Kou, Yan; Haroutunian, Vahram; Ma'ayan, Avi

    2016-01-01

    Neuronal ELAV-like (nELAVL) RNA binding proteins have been linked to numerous neurological disorders. We performed crosslinking-immunoprecipitation and RNAseq on human brain, and identified nELAVL binding sites on 8681 transcripts. Using knockout mice and RNAi in human neuroblastoma cells, we showed that nELAVL intronic and 3' UTR binding regulates human RNA splicing and abundance. We validated hundreds of nELAVL targets among which were important neuronal and disease-associated transcripts, including Alzheimer's disease (AD) transcripts. We therefore investigated RNA regulation in AD brain, and observed differential splicing of 150 transcripts, which in some cases correlated with differential nELAVL binding. Unexpectedly, the most significant change of nELAVL binding was evident on non-coding Y RNAs. nELAVL/Y RNA complexes were specifically remodeled in AD and after acute UV stress in neuroblastoma cells. We propose that the increased nELAVL/Y RNA association during stress may lead to nELAVL sequestration, redistribution of nELAVL target binding, and altered neuronal RNA splicing. DOI: http://dx.doi.org/10.7554/eLife.10421.001 PMID:26894958

  7. The Mediator complex: a central integrator of transcription

    PubMed Central

    Allen, Benjamin L.; Taatjes, Dylan J.

    2016-01-01

    The RNA polymerase II (pol II) enzyme transcribes all protein-coding and most non-coding RNA genes and is globally regulated by Mediator, a large, conformationally flexible protein complex with variable subunit composition (for example, a four-subunit CDK8 module can reversibly associate). These biochemical characteristics are fundamentally important for Mediator's ability to control various processes important for transcription, including organization of chromatin architecture and regulation of pol II pre-initiation, initiation, re-initiation, pausing, and elongation. Although Mediator exists in all eukaryotes, a variety of Mediator functions appear to be specific to metazoans, indicative of more diverse regulatory requirements. PMID:25693131

  8. cncRNAs: Bi-functional RNAs with protein coding and non-coding functions

    PubMed Central

    Kumari, Pooja; Sampath, Karuna

    2015-01-01

    For many decades, the major function of mRNA was thought to be to provide protein-coding information embedded in the genome. The advent of high-throughput sequencing has led to the discovery of pervasive transcription of eukaryotic genomes and opened the world of RNA-mediated gene regulation. Many regulatory RNAs have been found to be incapable of protein coding and are hence termed as non-coding RNAs (ncRNAs). However, studies in recent years have shown that several previously annotated non-coding RNAs have the potential to encode proteins, and conversely, some coding RNAs have regulatory functions independent of the protein they encode. Such bi-functional RNAs, with both protein coding and non-coding functions, which we term as ‘cncRNAs’, have emerged as new players in cellular systems. Here, we describe the functions of some cncRNAs identified from bacteria to humans. Because the functions of many RNAs across genomes remains unclear, we propose that RNAs be classified as coding, non-coding or both only after careful analysis of their functions. PMID:26498036

  9. The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101

    NASA Astrophysics Data System (ADS)

    Pfreundt, Ulrike; Kopf, Matthias; Belkin, Natalia; Berman-Frank, Ilana; Hess, Wolfgang R.

    2014-08-01

    Blooms of the dinitrogen-fixing marine cyanobacterium Trichodesmium considerably contribute to new nitrogen inputs into tropical oceans. Intriguingly, only 60% of the Trichodesmium erythraeum IMS101 genome sequence codes for protein, compared with ~85% in other sequenced cyanobacterial genomes. The extensive non-coding genome fraction suggests space for an unusually high number of unidentified, potentially regulatory non-protein-coding RNAs (ncRNAs). To identify the transcribed fraction of the genome, here we present a genome-wide map of transcriptional start sites (TSS) at single nucleotide resolution, revealing the activity of 6,080 promoters. We demonstrate that T. erythraeum has the highest number of actively splicing group II introns and the highest percentage of TSS yielding ncRNAs of any bacterium examined to date. We identified a highly transcribed retroelement that serves as template repeat for the targeted mutation of at least 12 different genes by mutagenic homing. Our findings explain the non-coding portion of the T. erythraeum genome by the transcription of an unusually high number of non-coding transcripts in addition to the known high incidence of transposable elements. We conclude that riboregulation and RNA maturation-dependent processes constitute a major part of the Trichodesmium regulatory apparatus.

  10. Dampening DNA binding: a common mechanism of transcriptional repression for both ncRNAs and protein domains.

    PubMed

    Goodrich, James A; Kugel, Jennifer F

    2010-01-01

    With eukaryotic non-coding RNAs (ncRNAs) now established as critical regulators of cellular transcription, the true diversity with which they can elicit biological effects is beginning to be appreciated. Two ncRNAs, mouse B2 RNA and human Alu RNA, have been found to repress mRNA transcription in response to heat shock. They do so by binding directly to RNA polymerase II, assembling into complexes on promoter DNA, and disrupting contacts between the polymerase and the DNA. Such a mechanism of repression had not previously been observed for a eukaryotic ncRNA; however, there are examples of eukaryotic protein domains that repress transcription by blocking essential protein-DNA interactions. Comparing the mechanism of transcriptional repression utilized by these protein domains to that used by B2 and Alu RNAs raises intriguing questions regarding transcriptional control, and how B2 and Alu RNAs might themselves be regulated.

  11. New PAH gene promoter KLF1 and 3'-region C/EBPalpha motifs influence transcription in vitro.

    PubMed

    Klaassen, Kristel; Stankovic, Biljana; Kotur, Nikola; Djordjevic, Maja; Zukic, Branka; Nikcevic, Gordana; Ugrin, Milena; Spasovski, Vesna; Srzentic, Sanja; Pavlovic, Sonja; Stojiljkovic, Maja

    2017-02-01

    Phenylketonuria (PKU) is a metabolic disease caused by mutations in the phenylalanine hydroxylase (PAH) gene. Although the PAH genotype remains the main determinant of PKU phenotype severity, genotype-phenotype inconsistencies have been reported. In this study, we focused on unanalysed sequences in non-coding PAH gene regions to assess their possible influence on the PKU phenotype. We transiently transfected HepG2 cells with various chloramphenicol acetyl transferase (CAT) reporter constructs which included PAH gene non-coding regions. Selected non-coding regions were indicated by in silico prediction to contain transcription factor binding sites. Furthermore, electrophoretic mobility shift assay (EMSA) and supershift assays were performed to identify which transcriptional factors were engaged in the interaction. We found novel KLF1 motif in the PAH promoter, which decreases CAT activity by 50 % in comparison to basal transcription in vitro. The cytosine at the c.-170 promoter position creates an additional binding site for the protein complex involving KLF1 transcription factor. Moreover, we assessed for the first time the role of a multivariant variable number tandem repeat (VNTR) region located in the 3'-region of the PAH gene. We found that the VNTR3, VNTR7 and VNTR8 constructs had approximately 60 % of CAT activity. The regulation is mediated by the C/EBPalpha transcription factor, present in protein complex binding to VNTR3. Our study highlighted two novel promoter KLF1 and 3'-region C/EBPalpha motifs in the PAH gene which decrease transcription in vitro and, thus, could be considered as PAH expression modifiers. New transcription motifs in non-coding regions will contribute to better understanding of the PKU phenotype complexity and may become important for the optimisation of PKU treatment.

  12. Transcriptome and gene expression profile of ovarian follicle tissue of the triatomine bug Rhodnius prolixus

    PubMed Central

    Medeiros, Marcelo N.; Logullo, Raquel; Ramos, Isabela B.; Sorgine, Marcos H. F.; Paiva-Silva, Gabriela O.; Mesquita, Rafael D.; Machado, Ednildo Alcantara; Coutinho, Maria Alice; Masuda, Hatisaburo; Capurro, Margareth L.; Ribeiro, José M.C.; Cardoso Braz, Glória Regina; Oliveira, Pedro L

    2013-01-01

    Insect oocytes grow in close association with the ovarian follicular epithelium (OFE), which escorts the oocyte during oogenesis and is responsible for synthesis and secretion of the eggshell. We describe a transcriptome of OFE of the triatomine bug Rhodnius prolixus, a vector of Chagas disease, to increase our knowledge of the role of FE in egg development. Random clones were sequenced from a cDNA library of different stages of follicle development. The transcriptome showed high commitment to transcription, protein synthesis, and secretion. The most abundant cDNA was a secreted (S) small, proline-rich protein with maximal expression in the vitellogenic follicle, suggesting a role in oocyte maturation. We also found Rp45, a chorion protein already described, and a putative chitin-associated cuticle protein that was an eggshell component candidate. Six transcripts coding for proteins related to the unfolded protein response (UPR) by were chosen and their expression analyzed. Surprisingly, transcripts related to UPR showed higher expression during early stages of development and downregulation during late stages, when transcripts coding for S proteins participating in chorion formation were highly expressed. Several transcripts with potential roles in oogenesis and embryo development are also discussed. We propose that intense protein synthesis at the FE results in reticulum stress (RS) and that lowering expression of a set of genes related to cell survival should lead to degeneration of follicular cells at oocyte maturation. This paradoxical suppression of UPR suggests that ovarian follicles may represent an interesting model for studying control of RS and cell survival in professional S cell types. PMID:21736942

  13. Deep Sequencing Reveals Uncharted Isoform Heterogeneity of the Protein-Coding Transcriptome in Cerebral Ischemia.

    PubMed

    Bhattarai, Sunil; Aly, Ahmed; Garcia, Kristy; Ruiz, Diandra; Pontarelli, Fabrizio; Dharap, Ashutosh

    2018-06-03

    Gene expression in cerebral ischemia has been a subject of intense investigations for several years. Studies utilizing probe-based high-throughput methodologies such as microarrays have contributed significantly to our existing knowledge but lacked the capacity to dissect the transcriptome in detail. Genome-wide RNA-sequencing (RNA-seq) enables comprehensive examinations of transcriptomes for attributes such as strandedness, alternative splicing, alternative transcription start/stop sites, and sequence composition, thus providing a very detailed account of gene expression. Leveraging this capability, we conducted an in-depth, genome-wide evaluation of the protein-coding transcriptome of the adult mouse cortex after transient focal ischemia at 6, 12, or 24 h of reperfusion using RNA-seq. We identified a total of 1007 transcripts at 6 h, 1878 transcripts at 12 h, and 1618 transcripts at 24 h of reperfusion that were significantly altered as compared to sham controls. With isoform-level resolution, we identified 23 splice variants arising from 23 genes that were novel mRNA isoforms. For a subset of genes, we detected reperfusion time-point-dependent splice isoform switching, indicating an expression and/or functional switch for these genes. Finally, for 286 genes across all three reperfusion time-points, we discovered multiple, distinct, simultaneously expressed and differentially altered isoforms per gene that were generated via alternative transcription start/stop sites. Of these, 165 isoforms derived from 109 genes were novel mRNAs. Together, our data unravel the protein-coding transcriptome of the cerebral cortex at an unprecedented depth to provide several new insights into the flexibility and complexity of stroke-related gene transcription and transcript organization.

  14. BRD4 assists elongation of both coding and enhancer RNAs guided by histone acetylation

    PubMed Central

    Kanno, Tomohiko; Kanno, Yuka; LeRoy, Gary; Campos, Eric; Sun, Hong-Wei; Brooks, Stephen R; Vahedi, Golnaz; Heightman, Tom D; Garcia, Benjamin A; Reinberg, Danny; Siebenlist, Ulrich; O’Shea, John J; Ozato, Keiko

    2016-01-01

    Small-molecule BET inhibitors interfere with the epigenetic interactions between acetylated histones and the bromodomains of the BET family proteins, including BRD4, and they potently inhibit growth of malignant cells by targeting cancer-promoting genes. BRD4 interacts with the pause-release factor P-TEFb, and has been proposed to release Pol II from promoter-proximal pausing. We show that BRD4 occupied widespread genomic regions in mouse cells, and directly stimulated elongation of both protein-coding transcripts and non-coding enhancer RNAs (eRNAs), dependent on the function of bromodomains. BRD4 interacted physically with elongating Pol II complexes, and assisted Pol II progression through hyper-acetylated nucleosomes by interacting with acetylated histones via bromodomains. On active enhancers, the BET inhibitor JQ1 antagonized BRD4-associated eRNA synthesis. Thus, BRD4 is involved in multiple steps of the transcription hierarchy, primarily by assisting transcript elongation both at enhancers and on gene bodies. PMID:25383670

  15. Transcriptional landscapes of Axolotl (Ambystoma mexicanum).

    PubMed

    Caballero-Pérez, Juan; Espinal-Centeno, Annie; Falcon, Francisco; García-Ortega, Luis F; Curiel-Quesada, Everardo; Cruz-Hernández, Andrés; Bako, Laszlo; Chen, Xuemei; Martínez, Octavio; Alberto Arteaga-Vázquez, Mario; Herrera-Estrella, Luis; Cruz-Ramírez, Alfredo

    2018-01-15

    The axolotl (Ambystoma mexicanum) is the vertebrate model system with the highest regeneration capacity. Experimental tools established over the past 100 years have been fundamental to start unraveling the cellular and molecular basis of tissue and limb regeneration. In the absence of a reference genome for the Axolotl, transcriptomic analysis become fundamental to understand the genetic basis of regeneration. Here we present one of the most diverse transcriptomic data sets for Axolotl by profiling coding and non-coding RNAs from diverse tissues. We reconstructed a population of 115,906 putative protein coding mRNAs as full ORFs (including isoforms). We also identified 352 conserved miRNAs and 297 novel putative mature miRNAs. Systematic enrichment analysis of gene expression allowed us to identify tissue-specific protein-coding transcripts. We also found putative novel and conserved microRNAs which potentially target mRNAs which are reported as important disease candidates in heart and liver. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Microprocessor mediates transcriptional termination in long noncoding microRNA genes

    PubMed Central

    Dhir, Ashish; Dhir, Somdutta; Proudfoot, Nick J.; Jopling, Catherine L.

    2015-01-01

    MicroRNA (miRNA) play a major role in the post-transcriptional regulation of gene expression. Mammalian miRNA biogenesis begins with co-transcriptional cleavage of RNA polymerase II (Pol II) transcripts by the Microprocessor complex. While most miRNA are located within introns of protein coding genes, a substantial minority of miRNA originate from long non coding (lnc) RNA where transcript processing is largely uncharacterized. We show, by detailed characterization of liver-specific lnc-pri-miR-122 and genome-wide analysis in human cell lines, that most lnc-pri-miRNA do not use the canonical cleavage and polyadenylation (CPA) pathway, but instead use Microprocessor cleavage to terminate transcription. This Microprocessor inactivation leads to extensive transcriptional readthrough of lnc-pri-miRNA and transcriptional interference with downstream genes. Consequently we define a novel RNase III-mediated, polyadenylation-independent mechanism of Pol II transcription termination in mammalian cells. PMID:25730776

  17. Identification and Classification of New Transcripts in Dorper and Small-Tailed Han Sheep Skeletal Muscle Transcriptomes.

    PubMed

    Chao, Tianle; Wang, Guizhi; Wang, Jianmin; Liu, Zhaohua; Ji, Zhibin; Hou, Lei; Zhang, Chunlan

    2016-01-01

    High-throughput mRNA sequencing enables the discovery of new transcripts and additional parts of incompletely annotated transcripts. Compared with the human and cow genomes, the reference annotation level of the sheep genome is still low. An investigation of new transcripts in sheep skeletal muscle will improve our understanding of muscle development. Therefore, applying high-throughput sequencing, two cDNA libraries from the biceps brachii of small-tailed Han sheep and Dorper sheep were constructed, and whole-transcriptome analysis was performed to determine the unknown transcript catalogue of this tissue. In this study, 40,129 transcripts were finally mapped to the sheep genome. Among them, 3,467 transcripts were determined to be unannotated in the current reference sheep genome and were defined as new transcripts. Based on protein-coding capacity prediction and comparative analysis of sequence similarity, 246 transcripts were classified as portions of unannotated genes or incompletely annotated genes. Another 1,520 transcripts were predicted with high confidence to be long non-coding RNAs. Our analysis also revealed 334 new transcripts that displayed specific expression in ruminants and uncovered a number of new transcripts without intergenus homology but with specific expression in sheep skeletal muscle. The results confirmed a complex transcript pattern of coding and non-coding RNA in sheep skeletal muscle. This study provided important information concerning the sheep genome and transcriptome annotation, which could provide a basis for further study.

  18. Splicing factor SFRS1 recognizes a functionally diverse landscape of RNA transcripts.

    PubMed

    Sanford, Jeremy R; Wang, Xin; Mort, Matthew; Vanduyn, Natalia; Cooper, David N; Mooney, Sean D; Edenberg, Howard J; Liu, Yunlong

    2009-03-01

    Metazoan genes are encrypted with at least two superimposed codes: the genetic code to specify the primary structure of proteins and the splicing code to expand their proteomic output via alternative splicing. Here, we define the specificity of a central regulator of pre-mRNA splicing, the conserved, essential splicing factor SFRS1. Cross-linking immunoprecipitation and high-throughput sequencing (CLIP-seq) identified 23,632 binding sites for SFRS1 in the transcriptome of cultured human embryonic kidney cells. SFRS1 was found to engage many different classes of functionally distinct transcripts including mRNA, miRNA, snoRNAs, ncRNAs, and conserved intergenic transcripts of unknown function. The majority of these diverse transcripts share a purine-rich consensus motif corresponding to the canonical SFRS1 binding site. The consensus site was not only enriched in exons cross-linked to SFRS1 in vivo, but was also enriched in close proximity to splice sites. mRNAs encoding RNA processing factors were significantly overrepresented, suggesting that SFRS1 may broadly influence the post-transcriptional control of gene expression in vivo. Finally, a search for the SFRS1 consensus motif within the Human Gene Mutation Database identified 181 mutations in 82 different genes that disrupt predicted SFRS1 binding sites. This comprehensive analysis substantially expands the known roles of human SR proteins in the regulation of a diverse array of RNA transcripts.

  19. The transcriptional activator ZNF143 is essential for normal development in zebrafish

    PubMed Central

    2012-01-01

    Background ZNF143 is a sequence-specific DNA-binding protein that stimulates transcription of both small RNA genes by RNA polymerase II or III, or protein-coding genes by RNA polymerase II, using separable activating domains. We describe phenotypic effects following knockdown of this protein in developing Danio rerio (zebrafish) embryos by injection of morpholino antisense oligonucleotides that target znf143 mRNA. Results The loss of function phenotype is pleiotropic and includes a broad array of abnormalities including defects in heart, blood, ear and midbrain hindbrain boundary. Defects are rescued by coinjection of synthetic mRNA encoding full-length ZNF143 protein, but not by protein lacking the amino-terminal activation domains. Accordingly, expression of several marker genes is affected following knockdown, including GATA-binding protein 1 (gata1), cardiac myosin light chain 2 (cmlc2) and paired box gene 2a (pax2a). The zebrafish pax2a gene proximal promoter contains two binding sites for ZNF143, and reporter gene transcription driven by this promoter in transfected cells is activated by this protein. Conclusions Normal development of zebrafish embryos requires ZNF143. Furthermore, the pax2a gene is probably one example of many protein-coding gene targets of ZNF143 during zebrafish development. PMID:22268977

  20. The transcriptional activator ZNF143 is essential for normal development in zebrafish.

    PubMed

    Halbig, Kari M; Lekven, Arne C; Kunkel, Gary R

    2012-01-23

    ZNF143 is a sequence-specific DNA-binding protein that stimulates transcription of both small RNA genes by RNA polymerase II or III, or protein-coding genes by RNA polymerase II, using separable activating domains. We describe phenotypic effects following knockdown of this protein in developing Danio rerio (zebrafish) embryos by injection of morpholino antisense oligonucleotides that target znf143 mRNA. The loss of function phenotype is pleiotropic and includes a broad array of abnormalities including defects in heart, blood, ear and midbrain hindbrain boundary. Defects are rescued by coinjection of synthetic mRNA encoding full-length ZNF143 protein, but not by protein lacking the amino-terminal activation domains. Accordingly, expression of several marker genes is affected following knockdown, including GATA-binding protein 1 (gata1), cardiac myosin light chain 2 (cmlc2) and paired box gene 2a (pax2a). The zebrafish pax2a gene proximal promoter contains two binding sites for ZNF143, and reporter gene transcription driven by this promoter in transfected cells is activated by this protein. Normal development of zebrafish embryos requires ZNF143. Furthermore, the pax2a gene is probably one example of many protein-coding gene targets of ZNF143 during zebrafish development.

  1. The artificial zinc finger coding gene 'Jazz' binds the utrophin promoter and activates transcription.

    PubMed

    Corbi, N; Libri, V; Fanciulli, M; Tinsley, J M; Davies, K E; Passananti, C

    2000-06-01

    Up-regulation of utrophin gene expression is recognized as a plausible therapeutic approach in the treatment of Duchenne muscular dystrophy (DMD). We have designed and engineered new zinc finger-based transcription factors capable of binding and activating transcription from the promoter of the dystrophin-related gene, utrophin. Using the recognition 'code' that proposes specific rules between zinc finger primary structure and potential DNA binding sites, we engineered a new gene named 'Jazz' that encodes for a three-zinc finger peptide. Jazz belongs to the Cys2-His2 zinc finger type and was engineered to target the nine base pair DNA sequence: 5'-GCT-GCT-GCG-3', present in the promoter region of both the human and mouse utrophin gene. The entire zinc finger alpha-helix region, containing the amino acid positions that are crucial for DNA binding, was specifically chosen on the basis of the contacts more frequently represented in the available list of the 'code'. Here we demonstrate that Jazz protein binds specifically to the double-stranded DNA target, with a dissociation constant of about 32 nM. Band shift and super-shift experiments confirmed the high affinity and specificity of Jazz protein for its DNA target. Moreover, we show that chimeric proteins, named Gal4-Jazz and Sp1-Jazz, are able to drive the transcription of a test gene from the human utrophin promoter.

  2. [Long non-coding RNAs in the pathophysiology of atherosclerosis].

    PubMed

    Novak, Jan; Vašků, Julie Bienertová; Souček, Miroslav

    2018-01-01

    The human genome contains about 22 000 protein-coding genes that are transcribed to an even larger amount of messenger RNAs (mRNA). Interestingly, the results of the project ENCODE from 2012 show, that despite up to 90 % of our genome being actively transcribed, protein-coding mRNAs make up only 2-3 % of the total amount of the transcribed RNA. The rest of RNA transcripts is not translated to proteins and that is why they are referred to as "non-coding RNAs". Earlier the non-coding RNA was considered "the dark matter of genome", or "the junk", whose genes has accumulated in our DNA during the course of evolution. Today we already know that non-coding RNAs fulfil a variety of regulatory functions in our body - they intervene into epigenetic processes from chromatin remodelling to histone methylation, or into the transcription process itself, or even post-transcription processes. Long non-coding RNAs (lncRNA) are one of the classes of non-coding RNAs that have more than 200 nucleotides in length (non-coding RNAs with less than 200 nucleotides in length are called small non-coding RNAs). lncRNAs represent a widely varied and large group of molecules with diverse regulatory functions. We can identify them in all thinkable cell types or tissues, or even in an extracellular space, which includes blood, specifically plasma. Their levels change during the course of organogenesis, they are specific to different tissues and their changes also occur along with the development of different illnesses, including atherosclerosis. This review article aims to present lncRNAs problematics in general and then focuses on some of their specific representatives in relation to the process of atherosclerosis (i.e. we describe lncRNA involvement in the biology of endothelial cells, vascular smooth muscle cells or immune cells), and we further describe possible clinical potential of lncRNA, whether in diagnostics or therapy of atherosclerosis and its clinical manifestations.Key words: atherosclerosis - lincRNA - lncRNA - MALAT - MIAT.

  3. Transcription Factor Binding Profiles Reveal Cyclic Expression of Human Protein-coding Genes and Non-coding RNAs

    PubMed Central

    Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.

    2013-01-01

    Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175

  4. Characteristics and significance of intergenic polyadenylated RNA transcription in Arabidopsis.

    PubMed

    Moghe, Gaurav D; Lehti-Shiu, Melissa D; Seddon, Alex E; Yin, Shan; Chen, Yani; Juntawong, Piyada; Brandizzi, Federica; Bailey-Serres, Julia; Shiu, Shin-Han

    2013-01-01

    The Arabidopsis (Arabidopsis thaliana) genome is the most well-annotated plant genome. However, transcriptome sequencing in Arabidopsis continues to suggest the presence of polyadenylated (polyA) transcripts originating from presumed intergenic regions. It is not clear whether these transcripts represent novel noncoding or protein-coding genes. To understand the nature of intergenic polyA transcription, we first assessed its abundance using multiple messenger RNA sequencing data sets. We found 6,545 intergenic transcribed fragments (ITFs) occupying 3.6% of Arabidopsis intergenic space. In contrast to transcribed fragments that map to protein-coding and RNA genes, most ITFs are significantly shorter, are expressed at significantly lower levels, and tend to be more data set specific. A surprisingly large number of ITFs (32.1%) may be protein coding based on evidence of translation. However, our results indicate that these "translated" ITFs tend to be close to and are likely associated with known genes. To investigate if ITFs are under selection and are functional, we assessed ITF conservation through cross-species as well as within-species comparisons. Our analysis reveals that 237 ITFs, including 49 with translation evidence, are under strong selective constraint and relatively distant from annotated features. These ITFs are likely parts of novel genes. However, the selective pressure imposed on most ITFs is similar to that of randomly selected, untranscribed intergenic sequences. Our findings indicate that despite the prevalence of ITFs, apart from the possibility of genomic contamination, many may be background or noisy transcripts derived from "junk" DNA, whose production may be inherent to the process of transcription and which, on rare occasions, may act as catalysts for the creation of novel genes.

  5. Dissecting the expression relationships between RNA-binding proteins and their cognate targets in eukaryotic post-transcriptional regulatory networks.

    PubMed

    Nishtala, Sneha; Neelamraju, Yaseswini; Janga, Sarath Chandra

    2016-05-10

    RNA-binding proteins (RBPs) are pivotal in orchestrating several steps in the metabolism of RNA in eukaryotes thereby controlling an extensive network of RBP-RNA interactions. Here, we employed CLIP (cross-linking immunoprecipitation)-seq datasets for 60 human RBPs and RIP-ChIP (RNP immunoprecipitation-microarray) data for 69 yeast RBPs to construct a network of genome-wide RBP- target RNA interactions for each RBP. We show in humans that majority (~78%) of the RBPs are strongly associated with their target transcripts at transcript level while ~95% of the studied RBPs were also found to be strongly associated with expression levels of target transcripts when protein expression levels of RBPs were employed. At transcript level, RBP - RNA interaction data for the yeast genome, exhibited a strong association for 63% of the RBPs, confirming the association to be conserved across large phylogenetic distances. Analysis to uncover the features contributing to these associations revealed the number of target transcripts and length of the selected protein-coding transcript of an RBP at the transcript level while intensity of the CLIP signal, number of RNA-Binding domains, location of the binding site on the transcript, to be significant at the protein level. Our analysis will contribute to improved modelling and prediction of post-transcriptional networks.

  6. Dissecting the expression relationships between RNA-binding proteins and their cognate targets in eukaryotic post-transcriptional regulatory networks

    NASA Astrophysics Data System (ADS)

    Nishtala, Sneha; Neelamraju, Yaseswini; Janga, Sarath Chandra

    2016-05-01

    RNA-binding proteins (RBPs) are pivotal in orchestrating several steps in the metabolism of RNA in eukaryotes thereby controlling an extensive network of RBP-RNA interactions. Here, we employed CLIP (cross-linking immunoprecipitation)-seq datasets for 60 human RBPs and RIP-ChIP (RNP immunoprecipitation-microarray) data for 69 yeast RBPs to construct a network of genome-wide RBP- target RNA interactions for each RBP. We show in humans that majority (~78%) of the RBPs are strongly associated with their target transcripts at transcript level while ~95% of the studied RBPs were also found to be strongly associated with expression levels of target transcripts when protein expression levels of RBPs were employed. At transcript level, RBP - RNA interaction data for the yeast genome, exhibited a strong association for 63% of the RBPs, confirming the association to be conserved across large phylogenetic distances. Analysis to uncover the features contributing to these associations revealed the number of target transcripts and length of the selected protein-coding transcript of an RBP at the transcript level while intensity of the CLIP signal, number of RNA-Binding domains, location of the binding site on the transcript, to be significant at the protein level. Our analysis will contribute to improved modelling and prediction of post-transcriptional networks.

  7. Transcriptator: An Automated Computational Pipeline to Annotate Assembled Reads and Identify Non Coding RNA.

    PubMed

    Tripathi, Kumar Parijat; Evangelista, Daniela; Zuccaro, Antonio; Guarracino, Mario Rosario

    2015-01-01

    RNA-seq is a new tool to measure RNA transcript counts, using high-throughput sequencing at an extraordinary accuracy. It provides quantitative means to explore the transcriptome of an organism of interest. However, interpreting this extremely large data into biological knowledge is a problem, and biologist-friendly tools are lacking. In our lab, we developed Transcriptator, a web application based on a computational Python pipeline with a user-friendly Java interface. This pipeline uses the web services available for BLAST (Basis Local Search Alignment Tool), QuickGO and DAVID (Database for Annotation, Visualization and Integrated Discovery) tools. It offers a report on statistical analysis of functional and Gene Ontology (GO) annotation's enrichment. It helps users to identify enriched biological themes, particularly GO terms, pathways, domains, gene/proteins features and protein-protein interactions related informations. It clusters the transcripts based on functional annotations and generates a tabular report for functional and gene ontology annotations for each submitted transcript to the web server. The implementation of QuickGo web-services in our pipeline enable the users to carry out GO-Slim analysis, whereas the integration of PORTRAIT (Prediction of transcriptomic non coding RNA (ncRNA) by ab initio methods) helps to identify the non coding RNAs and their regulatory role in transcriptome. In summary, Transcriptator is a useful software for both NGS and array data. It helps the users to characterize the de-novo assembled reads, obtained from NGS experiments for non-referenced organisms, while it also performs the functional enrichment analysis of differentially expressed transcripts/genes for both RNA-seq and micro-array experiments. It generates easy to read tables and interactive charts for better understanding of the data. The pipeline is modular in nature, and provides an opportunity to add new plugins in the future. Web application is freely available at: http://www-labgtp.na.icar.cnr.it/Transcriptator.

  8. Microbial metatranscriptomics in a permanent marine oxygen minimum zone.

    PubMed

    Stewart, Frank J; Ulloa, Osvaldo; DeLong, Edward F

    2012-01-01

    Simultaneous characterization of taxonomic composition, metabolic gene content and gene expression in marine oxygen minimum zones (OMZs) has potential to broaden perspectives on the microbial and biogeochemical dynamics in these environments. Here, we present a metatranscriptomic survey of microbial community metabolism in the Eastern Tropical South Pacific OMZ off northern Chile. Community RNA was sampled in late austral autumn from four depths (50, 85, 110, 200 m) extending across the oxycline and into the upper OMZ. Shotgun pyrosequencing of cDNA yielded 180,000 to 550,000 transcript sequences per depth. Based on functional gene representation, transcriptome samples clustered apart from corresponding metagenome samples from the same depth, highlighting the discrepancies between metabolic potential and actual transcription. BLAST-based characterizations of non-ribosomal RNA sequences revealed a dominance of genes involved with both oxidative (nitrification) and reductive (anammox, denitrification) components of the marine nitrogen cycle. Using annotations of protein-coding genes as proxies for taxonomic affiliation, we observed depth-specific changes in gene expression by key functional taxonomic groups. Notably, transcripts most closely matching the genome of the ammonia-oxidizing archaeon Nitrosopumilus maritimus dominated the transcriptome in the upper three depths, representing one in five protein-coding transcripts at 85 m. In contrast, transcripts matching the anammox bacterium Kuenenia stuttgartiensis dominated at the core of the OMZ (200 m; 1 in 12 protein-coding transcripts). The distribution of N. maritimus-like transcripts paralleled that of transcripts matching ammonia monooxygenase genes, which, despite being represented by both bacterial and archaeal sequences in the community DNA, were dominated (> 99%) by archaeal sequences in the RNA, suggesting a substantial role for archaeal nitrification in the upper OMZ. These data, as well as those describing other key OMZ metabolic processes (e.g. sulfur oxidation), highlight gene-specific expression patterns in the context of the entire community transcriptome, as well as identify key functional groups for taxon-specific genomic profiling. © 2011 Society for Applied Microbiology and Blackwell Publishing Ltd.

  9. Antisense transcription is pervasive but rarely conserved in enteric bacteria.

    PubMed

    Raghavan, Rahul; Sloan, Daniel B; Ochman, Howard

    2012-01-01

    Noncoding RNAs, including antisense RNAs (asRNAs) that originate from the complementary strand of protein-coding genes, are involved in the regulation of gene expression in all domains of life. Recent application of deep-sequencing technologies has revealed that the transcription of asRNAs occurs genome-wide in bacteria. Although the role of the vast majority of asRNAs remains unknown, it is often assumed that their presence implies important regulatory functions, similar to those of other noncoding RNAs. Alternatively, many antisense transcripts may be produced by chance transcription events from promoter-like sequences that result from the degenerate nature of bacterial transcription factor binding sites. To investigate the biological relevance of antisense transcripts, we compared genome-wide patterns of asRNA expression in closely related enteric bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, by performing strand-specific transcriptome sequencing. Although antisense transcripts are abundant in both species, less than 3% of asRNAs are expressed at high levels in both species, and only about 14% appear to be conserved among species. And unlike the promoters of protein-coding genes, asRNA promoters show no evidence of sequence conservation between, or even within, species. Our findings suggest that many or even most bacterial asRNAs are nonadaptive by-products of the cell's transcription machinery. IMPORTANCE Application of high-throughput methods has revealed the expression throughout bacterial genomes of transcripts encoded on the strand complementary to protein-coding genes. Because transcription is costly, it is usually assumed that these transcripts, termed antisense RNAs (asRNAs), serve some function; however, the role of most asRNAs is unclear, raising questions about their relevance in cellular processes. Because natural selection conserves functional elements, comparisons between related species provide a method for assessing functionality genome-wide. Applying such an approach, we assayed all transcripts in two closely related bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, and demonstrate that, although the levels of genome-wide antisense transcription are similarly high in both bacteria, only a small fraction of asRNAs are shared across species. Moreover, the promoters associated with asRNAs show no evidence of sequence conservation between, or even within, species. These findings indicate that despite the genome-wide transcription of asRNAs, many of these transcripts are likely nonfunctional.

  10. Antisense Transcription Is Pervasive but Rarely Conserved in Enteric Bacteria

    PubMed Central

    Raghavan, Rahul; Sloan, Daniel B.; Ochman, Howard

    2012-01-01

    ABSTRACT Noncoding RNAs, including antisense RNAs (asRNAs) that originate from the complementary strand of protein-coding genes, are involved in the regulation of gene expression in all domains of life. Recent application of deep-sequencing technologies has revealed that the transcription of asRNAs occurs genome-wide in bacteria. Although the role of the vast majority of asRNAs remains unknown, it is often assumed that their presence implies important regulatory functions, similar to those of other noncoding RNAs. Alternatively, many antisense transcripts may be produced by chance transcription events from promoter-like sequences that result from the degenerate nature of bacterial transcription factor binding sites. To investigate the biological relevance of antisense transcripts, we compared genome-wide patterns of asRNA expression in closely related enteric bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, by performing strand-specific transcriptome sequencing. Although antisense transcripts are abundant in both species, less than 3% of asRNAs are expressed at high levels in both species, and only about 14% appear to be conserved among species. And unlike the promoters of protein-coding genes, asRNA promoters show no evidence of sequence conservation between, or even within, species. Our findings suggest that many or even most bacterial asRNAs are nonadaptive by-products of the cell’s transcription machinery. PMID:22872780

  11. A novel TBP-TAF complex on RNA polymerase II-transcribed snRNA genes.

    PubMed

    Zaborowska, Justyna; Taylor, Alice; Roeder, Robert G; Murphy, Shona

    2012-01-01

    Initiation of transcription of most human genes transcribed by RNA polymerase II (RNAP II) requires the formation of a preinitiation complex comprising TFIIA, B, D, E, F, H and RNAP II. The general transcription factor TFIID is composed of the TATA-binding protein and up to 13 TBP-associated factors. During transcription of snRNA genes, RNAP II does not appear to make the transition to long-range productive elongation, as happens during transcription of protein-coding genes. In addition, recognition of the snRNA gene-type specific 3' box RNA processing element requires initiation from an snRNA gene promoter. These characteristics may, at least in part, be driven by factors recruited to the promoter. For example, differences in the complement of TAFs might result in differential recruitment of elongation and RNA processing factors. As precedent, it already has been shown that the promoters of some protein-coding genes do not recruit all the TAFs found in TFIID. Although TAF5 has been shown to be associated with RNAP II-transcribed snRNA genes, the full complement of TAFs associated with these genes has remained unclear. Here we show, using a ChIP and siRNA-mediated approach, that the TBP/TAF complex on snRNA genes differs from that found on protein-coding genes. Interestingly, the largest TAF, TAF1, and the core TAFs, TAF10 and TAF4, are not detected on snRNA genes. We propose that this snRNA gene-specific TAF subset plays a key role in gene type-specific control of expression.

  12. Identification of Bombyx mori bidensovirus VD1-ORF4 reveals a novel protein associated with viral structural component.

    PubMed

    Li, Guohui; Hu, Zhaoyang; Guo, Xuli; Li, Guangtian; Tang, Qi; Wang, Peng; Chen, Keping; Yao, Qin

    2013-06-01

    Bombyx mori bidensovirus (BmBDV) VD1-ORF4 (open reading frame 4, ORF4) consists of 3,318 nucleotides, which codes for a predicted 1,105-amino acid protein containing a conserved DNA polymerase motif. However, its functions in viral propagation remain unknown. In the current study, the transcription of VD1-ORF4 was examined from 6 to 96 h postinfection (p.i.) by RT-PCR, 5'-RACE revealed the transcription initiation site of BmBDV ORF4 to be -16 nucleotides upstream from the start codon, and 3'-RACE revealed the transcription termination site of VD1-ORF4 to be +7 nucleotides downstream from termination codon. Three different proteins were examined in the extracts of BmBDV-infected silkworms midguts by Western blot using raised antibodies against VD1-ORF4 deduced amino acid, and a specific protein band about 53 kDa was further detected in purified virions using the same antibodies. Taken together, BmBDV VD1-ORF4 codes for three or more proteins during the viral life cycle, one of which is a 53 kDa protein and confirmed to be a component of BmBDV virion.

  13. Long non-coding RNA discovery across the genus anopheles reveals conserved secondary structures within and beyond the Gambiae complex.

    PubMed

    Jenkins, Adam M; Waterhouse, Robert M; Muskavitch, Marc A T

    2015-04-23

    Long non-coding RNAs (lncRNAs) have been defined as mRNA-like transcripts longer than 200 nucleotides that lack significant protein-coding potential, and many of them constitute scaffolds for ribonucleoprotein complexes with critical roles in epigenetic regulation. Various lncRNAs have been implicated in the modulation of chromatin structure, transcriptional and post-transcriptional gene regulation, and regulation of genomic stability in mammals, Caenorhabditis elegans, and Drosophila melanogaster. The purpose of this study is to identify the lncRNA landscape in the malaria vector An. gambiae and assess the evolutionary conservation of lncRNAs and their secondary structures across the Anopheles genus. Using deep RNA sequencing of multiple Anopheles gambiae life stages, we have identified 2,949 lncRNAs and more than 300 previously unannotated putative protein-coding genes. The lncRNAs exhibit differential expression profiles across life stages and adult genders. We find that across the genus Anopheles, lncRNAs display much lower sequence conservation than protein-coding genes. Additionally, we find that lncRNA secondary structure is highly conserved within the Gambiae complex, but diverges rapidly across the rest of the genus Anopheles. This study offers one of the first lncRNA secondary structure analyses in vector insects. Our description of lncRNAs in An. gambiae offers the most comprehensive genome-wide insights to date into lncRNAs in this vector mosquito, and defines a set of potential targets for the development of vector-based interventions that may further curb the human malaria burden in disease-endemic countries.

  14. Efficient analysis of mouse genome sequences reveal many nonsense variants

    PubMed Central

    Steeland, Sophie; Timmermans, Steven; Van Ryckeghem, Sara; Hulpiau, Paco; Saeys, Yvan; Van Montagu, Marc; Vandenbroucke, Roosmarijn E.; Libert, Claude

    2016-01-01

    Genetic polymorphisms in coding genes play an important role when using mouse inbred strains as research models. They have been shown to influence research results, explain phenotypical differences between inbred strains, and increase the amount of interesting gene variants present in the many available inbred lines. SPRET/Ei is an inbred strain derived from Mus spretus that has ∼1% sequence difference with the C57BL/6J reference genome. We obtained a listing of all SNPs and insertions/deletions (indels) present in SPRET/Ei from the Mouse Genomes Project (Wellcome Trust Sanger Institute) and processed these data to obtain an overview of all transcripts having nonsynonymous coding sequence variants. We identified 8,883 unique variants affecting 10,096 different transcripts from 6,328 protein-coding genes, which is about 28% of all coding genes. Because only a subset of these variants results in drastic changes in proteins, we focused on variations that are nonsense mutations that ultimately resulted in a gain of a stop codon. These genes were identified by in silico changing the C57BL/6J coding sequences to the SPRET/Ei sequences, converting them to amino acid (AA) sequences, and comparing the AA sequences. All variants and transcripts affected were also stored in a database, which can be browsed using a SPRET/Ei M. spretus variants web tool (www.spretus.org), including a manual. We validated the tool by demonstrating the loss of function of three proteins predicted to be severely truncated, namely Fas, IRAK2, and IFNγR1. PMID:27147605

  15. A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes.

    PubMed

    Hezroni, Hadas; Ben-Tov Perry, Rotem; Meir, Zohar; Housman, Gali; Lubelsky, Yoav; Ulitsky, Igor

    2017-08-30

    Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.

  16. Long Non-Coding RNAs Responsive to Salt and Boron Stress in the Hyper-Arid Lluteño Maize from Atacama Desert.

    PubMed

    Huanca-Mamani, Wilson; Arias-Carrasco, Raúl; Cárdenas-Ninasivincha, Steffany; Rojas-Herrera, Marcelo; Sepúlveda-Hermosilla, Gonzalo; Caris-Maldonado, José Carlos; Bastías, Elizabeth; Maracaja-Coutinho, Vinicius

    2018-03-20

    Long non-coding RNAs (lncRNAs) have been defined as transcripts longer than 200 nucleotides, which lack significant protein coding potential and possess critical roles in diverse cellular processes. Long non-coding RNAs have recently been functionally characterized in plant stress-response mechanisms. In the present study, we perform a comprehensive identification of lncRNAs in response to combined stress induced by salinity and excess of boron in the Lluteño maize, a tolerant maize landrace from Atacama Desert, Chile. We use deep RNA sequencing to identify a set of 48,345 different lncRNAs, of which 28,012 (58.1%) are conserved with other maize (B73, Mo17 or Palomero), with the remaining 41.9% belonging to potentially Lluteño exclusive lncRNA transcripts. According to B73 maize reference genome sequence, most Lluteño lncRNAs correspond to intergenic transcripts. Interestingly, Lluteño lncRNAs presents an unusual overall higher expression compared to protein coding genes under exposure to stressed conditions. In total, we identified 1710 putatively responsive to the combined stressed conditions of salt and boron exposure. We also identified a set of 848 stress responsive potential trans natural antisense transcripts ( trans -NAT) lncRNAs, which seems to be regulating genes associated with regulation of transcription, response to stress, response to abiotic stimulus and participating of the nicotianamine metabolic process. Reverse transcription-quantitative PCR (RT-qPCR) experiments were performed in a subset of lncRNAs, validating their existence and expression patterns. Our results suggest that a diverse set of maize lncRNAs from leaves and roots is responsive to combined salt and boron stress, being the first effort to identify lncRNAs from a maize landrace adapted to extreme conditions such as the Atacama Desert. The information generated is a starting point to understand the genomic adaptabilities suffered by this maize to surpass this extremely stressed environment.

  17. Long Non-Coding RNAs Responsive to Salt and Boron Stress in the Hyper-Arid Lluteño Maize from Atacama Desert

    PubMed Central

    Huanca-Mamani, Wilson; Arias-Carrasco, Raúl; Cárdenas-Ninasivincha, Steffany; Rojas-Herrera, Marcelo; Sepúlveda-Hermosilla, Gonzalo; Caris-Maldonado, José Carlos; Bastías, Elizabeth; Maracaja-Coutinho, Vinicius

    2018-01-01

    Long non-coding RNAs (lncRNAs) have been defined as transcripts longer than 200 nucleotides, which lack significant protein coding potential and possess critical roles in diverse cellular processes. Long non-coding RNAs have recently been functionally characterized in plant stress–response mechanisms. In the present study, we perform a comprehensive identification of lncRNAs in response to combined stress induced by salinity and excess of boron in the Lluteño maize, a tolerant maize landrace from Atacama Desert, Chile. We use deep RNA sequencing to identify a set of 48,345 different lncRNAs, of which 28,012 (58.1%) are conserved with other maize (B73, Mo17 or Palomero), with the remaining 41.9% belonging to potentially Lluteño exclusive lncRNA transcripts. According to B73 maize reference genome sequence, most Lluteño lncRNAs correspond to intergenic transcripts. Interestingly, Lluteño lncRNAs presents an unusual overall higher expression compared to protein coding genes under exposure to stressed conditions. In total, we identified 1710 putatively responsive to the combined stressed conditions of salt and boron exposure. We also identified a set of 848 stress responsive potential trans natural antisense transcripts (trans-NAT) lncRNAs, which seems to be regulating genes associated with regulation of transcription, response to stress, response to abiotic stimulus and participating of the nicotianamine metabolic process. Reverse transcription-quantitative PCR (RT-qPCR) experiments were performed in a subset of lncRNAs, validating their existence and expression patterns. Our results suggest that a diverse set of maize lncRNAs from leaves and roots is responsive to combined salt and boron stress, being the first effort to identify lncRNAs from a maize landrace adapted to extreme conditions such as the Atacama Desert. The information generated is a starting point to understand the genomic adaptabilities suffered by this maize to surpass this extremely stressed environment. PMID:29558449

  18. Transcription initiation complex structures elucidate DNA opening.

    PubMed

    Plaschka, C; Hantsche, M; Dienemann, C; Burzinski, C; Plitzko, J; Cramer, P

    2016-05-19

    Transcription of eukaryotic protein-coding genes begins with assembly of the RNA polymerase (Pol) II initiation complex and promoter DNA opening. Here we report cryo-electron microscopy (cryo-EM) structures of yeast initiation complexes containing closed and open DNA at resolutions of 8.8 Å and 3.6 Å, respectively. DNA is positioned and retained over the Pol II cleft by a network of interactions between the TATA-box-binding protein TBP and transcription factors TFIIA, TFIIB, TFIIE, and TFIIF. DNA opening occurs around the tip of the Pol II clamp and the TFIIE 'extended winged helix' domain, and can occur in the absence of TFIIH. Loading of the DNA template strand into the active centre may be facilitated by movements of obstructing protein elements triggered by allosteric binding of the TFIIE 'E-ribbon' domain. The results suggest a unified model for transcription initiation with a key event, the trapping of open promoter DNA by extended protein-protein and protein-DNA contacts.

  19. A Catalogue of Putative cis-Regulatory Interactions Between Long Non-coding RNAs and Proximal Coding Genes Based on Correlative Analysis Across Diverse Human Tumors.

    PubMed

    Basu, Swaraj; Larsson, Erik

    2018-05-31

    Antisense transcripts and other long non-coding RNAs are pervasive in mammalian cells, and some of these molecules have been proposed to regulate proximal protein-coding genes in cis For example, non-coding transcription can contribute to inactivation of tumor suppressor genes in cancer, and antisense transcripts have been implicated in the epigenetic inactivation of imprinted genes. However, our knowledge is still limited and more such regulatory interactions likely await discovery. Here, we make use of available gene expression data from a large compendium of human tumors to generate hypotheses regarding non-coding-to-coding cis -regulatory relationships with emphasis on negative associations, as these are less likely to arise for reasons other than cis -regulation. We document a large number of possible regulatory interactions, including 193 coding/non-coding pairs that show expression patterns compatible with negative cis -regulation. Importantly, by this approach we capture several known cases, and many of the involved coding genes have known roles in cancer. Our study provides a large catalog of putative non-coding/coding cis -regulatory pairs that may serve as a basis for further experimental validation and characterization. Copyright © 2018 Basu and Larsson.

  20. Variations in the non-coding transcriptome as a driver of inter-strain divergence and physiological adaptation in bacteria.

    PubMed

    Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R; Voß, Björn

    2015-04-22

    In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5'UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5'UTR. Such an sRNA/mRNA structure, which we name 'actuaton', represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation.

  1. Pervasive transcription: detecting functional RNAs in bacteria.

    PubMed

    Lybecker, Meghan; Bilusic, Ivana; Raghavan, Rahul

    2014-01-01

    Pervasive, or genome-wide, transcription has been reported in all domains of life. In bacteria, most pervasive transcription occurs antisense to protein-coding transcripts, although recently a new class of pervasive RNAs was identified that originates from within annotated genes. Initially considered to be non-functional transcriptional noise, pervasive transcription is increasingly being recognized as important in regulating gene expression. The function of pervasive transcription is an extensively debated question in the field of transcriptomics and regulatory RNA biology. Here, we highlight the most recent contributions addressing the purpose of pervasive transcription in bacteria and discuss their implications.

  2. Tumor hypoxia induces nuclear paraspeckle formation through HIF-2α dependent transcriptional activation of NEAT1 leading to cancer cell survival

    PubMed Central

    Choudhry, H; Albukhari, A; Morotti, M; Haider, S; Moralli, D; Smythies, J; Schödel, J; Green, C M; Camps, C; Buffa, F; Ratcliffe, P; Ragoussis, J; Harris, A L; Mole, D R

    2015-01-01

    Activation of cellular transcriptional responses, mediated by hypoxia-inducible factor (HIF), is common in many types of cancer, and generally confers a poor prognosis. Known to induce many hundreds of protein-coding genes, HIF has also recently been shown to be a key regulator of the non-coding transcriptional response. Here, we show that NEAT1 long non-coding RNA (lncRNA) is a direct transcriptional target of HIF in many breast cancer cell lines and in solid tumors. Unlike previously described lncRNAs, NEAT1 is regulated principally by HIF-2 rather than by HIF-1. NEAT1 is a nuclear lncRNA that is an essential structural component of paraspeckles and the hypoxic induction of NEAT1 induces paraspeckle formation in a manner that is dependent upon both NEAT1 and on HIF-2. Paraspeckles are multifunction nuclear structures that sequester transcriptionally active proteins as well as RNA transcripts that have been subjected to adenosine-to-inosine (A-to-I) editing. We show that the nuclear retention of one such transcript, F11R (also known as junctional adhesion molecule 1, JAM1), in hypoxia is dependent upon the hypoxic increase in NEAT1, thereby conferring a novel mechanism of HIF-dependent gene regulation. Induction of NEAT1 in hypoxia also leads to accelerated cellular proliferation, improved clonogenic survival and reduced apoptosis, all of which are hallmarks of increased tumorigenesis. Furthermore, in patients with breast cancer, high tumor NEAT1 expression correlates with poor survival. Taken together, these results indicate a new role for HIF transcriptional pathways in the regulation of nuclear structure and that this contributes to the pro-tumorigenic hypoxia-phenotype in breast cancer. PMID:25417700

  3. Transcription of a protein-coding gene on B chromosomes of the Siberian roe deer (Capreolus pygargus)

    PubMed Central

    2013-01-01

    Background Most eukaryotic species represent stable karyotypes with a particular diploid number. B chromosomes are additional to standard karyotypes and may vary in size, number and morphology even between cells of the same individual. For many years it was generally believed that B chromosomes found in some plant, animal and fungi species lacked active genes. Recently, molecular cytogenetic studies showed the presence of additional copies of protein-coding genes on B chromosomes. However, the transcriptional activity of these genes remained elusive. We studied karyotypes of the Siberian roe deer (Capreolus pygargus) that possess up to 14 B chromosomes to investigate the presence and expression of genes on supernumerary chromosomes. Results Here, we describe a 2 Mbp region homologous to cattle chromosome 3 and containing TNNI3K (partial), FPGT, LRRIQ3 and a large gene-sparse segment on B chromosomes of the Siberian roe deer. The presence of the copy of the autosomal region was demonstrated by B-specific cDNA analysis, PCR assisted mapping, cattle bacterial artificial chromosome (BAC) clone localization and quantitative polymerase chain reaction (qPCR). By comparative analysis of B-specific and non-B chromosomal sequences we discovered some B chromosome-specific mutations in protein-coding genes, which further enabled the detection of a FPGT-TNNI3K transcript expressed from duplicated genes located on B chromosomes in roe deer fibroblasts. Conclusions Discovery of a large autosomal segment in all B chromosomes of the Siberian roe deer further corroborates the view of an autosomal origin for these elements. Detection of a B-derived transcript in fibroblasts implies that the protein coding sequences located on Bs are not fully inactivated. The origin, evolution and effect on host of B chromosomal genes seem to be similar to autosomal segmental duplications, which reinforces the view that supernumerary chromosomal elements might play an important role in genome evolution. PMID:23915065

  4. Differential protein-coding gene and long noncoding RNA expression in smoking-related lung squamous cell carcinoma.

    PubMed

    Li, Shicheng; Sun, Xiao; Miao, Shuncheng; Liu, Jia; Jiao, Wenjie

    2017-11-01

    Cigarette smoking is one of the greatest preventable risk factors for developing cancer, and most cases of lung squamous cell carcinoma (lung SCC) are associated with smoking. The pathogenesis mechanism of tumor progress is unclear. This study aimed to identify biomarkers in smoking-related lung cancer, including protein-coding gene, long noncoding RNA, and transcription factors. We selected and obtained messenger RNA microarray datasets and clinical data from the Gene Expression Omnibus database to identify gene expression altered by cigarette smoking. Integrated bioinformatic analysis was used to clarify biological functions of the identified genes, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the construction of a protein-protein interaction network, transcription factor, and statistical analyses. Subsequent quantitative real-time PCR was utilized to verify these bioinformatic analyses. Five hundred and ninety-eight differentially expressed genes and 21 long noncoding RNA were identified in smoking-related lung SCC. GO and KEGG pathway analysis showed that identified genes were enriched in the cancer-related functions and pathways. The protein-protein interaction network revealed seven hub genes identified in lung SCC. Several transcription factors and their binding sites were predicted. The results of real-time quantitative PCR revealed that AURKA and BIRC5 were significantly upregulated and LINC00094 was downregulated in the tumor tissues of smoking patients. Further statistical analysis indicated that dysregulation of AURKA, BIRC5, and LINC00094 indicated poor prognosis in lung SCC. Protein-coding genes AURKA, BIRC5, and LINC00094 could be biomarkers or therapeutic targets for smoking-related lung SCC. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.

  5. Genome-Wide Discovery of Long Non-Coding RNAs in Rainbow Trout.

    PubMed

    Al-Tobasei, Rafet; Paneru, Bam; Salem, Mohamed

    2016-01-01

    The ENCODE project revealed that ~70% of the human genome is transcribed. While only 1-2% of the RNAs encode for proteins, the rest are non-coding RNAs. Long non-coding RNAs (lncRNAs) form a diverse class of non-coding RNAs that are longer than 200 nt. Emerging evidence indicates that lncRNAs play critical roles in various cellular processes including regulation of gene expression. LncRNAs show low levels of gene expression and sequence conservation, which make their computational identification in genomes difficult. In this study, more than two billion Illumina sequence reads were mapped to the genome reference using the TopHat and Cufflinks software. Transcripts shorter than 200 nt, with more than 83-100 amino acids ORF, or with significant homologies to the NCBI nr-protein database were removed. In addition, a computational pipeline was used to filter the remaining transcripts based on a protein-coding-score test. Depending on the filtering stringency conditions, between 31,195 and 54,503 lncRNAs were identified, with only 421 matching known lncRNAs in other species. A digital gene expression atlas revealed 2,935 tissue-specific and 3,269 ubiquitously-expressed lncRNAs. This study annotates the lncRNA rainbow trout genome and provides a valuable resource for functional genomics research in salmonids.

  6. Single nucleotide polymorphism-specific regulation of matrix metalloproteinase-9 by multiple miRNAs targeting the coding exon

    PubMed Central

    Duellman, Tyler; Warren, Christopher; Yang, Jay

    2014-01-01

    Microribonucleic acids (miRNAs) work with exquisite specificity and are able to distinguish a target from a non-target based on a single nucleotide mismatch in the core nucleotide domain. We questioned whether miRNA regulation of gene expression could occur in a single nucleotide polymorphism (SNP)-specific manner, manifesting as a post-transcriptional control of expression of genetic polymorphisms. In our recent study of the functional consequences of matrix metalloproteinase (MMP)-9 SNPs, we discovered that expression of a coding exon SNP in the pro-domain of the protein resulted in a profound decrease in the secreted protein. This missense SNP results in the N38S amino acid change and a loss of an N-glycosylation site. A systematic study demonstrated that the loss of secreted protein was due not to the loss of an N-glycosylation site, but rather an SNP-specific targeting by miR-671-3p and miR-657. Bioinformatics analysis identified 41 SNP-specific miRNA targeting MMP-9 SNPs, mostly in the coding exon and an extension of the analysis to chromosome 20, where the MMP-9 gene is located, suggesting that SNP-specific miRNAs targeting the coding exon are prevalent. This selective post-transcriptional regulation of a target messenger RNA harboring genetic polymorphisms by miRNAs offers an SNP-dependent post-transcriptional regulatory mechanism, allowing for polymorphic-specific differential gene regulation. PMID:24627221

  7. CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts.

    PubMed

    Testa, Alison C; Hane, James K; Ellwood, Simon R; Oliver, Richard P

    2015-03-11

    The impact of gene annotation quality on functional and comparative genomics makes gene prediction an important process, particularly in non-model species, including many fungi. Sets of homologous protein sequences are rarely complete with respect to the fungal species of interest and are often small or unreliable, especially when closely related species have not been sequenced or annotated in detail. In these cases, protein homology-based evidence fails to correctly annotate many genes, or significantly improve ab initio predictions. Generalised hidden Markov models (GHMM) have proven to be invaluable tools in gene annotation and, recently, RNA-seq has emerged as a cost-effective means to significantly improve the quality of automated gene annotation. As these methods do not require sets of homologous proteins, improving gene prediction from these resources is of benefit to fungal researchers. While many pipelines now incorporate RNA-seq data in training GHMMs, there has been relatively little investigation into additionally combining RNA-seq data at the point of prediction, and room for improvement in this area motivates this study. CodingQuarry is a highly accurate, self-training GHMM fungal gene predictor designed to work with assembled, aligned RNA-seq transcripts. RNA-seq data informs annotations both during gene-model training and in prediction. Our approach capitalises on the high quality of fungal transcript assemblies by incorporating predictions made directly from transcript sequences. Correct predictions are made despite transcript assembly problems, including those caused by overlap between the transcripts of adjacent gene loci. Stringent benchmarking against high-confidence annotation subsets showed CodingQuarry predicted 91.3% of Schizosaccharomyces pombe genes and 90.4% of Saccharomyces cerevisiae genes perfectly. These results are 4-5% better than those of AUGUSTUS, the next best performing RNA-seq driven gene predictor tested. Comparisons against whole genome Sc. pombe and S. cerevisiae annotations further substantiate a 4-5% improvement in the number of correctly predicted genes. We demonstrate the success of a novel method of incorporating RNA-seq data into GHMM fungal gene prediction. This shows that a high quality annotation can be achieved without relying on protein homology or a training set of genes. CodingQuarry is freely available ( https://sourceforge.net/projects/codingquarry/ ), and suitable for incorporation into genome annotation pipelines.

  8. Molecular interplay of pro-inflammatory transcription factors and non-coding RNAs in esophageal squamous cell carcinoma.

    PubMed

    Sundaram, Gopinath M; Veera Bramhachari, Pallaval

    2017-06-01

    Esophageal squamous cell carcinoma is the sixth most common cancer in the developing world. The aggressive nature of esophageal squamous cell carcinoma, its tendency for relapse, and the poor survival prospects of patients diagnosed at advanced stages, represent a pressing need for the development of new therapies for this disease. Chronic inflammation is known to have a causal link to cancer pre-disposition. Nuclear factor kappa B and signal transducer and activator of transcription 3 are transcription factors which regulate immunity and inflammation and are emerging as key regulators of tumor initiation, progression, and metastasis. Although these pro-inflammatory factors in esophageal squamous cell carcinoma have been well-characterized with reference to protein-coding targets, their functional interactions with non-coding RNAs have only recently been gaining attention. Non-coding RNAs, especially microRNAs and long non-coding RNAs demonstrate potential as biomarkers and alternative therapeutic targets. In this review, we summarize the recent literature and concepts on non-coding RNAs that are regulated by/regulate nuclear factor kappa B and signal transducer and activator of transcription 3 in esophageal cancer progression. We also discuss how these recent discoveries can pave way for future therapeutic options to treat esophageal squamous cell carcinoma.

  9. Identification of BSAP (Pax-5) target genes in early B-cell development by loss- and gain-of-function experiments.

    PubMed Central

    Nutt, S L; Morrison, A M; Dörfler, P; Rolink, A; Busslinger, M

    1998-01-01

    The Pax-5 gene codes for the transcription factor BSAP which is essential for the progression of adult B lymphopoiesis beyond an early progenitor (pre-BI) cell stage. Although several genes have been proposed to be regulated by BSAP, CD19 is to date the only target gene which has been genetically confirmed to depend on this transcription factor for its expression. We have now taken advantage of cultured pre-BI cells of wild-type and Pax-5 mutant bone marrow to screen a large panel of B lymphoid genes for additional BSAP target genes. Four differentially expressed genes were shown to be under the direct control of BSAP, as their expression was rapidly regulated in Pax-5-deficient pre-BI cells by a hormone-inducible BSAP-estrogen receptor fusion protein. The genes coding for the B-cell receptor component Ig-alpha (mb-1) and the transcription factors N-myc and LEF-1 are positively regulated by BSAP, while the gene coding for the cell surface protein PD-1 is efficiently repressed. Distinct regulatory mechanisms of BSAP were revealed by reconstituting Pax-5-deficient pre-BI cells with full-length BSAP or a truncated form containing only the paired domain. IL-7 signalling was able to efficiently induce the N-myc gene only in the presence of full-length BSAP, while complete restoration of CD19 synthesis was critically dependent on the BSAP protein concentration. In contrast, the expression of the mb-1 and LEF-1 genes was already reconstituted by the paired domain polypeptide lacking any transactivation function, suggesting that the DNA-binding domain of BSAP is sufficient to recruit other transcription factors to the regulatory regions of these two genes. In conclusion, these loss- and gain-of-function experiments demonstrate that BSAP regulates four newly identified target genes as a transcriptional activator, repressor or docking protein depending on the specific regulatory sequence context. PMID:9545244

  10. Long Noncoding RNAs in the Yeast S. cerevisiae.

    PubMed

    Niederer, Rachel O; Hass, Evan P; Zappulla, David C

    2017-01-01

    Long noncoding RNAs have recently been discovered to comprise a sizeable fraction of the RNA World. The scope of their functions, physical organization, and disease relevance remain in the early stages of characterization. Although many thousands of lncRNA transcripts recently have been found to emanate from the expansive DNA between protein-coding genes in animals, there are also hundreds that have been found in simple eukaryotes. Furthermore, lncRNAs have been found in the bacterial and archaeal branches of the tree of life, suggesting they are ubiquitous. In this chapter, we focus primarily on what has been learned so far about lncRNAs from the greatly studied single-celled eukaryote, the yeast Saccharomyces cerevisiae. Most lncRNAs examined in yeast have been implicated in transcriptional regulation of protein-coding genes-often in response to forms of stress-whereas a select few have been ascribed yet other functions. Of those known to be involved in transcriptional regulation of protein-coding genes, the vast majority function in cis. There are also some yeast lncRNAs identified that are not directly involved in regulation of transcription. Examples of these include the telomerase RNA and telomere-encoded transcripts. In addition to its role as a template-encoding telomeric DNA synthesis, telomerase RNA has been shown to function as a flexible scaffold for protein subunits of the RNP holoenzyme. The flexible scaffold model provides a specific mechanistic paradigm that is likely to apply to many other lncRNAs that assemble and orchestrate large RNP complexes, even in humans. Looking to the future, it is clear that considerable fundamental knowledge remains to be obtained about the architecture and functions of lncRNAs. Using genetically tractable unicellular model organisms should facilitate lncRNA characterization. The acquired basic knowledge will ultimately translate to better understanding of the growing list of lncRNAs linked to human maladies.

  11. Sialotranscriptomics of Rhipicephalus zambeziensis reveals intricate expression profiles of secretory proteins and suggests tight temporal transcriptional regulation during blood-feeding.

    PubMed

    de Castro, Minique Hilda; de Klerk, Daniel; Pienaar, Ronel; Rees, D Jasper G; Mans, Ben J

    2017-08-10

    Ticks secrete a diverse mixture of secretory proteins into the host to evade its immune response and facilitate blood-feeding, making secretory proteins attractive targets for the production of recombinant anti-tick vaccines. The largely neglected tick species, Rhipicephalus zambeziensis, is an efficient vector of Theileria parva in southern Africa but its available sequence information is limited. Next generation sequencing has advanced sequence availability for ticks in recent years and has assisted the characterisation of secretory proteins. This study focused on the de novo assembly and annotation of the salivary gland transcriptome of R. zambeziensis and the temporal expression of secretory protein transcripts in female and male ticks, before the onset of feeding and during early and late feeding. The sialotranscriptome of R. zambeziensis yielded 23,631 transcripts from which 13,584 non-redundant proteins were predicted. Eighty-six percent of these contained a predicted start and stop codon and were estimated to be putatively full-length proteins. A fifth (2569) of the predicted proteins were annotated as putative secretory proteins and explained 52% of the expression in the transcriptome. Expression analyses revealed that 2832 transcripts were differentially expressed among feeding time points and 1209 between the tick sexes. The expression analyses further indicated that 57% of the annotated secretory protein transcripts were differentially expressed. Dynamic expression profiles of secretory protein transcripts were observed during feeding of female ticks. Whereby a number of transcripts were upregulated during early feeding, presumably for feeding site establishment and then during late feeding, 52% of these were downregulated, indicating that transcripts were required at specific feeding stages. This suggested that secretory proteins are under stringent transcriptional regulation that fine-tunes their expression in salivary glands during feeding. No open reading frames were predicted for 7947 transcripts. This class represented 17% of the differentially expressed transcripts, suggesting a potential transcriptional regulatory function of long non-coding RNA in tick blood-feeding. The assembled sialotranscriptome greatly expands the sequence availability of R. zambeziensis, assists in our understanding of the transcription of secretory proteins during blood-feeding and will be a valuable resource for future vaccine candidate selection.

  12. Mechanisms of radiation-induced gene responses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Woloschak, G.E.; Paunesku, T.

    1996-10-01

    In the process of identifying genes differentially expressed in cells exposed ultraviolet radiation, we have identified a transcript having a 26-bp region that is highly conserved in a variety of species including Bacillus circulans, yeast, pumpkin, Drosophila, mouse, and man. When the 5` region (flanking region or UTR) of a gene, the sequence is predominantly in +/+ orientation with respect to the coding DNA strand; while in the coding region and the 3` region (UTR), the sequence is most frequently in the +/-orientation with respect to the coding DNA strand. In two genes, the element is split into two parts;more » however, in most cases, it is found only once but with a minimum of 11 consecutive nucleotides precisely depicting the original sequence. The element is found in a large number of different genes with diverse functions (from human ras p21 to B. circulans chitonase). Gel shift assays demonstrated the presence of a protein in HeLa cell extracts that binds to the sense and antisense single-stranded consensus oligomers, as well as to the double- stranded oligonucleotide. When double-stranded oligomer was used, the size shift demonstrated as additional protein-oligomer complex larger than the one bound to either sense or antisense single-stranded consensus oligomers alone. It is speculated either that this element binds to protein(s) important in maintaining DNA is a single-stranded orientation for transcription or, alternatively that this element is important in the transcription-coupled DNA repair process.« less

  13. Differential regulation of transcription through distinct Suppressor of Hairless DNA binding site architectures during Notch signaling in proneural clusters.

    PubMed

    Cave, John W; Xia, Li; Caudy, Michael

    2011-01-01

    In Drosophila melanogaster, achaete (ac) and m8 are model basic helix-loop-helix activator (bHLH A) and repressor genes, respectively, that have the opposite cell expression pattern in proneural clusters during Notch signaling. Previous studies have shown that activation of m8 transcription in specific cells within proneural clusters by Notch signaling is programmed by a "combinatorial" and "architectural" DNA transcription code containing binding sites for the Su(H) and proneural bHLH A proteins. Here we show the novel result that the ac promoter contains a similar combinatorial code of Su(H) and bHLH A binding sites but contains a different Su(H) site architectural code that does not mediate activation during Notch signaling, thus programming a cell expression pattern opposite that of m8 in proneural clusters.

  14. Transcriptional regulation of coordinate changes in flagellar mRNAs during differentiation of Naegleria gruberi amoebae into flagellates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, J.H.; Walsh, C.J.

    1988-06-01

    The nuclear run-on technique was used to measure the rate of transcription of flagellar genes during the differentiation of Naegleria gruberi amebae into flagellates. Synthesis of mRNAs for the axonemal proteins ..cap alpha..- and BETA-tubulin and flagellar calmodulin, as well as a coordinately regulated poly(A)/sup +/ RNA that codes for an unidentified protein, showed transient increases averaging 22-fold. The rate of synthesis of two poly(A)/sup +/ RNAs common to ameobae and flagellates was low until the transcription of the flagellar genes began to decline, at which time synthesis of the RNAs found in ameobae increased 3- to 10-fold. The observedmore » changes in the rate of transcription can account quantitatively for the 20-fold increase in flagellar mRNA concentration during the differentiation. The data for the flagellar calmodulin gene demonstrate transcriptional regulation for a nontubulin axonemal protein. The data also demonstrate at least two programs of transcriptional regulation during the differentiation and raise the intriguing possibility that some significant fraction of the nearly 200 different proteins of the flagellar axoneme is transcriptionally regulated during the 1 h it takes N. gruberi amebae to form visible flagella.« less

  15. The Ever-Evolving Concept of the Gene: The Use of RNA/Protein Experimental Techniques to Understand Genome Functions

    PubMed Central

    Cipriano, Andrea; Ballarino, Monica

    2018-01-01

    The completion of the human genome sequence together with advances in sequencing technologies have shifted the paradigm of the genome, as composed of discrete and hereditable coding entities, and have shown the abundance of functional noncoding DNA. This part of the genome, previously dismissed as “junk” DNA, increases proportionally with organismal complexity and contributes to gene regulation beyond the boundaries of known protein-coding genes. Different classes of functionally relevant nonprotein-coding RNAs are transcribed from noncoding DNA sequences. Among them are the long noncoding RNAs (lncRNAs), which are thought to participate in the basal regulation of protein-coding genes at both transcriptional and post-transcriptional levels. Although knowledge of this field is still limited, the ability of lncRNAs to localize in different cellular compartments, to fold into specific secondary structures and to interact with different molecules (RNA or proteins) endows them with multiple regulatory mechanisms. It is becoming evident that lncRNAs may play a crucial role in most biological processes such as the control of development, differentiation and cell growth. This review places the evolution of the concept of the gene in its historical context, from Darwin's hypothetical mechanism of heredity to the post-genomic era. We discuss how the original idea of protein-coding genes as unique determinants of phenotypic traits has been reconsidered in light of the existence of noncoding RNAs. We summarize the technological developments which have been made in the genome-wide identification and study of lncRNAs and emphasize the methodologies that have aided our understanding of the complexity of lncRNA-protein interactions in recent years. PMID:29560353

  16. Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain

    PubMed Central

    2012-01-01

    Background Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. Results By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. Conclusions Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function. PMID:22651826

  17. The Long Noncoding RNA Transcriptome of Dictyostelium discoideum Development.

    PubMed

    Rosengarten, Rafael D; Santhanam, Balaji; Kokosar, Janez; Shaulsky, Gad

    2017-02-09

    Dictyostelium discoideum live in the soil as single cells, engulfing bacteria and growing vegetatively. Upon starvation, tens of thousands of amoebae enter a developmental program that includes aggregation, multicellular differentiation, and sporulation. Major shifts across the protein-coding transcriptome accompany these developmental changes. However, no study has presented a global survey of long noncoding RNAs (ncRNAs) in D. discoideum To characterize the antisense and long intergenic noncoding RNA (lncRNA) transcriptome, we analyzed previously published developmental time course samples using an RNA-sequencing (RNA-seq) library preparation method that selectively depletes ribosomal RNAs (rRNAs). We detected the accumulation of transcripts for 9833 protein-coding messenger RNAs (mRNAs), 621 lncRNAs, and 162 putative antisense RNAs (asRNAs). The noncoding RNAs were interspersed throughout the genome, and were distinct in expression level, length, and nucleotide composition. The noncoding transcriptome displayed a temporal profile similar to the coding transcriptome, with stages of gradual change interspersed with larger leaps. The transcription profiles of some noncoding RNAs were strongly correlated with known differentially expressed coding RNAs, hinting at a functional role for these molecules during development. Examining the mitochondrial transcriptome, we modeled two novel antisense transcripts. We applied yet another ribosomal depletion method to a subset of the samples to better retain transfer RNA (tRNA) transcripts. We observed polymorphisms in tRNA anticodons that suggested a post-transcriptional means by which D. discoideum compensates for codons missing in the genomic complement of tRNAs. We concluded that the prevalence and characteristics of long ncRNAs indicate that these molecules are relevant to the progression of molecular and cellular phenotypes during development. Copyright © 2017 Rosengarten et al.

  18. Variations in the non-coding transcriptome as a driver of inter-strain divergence and physiological adaptation in bacteria

    PubMed Central

    Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R.; Voß, Björn

    2015-01-01

    In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5′UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5′UTR. Such an sRNA/mRNA structure, which we name ‘actuaton’, represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation. PMID:25902393

  19. Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain

    PubMed Central

    de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

    2014-01-01

    The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. PMID:24792163

  20. De Novo Origin of Human Protein-Coding Genes

    PubMed Central

    Wu, Dong-Dong; Irwin, David M.; Zhang, Ya-Ping

    2011-01-01

    The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. PMID:22102831

  1. Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

    PubMed

    Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

    2015-12-11

    High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.

  2. Identification and characterization of long non-coding RNAs in rainbow trout eggs

    USDA-ARS?s Scientific Manuscript database

    Long non-coding RNAs (lncRNAs) are in general considered as a diverse class of transcripts longer than 200 nucleotides that structurally resemble mRNAs but do not encode proteins. Recent advances in RNA sequencing (RNA-Seq) and bioinformatics methods have provided an opportunity to indentify and ana...

  3. Differential expression of lncRNAs during the HIV replication cycle: an underestimated layer in the HIV-host interplay.

    PubMed

    Trypsteen, Wim; Mohammadi, Pejman; Van Hecke, Clarissa; Mestdagh, Pieter; Lefever, Steve; Saeys, Yvan; De Bleser, Pieter; Vandesompele, Jo; Ciuffi, Angela; Vandekerckhove, Linos; De Spiegelaere, Ward

    2016-10-26

    Studying the effects of HIV infection on the host transcriptome has typically focused on protein-coding genes. However, recent advances in the field of RNA sequencing revealed that long non-coding RNAs (lncRNAs) add an extensive additional layer to the cell's molecular network. Here, we performed transcriptome profiling throughout a primary HIV infection in vitro to investigate lncRNA expression at the different HIV replication cycle processes (reverse transcription, integration and particle production). Subsequently, guilt-by-association, transcription factor and co-expression analysis were performed to infer biological roles for the lncRNAs identified in the HIV-host interplay. Many lncRNAs were suggested to play a role in mechanisms relying on proteasomal and ubiquitination pathways, apoptosis, DNA damage responses and cell cycle regulation. Through transcription factor binding analysis, we found that lncRNAs display a distinct transcriptional regulation profile as compared to protein coding mRNAs, suggesting that mRNAs and lncRNAs are independently modulated. In addition, we identified five differentially expressed lncRNA-mRNA pairs with mRNA involvement in HIV pathogenesis with possible cis regulatory lncRNAs that control nearby mRNA expression and function. Altogether, the present study demonstrates that lncRNAs add a new dimension to the HIV-host interplay and should be further investigated as they may represent targets for controlling HIV replication.

  4. An essential role for the RNA-binding protein Smaug during the Drosophila maternal-to-zygotic transition.

    PubMed

    Benoit, Beatrice; He, Chun Hua; Zhang, Fan; Votruba, Sarah M; Tadros, Wael; Westwood, J Timothy; Smibert, Craig A; Lipshitz, Howard D; Theurkauf, William E

    2009-03-01

    Genetic control of embryogenesis switches from the maternal to the zygotic genome during the maternal-to-zygotic transition (MZT), when maternal mRNAs are destroyed, high-level zygotic transcription is initiated, the replication checkpoint is activated and the cell cycle slows. The midblastula transition (MBT) is the first morphological event that requires zygotic gene expression. The Drosophila MBT is marked by blastoderm cellularization and follows 13 cleavage-stage divisions. The RNA-binding protein Smaug is required for cleavage-independent maternal transcript destruction during the Drosophila MZT. Here, we show that smaug mutants also disrupt syncytial blastoderm stage cell-cycle delays, DNA replication checkpoint activation, cellularization, and high-level zygotic expression of protein coding and micro RNA genes. We also show that Smaug protein levels increase through the cleavage divisions and peak when the checkpoint is activated and zygotic transcription initiates, and that transgenic expression of Smaug in an anterior-to-posterior gradient produces a concomitant gradient in the timing of maternal transcript destruction, cleavage cell cycle delays, zygotic gene transcription, cellularization and gastrulation. Smaug accumulation thus coordinates progression through the MZT.

  5. Position specific variation in the rate of evolution in transcription factor binding sites

    PubMed Central

    Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B

    2003-01-01

    Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282

  6. miRNA-dependent gene silencing involving Ago2-mediated cleavage of a circular antisense RNA

    PubMed Central

    Hansen, Thomas B; Wiklund, Erik D; Bramsen, Jesper B; Villadsen, Sune B; Statham, Aaron L; Clark, Susan J; Kjems, Jørgen

    2011-01-01

    MicroRNAs (miRNAs) are ∼22 nt non-coding RNAs that typically bind to the 3′ UTR of target mRNAs in the cytoplasm, resulting in mRNA destabilization and translational repression. Here, we report that miRNAs can also regulate gene expression by targeting non-coding antisense transcripts in human cells. Specifically, we show that miR-671 directs cleavage of a circular antisense transcript of the Cerebellar Degeneration-Related protein 1 (CDR1) locus in an Ago2-slicer-dependent manner. The resulting downregulation of circular antisense has a concomitant decrease in CDR1 mRNA levels, independently of heterochromatin formation. This study provides the first evidence for non-coding antisense transcripts as functional miRNA targets, and a novel regulatory mechanism involving a positive correlation between mRNA and antisense circular RNA levels. PMID:21964070

  7. Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.)

    PubMed Central

    Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

    2015-01-01

    The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073

  8. Deployment of the human immunodeficiency virus type 1 protein arsenal: combating the host to enhance viral transcription and providing targets for therapeutic development

    PubMed Central

    Dahiya, Satinder; Nonnemacher, Michael R.

    2012-01-01

    Despite the success of highly active antiretroviral therapy in combating human immunodeficiency virus type 1 (HIV-1) infection, the virus still persists in viral reservoirs, often in a state of transcriptional silence. This review focuses on the HIV-1 protein and regulatory machinery and how expanding knowledge of the function of individual HIV-1-coded proteins has provided valuable insights into understanding HIV transcriptional regulation in selected susceptible cell types. Historically, Tat has been the most studied primary transactivator protein, but emerging knowledge of HIV-1 transcriptional regulation in cells of the monocyte–macrophage lineage has more recently established that a number of the HIV-1 accessory proteins like Vpr may directly or indirectly regulate the transcriptional process. The viral proteins Nef and matrix play important roles in modulating the cellular activation pathways to facilitate viral replication. These observations highlight the cross talk between the HIV-1 transcriptional machinery and cellular activation pathways. The review also discusses the proposed transcriptional regulation mechanisms that intersect with the pathways regulated by microRNAs and how development of the knowledge of chromatin biology has enhanced our understanding of key protein–protein and protein–DNA interactions that form the HIV-1 transcriptome. Finally, we discuss the potential pharmacological approaches to target viral persistence and enhance effective transcription to purge the virus in cellular reservoirs, especially within the central nervous system, and the novel therapeutics that are currently in various stages of development to achieve a much superior prognosis for the HIV-1-infected population. PMID:22422068

  9. A long natural-antisense RNA is accumulated in the conidia of Aspergillus oryzae.

    PubMed

    Tsujii, Masaru; Okuda, Satoshi; Ishi, Kazutomo; Madokoro, Kana; Takeuchi, Michio; Yamagata, Youhei

    2016-01-01

    Analysis of expressed sequence tag libraries from various culture conditions revealed the existence of conidia-specific transcripts assembled to putative conidiation-specific reductase gene (csrA) in Aspergillus oryzae. However, the all transcripts were transcribed with opposite direction to the gene csrA. The sequence analysis of the transcript revealed that the RNA overlapped mRNA of csrA with 3'-end, and did not code protein longer than 60 amino acid residues. We designated the transcript Conidia Specific Long Natural-antisense RNA (CSLNR). The real-time PCR analysis demonstrated that the CSLNR is conidia-specific transcript, which cannot be transcribed in the absence of brlA, and the amount of CSLNR was much more than that of the transcript from csrA in conidia. Furthermore, the csrA deletion, also lacking coding region of CSLNR in A. oryzae reduced the number of conidia. Overexpression of CsrA demonstrated the inhibition of growth and conidiation, while CSLNR did not affect conidiation.

  10. Epigenetic Regulation of Transcription in Trypanosomatid Protozoa.

    PubMed

    Martínez-Calvillo, Santiago; Romero-Meza, Gabriela; Vizuet-de-Rueda, Juan C; Florencio-Martínez, Luis E; Manning-Cela, Rebeca; Nepomuceno-Mejía, Tomás

    2018-02-01

    The Trypanosomatid family includes flagellated parasites that cause fatal human diseases. Remarkably, protein-coding genes in these organisms are positioned in long tandem arrays that are transcribed polycistronically. However, the knowledge about regulation of transcription initiation and termination in trypanosomatids is scarce. The importance of epigenetic regulation in these processes has become evident in the last years, as distinctive histone modifications and histone variants have been found in transcription initiation and termination regions. Moreover, multiple chromatin-related proteins have been identified and characterized in trypanosomatids, including histone-modifying enzymes, effector complexes, chromatin-remodelling enzymes and histone chaperones. Notably, base J, a modified thymine residue present in the nuclear DNA of trypanosomatids, has been implicated in transcriptional regulation. Here we review the current knowledge on epigenetic control of transcription by all three RNA polymerases in this group of early-diverged eukaryotes.

  11. Expression of metastasis suppressor gene AES driven by a Yin Yang (YY) element in a CpG island promoter and transcription factor YY2.

    PubMed

    Kakizaki, Fumihiko; Sonoshita, Masahiro; Miyoshi, Hiroyuki; Itatani, Yoshiro; Ito, Shinji; Kawada, Kenji; Sakai, Yoshiharu; Taketo, M Mark

    2016-11-01

    We recently found that the product of the AES gene functions as a metastasis suppressor of colorectal cancer (CRC) in both humans and mice. Expression of amino-terminal enhancer of split (AES) protein is significantly decreased in liver metastatic lesions compared with primary colon tumors. To investigate its downregulation mechanism in metastases, we searched for transcriptional regulators of AES in human CRC and found that its expression is reduced mainly by transcriptional dysregulation and, in some cases, by additional haploidization of its coding gene. The AES promoter-enhancer is in a typical CpG island, and contains a Yin-Yang transcription factor recognition sequence (YY element). In human epithelial cells of normal colon and primary tumors, transcription factor YY2, a member of the YY family, binds directly to the YY element, and stimulates expression of AES. In a transplantation mouse model of liver metastases, however, expression of Yy2 (and therefore of Aes) is downregulated. In human CRC metastases to the liver, the levels of AES protein are correlated with those of YY2. In addition, we noticed copy-number reduction for the AES coding gene in chromosome 19p13.3 in 12% (5/42) of human CRC cell lines. We excluded other mechanisms such as point or indel mutations in the coding or regulatory regions of the AES gene, CpG methylation in the AES promoter enhancer, expression of microRNAs, and chromatin histone modifications. These results indicate that Aes may belong to a novel family of metastasis suppressors with a CpG-island promoter enhancer, and it is regulated transcriptionally. © 2016 The Authors. Cancer Science published by John Wiley & Sons Australia, Ltd on behalf of Japanese Cancer Association.

  12. Analysis of the complete nucleotide sequence and functional organization of the genome of Streptococcus pneumoniae bacteriophage Cp-1.

    PubMed

    Martín, A C; López, R; García, P

    1996-06-01

    Cp-1, a bacteriophage infecting Streptococcus pneumoniae, has a linear double-stranded DNA genome, with a terminal protein covalently linked to its 5' ends, that replicates by the protein-priming mechanism. We describe here the complete DNA sequence and transcriptional map of the Cp-1 genome. These analyses have led to the firm assignment of 10 genes and the localization of 19 additional open reading frames in the 19,345-bp Cp-1 DNA. Striking similarities and differences between some of these proteins and those of the Bacillus subtilis phage phi 29, a system that also replicates its DNA by the protein-priming mechanism, have been revealed. The genes coding for structural proteins and assembly factors are located in the central part of the Cp-1 genome. Several proteins corresponding to the predicted gene products were identified by in vitro and in vivo expression of the cloned genes. Mature major head protein from the virion particles results from hydrolysis of the primary gene product at the His-49 residue, whereas the phage gene is expressed in Escherichia coli without modification. We have also identified two open reading frames coding for proteins that show high degrees of similarity to the N- and C-terminal regions, respectively, of the single tail protein identified in phi 29. Sequencing and primer extension analysis suggest transcription of a small RNA showing a secondary structure similar to that of the prohead RNA required for the ATP-dependent packaging of phi 29 DNA. On the basis of its temporal expression, transcription of the Cp-1 genome takes place in two stages, early and late. Combined Northern (RNA) blot and primer extension experiments allowed us to map the 5' initiation sites of the transcripts, and we found that only three genes were transcribed from right to left. These analyses reveal that there are also noticeable differences between Cp-l and phi 29 in transcriptional organization. Considered together, the observations reported here provide new tangible evidence on phylogenetic relationships between B. subtilis and S. pneumoniae.

  13. DNA Dynamics.

    ERIC Educational Resources Information Center

    Warren, Michael D.

    1997-01-01

    Explains a method to enable students to understand DNA and protein synthesis using model-building and role-playing. Acquaints students with the triplet code and transcription. Includes copies of the charts used in this technique. (DDR)

  14. Proteomic Analysis of Mitotic RNA Polymerase II Reveals Novel Interactors and Association With Proteins Dysfunctional in Disease*

    PubMed Central

    Möller, André; Xie, Sheila Q.; Hosp, Fabian; Lang, Benjamin; Phatnani, Hemali P.; James, Sonya; Ramirez, Francisco; Collin, Gayle B.; Naggert, Jürgen K.; Babu, M. Madan; Greenleaf, Arno L.; Selbach, Matthias; Pombo, Ana

    2012-01-01

    RNA polymerase II (RNAPII) transcribes protein-coding genes in eukaryotes and interacts with factors involved in chromatin remodeling, transcriptional activation, elongation, and RNA processing. Here, we present the isolation of native RNAPII complexes using mild extraction conditions and immunoaffinity purification. RNAPII complexes were extracted from mitotic cells, where they exist dissociated from chromatin. The proteomic content of native complexes in total and size-fractionated extracts was determined using highly sensitive LC-MS/MS. Protein associations with RNAPII were validated by high-resolution immunolocalization experiments in both mitotic cells and in interphase nuclei. Functional assays of transcriptional activity were performed after siRNA-mediated knockdown. We identify >400 RNAPII associated proteins in mitosis, among these previously uncharacterized proteins for which we show roles in transcriptional elongation. We also identify, as novel functional RNAPII interactors, two proteins involved in human disease, ALMS1 and TFG, emphasizing the importance of gene regulation for normal development and physiology. PMID:22199231

  15. Venom gland transcriptomic and venom proteomic analyses of the scorpion Megacormus gertschi Díaz-Najera, 1966 (Scorpiones: Euscorpiidae: Megacorminae).

    PubMed

    Santibáñez-López, Carlos E; Cid-Uribe, Jimena I; Zamudio, Fernando Z; Batista, Cesar V F; Ortiz, Ernesto; Possani, Lourival D

    2017-07-01

    The soluble venom from the Mexican scorpion Megacormus gertschi of the family Euscorpiidae was obtained and its biological effects were tested in several animal models. This venom is not toxic to mice at doses of 100 μg per 20 g of mouse weight, while being lethal to arthropods (insects and crustaceans), at doses of 20 μg (for crickets) and 100 μg (for shrimps) per animal. Samples of the venom were separated by high performance liquid chromatography and circa 80 distinct chromatographic fractions were obtained from which 67 components have had their molecular weights determined by mass spectrometry analysis. The N-terminal amino acid sequence of seven protein/peptides were obtained by Edman degradation and are reported. Among the high molecular weight components there are enzymes with experimentally-confirmed phospholipase activity. A pair of telsons from this scorpion species was dissected, from which total RNA was extracted and used for cDNA library construction. Massive sequencing by the Illumina protocol, followed by de novo assembly, resulted in a total of 110,528 transcripts. From those, we were able to annotate 182, which putatively code for peptides/proteins with sequence similarity to previously-reported venom components available from different protein databases. Transcripts seemingly coding for enzymes showed the richest diversity, with 52 sequences putatively coding for proteases, 20 for phospholipases, 8 for lipases and 5 for hyaluronidases. The number of different transcripts potentially coding for peptides with sequence similarity to those that affect ion channels was 19, for putative antimicrobial peptides 19, and for protease inhibitor-like peptides, 18. Transcripts seemingly coding for other venom components were identified and described. The LC/MS analysis of a trypsin-digested venom aliquot resulted in 23 matches with the translated transcriptome database, which validates the transcriptome. The proteomic and transcriptomic analyses reported here constitute the first approach to study the venom components from a scorpion species belonging to the family Euscorpiidae. The data certainly show that this venom is different from all the ones described thus far in the literature. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. RNA Nuclear Export: From Neurological Disorders to Cancer.

    PubMed

    Hautbergue, Guillaume M

    2017-01-01

    The presence of a nuclear envelope, also known as nuclear membrane, defines the structural framework of all eukaryotic cells by separating the nucleus, which contains the genetic material, from the cytoplasm where the synthesis of proteins takes place. Translation of proteins in Eukaryotes is thus dependent on the active transport of DNA-encoded RNA molecules through pores embedded within the nuclear membrane. Several mechanisms are involved in this process generally referred to as RNA nuclear export or nucleocytoplasmic transport of RNA. The regulated expression of genes requires the nuclear export of protein-coding messenger RNA molecules (mRNAs) as well as non-coding RNAs (ncRNAs) together with proteins and pre-assembled ribosomal subunits. The nuclear export of mRNAs is intrinsically linked to the co-transcriptional processing of nascent transcripts synthesized by the RNA polymerase II. This functional coupling is essential for the survival of cells allowing for timely nuclear export of fully processed transcripts, which could otherwise cause the translation of abnormal proteins such as the polymeric repeat proteins produced in some neurodegenerative diseases. Alterations of the mRNA nuclear export pathways can also lead to genome instability and to various forms of cancer. This chapter will describe the molecular mechanisms driving the nuclear export of RNAs with a particular emphasis on mRNAs. It will also review their known alterations in neurological disorders and cancer, and the recent opportunities they offer for the potential development of novel therapeutic strategies.

  17. Transcriptional regulation of decreased protein synthesis during skeletal muscle unloading

    NASA Technical Reports Server (NTRS)

    Howard, G.; Steffen, J. M.; Geoghegan, T. E.

    1989-01-01

    The regulatory role of transcriptional alterations in unloaded skeletal muscles was investigated by determining levels of total muscle RNA and mRNA fractions in soleus, gastrocnemius, and extensor digitorum longus (EDL) of rats subjected to whole-body suspension for up to 7 days. After 7 days, total RNA and mRNA contents were lower in soleus and gastrocnemius, compared with controls, but the concentrations of both RNAs per g muscle were unaltered. Alpha-actin mRNA (assessed by dot hybridization) was significantly reduced in soleus after 1, 3, and 7 days of suspension and in gastrocnemius after 3 and 7 days, but was unchanged in EDL. Protein synthesis directed by RNA extracted from soleus and EDL indicated marked alteration in mRNAs coding for several small proteins. Results suggest that altered transcription and availability of specific mRNAs contribute significantly to the regulation of protein synthesis during skeletal muscle unloading.

  18. Controlling nuclear RNA levels.

    PubMed

    Schmid, Manfred; Jensen, Torben Heick

    2018-05-10

    RNA turnover is an integral part of cellular RNA homeostasis and gene expression regulation. Whereas the cytoplasmic control of protein-coding mRNA is often the focus of study, we discuss here the less appreciated role of nuclear RNA decay systems in controlling RNA polymerase II (RNAPII)-derived transcripts. Historically, nuclear RNA degradation was found to be essential for the functionalization of transcripts through their proper maturation. Later, it was discovered to also be an important caretaker of nuclear hygiene by removing aberrant and unwanted transcripts. Recent years have now seen a set of new protein complexes handling a variety of new substrates, revealing functions beyond RNA processing and the decay of non-functional transcripts. This includes an active contribution of nuclear RNA metabolism to the overall cellular control of RNA levels, with mechanistic implications during cellular transitions.

  19. Gene end-like sequences within the 3' non-coding region of the Nipah virus genome attenuate viral gene transcription.

    PubMed

    Sugai, Akihiro; Sato, Hiroki; Yoneda, Misako; Kai, Chieko

    2017-08-01

    The regulation of transcription during Nipah virus (NiV) replication is poorly understood. Using a bicistronic minigenome system, we investigated the involvement of non-coding regions (NCRs) in the transcriptional re-initiation efficiency of NiV RNA polymerase. Reporter assays revealed that attenuation of NiV gene expression was not constant at each gene junction, and that the attenuating property was controlled by the 3' NCR. However, this regulation was independent of the gene-end, gene-start and intergenic regions. Northern blot analysis indicated that regulation of viral gene expression by the phosphoprotein (P) and large protein (L) 3' NCRs occurred at the transcription level. We identified uridine-rich tracts within the L 3' NCR that are similar to gene-end signals. These gene-end-like sequences were recognized as weak transcription termination signals by the viral RNA polymerase, thereby reducing downstream gene transcription. Thus, we suggest that NiV has a unique mechanism of transcriptional regulation. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain.

    PubMed

    de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

    2014-06-01

    The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. The histone modifications governing TFF1 transcription mediated by estrogen receptor.

    PubMed

    Li, Yanyan; Sun, Luyang; Zhang, Yu; Wang, Dandan; Wang, Feng; Liang, Jing; Gui, Bin; Shang, Yongfeng

    2011-04-22

    Transcription regulation by histone modifications is a major contributing factor to the structural and functional diversity in biology. These modifications are encrypted as histone codes or histone languages and function to establish and maintain heritable epigenetic codes that define the identity and the fate of the cell. Despite recent advances revealing numerous histone modifications associated with transcription regulation, how such modifications dictate the process of transcription is not fully understood. Here we describe spatial and temporal analyses of the histone modifications that are introduced during estrogen receptor α (ERα)-activated transcription. We demonstrated that aborting RNA polymerase II caused a disruption of the histone modifications that are associated with transcription elongation but had a minimal effect on modifications deposited during transcription initiation. We also found that the histone H3S10 phosphorylation mark is catalyzed by mitogen- and stress-activated protein kinase 1 (MSK1) and is recognized by a 14-3-3ζ/14-3-3ε heterodimer through its interaction with H3K4 trimethyltransferase SMYD3 and the p52 subunit of TFIIH. We showed that H3S10 phosphorylation is a prerequisite for H3K4 trimethylation. In addition, we demonstrated that SET8/PR-Set7/KMT5A is required for ERα-regulated transcription and its catalyzed H4K20 monomethylation is implicated in both transcription initiation and elongation. Our experiments provide a relatively comprehensive analysis of histone modifications associated with ERα-regulated transcription and define the biological meaning of several key components of the histone code that governs ERα-regulated transcription.

  2. Regulation of expression of two LY-6 family genes by intron retention and transcription induced chimerism

    PubMed Central

    Calvanese, Vincenzo; Mallya, Meera; Campbell, R Duncan; Aguado, Begoña

    2008-01-01

    Background Regulation of the expression of particular genes can rely on mechanisms that are different from classical transcriptional and translational control. The LY6G5B and LY6G6D genes encode LY-6 domain proteins, whose expression seems to be regulated in an original fashion, consisting of an intron retention event which generates, through an early premature stop codon, a non-coding transcript, preventing expression in most cell lines and tissues. Results The MHC LY-6 non-coding transcripts have shown to be stable and very abundant in the cell, and not subject to Nonsense Mediated Decay (NMD). This retention event appears not to be solely dependent on intron features, because in the case of LY6G5B, when the intron is inserted in the artificial context of a luciferase expression plasmid, it is fully spliced but strongly stabilises the resulting luciferase transcript. In addition, by quantitative PCR we found that the retained and spliced forms are differentially expressed in tissues indicating an active regulation of the non-coding transcript. EST database analysis revealed that these genes have an alternative expression pathway with the formation of Transcription Induced Chimeras (TIC). This data was confirmed by RT-PCR, revealing the presence of different transcripts that would encode the chimeric proteins CSNKβ-LY6G5B and G6F-LY6G6D, in which the LY-6 domain would join to a kinase domain and an Ig-like domain, respectively. Conclusion In conclusion, the LY6G5B and LY6G6D intron-retained transcripts are not subjected to NMD and are more abundant than the properly spliced forms. In addition, these genes form chimeric transcripts with their neighbouring same orientation 5' genes. Of interest is the fact that the 5' genes (CSNKβ or G6F) undergo differential splicing only in the context of the chimera (CSNKβ-LY6G5B or G6F-LY6G6C) and not on their own. PMID:18817541

  3. The majority of total nuclear-encoded non-ribosomal RNA in a human cell is 'dark matter' un-annotated RNA.

    PubMed

    Kapranov, Philipp; St Laurent, Georges; Raz, Tal; Ozsolak, Fatih; Reynolds, C Patrick; Sorensen, Poul H B; Reaman, Gregory; Milos, Patrice; Arceci, Robert J; Thompson, John F; Triche, Timothy J

    2010-12-21

    Discovery that the transcriptional output of the human genome is far more complex than predicted by the current set of protein-coding annotations and that most RNAs produced do not appear to encode proteins has transformed our understanding of genome complexity and suggests new paradigms of genome regulation. However, the fraction of all cellular RNA whose function we do not understand and the fraction of the genome that is utilized to produce that RNA remain controversial. This is not simply a bookkeeping issue because the degree to which this un-annotated transcription is present has important implications with respect to its biologic function and to the general architecture of genome regulation. For example, efforts to elucidate how non-coding RNAs (ncRNAs) regulate genome function will be compromised if that class of RNAs is dismissed as simply 'transcriptional noise'. We show that the relative mass of RNA whose function and/or structure we do not understand (the so called 'dark matter' RNAs), as a proportion of all non-ribosomal, non-mitochondrial human RNA (mt-RNA), can be greater than that of protein-encoding transcripts. This observation is obscured in studies that focus only on polyA-selected RNA, a method that enriches for protein coding RNAs and at the same time discards the vast majority of RNA prior to analysis. We further show the presence of a large number of very long, abundantly-transcribed regions (100's of kb) in intergenic space and further show that expression of these regions is associated with neoplastic transformation. These overlap some regions found previously in normal human embryonic tissues and raises an interesting hypothesis as to the function of these ncRNAs in both early development and neoplastic transformation. We conclude that 'dark matter' RNA can constitute the majority of non-ribosomal, non-mitochondrial-RNA and a significant fraction arises from numerous very long, intergenic transcribed regions that could be involved in neoplastic transformation.

  4. Pan-cancer transcriptomic analysis associates long non-coding RNAs with key mutational driver events

    PubMed Central

    Ashouri, Arghavan; Sayin, Volkan I.; Van den Eynden, Jimmy; Singh, Simranjit X.; Papagiannakopoulos, Thales; Larsson, Erik

    2016-01-01

    Thousands of long non-coding RNAs (lncRNAs) lie interspersed with coding genes across the genome, and a small subset has been implicated as downstream effectors in oncogenic pathways. Here we make use of transcriptome and exome sequencing data from thousands of tumours across 19 cancer types, to identify lncRNAs that are induced or repressed in relation to somatic mutations in key oncogenic driver genes. Our screen confirms known coding and non-coding effectors and also associates many new lncRNAs to relevant pathways. The associations are often highly reproducible across cancer types, and while many lncRNAs are co-expressed with their protein-coding hosts or neighbours, some are intergenic and independent. We highlight lncRNAs with possible functions downstream of the tumour suppressor TP53 and the master antioxidant transcription factor NFE2L2. Our study provides a comprehensive overview of lncRNA transcriptional alterations in relation to key driver mutational events in human cancers. PMID:28959951

  5. De Novo ORFs in Drosophila Are Important to Organismal Fitness and Evolved Rapidly from Previously Non-coding Sequences

    PubMed Central

    Reinhardt, Josephine A.; Wanjiru, Betty M.; Brant, Alicia T.; Saelao, Perot; Begun, David J.; Jones, Corbin D.

    2013-01-01

    How non-coding DNA gives rise to new protein-coding genes (de novo genes) is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs), while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important. PMID:24146629

  6. Role of genomic architecture in the expression dynamics of long noncoding RNAs during differentiation of human neuroblastoma cells.

    PubMed

    Batagov, Arsen O; Yarmishyn, Aliaksandr A; Jenjaroenpun, Piroon; Tan, Jovina Z; Nishida, Yuichiro; Kurochkin, Igor V

    2013-10-16

    Mammalian genomes are extensively transcribed producing thousands of long non-protein-coding RNAs (lncRNAs). The biological significance and function of the vast majority of lncRNAs remain unclear. Recent studies have implicated several lncRNAs as playing important roles in embryonic development and cancer progression. LncRNAs are characterized with different genomic architectures in relationship with their associated protein-coding genes. Our study aimed at bridging lncRNA architecture with dynamical patterns of their expression using differentiating human neuroblastoma cells model. LncRNA expression was studied in a 120-hours timecourse of differentiation of human neuroblastoma SH-SY5Y cells into neurons upon treatment with retinoic acid (RA), the compound used for the treatment of neuroblastoma. A custom microarray chip was utilized to interrogate expression levels of 9,267 lncRNAs in the course of differentiation. We categorized lncRNAs into 19 architecture classes according to their position relatively to protein-coding genes. For each architecture class, dynamics of expression of lncRNAs was studied in association with their protein-coding partners. It allowed us to demonstrate positive correlation of lncRNAs with their associated protein-coding genes at bidirectional promoters and for sense-antisense transcript pairs. In contrast, lncRNAs located in the introns and downstream of the protein-coding genes were characterized with negative correlation modes. We further classified the lncRNAs by the temporal patterns of their expression dynamics. We found that intronic and bidirectional promoter architectures are associated with rapid RA-dependent induction or repression of the corresponding lncRNAs, followed by their constant expression. At the same time, lncRNAs expressed downstream of protein-coding genes are characterized by rapid induction, followed by transcriptional repression. Quantitative RT-PCR analysis confirmed the discovered functional modes for several selected lncRNAs associated with proteins involved in cancer and embryonic development. This is the first report detailing dynamical changes of multiple lncRNAs during RA-induced neuroblastoma differentiation. Integration of genomic and transcriptomic levels of information allowed us to demonstrate specific behavior of lncRNAs organized in different genomic architectures. This study also provides a list of lncRNAs with possible roles in neuroblastoma.

  7. Efficiency of VIGS and gene expression in a novel bipartite potexvirus vector delivery system as a function of strength of TGB1 silencing suppression.

    PubMed

    Lim, Hyoun-Sub; Vaira, Anna Maria; Domier, Leslie L; Lee, Sung Chul; Kim, Hong Gi; Hammond, John

    2010-06-20

    We have developed plant virus-based vectors for virus-induced gene silencing (VIGS) and protein expression, based on Alternanthera mosaic virus (AltMV), for infection of a wide range of host plants including Nicotiana benthamiana and Arabidopsis thaliana by either mechanical inoculation of in vitro transcripts or via agroinfiltration. In vivo transcripts produced by co-agroinfiltration of bacteriophage T7 RNA polymerase resulted in T7-driven AltMV infection from a binary vector in the absence of the Cauliflower mosaic virus 35S promoter. An artificial bipartite viral vector delivery system was created by separating the AltMV RNA-dependent RNA polymerase and Triple Gene Block (TGB)123-Coat protein (CP) coding regions into two constructs each bearing the AltMV 5' and 3' non-coding regions, which recombined in planta to generate a full-length AltMV genome. Substitution of TGB1 L(88)P, and equivalent changes in other potexvirus TGB1 proteins, affected RNA silencing suppression efficacy and suitability of the vectors from protein expression to VIGS. Published by Elsevier Inc.

  8. Deep sequencing approaches for the analysis of prokaryotic transcriptional boundaries and dynamics.

    PubMed

    James, Katherine; Cockell, Simon J; Zenkin, Nikolay

    2017-05-01

    The identification of the protein-coding regions of a genome is straightforward due to the universality of start and stop codons. However, the boundaries of the transcribed regions, conditional operon structures, non-coding RNAs and the dynamics of transcription, such as pausing of elongation, are non-trivial to identify, even in the comparatively simple genomes of prokaryotes. Traditional methods for the study of these areas, such as tiling arrays, are noisy, labour-intensive and lack the resolution required for densely-packed bacterial genomes. Recently, deep sequencing has become increasingly popular for the study of the transcriptome due to its lower costs, higher accuracy and single nucleotide resolution. These methods have revolutionised our understanding of prokaryotic transcriptional dynamics. Here, we review the deep sequencing and data analysis techniques that are available for the study of transcription in prokaryotes, and discuss the bioinformatic considerations of these analyses. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. A Rapid, Extensive, and Transient Transcriptional Response to Estrogen Signaling in Breast Cancer Cells

    PubMed Central

    Hah, Nasun; Danko, Charles G.; Core, Leighton; Waterfall, Joshua J.; Siepel, Adam; Lis, John T.; Kraus, W. Lee

    2011-01-01

    Summary We report the immediate effects of estrogen signaling on the transcriptome of breast cancer cells using Global Run-On and sequencing (GRO-seq). The data were analyzed using a new bioinformatic approach that allowed us to identify transcripts directly from the GRO-seq data. We found that estrogen signaling directly regulates a strikingly large fraction of the transcriptome in a rapid, robust, and unexpectedly transient manner. In addition to protein coding genes, estrogen regulates the distribution and activity of all three RNA polymerases, and virtually every class of non-coding RNA that has been described to date. We also identified a large number of previously undetected estrogen-regulated intergenic transcripts, many of which are found proximal to estrogen receptor binding sites. Collectively, our results provide the most comprehensive measurement of the primary and immediate estrogen effects to date and a resource for understanding rapid signal-dependent transcription in other systems. PMID:21549415

  10. Regulatory coding of lymphoid lineage choice by hematopoietic transcription factors

    NASA Technical Reports Server (NTRS)

    Warren, Luigi A.; Rothenberg, Ellen V.

    2003-01-01

    During lymphopoiesis, precursor cells negotiate a complex regulatory space, defined by the levels of several competing and cross-regulating transcription factors, before arriving at stable states of commitment to the B-, T- and NK-specific developmental programs. Recent perturbation experiments provide evidence that this space has three major axes, corresponding to the PU.1 versus GATA-1 balance, the intensity of Notch signaling through the CSL pathway, and the ratio of E-box transcription factors to their Id protein antagonists.

  11. The RNA Exosome Adaptor ZFC3H1 Functionally Competes with Nuclear Export Activity to Retain Target Transcripts.

    PubMed

    Silla, Toomas; Karadoulama, Evdoxia; Mąkosa, Dawid; Lubas, Michal; Jensen, Torben Heick

    2018-05-15

    Mammalian genomes are promiscuously transcribed, yielding protein-coding and non-coding products. Many transcripts are short lived due to their nuclear degradation by the ribonucleolytic RNA exosome. Here, we show that abolished nuclear exosome function causes the formation of distinct nuclear foci, containing polyadenylated (pA + ) RNA secluded from nucleocytoplasmic export. We asked whether exosome co-factors could serve such nuclear retention. Co-localization studies revealed the enrichment of pA + RNA foci with "pA-tail exosome targeting (PAXT) connection" components MTR4, ZFC3H1, and PABPN1 but no overlap with known nuclear structures such as Cajal bodies, speckles, paraspeckles, or nucleoli. Interestingly, ZFC3H1 is required for foci formation, and in its absence, selected pA + RNAs, including coding and non-coding transcripts, are exported to the cytoplasm in a process dependent on the mRNA export factor AlyREF. Our results establish ZFC3H1 as a central nuclear pA + RNA retention factor, counteracting nuclear export activity. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.

  12. Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.).

    PubMed

    Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

    2015-02-01

    The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  13. Long Non-coding RNA, PANDA, Contributes to the Stabilization of p53 Tumor Suppressor Protein.

    PubMed

    Kotake, Yojiro; Kitagawa, Kyoko; Ohhata, Tatsuya; Sakai, Satoshi; Uchida, Chiharu; Niida, Hiroyuki; Naemura, Madoka; Kitagawa, Masatoshi

    2016-04-01

    P21-associated noncoding RNA DNA damage-activated (PANDA) is induced in response to DNA damage and represses apoptosis by inhibiting the function of nuclear transcription factor Y subunit alpha (NF-YA) transcription factor. Herein, we report that PANDA affects regulation of p53 tumor-suppressor protein. U2OS cells were transfected with PANDA siRNAs. At 72 h post-transfection, cells were subjected to immunoblotting and quantitative reverse transcription-polymerase chain reaction. Depletion of PANDA was associated with decreased levels of p53 protein, but not p53 mRNA. The stability of p53 protein was markedly reduced by PANDA silencing. Degradation of p53 protein by silencing PANDA was prevented by treatment of MG132, a proteasome inhibitor. Moreover, depletion of PANDA prevented accumulation of p53 protein, as a result of DNA damage, induced by the genotoxic agent etoposide. These results suggest that PANDA stabilizes p53 protein in response to DNA damage, and provide new insight into the regulatory mechanisms of p53. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.

  14. PRIC320, a transcription coactivator, isolated from peroxisome proliferator-binding protein complex.

    PubMed

    Surapureddi, Sailesh; Viswakarma, Navin; Yu, Songtao; Guo, Dongsheng; Rao, M Sambasiva; Reddy, Janardan K

    2006-05-05

    Ciprofibrate, a potent peroxisome proliferator, induces pleiotropic responses in liver by activating peroxisome proliferator-activated receptor alpha (PPARalpha), a nuclear receptor. Transcriptional regulation by liganded nuclear receptors involves the participation of coregulators that form multiprotein complexes possibly to achieve cell and gene specific transcription. SDS-PAGE and matrix-assisted laser desorption/ionization reflection time-of-flight mass spectrometric analyses of ciprofibrate-binding proteins from liver nuclear extracts obtained using ciprofibrate-Sepharose affinity matrix resulted in the identification of a new high molecular weight nuclear receptor coactivator, which we designated PRIC320. The full-length human cDNA encoding this protein has an open-reading frame that codes for a 320kDa protein containing 2882 amino acids. PRIC320 contains five LXXLL signature motifs that mediate interaction with nuclear receptors. PRIC320 binds avidly to nuclear receptors PPARalpha, CAR, ERalpha, and RXR, but only minimally with PPARgamma. PRIC320 also interacts with transcription cofactors CBP, PRIP, and PBP. Immunoprecipitation-immunoblotting as well as cellular localization studies confirmed the interaction between PPARalpha and PRIC320. PRIC320 acts as a transcription coactivator by stimulating PPARalpha-mediated transcription. We conclude that ciprofibrate, a PPARalpha ligand, binds a multiprotein complex and PRIC320 cloned from this complex functions as a nuclear receptor coactivator.

  15. Quality control of mRNP biogenesis: networking at the transcription site.

    PubMed

    Eberle, Andrea B; Visa, Neus

    2014-08-01

    Eukaryotic cells carry out quality control (QC) over the processes of RNA biogenesis to inactivate or eliminate defective transcripts, and to avoid their production. In the case of protein-coding transcripts, the quality controls can sense defects in the assembly of mRNA-protein complexes, in the processing of the precursor mRNAs, and in the sequence of open reading frames. Different types of defect are monitored by different specialized mechanisms. Some of them involve dedicated factors whose function is to identify faulty molecules and target them for degradation. Others are the result of a more subtle balance in the kinetics of opposing activities in the mRNA biogenesis pathway. One way or another, all such mechanisms hinder the expression of the defective mRNAs through processes as diverse as rapid degradation, nuclear retention and transcriptional silencing. Three major degradation systems are responsible for the destruction of the defective transcripts: the exosome, the 5'-3' exoribonucleases, and the nonsense-mediated mRNA decay (NMD) machinery. This review summarizes recent findings on the cotranscriptional quality control of mRNA biogenesis, and speculates that a protein-protein interaction network integrates multiple mRNA degradation systems with the transcription machinery. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. The gene coding for the B cell surface protein CD19 is localized on human chromosome 16p11.

    PubMed

    Stapleton, P; Kozmik, Z; Weith, A; Busslinger, M

    1995-02-01

    The CD19 gene codes for one of the earliest markers of the human B cell lineage and is a target for the B lymphoid-specific transcription factor BSAP (Pax-5). The transmembrane protein CD19 has been implicated in controlling proliferation of mature B lymphocytes by modulating signal transduction through the antigen receptor. In this study, we have employed Southern blot and fluorescence in situ hybridization analyses to localize the CD19 gene to human chromosome 16p11.

  17. A Change in SHATTERPROOF Protein Lies at the Origin of a Fruit Morphological Novelty and a New Strategy for Seed Dispersal in Medicago Genus1[C][W

    PubMed Central

    Fourquin, Chloé; del Cerro, Carolina; Victoria, Filipe C.; Vialette-Guiraud, Aurélie; de Oliveira, Antonio C.; Ferrándiz, Cristina

    2013-01-01

    Angiosperms are the most diverse and numerous group of plants, and it is generally accepted that this evolutionary success owes in part to the diversity found in fruits, key for protecting the developing seeds and ensuring seed dispersal. Although studies on the molecular basis of morphological innovations are few, they all illustrate the central role played by transcription factors acting as developmental regulators. Here, we show that a small change in the protein sequence of a MADS-box transcription factor correlates with the origin of a highly modified fruit morphology and the change in seed dispersal strategies that occurred in Medicago, a genus belonging to the large legume family. This protein sequence modification alters the functional properties of the protein, affecting the affinities for other protein partners involved in high-order complexes. Our work illustrates that variation in coding regions can generate evolutionary novelties not based on gene duplication/subfunctionalization but by interactions in complex networks, contributing also to the current debate on the relative importance of changes in regulatory or coding regions of master regulators in generating morphological novelties. PMID:23640757

  18. Post-transcriptional Regulation of Genes Related to Biological Behaviors of Gastric Cancer by Long Noncoding RNAs and MicroRNAs

    PubMed Central

    Liu, Wenjing; Ma, Rui; Yuan, Yuan

    2017-01-01

    Noncoding RNAs play critical roles in regulating protein-coding genes and comprise two major classes: long noncoding RNAs (lncRNAs) and microRNAs (miRNAs). LncRNAs regulate gene expression at transcriptional, post-transcriptional, and epigenetic levels via multiple action modes. LncRNAs can also function as endogenous competitive RNAs for miRNAs and indirectly regulate gene expression post-transcriptionally. By binding to the 3'-untranslated regions (3'-UTR) of target genes, miRNAs post-transcriptionally regulate gene expression. Herein, we conducted a review of post-transcriptional regulation by lncRNAs and miRNAs of genes associated with biological behaviors of gastric cancer. PMID:29187891

  19. Dynamic Modeling of GAIT System Reveals Transcriptome Expansion and Translational Trickle Control Device

    PubMed Central

    Yao, Peng; Potdar, Alka A.; Arif, Abul; Ray, Partho Sarothi; Mukhopadhyay, Rupak; Willard, Belinda; Xu, Yichi; Yan, Jun; Saidel, Gerald M.; Fox, Paul L.

    2012-01-01

    SUMMARY Post-transcriptional regulatory mechanisms superimpose “fine-tuning” control upon “on-off” switches characteristic of gene transcription. We have exploited computational modeling with experimental validation to resolve an anomalous relationship between mRNA expression and protein synthesis. Differential GAIT (Gamma-interferon Activated Inhibitor of Translation) complex activation repressed VEGF-A synthesis to a low, constant rate despite high, variable VEGFA mRNA expression. Dynamic model simulations indicated the presence of an unidentified, inhibitory GAIT element-interacting factor. We discovered a truncated form of glutamyl-prolyl tRNA synthetase (EPRS), the GAIT constituent that binds the 3’-UTR GAIT element in target transcripts. The truncated protein, EPRSN1, prevents binding of functional GAIT complex. EPRSN1 mRNA is generated by a remarkable polyadenylation-directed conversion of a Tyr codon in the EPRS coding sequence to a stop codon (PAY*). By low-level protection of GAIT element-bearing transcripts, EPRSN1 imposes a robust “translational trickle” of target protein expression. Genome-wide analysis shows PAY* generates multiple truncated transcripts thereby contributing to transcriptome expansion. PMID:22386318

  20. An RNA motif advances transcription by preventing Rho-dependent termination

    PubMed Central

    Sevostyanova, Anastasia; Groisman, Eduardo A.

    2015-01-01

    The transcription termination factor Rho associates with most nascent bacterial RNAs as they emerge from RNA polymerase. However, pharmacological inhibition of Rho derepresses only a small fraction of these transcripts. What, then, determines the specificity of Rho-dependent transcription termination? We now report the identification of a Rho-antagonizing RNA element (RARE) that hinders Rho-dependent transcription termination. We establish that RARE traps Rho in an inactive complex but does not prevent Rho binding to its recruitment sites. Although translating ribosomes normally block Rho access to an mRNA, inefficient translation of an open reading frame in the leader region of the Salmonella mgtCBR operon actually enables transcription of its associated coding region by favoring an RNA conformation that sequesters RARE. The discovery of an RNA element that inactivates Rho signifies that the specificity of nucleic-acid binding proteins is defined not only by the sequences that recruit these proteins but also by sequences that antagonize their activity. PMID:26630006

  1. Functional transcriptomics of wild-caught Lutzomyia intermedia salivary glands: identification of a protective salivary protein against Leishmania braziliensis infection.

    PubMed

    de Moura, Tatiana R; Oliveira, Fabiano; Carneiro, Marcia W; Miranda, José Carlos; Clarêncio, Jorge; Barral-Netto, Manoel; Brodskyn, Cláudia; Barral, Aldina; Ribeiro, José M C; Valenzuela, Jesus G; de Oliveira, Camila I

    2013-01-01

    Leishmania parasites are transmitted in the presence of sand fly saliva. Together with the parasite, the sand fly injects salivary components that change the environment at the feeding site. Mice immunized with Phlebotomus papatasi salivary gland (SG) homogenate are protected against Leishmania major infection, while immunity to Lutzomyia intermedia SG homogenate exacerbated experimental Leishmania braziliensis infection. In humans, antibodies to Lu. intermedia saliva are associated with risk of acquiring L. braziliensis infection. Despite these important findings, there is no information regarding the repertoire of Lu. intermedia salivary proteins. A cDNA library from the Salivary Glands (SGs) of wild-caught Lu. intermedia was constructed, sequenced, and complemented by a proteomic approach based on 1D SDS PAGE and mass/mass spectrometry to validate the transcripts present in this cDNA library. We identified the most abundant transcripts and proteins reported in other sand fly species as well as novel proteins such as neurotoxin-like proteins, peptides with ML domain, and three small peptides found so far only in this sand fly species. DNA plasmids coding for ten selected transcripts were constructed and used to immunize BALB/c mice to study their immunogenicity. Plasmid Linb-11--coding for a 4.5-kDa protein--induced a cellular immune response and conferred protection against L. braziliensis infection. This protection correlated with a decreased parasite load and an increased frequency of IFN-γ-producing cells. We identified the most abundant and novel proteins present in the SGs of Lu. intermedia, a vector of cutaneous leishmaniasis in the Americas. We also show for the first time that immunity to a single salivary protein from Lu. intermedia can protect against cutaneous leishmaniasis caused by L. braziliensis.

  2. A combinatorial code for pattern formation in Drosophila oogenesis.

    PubMed

    Yakoby, Nir; Bristow, Christopher A; Gong, Danielle; Schafer, Xenia; Lembong, Jessica; Zartman, Jeremiah J; Halfon, Marc S; Schüpbach, Trudi; Shvartsman, Stanislav Y

    2008-11-01

    Two-dimensional patterning of the follicular epithelium in Drosophila oogenesis is required for the formation of three-dimensional eggshell structures. Our analysis of a large number of published gene expression patterns in the follicle cells suggests that they follow a simple combinatorial code based on six spatial building blocks and the operations of union, difference, intersection, and addition. The building blocks are related to the distribution of inductive signals, provided by the highly conserved epidermal growth factor receptor and bone morphogenetic protein signaling pathways. We demonstrate the validity of the code by testing it against a set of patterns obtained in a large-scale transcriptional profiling experiment. Using the proposed code, we distinguish 36 distinct patterns for 81 genes expressed in the follicular epithelium and characterize their joint dynamics over four stages of oogenesis. The proposed combinatorial framework allows systematic analysis of the diversity and dynamics of two-dimensional transcriptional patterns and guides future studies of gene regulation.

  3. The Canonical Immediate Early 3 Gene Product pIE611 of Mouse Cytomegalovirus Is Dispensable for Viral Replication but Mediates Transcriptional and Posttranscriptional Regulation of Viral Gene Products.

    PubMed

    Rattay, Stephanie; Trilling, Mirko; Megger, Dominik A; Sitek, Barbara; Meyer, Helmut E; Hengel, Hartmut; Le-Trilling, Vu Thuy Khanh

    2015-08-01

    Transcription of mouse cytomegalovirus (MCMV) immediate early ie1 and ie3 is controlled by the major immediate early promoter/enhancer (MIEP) and requires differential splicing. Based on complete loss of genome replication of an MCMV mutant carrying a deletion of the ie3-specific exon 5, the multifunctional IE3 protein (611 amino acids; pIE611) is considered essential for viral replication. Our analysis of ie3 transcription resulted in the identification of novel ie3 isoforms derived from alternatively spliced ie3 transcripts. Construction of an IE3-hemagglutinin (IE3-HA) virus by insertion of an in-frame HA epitope sequence allowed detection of the IE3 isoforms in infected cells, verifying that the newly identified transcripts code for proteins. This prompted the construction of an MCMV mutant lacking ie611 but retaining the coding capacity for the newly identified isoforms ie453 and ie310. Using Δie611 MCMV, we demonstrated the dispensability of the canonical ie3 gene product pIE611 for viral replication. To determine the role of pIE611 for viral gene expression during MCMV infection in an unbiased global approach, we used label-free quantitative mass spectrometry to delineate pIE611-dependent changes of the MCMV proteome. Interestingly, further analysis revealed transcriptional as well as posttranscriptional regulation of MCMV gene products by pIE611. Cytomegaloviruses are pathogenic betaherpesviruses persisting in a lifelong latency from which reactivation can occur under conditions of immunosuppression, immunoimmaturity, or inflammation. The switch from latency to reactivation requires expression of immediate early genes. Therefore, understanding of immediate early gene regulation might add insights into viral pathogenesis. The mouse cytomegalovirus (MCMV) immediate early 3 protein (611 amino acids; pIE611) is considered essential for viral replication. The identification of novel protein isoforms derived from alternatively spliced ie3 transcripts prompted the construction of an MCMV mutant lacking ie611 but retaining the coding capacity for the newly identified isoforms ie453 and ie310. Using Δie611 MCMV, we demonstrated the dispensability of the canonical ie3 gene product pIE611 for viral replication and delineated pIE611-dependent changes of the MCMV proteome. Our findings have fundamental implications for the interpretation of earlier studies on pIE3 functions and highlight the complex orchestration of MCMV gene regulation. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  4. Light-Regulated Transcription of Genes Encoding Peridinin Chlorophyll a Proteins and the Major Intrinsic Light-Harvesting Complex Proteins in the Dinoflagellate Amphidinium carterae Hulburt (Dinophycae)1

    PubMed Central

    ten Lohuis, Michael R.; Miller, David J.

    1998-01-01

    In the dinoflagellate Amphidinium carterae, photoadaptation involves changes in the transcription of genes encoding both of the major classes of light-harvesting proteins, the peridinin chlorophyll a proteins (PCPs) and the major a/c-containing intrinsic light-harvesting proteins (LHCs). PCP and LHC transcript levels were increased up to 86- and 6-fold higher, respectively, under low-light conditions relative to cells grown at high illumination. These increases in transcript abundance were accompanied by decreases in the extent of methylation of CpG and CpNpG motifs within or near PCP- and LHC-coding regions. Cytosine methylation levels in A. carterae are therefore nonstatic and may vary with environmental conditions in a manner suggestive of involvement in the regulation of gene expression. However, chemically induced undermethylation was insufficient in activating transcription, because treatment with two methylation inhibitors had no effect on PCP mRNA or protein levels. Regulation of gene activity through changes in DNA methylation has traditionally been assumed to be restricted to higher eukaryotes (deuterostomes and green plants); however, the atypically large genomes of dinoflagellates may have generated the requirement for systems of this type in a relatively “primitive” organism. Dinoflagellates may therefore provide a unique perspective on the evolution of eukaryotic DNA-methylation systems. PMID:9576788

  5. Origins of De Novo Genes in Human and Chimpanzee.

    PubMed

    Ruiz-Orera, Jorge; Hernandez-Rodriguez, Jessica; Chiva, Cristina; Sabidó, Eduard; Kondova, Ivanela; Bontrop, Ronald; Marqués-Bonet, Tomàs; Albà, M Mar

    2015-12-01

    The birth of new genes is an important motor of evolutionary innovation. Whereas many new genes arise by gene duplication, others originate at genomic regions that did not contain any genes or gene copies. Some of these newly expressed genes may acquire coding or non-coding functions and be preserved by natural selection. However, it is yet unclear which is the prevalence and underlying mechanisms of de novo gene emergence. In order to obtain a comprehensive view of this process, we have performed in-depth sequencing of the transcriptomes of four mammalian species--human, chimpanzee, macaque, and mouse--and subsequently compared the assembled transcripts and the corresponding syntenic genomic regions. This has resulted in the identification of over five thousand new multiexonic transcriptional events in human and/or chimpanzee that are not observed in the rest of species. Using comparative genomics, we show that the expression of these transcripts is associated with the gain of regulatory motifs upstream of the transcription start site (TSS) and of U1 snRNP sites downstream of the TSS. In general, these transcripts show little evidence of purifying selection, suggesting that many of them are not functional. However, we find signatures of selection in a subset of de novo genes which have evidence of protein translation. Taken together, the data support a model in which frequently-occurring new transcriptional events in the genome provide the raw material for the evolution of new proteins.

  6. Origins of De Novo Genes in Human and Chimpanzee

    PubMed Central

    Ruiz-Orera, Jorge; Hernandez-Rodriguez, Jessica; Chiva, Cristina; Sabidó, Eduard; Kondova, Ivanela; Bontrop, Ronald; Marqués-Bonet, Tomàs; Albà, M.Mar

    2015-01-01

    The birth of new genes is an important motor of evolutionary innovation. Whereas many new genes arise by gene duplication, others originate at genomic regions that did not contain any genes or gene copies. Some of these newly expressed genes may acquire coding or non-coding functions and be preserved by natural selection. However, it is yet unclear which is the prevalence and underlying mechanisms of de novo gene emergence. In order to obtain a comprehensive view of this process, we have performed in-depth sequencing of the transcriptomes of four mammalian species—human, chimpanzee, macaque, and mouse—and subsequently compared the assembled transcripts and the corresponding syntenic genomic regions. This has resulted in the identification of over five thousand new multiexonic transcriptional events in human and/or chimpanzee that are not observed in the rest of species. Using comparative genomics, we show that the expression of these transcripts is associated with the gain of regulatory motifs upstream of the transcription start site (TSS) and of U1 snRNP sites downstream of the TSS. In general, these transcripts show little evidence of purifying selection, suggesting that many of them are not functional. However, we find signatures of selection in a subset of de novo genes which have evidence of protein translation. Taken together, the data support a model in which frequently-occurring new transcriptional events in the genome provide the raw material for the evolution of new proteins. PMID:26720152

  7. Opposite GC skews at the 5' and 3' ends of genes in unicellular fungi

    PubMed Central

    2011-01-01

    Background GC-skews have previously been linked to transcription in some eukaryotes. They have been associated with transcription start sites, with the coding strand G-biased in mammals and C-biased in fungi and invertebrates. Results We show a consistent and highly significant pattern of GC-skew within genes of almost all unicellular fungi. The pattern of GC-skew is asymmetrical: the coding strand of genes is typically C-biased at the 5' ends but G-biased at the 3' ends, with intermediate skews at the middle of genes. Thus, the initiation, elongation, and termination phases of transcription are associated with different skews. This pattern influences the encoded proteins by generating differential usage of amino acids at the 5' and 3' ends of genes. These biases also affect fourfold-degenerate positions and extend into promoters and 3' UTRs, indicating that skews cannot be accounted by selection for protein function or translation. Conclusions We propose two explanations, the mutational pressure hypothesis, and the adaptive hypothesis. The mutational pressure hypothesis is that different co-factors bind to RNA pol II at different phases of transcription, producing different mutational regimes. The adaptive hypothesis is that cytidine triphosphate deficiency may lead to C-avoidance at the 3' ends of transcripts to control the flow of RNA pol II molecules and reduce their frequency of collisions. PMID:22208287

  8. Measles virus minigenomes encoding two autofluorescent proteins reveal cell-to-cell variation in reporter expression dependent on viral sequences between the transcription units.

    PubMed

    Rennick, Linda J; Duprex, W Paul; Rima, Bert K

    2007-10-01

    Transcription from morbillivirus genomes commences at a single promoter in the 3' non-coding terminus, with the six genes being transcribed sequentially. The 3' and 5' untranslated regions (UTRs) of the genes (mRNA sense), together with the intergenic trinucleotide spacer, comprise the non-coding sequences (NCS) of the virus and contain the conserved gene end and gene start signals, respectively. Bicistronic minigenomes containing transcription units (TUs) encoding autofluorescent reporter proteins separated by measles virus (MV) NCS were used to give a direct estimation of gene expression in single, living cells by assessing the relative amounts of each fluorescent protein in each cell. Initially, five minigenomes containing each of the MV NCS were generated. Assays were developed to determine the amount of each fluorescent protein in cells at both cell population and single-cell levels. This revealed significant variations in gene expression between cells expressing the same NCS-containing minigenome. The minigenome containing the M/F NCS produced significantly lower amounts of fluorescent protein from the second TU (TU2), compared with the other minigenomes. A minigenome with a truncated F 5' UTR had increased expression from TU2. This UTR is 524 nt longer than the other MV 5' UTRs. Insertions into the 5' UTR of the enhanced green fluorescent protein gene in the minigenome containing the N/P NCS showed that specific sequences, rather than just the additional length of F 5' UTR, govern this decreased expression from TU2.

  9. A comprehensive catalog of human KRAB-associated zinc finger genes: Insights into the evolutionary history of a large family of transcriptional repressors

    PubMed Central

    Huntley, Stuart; Baggott, Daniel M.; Hamilton, Aaron T.; Tran-Gyamfi, Mary; Yang, Shan; Kim, Joomyeong; Gordon, Laurie; Branscomb, Elbert; Stubbs, Lisa

    2006-01-01

    Krüppel-type zinc finger (ZNF) motifs are prevalent components of transcription factor proteins in all eukaryotes. KRAB-ZNF proteins, in which a potent repressor domain is attached to a tandem array of DNA-binding zinc-finger motifs, are specific to tetrapod vertebrates and represent the largest class of ZNF proteins in mammals. To define the full repertoire of human KRAB-ZNF proteins, we searched the genome sequence for key motifs and then constructed and manually curated gene models incorporating those sequences. The resulting gene catalog contains 423 KRAB-ZNF protein-coding loci, yielding alternative transcripts that altogether predict at least 742 structurally distinct proteins. Active rounds of segmental duplication, involving single genes or larger regions and including both tandem and distributed duplication events, have driven the expansion of this mammalian gene family. Comparisons between the human genes and ZNF loci mined from the draft mouse, dog, and chimpanzee genomes not only identified 103 KRAB-ZNF genes that are conserved in mammals but also highlighted a substantial level of lineage-specific change; at least 136 KRAB-ZNF coding genes are primate specific, including many recent duplicates. KRAB-ZNF genes are widely expressed and clustered genes are typically not coregulated, indicating that paralogs have evolved to fill roles in many different biological processes. To facilitate further study, we have developed a Web-based public resource with access to gene models, sequences, and other data, including visualization tools to provide genomic context and interaction with other public data sets. PMID:16606702

  10. Modulation of Gene Expression in Contextual Fear Conditioning in the Rat

    PubMed Central

    Macchi, Monica; Ciampini, Cristina; Bernardi, Rodolfo; Baldi, Elisabetta; Bucherelli, Corrado; Brunelli, Marcello; Scuri, Rossana

    2013-01-01

    In contextual fear conditioning (CFC) a single training leads to long-term memory of context-aversive electrical foot-shocks association. Mid-temporal regions of the brain of trained and naive rats were obtained 2 days after conditioning and screened by two-directional suppression subtractive hybridization. A pool of differentially expressed genes was identified and some of them were randomly selected and confirmed with qRT-PCR assay. These transcripts showed high homology for rat gene sequences coding for proteins involved in different cellular processes. The expression of the selected transcripts was also tested in rats which had freely explored the experimental apparatus (exploration) and in rats to which the same number of aversive shocks had been administered in the same apparatus, but temporally compressed so as to make the association between painful stimuli and the apparatus difficult (shock-only). Some genes resulted differentially expressed only in the rats subjected to CFC, others only in exploration or shock-only rats, whereas the gene coding for translocase of outer mitochondrial membrane 20 protein and nardilysin were differentially expressed in both CFC and exploration rats. For example, the expression of stathmin 1 whose transcripts resulted up regulated was also tested to evaluate the transduction and protein localization after conditioning. PMID:24278235

  11. MicroRNAome genome: a treasure for cancer diagnosis and therapy

    PubMed Central

    Berindan-Neagoe, Ioana; Monroig, Paloma; Pasculli, Barbara; Calin, George A.

    2015-01-01

    The interplay between abnormalities in genes coding for proteins and microRNAs (miRNAs) has been among the most exiting yet unexpected discoveries in oncology over the last decade. The complexity of this network has redefined cancer research as these molecules produced from what was once considered “genomic trash”, have shown to be crucial for cancer initiation, progression, and dissemination. Naturally occurring miRNAs are very short transcripts that never produce a protein or amino acid chain, but act by regulating protein expression during cellular processes such as growth, development and differentiation at the transcriptional, post-transcriptional and/or translational level. In this review article we present miRNAs as ubiquitous players involved in all cancer hallmarks. We also describe the most used methods to detect their expression, which have revealed through gene expression studies the identity of hundreds of miRNAs dysregulated in cancer cells or tumor microenvironment cells. Furthermore, we discuss the role of miRNAs as hormones and as reliable cancer biomarkers and predictors of treatment-response. Along with this, we explore current strategies in designing miRNA-targeting therapeutics, as well as the associated challenges that research envisions to overcome. Finally, we introduce a new wave in molecular oncology translational research, the study of long non-coding RNAs. PMID:25104502

  12. RNA G-quadruplexes: emerging mechanisms in disease

    PubMed Central

    Cammas, Anne

    2017-01-01

    Abstract RNA G-quadruplexes (G4s) are formed by G-rich RNA sequences in protein-coding (mRNA) and non-coding (ncRNA) transcripts that fold into a four-stranded conformation. Experimental studies and bioinformatic predictions support the view that these structures are involved in different cellular functions associated to both DNA processes (telomere elongation, recombination and transcription) and RNA post-transcriptional mechanisms (including pre-mRNA processing, mRNA turnover, targeting and translation). An increasing number of different diseases have been associated with the inappropriate regulation of RNA G4s exemplifying the potential importance of these structures on human health. Here, we review the different molecular mechanisms underlying the link between RNA G4s and human diseases by proposing several overlapping models of deregulation emerging from recent research, including (i) sequestration of RNA-binding proteins, (ii) aberrant expression or localization of RNA G4-binding proteins, (iii) repeat associated non-AUG (RAN) translation, (iv) mRNA translational blockade and (v) disabling of protein–RNA G4 complexes. This review also provides a comprehensive survey of the functional RNA G4 and their mechanisms of action. Finally, we highlight future directions for research aimed at improving our understanding on RNA G4-mediated regulatory mechanisms linked to diseases. PMID:28013268

  13. Tetrahymena thermophila acidic ribosomal protein L37 contains an archaebacterial type of C-terminus.

    PubMed

    Hansen, T S; Andreasen, P H; Dreisig, H; Højrup, P; Nielsen, H; Engberg, J; Kristiansen, K

    1991-09-15

    We have cloned and characterized a Tetrahymena thermophila macronuclear gene (L37) encoding the acidic ribosomal protein (A-protein) L37. The gene contains a single intron located in the 3'-part of the coding region. Two major and three minor transcription start points (tsp) were mapped 39 to 63 nucleotides upstream from the translational start codon. The uppermost tsp mapped to the first T in a putative T. thermophila RNA polymerase II initiator element, TATAA. The coding region of L37 predicts a protein of 109 amino acid (aa) residues. A substantial part of the deduced aa sequence was verified by protein sequencing. The T. thermophila L37 clearly belongs to the P1-type family of eukaryotic A-proteins, but the C-terminal region has the hallmarks of archaebacterial A-proteins.

  14. Isolated Fungal Promoters and Gene Transcription Terminators and Methods of Protein and Chemical Production in a Fungus

    DOEpatents

    Dai, Ziyu; Lasure, Linda L.; Magnuson, Jon K.

    2008-11-11

    The present invention encompasses isolated gene regulatory elements and gene transcription terminators that are differentially expressed in a native fungus exhibiting a first morphology relative to the native fungus exhibiting a second morphology. The invention also encompasses a method of utilizing a fungus for protein or chemical production. A transformed fungus is produced by transforming a fungus with a recombinant polynucleotide molecule. The recombinant polynucleotide molecule contains an isolated polynucleotide sequence linked operably to another molecule comprising a coding region of a gene of interest. The gene regulatory element and gene transcription terminator may temporally and spatially regulate expression of particular genes for optimum production of compounds of interest in a transgenic fungus.

  15. Isolated fungal promoters and gene transcription terminators and methods of protein and chemical production in a fungus

    DOEpatents

    Dai, Ziyu; Lasure, Linda L.; Magnuson, Jon K.

    2008-11-11

    The present invention encompasses isolated gene regulatory elements and gene transcription terminators that are differentially expressed in a native fungus exhibiting a first morphology relative to the native fungus exhibiting a second morphology. The invention also encompasses a method of utilizing a fungus for protein or chemical production. A transformed fungus is produced by transforming a fungus with a recombinant polynucleotide molecule. The recombinant polynucleotide molecule contains an isolated polynucleotide sequence linked operably to another molecule comprising a coding region of a gene of interest. The gene regulatory element and gene transcription terminator may temporally and spatially regulate expression of particular genes for optimum production of compounds of interest in a transgenic fungus.

  16. Isolated fungal promoters and gene transcription terminators and methods of protein and chemical production in a fungus

    DOEpatents

    Dai, Ziyu; Lasure, Linda L; Magnuson, Jon K

    2014-05-27

    The present invention encompasses isolated gene regulatory elements and gene transcription terminators that are differentially expressed in a native fungus exhibiting a first morphology relative to the native fungus exhibiting a second morphology. The invention also encompasses a method of utilizing a fungus for protein or chemical production. A transformed fungus is produced by transforming a fungus with a recombinant polynucleotide molecule. The recombinant polynucleotide molecule contains an isolated polynucleotide sequence linked operably to another molecule comprising a coding region of a gene of interest. The gene regulatory element and gene transcription terminator may temporally and spatially regulate expression of particular genes for optimum production of compounds of interest in a transgenic fungus.

  17. Extracellular Vesicle-Associated RNA as a Carrier of Epigenetic Information

    PubMed Central

    2017-01-01

    Post-transcriptional regulation of messenger RNA (mRNA) metabolism and subcellular localization is of the utmost importance both during development and in cell differentiation. Besides carrying genetic information, mRNAs contain cis-acting signals (zip codes), usually present in their 5′- and 3′-untranslated regions (UTRs). By binding to these signals, trans-acting factors, such as RNA-binding proteins (RBPs), and/or non-coding RNAs (ncRNAs), control mRNA localization, translation and stability. RBPs can also form complexes with non-coding RNAs of different sizes. The release of extracellular vesicles (EVs) is a conserved process that allows both normal and cancer cells to horizontally transfer molecules, and hence properties, to neighboring cells. By interacting with proteins that are specifically sorted to EVs, mRNAs as well as ncRNAs can be transferred from cell to cell. In this review, we discuss the mechanisms underlying the sorting to EVs of different classes of molecules, as well as the role of extracellular RNAs and the associated proteins in altering gene expression in the recipient cells. Importantly, if, on the one hand, RBPs play a critical role in transferring RNAs through EVs, RNA itself could, on the other hand, function as a carrier to transfer proteins (i.e., chromatin modifiers, and transcription factors) that, once transferred, can alter the cell’s epigenome. PMID:28937658

  18. Molecular cloning and analysis of Schizosaccharomyces pombe Reb1p: sequence-specific recognition of two sites in the far upstream rDNA intergenic spacer.

    PubMed Central

    Zhao, A; Guo, A; Liu, Z; Pape, L

    1997-01-01

    The coding sequences for a Schizosaccharomyces pombe sequence-specific DNA binding protein, Reb1p, have been cloned. The predicted S. pombe Reb1p is 24-29% identical to mouse TTF-1 (transcription termination factor-1) and Saccharomyces cerevisiae REB1 protein, both of which direct termination of RNA polymerase I catalyzed transcripts. The S.pombe Reb1 cDNA encodes a predicted polypeptide of 504 amino acids with a predicted molecular weight of 58.4 kDa. The S. pombe Reb1p is unusual in that the bipartite DNA binding motif identified originally in S.cerevisiae and Klyveromyces lactis REB1 proteins is uninterrupted and thus S.pombe Reb1p may contain the smallest natural REB1 homologous DNA binding domain. Its genomic coding sequences were shown to be interrupted by two introns. A recombinant histidine-tagged Reb1 protein bearing the rDNA binding domain has two homologous, sequence-specific binding sites in the S. pomber DNA intergenic spacer, located between 289 and 480 nt downstream of the end of the approximately 25S rRNA coding sequences. Each binding site is 13-14 bp downstream of two of the three proposed in vivo termination sites. The core of this 17 bp site, AGGTAAGGGTAATGCAC, is specifically protected by Reb1p in footprinting analysis. PMID:9016645

  19. A dehydration-inducible gene in the truffle Tuber borchii identifies a novel group of dehydrins

    PubMed Central

    Abba', Simona; Ghignone, Stefano; Bonfante, Paola

    2006-01-01

    Background The expressed sequence tag M6G10 was originally isolated from a screening for differentially expressed transcripts during the reproductive stage of the white truffle Tuber borchii. mRNA levels for M6G10 increased dramatically during fruiting body maturation compared to the vegetative mycelial stage. Results Bioinformatics tools, phylogenetic analysis and expression studies were used to support the hypothesis that this sequence, named TbDHN1, is the first dehydrin (DHN)-like coding gene isolated in fungi. Homologs of this gene, all defined as "coding for hypothetical proteins" in public databases, were exclusively found in ascomycetous fungi and in plants. Although complete (or almost complete) fungal genomes and EST collections of some Basidiomycota and Glomeromycota are already available, DHN-like proteins appear to be represented only in Ascomycota. A new and previously uncharacterized conserved signature pattern was identified and proposed to Uniprot database as the main distinguishing feature of this new group of DHNs. Expression studies provide experimental evidence of a transcript induction of TbDHN1 during cellular dehydration. Conclusion Expression pattern and sequence similarities to known plant DHNs indicate that TbDHN1 is the first characterized DHN-like protein in fungi. The high similarity of TbDHN1 with homolog coding sequences implies the existence of a novel fungal/plant group of LEA Class II proteins characterized by a previously undescribed signature pattern. PMID:16512918

  20. Analysis of Antisense Expression by Whole Genome Tiling Microarrays and siRNAs Suggests Mis-Annotation of Arabidopsis Orphan Protein-Coding Genes

    PubMed Central

    Richardson, Casey R.; Luo, Qing-Jun; Gontcharova, Viktoria; Jiang, Ying-Wen; Samanta, Manoj; Youn, Eunseog; Rock, Christopher D.

    2010-01-01

    Background MicroRNAs (miRNAs) and trans-acting small-interfering RNAs (tasi-RNAs) are small (20–22 nt long) RNAs (smRNAs) generated from hairpin secondary structures or antisense transcripts, respectively, that regulate gene expression by Watson-Crick pairing to a target mRNA and altering expression by mechanisms related to RNA interference. The high sequence homology of plant miRNAs to their targets has been the mainstay of miRNA prediction algorithms, which are limited in their predictive power for other kingdoms because miRNA complementarity is less conserved yet transitive processes (production of antisense smRNAs) are active in eukaryotes. We hypothesize that antisense transcription and associated smRNAs are biomarkers which can be computationally modeled for gene discovery. Principal Findings We explored rice (Oryza sativa) sense and antisense gene expression in publicly available whole genome tiling array transcriptome data and sequenced smRNA libraries (as well as C. elegans) and found evidence of transitivity of MIRNA genes similar to that found in Arabidopsis. Statistical analysis of antisense transcript abundances, presence of antisense ESTs, and association with smRNAs suggests several hundred Arabidopsis ‘orphan’ hypothetical genes are non-coding RNAs. Consistent with this hypothesis, we found novel Arabidopsis homologues of some MIRNA genes on the antisense strand of previously annotated protein-coding genes. A Support Vector Machine (SVM) was applied using thermodynamic energy of binding plus novel expression features of sense/antisense transcription topology and siRNA abundances to build a prediction model of miRNA targets. The SVM when trained on targets could predict the “ancient” (deeply conserved) class of validated Arabidopsis MIRNA genes with an accuracy of 84%, and 76% for “new” rapidly-evolving MIRNA genes. Conclusions Antisense and smRNA expression features and computational methods may identify novel MIRNA genes and other non-coding RNAs in plants and potentially other kingdoms, which can provide insight into antisense transcription, miRNA evolution, and post-transcriptional gene regulation. PMID:20520764

  1. Tudor-staphylococcal nuclease regulates the expression and biological function of alkylglycerone phosphate synthase via nuclear factor-κB and microRNA-127 in human glioma U87MG cells.

    PubMed

    Zhang, Yongqiang; Jia, Jun; Li, Ying; Chen, Yan-Ge; Huang, Huan; Qiao, Yang; Zhu, Yu

    2018-06-01

    Glioma is one of the malignant tumor types detrimental to human health; therefore, it is important to find novel targets and therapeutics for this tumor. The downregulated expression of Tudor-staphylococcal nuclease (SN) and alkylglycerone phosphate synthase (AGPS) can decrease cancer malignancy, and the overexpression of them can the increase viability and migration potential of various tumor cell types; however, the role of AGPS in the proliferation and migration of glioma, and the association of Tudor-SN and AGPS in human glioma is not clear. In the present study, it was determined that AGPS silencing suppressed the proliferation and migration potential of glioma U87MG cells, and suppressed the expression of the circular RNAs circ-ubiquitin-associated protein 2, circ-zinc finger protein 292 and circ-homeodomain-interacting protein kinase 3, and the long non-coding RNAs H19 imprinted maternally expressed transcript (non-protein coding), colon cancer-associated transcript 1 (non-protein coding) and hepatocellular carcinoma upregulated long non-coding RNA. Furthermore, Tudor-SN silencing suppressed the expression of AGPS; however, nuclear factor (NF)-κB and microRNA (miR)-127 retrieval experiments partially reduced the expression of AGPS. Additionally, it was determined that Tudor-SN silencing suppressed the activity of the mechanistic target of rapamycin (mTOR) signaling pathway, and NF-κB and miR-127 retrieval experiments partially reduced the activity of mTOR. Therefore, it was considered that NF-κB and miR-127 may be the mediators of Tudor-SN-regulated AGPS via the mTOR signaling pathway. These results improve on our knowledge of the mechanisms underlying Tudor-SN and AGPS in human glioma.

  2. Quantifying the Effect of DNA Packaging on Gene Expression Level

    NASA Astrophysics Data System (ADS)

    Kim, Harold

    2010-10-01

    Gene expression, the process by which the genetic code comes alive in the form of proteins, is one of the most important biological processes in living cells, and begins when transcription factors bind to specific DNA sequences in the promoter region upstream of a gene. The relationship between gene expression output and transcription factor input which is termed the gene regulation function is specific to each promoter, and predicting this gene regulation function from the locations of transcription factor binding sites is one of the challenges in biology. In eukaryotic organisms (for example, animals, plants, fungi etc), DNA is highly compacted into nucleosomes, 147-bp segments of DNA tightly wrapped around histone protein core, and therefore, the accessibility of transcription factor binding sites depends on their locations with respect to nucleosomes - sites inside nucleosomes are less accessible than those outside nucleosomes. To understand how transcription factor binding sites contribute to gene expression in a quantitative manner, we obtain gene regulation functions of promoters with various configurations of transcription factor binding sites by using fluorescent protein reporters to measure transcription factor input and gene expression output in single yeast cells. In this talk, I will show that the affinity of a transcription factor binding site inside and outside the nucleosome controls different aspects of the gene regulation function, and explain this finding based on a mass-action kinetic model that includes competition between nucleosomes and transcription factors.

  3. The evolution of transcriptional regulation in eukaryotes

    NASA Technical Reports Server (NTRS)

    Wray, Gregory A.; Hahn, Matthew W.; Abouheif, Ehab; Balhoff, James P.; Pizer, Margaret; Rockman, Matthew V.; Romano, Laura A.

    2003-01-01

    Gene expression is central to the genotype-phenotype relationship in all organisms, and it is an important component of the genetic basis for evolutionary change in diverse aspects of phenotype. However, the evolution of transcriptional regulation remains understudied and poorly understood. Here we review the evolutionary dynamics of promoter, or cis-regulatory, sequences and the evolutionary mechanisms that shape them. Existing evidence indicates that populations harbor extensive genetic variation in promoter sequences, that a substantial fraction of this variation has consequences for both biochemical and organismal phenotype, and that some of this functional variation is sorted by selection. As with protein-coding sequences, rates and patterns of promoter sequence evolution differ considerably among loci and among clades for reasons that are not well understood. Studying the evolution of transcriptional regulation poses empirical and conceptual challenges beyond those typically encountered in analyses of coding sequence evolution: promoter organization is much less regular than that of coding sequences, and sequences required for the transcription of each locus reside at multiple other loci in the genome. Because of the strong context-dependence of transcriptional regulation, sequence inspection alone provides limited information about promoter function. Understanding the functional consequences of sequence differences among promoters generally requires biochemical and in vivo functional assays. Despite these challenges, important insights have already been gained into the evolution of transcriptional regulation, and the pace of discovery is accelerating.

  4. Dissecting the chromatin interactome of microRNA genes.

    PubMed

    Chen, Dijun; Fu, Liang-Yu; Zhang, Zhao; Li, Guoliang; Zhang, Hang; Jiang, Li; Harrison, Andrew P; Shanahan, Hugh P; Klukas, Christian; Zhang, Hong-Yu; Ruan, Yijun; Chen, Ling-Ling; Chen, Ming

    2014-03-01

    Our knowledge of the role of higher-order chromatin structures in transcription of microRNA genes (MIRs) is evolving rapidly. Here we investigate the effect of 3D architecture of chromatin on the transcriptional regulation of MIRs. We demonstrate that MIRs have transcriptional features that are similar to protein-coding genes. RNA polymerase II-associated ChIA-PET data reveal that many groups of MIRs and protein-coding genes are organized into functionally compartmentalized chromatin communities and undergo coordinated expression when their genomic loci are spatially colocated. We observe that MIRs display widespread communication in those transcriptionally active communities. Moreover, miRNA-target interactions are significantly enriched among communities with functional homogeneity while depleted from the same community from which they originated, suggesting MIRs coordinating function-related pathways at posttranscriptional level. Further investigation demonstrates the existence of spatial MIR-MIR chromatin interacting networks. We show that groups of spatially coordinated MIRs are frequently from the same family and involved in the same disease category. The spatial interaction network possesses both common and cell-specific subnetwork modules that result from the spatial organization of chromatin within different cell types. Together, our study unveils an entirely unexplored layer of MIR regulation throughout the human genome that links the spatial coordination of MIRs to their co-expression and function.

  5. Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome.

    PubMed

    Bush, Stephen J; Muriuki, Charity; McCulloch, Mary E B; Farquhar, Iseabail L; Clark, Emily L; Hume, David A

    2018-04-24

    mRNA-like long non-coding RNAs (lncRNAs) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. Thus, in many cases lncRNA detection by RNA-sequencing (RNA-seq) is compromised by stochastic sampling. To account for this and create a catalogue of ruminant lncRNAs, we compared de novo assembled lncRNAs derived from large RNA-seq datasets in transcriptional atlas projects for sheep and goats with previous lncRNAs assembled in cattle and human. We then combined the novel lncRNAs with the sheep transcriptional atlas to identify co-regulated sets of protein-coding and non-coding loci. Few lncRNAs could be reproducibly assembled from a single dataset, even with deep sequencing of the same tissues from multiple animals. Furthermore, there was little sequence overlap between lncRNAs that were assembled from pooled RNA-seq data. We combined positional conservation (synteny) with cross-species mapping of candidate lncRNAs to identify a consensus set of ruminant lncRNAs and then used the RNA-seq data to demonstrate detectable and reproducible expression in each species. In sheep, 20 to 30% of lncRNAs were located close to protein-coding genes with which they are strongly co-expressed, which is consistent with the evolutionary origin of some ncRNAs in enhancer sequences. Nevertheless, most of the lncRNAs are not co-expressed with neighbouring protein-coding genes. Alongside substantially expanding the ruminant lncRNA repertoire, the outcomes of our analysis demonstrate that stochastic sampling can be partly overcome by combining RNA-seq datasets from related species. This has practical implications for the future discovery of lncRNAs in other species.

  6. Differential Expression Profile of lncRNAs from Primary Human Hepatocytes Following DEET and Fipronil Exposure

    PubMed Central

    Wallace, Andrew D.; Hodgson, Ernest; Roe, R. Michael

    2017-01-01

    While the synthesis and use of new chemical compounds is at an all-time high, the study of their potential impact on human health is quickly falling behind, and new methods are needed to assess their impact. We chose to examine the effects of two common environmental chemicals, the insect repellent N,N-diethyl-m-toluamide (DEET) and the insecticide fluocyanobenpyrazole (fipronil), on transcript levels of long non-protein coding RNAs (lncRNAs) in primary human hepatocytes using a global RNA-Seq approach. While lncRNAs are believed to play a critical role in numerous important biological processes, many still remain uncharacterized, and their functions and modes of action remain largely unclear, especially in relation to environmental chemicals. RNA-Seq showed that 100 µM DEET significantly increased transcript levels for 2 lncRNAs and lowered transcript levels for 18 lncRNAs, while fipronil at 10 µM increased transcript levels for 76 lncRNAs and decreased levels for 193 lncRNAs. A mixture of 100 µM DEET and 10 µM fipronil increased transcript levels for 75 lncRNAs and lowered transcript levels for 258 lncRNAs. This indicates a more-than-additive effect on lncRNA transcript expression when the two chemicals were presented in combination versus each chemical alone. Differentially expressed lncRNA genes were mapped to chromosomes, analyzed by proximity to neighboring protein-coding genes, and functionally characterized via gene ontology and molecular mapping algorithms. While further testing is required to assess the organismal impact of changes in transcript levels, this initial analysis links several of the dysregulated lncRNAs to processes and pathways critical to proper cellular function, such as the innate and adaptive immune response and the p53 signaling pathway. PMID:28991164

  7. TFIIS-Dependent Non-coding Transcription Regulates Developmental Genome Rearrangements

    PubMed Central

    Maliszewska-Olejniczak, Kamila; Gruchota, Julita; Gromadka, Robert; Denby Wilkes, Cyril; Arnaiz, Olivier; Mathy, Nathalie; Duharcourt, Sandra; Bétermier, Mireille; Nowak, Jacek K.

    2015-01-01

    Because of their nuclear dimorphism, ciliates provide a unique opportunity to study the role of non-coding RNAs (ncRNAs) in the communication between germline and somatic lineages. In these unicellular eukaryotes, a new somatic nucleus develops at each sexual cycle from a copy of the zygotic (germline) nucleus, while the old somatic nucleus degenerates. In the ciliate Paramecium tetraurelia, the genome is massively rearranged during this process through the reproducible elimination of repeated sequences and the precise excision of over 45,000 short, single-copy Internal Eliminated Sequences (IESs). Different types of ncRNAs resulting from genome-wide transcription were shown to be involved in the epigenetic regulation of genome rearrangements. To understand how ncRNAs are produced from the entire genome, we have focused on a homolog of the TFIIS elongation factor, which regulates RNA polymerase II transcriptional pausing. Six TFIIS-paralogs, representing four distinct families, can be found in P. tetraurelia genome. Using RNA interference, we showed that TFIIS4, which encodes a development-specific TFIIS protein, is essential for the formation of a functional somatic genome. Molecular analyses and high-throughput DNA sequencing upon TFIIS4 RNAi demonstrated that TFIIS4 is involved in all kinds of genome rearrangements, including excision of ~48% of IESs. Localization of a GFP-TFIIS4 fusion revealed that TFIIS4 appears specifically in the new somatic nucleus at an early developmental stage, before IES excision. RT-PCR experiments showed that TFIIS4 is necessary for the synthesis of IES-containing non-coding transcripts. We propose that these IES+ transcripts originate from the developing somatic nucleus and serve as pairing substrates for germline-specific short RNAs that target elimination of their homologous sequences. Our study, therefore, connects the onset of zygotic non coding transcription to the control of genome plasticity in Paramecium, and establishes for the first time a specific role of TFIIS in non-coding transcription in eukaryotes. PMID:26177014

  8. Functional Transcriptomics of Wild-Caught Lutzomyia intermedia Salivary Glands: Identification of a Protective Salivary Protein against Leishmania braziliensis Infection

    PubMed Central

    Carneiro, Marcia W.; Miranda, José Carlos; Clarêncio, Jorge; Barral-Netto, Manoel; Brodskyn, Cláudia; Barral, Aldina; Ribeiro, José M. C.; Valenzuela, Jesus G.; de Oliveira, Camila I.

    2013-01-01

    Background Leishmania parasites are transmitted in the presence of sand fly saliva. Together with the parasite, the sand fly injects salivary components that change the environment at the feeding site. Mice immunized with Phlebotomus papatasi salivary gland (SG) homogenate are protected against Leishmania major infection, while immunity to Lutzomyia intermedia SG homogenate exacerbated experimental Leishmania braziliensis infection. In humans, antibodies to Lu. intermedia saliva are associated with risk of acquiring L. braziliensis infection. Despite these important findings, there is no information regarding the repertoire of Lu. intermedia salivary proteins. Methods and Findings A cDNA library from the Salivary Glands (SGs) of wild-caught Lu. intermedia was constructed, sequenced, and complemented by a proteomic approach based on 1D SDS PAGE and mass/mass spectrometry to validate the transcripts present in this cDNA library. We identified the most abundant transcripts and proteins reported in other sand fly species as well as novel proteins such as neurotoxin-like proteins, peptides with ML domain, and three small peptides found so far only in this sand fly species. DNA plasmids coding for ten selected transcripts were constructed and used to immunize BALB/c mice to study their immunogenicity. Plasmid Linb-11—coding for a 4.5-kDa protein—induced a cellular immune response and conferred protection against L. braziliensis infection. This protection correlated with a decreased parasite load and an increased frequency of IFN-γ-producing cells. Conclusions We identified the most abundant and novel proteins present in the SGs of Lu. intermedia, a vector of cutaneous leishmaniasis in the Americas. We also show for the first time that immunity to a single salivary protein from Lu. intermedia can protect against cutaneous leishmaniasis caused by L. braziliensis. PMID:23717705

  9. Identification of a New Human Adenovirus Protein Encoded by a Novel Late l-Strand Transcription Unit▿

    PubMed Central

    Tollefson, Ann E.; Ying, Baoling; Doronin, Konstantin; Sidor, Peter D.; Wold, William S. M.

    2007-01-01

    A short open reading frame named the “U exon,” located on the adenovirus (Ad) l-strand (for leftward transcription) between the early E3 region and the fiber gene, is conserved in mastadenoviruses. We have observed that Ad5 mutants with large deletions in E3 that infringe on the U exon display a mild growth defect, as well as an aberrant Ad E2 DNA-binding protein (DBP) intranuclear localization pattern and an apparent failure to organize replication centers during late infection. Mutants in which the U exon DNA is reconstructed have a reversed phenotype. Chow et al. (L. T. Chow et al., J. Mol. Biol. 134:265-303, 1979) described mRNAs initiating in the region of the U exon and spliced to downstream sequences in the late DBP mRNA leader and the DBP-coding region. We have cloned this mRNA (as cDNA) from Ad5 late mRNA; the predicted protein is 217 amino acids, initiating in the U exon and continuing in frame in the DBP leader and in the DBP-coding region but in a different reading frame from DBP. Polyclonal and monoclonal antibodies generated against the predicted U exon protein (UXP) showed that UXP is ∼24K in size by immunoblot and is a late protein. At 18 to 24 h postinfection, UXP is strongly associated with nucleoli and is found throughout the nucleus; later, UXP is associated with the periphery of replication centers, suggesting a function relevant to Ad DNA replication or RNA transcription. UXP is expressed by all four species C Ads. When expressed in transient transfections, UXP complements the aberrant DBP localization pattern of UXP-negative Ad5 mutants. Our data indicate that UXP is a previously unrecognized protein derived from a novel late l-strand transcription unit. PMID:17881437

  10. Human Immunodeficiency Virus-Type 1 LTR DNA contains an intrinsic gene producing antisense RNA and protein products

    PubMed Central

    Ludwig, Linda B; Ambrus, Julian L; Krawczyk, Kristie A; Sharma, Sanjay; Brooks, Stephen; Hsiao, Chiu-Bin; Schwartz, Stanley A

    2006-01-01

    Background While viruses have long been shown to capitalize on their limited genomic size by utilizing both strands of DNA or complementary DNA/RNA intermediates to code for viral proteins, it has been assumed that human retroviruses have all their major proteins translated only from the plus or sense strand of RNA, despite their requirement for a dsDNA proviral intermediate. Several studies, however, have suggested the presence of antisense transcription for both HIV-1 and HTLV-1. More recently an antisense transcript responsible for the HTLV-1 bZIP factor (HBZ) protein has been described. In this study we investigated the possibility of an antisense gene contained within the human immunodeficiency virus type 1 (HIV-1) long terminal repeat (LTR). Results Inspection of published sequences revealed a potential transcription initiator element (INR) situated downstream of, and in reverse orientation to, the usual HIV-1 promoter and transcription start site. This antisense initiator (HIVaINR) suggested the possibility of an antisense gene responsible for RNA and protein production. We show that antisense transcripts are generated, in vitro and in vivo, originating from the TAR DNA of the HIV-1 LTR. To test the possibility that protein(s) could be translated from this novel HIV-1 antisense RNA, recombinant HIV antisense gene-FLAG vectors were designed. Recombinant protein(s) were produced and isolated utilizing carboxy-terminal FLAG epitope (DYKDDDDK) sequences. In addition, affinity-purified antisera to an internal peptide derived from the HIV antisense protein (HAP) sequences identified HAPs from HIV+ human peripheral blood lymphocytes. Conclusion HIV-1 contains an antisense gene in the U3-R regions of the LTR responsible for both an antisense RNA transcript and proteins. This antisense transcript has tremendous potential for intrinsic RNA regulation because of its overlap with the beginning of all HIV-1 sense RNA transcripts by 25 nucleotides. The novel HAPs are encoded in a region of the LTR that has already been shown to be deleted in some HIV-infected long-term survivors and represent new potential targets for vaccine development. PMID:17090330

  11. Microprocessor mediates transcriptional termination of long noncoding RNA transcripts hosting microRNAs.

    PubMed

    Dhir, Ashish; Dhir, Somdutta; Proudfoot, Nick J; Jopling, Catherine L

    2015-04-01

    MicroRNAs (miRNAs) play a major part in the post-transcriptional regulation of gene expression. Mammalian miRNA biogenesis begins with cotranscriptional cleavage of RNA polymerase II (Pol II) transcripts by the Microprocessor complex. Although most miRNAs are located within introns of protein-coding transcripts, a substantial minority of miRNAs originate from long noncoding (lnc) RNAs, for which transcript processing is largely uncharacterized. We show, by detailed characterization of liver-specific lnc-pri-miR-122 and genome-wide analysis in human cell lines, that most lncRNA transcripts containing miRNAs (lnc-pri-miRNAs) do not use the canonical cleavage-and-polyadenylation pathway but instead use Microprocessor cleavage to terminate transcription. Microprocessor inactivation leads to extensive transcriptional readthrough of lnc-pri-miRNA and transcriptional interference with downstream genes. Consequently we define a new RNase III-mediated, polyadenylation-independent mechanism of Pol II transcription termination in mammalian cells.

  12. Chemical Approaches to Control Gene Expression

    PubMed Central

    Gottesfeld, Joel M.; Turner, James M.; Dervan, Peter B.

    2000-01-01

    A current goal in molecular medicine is the development of new strategies to interfere with gene expression in living cells in the hope that novel therapies for human disease will result from these efforts. This review focuses on small-molecule or chemical approaches to manipulate gene expression by modulating either transcription of messenger RNA-coding genes or protein translation. The molecules under study include natural products, designed ligands, and compounds identified through functional screens of combinatorial libraries. The cellular targets for these molecules include DNA, messenger RNA, and the protein components of the transcription, RNA processing, and translational machinery. Studies with model systems have shown promise in the inhibition of both cellular and viral gene transcription and mRNA utilization. Moreover, strategies for both repression and activation of gene transcription have been described. These studies offer promise for treatment of diseases of pathogenic (viral, bacterial, etc.) and cellular origin (cancer, genetic diseases, etc.). PMID:11097426

  13. RNAi screening of subtracted transcriptomes reveals tumor suppression by taurine-activated GABAA receptors involved in volume regulation

    PubMed Central

    van Nierop, Pim; Vormer, Tinke L.; Foijer, Floris; Verheij, Joanne; Lodder, Johannes C.; Andersen, Jesper B.; Mansvelder, Huibert D.; te Riele, Hein

    2018-01-01

    To identify coding and non-coding suppressor genes of anchorage-independent proliferation by efficient loss-of-function screening, we have developed a method for enzymatic production of low complexity shRNA libraries from subtracted transcriptomes. We produced and screened two LEGO (Low-complexity by Enrichment for Genes shut Off) shRNA libraries that were enriched for shRNA vectors targeting coding and non-coding polyadenylated transcripts that were reduced in transformed Mouse Embryonic Fibroblasts (MEFs). The LEGO shRNA libraries included ~25 shRNA vectors per transcript which limited off-target artifacts. Our method identified 79 coding and non-coding suppressor transcripts. We found that taurine-responsive GABAA receptor subunits, including GABRA5 and GABRB3, were induced during the arrest of non-transformed anchor-deprived MEFs and prevented anchorless proliferation. We show that taurine activates chloride currents through GABAA receptors on MEFs, causing seclusion of cell volume in large membrane protrusions. Volume seclusion from cells by taurine correlated with reduced proliferation and, conversely, suppression of this pathway allowed anchorage-independent proliferation. In human cholangiocarcinomas, we found that several proteins involved in taurine signaling via GABAA receptors were repressed. Low GABRA5 expression typified hyperproliferative tumors, and loss of taurine signaling correlated with reduced patient survival, suggesting this tumor suppressive mechanism operates in vivo. PMID:29787571

  14. Long non-coding RNAs and mRNAs profiling during spleen development in pig.

    PubMed

    Che, Tiandong; Li, Diyan; Jin, Long; Fu, Yuhua; Liu, Yingkai; Liu, Pengliang; Wang, Yixin; Tang, Qianzi; Ma, Jideng; Wang, Xun; Jiang, Anan; Li, Xuewei; Li, Mingzhou

    2018-01-01

    Genome-wide transcriptomic studies in humans and mice have become extensive and mature. However, a comprehensive and systematic understanding of protein-coding genes and long non-coding RNAs (lncRNAs) expressed during pig spleen development has not been achieved. LncRNAs are known to participate in regulatory networks for an array of biological processes. Here, we constructed 18 RNA libraries from developing fetal pig spleen (55 days before birth), postnatal pig spleens (0, 30, 180 days and 2 years after birth), and the samples from the 2-year-old Wild Boar. A total of 15,040 lncRNA transcripts were identified among these samples. We found that the temporal expression pattern of lncRNAs was more restricted than observed for protein-coding genes. Time-series analysis showed two large modules for protein-coding genes and lncRNAs. The up-regulated module was enriched for genes related to immune and inflammatory function, while the down-regulated module was enriched for cell proliferation processes such as cell division and DNA replication. Co-expression networks indicated the functional relatedness between protein-coding genes and lncRNAs, which were enriched for similar functions over the series of time points examined. We identified numerous differentially expressed protein-coding genes and lncRNAs in all five developmental stages. Notably, ceruloplasmin precursor (CP), a protein-coding gene participating in antioxidant and iron transport processes, was differentially expressed in all stages. This study provides the first catalog of the developing pig spleen, and contributes to a fuller understanding of the molecular mechanisms underpinning mammalian spleen development.

  15. Serine/Threonine kinase dependent transcription from the polyhedrin promoter of SpltNPV-I

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mishra, Gourav; Gautam, Hemant K.; Das, Rakha H.

    2007-07-06

    Polyhedrin (polh) and p10 are the two hyper-expressed very late genes of nucleopolyhedroviruses. Alpha amanitin resistant transcription from Spodoptera litura nucleopolyhedrovirus (SpltNPV-I) polyhedrin promoter was observed with virus infected nuclear extract of NIV-HA-197 cells but not with that from uninfected nuclear extract. Anti-protein kinase-1 (pk1) antibody inhibited the transcription and the inhibition reversed on addition of pk1, however, pk1 mutant protein, K50M having no phosphorylation activity did not overcome the transcription inhibition. Chromatin immuno-precipitation assays with viral anti-pk1 antibody showed the interaction of pk1 with the polh while electrophoretic mobility shift assays indicated the strong binding affinity (K {sub d}more » {approx} 5.5 x 10{sup -11}) of purified pk1 with the polh promoter. These results suggested that the viral coded pk1 acts as a transcription factor in transcribing baculovirus very late genes.« less

  16. Intergenic disease-associated regions are abundant in novel transcripts.

    PubMed

    Bartonicek, N; Clark, M B; Quek, X C; Torpy, J R; Pritchard, A L; Maag, J L V; Gloss, B S; Crawford, J; Taft, R J; Hayward, N K; Montgomery, G W; Mattick, J S; Mercer, T R; Dinger, M E

    2017-12-28

    Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.

  17. Translation elicits a growth rate-dependent, genome-wide, differential protein production in Bacillus subtilis.

    PubMed

    Borkowski, Olivier; Goelzer, Anne; Schaffer, Marc; Calabre, Magali; Mäder, Ulrike; Aymerich, Stéphane; Jules, Matthieu; Fromion, Vincent

    2016-05-17

    Complex regulatory programs control cell adaptation to environmental changes by setting condition-specific proteomes. In balanced growth, bacterial protein abundances depend on the dilution rate, transcript abundances and transcript-specific translation efficiencies. We revisited the current theory claiming the invariance of bacterial translation efficiency. By integrating genome-wide transcriptome datasets and datasets from a library of synthetic gfp-reporter fusions, we demonstrated that translation efficiencies in Bacillus subtilis decreased up to fourfold from slow to fast growth. The translation initiation regions elicited a growth rate-dependent, differential production of proteins without regulators, hence revealing a unique, hard-coded, growth rate-dependent mode of regulation. We combined model-based data analyses of transcript and protein abundances genome-wide and revealed that this global regulation is extensively used in B. subtilis We eventually developed a knowledge-based, three-step translation initiation model, experimentally challenged the model predictions and proposed that a growth rate-dependent drop in free ribosome abundance accounted for the differential protein production. © 2016 The Authors. Published under the terms of the CC BY 4.0 license.

  18. Comparative transcriptomics of two environmentally relevant cyanobacteria reveals unexpected transcriptome diversity

    PubMed Central

    Voigt, Karsten; Sharma, Cynthia M; Mitschke, Jan; Joke Lambrecht, S; Voß, Björn; Hess, Wolfgang R; Steglich, Claudia

    2014-01-01

    Prochlorococcus is a genus of abundant and ecologically important marine cyanobacteria. Here, we present a comprehensive comparison of the structure and composition of the transcriptomes of two Prochlorococcus strains, which, despite their similarities, have adapted their gene pool to specific environmental constraints. We present genome-wide maps of transcriptional start sites (TSS) for both organisms, which are representatives of the two most diverse clades within the two major ecotypes adapted to high- and low-light conditions, respectively. Our data suggest antisense transcription for three-quarters of all genes, which is substantially more than that observed in other bacteria. We discovered hundreds of TSS within genes, most notably within 16 of the 29 prochlorosin genes, in strain MIT9313. A direct comparison revealed very little conservation in the location of TSS and the nature of non-coding transcripts between both strains. We detected extremely short 5′ untranslated regions with a median length of only 27 and 29 nt for MED4 and MIT9313, respectively, and for 8% of all protein-coding genes the median distance to the start codon is only 10 nt or even shorter. These findings and the absence of an obvious Shine–Dalgarno motif suggest that leaderless translation and ribosomal protein S1-dependent translation constitute alternative mechanisms for translation initiation in Prochlorococcus. We conclude that genome-wide antisense transcription is a major component of the transcriptional output from these relatively small genomes and that a hitherto unrecognized high degree of complexity and variability of gene expression exists in their transcriptional architecture. PMID:24739626

  19. Identification of quantitative trait loci affecting ectomycorrhizal symbiosis in an interspecific F1 poplar cross and differential expression of genes in ectomycorrhizas of the two parents: Populus deltoides and Populus trichocarpa

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Labbe, Jessy L; Jorge, Veronique; Vion, Patrice

    A Populus deltoides Populus trichocarpa F1 pedigree was analyzed for quantitative trait loci (QTLs) affecting ectomycorrhizal development and for microarray characterization of gene networks involved in this symbiosis. A 300 genotype progeny set was evaluated for its ability to form ectomycorrhiza with the basidiomycete Laccaria bicolor. The percentage of mycorrhizal root tips was determined on the root systems of all 300 progeny and their two parents. QTL analysis identified four significant QTLs, one on the P. deltoides and three on the P. trichocarpa genetic maps. These QTLs were aligned to the P. trichocarpa genome and each contained several megabases andmore » encompass numerous genes. NimbleGen whole-genome microarray, using cDNA from RNA extracts of ectomycorrhizal root tips from the parental genotypes P. trichocarpa and P. deltoides, was used to narrow the candidate gene list. Among the 1,543 differentially expressed genes (p value 0.05; 5.0-fold change in transcript level) having different transcript levels in mycorrhiza of the two parents, 41 transcripts were located in the QTL intervals: 20 in Myc_d1, 14 in Myc_t1, and seven in Myc_t2, while no significant differences among transcripts were found in Myc_t3. Among these 41 transcripts, 25 were overrepresented in P. deltoides relative to P. trichocarpa; 16 were overrepresented in P. trichocarpa. The transcript showing the highest overrepresentation in P. trichocarpa mycorrhiza libraries compared to P. deltoides mycorrhiza codes for an ethylene-sensitive EREBP-4 protein which may repress defense mechanisms in P. trichocarpa while the highest overrepresented transcripts in P. deltoides code for proteins/genes typically associated with pathogen resistance.« less

  20. Transcriptional regulation of the human mitochondrial peptide deformylase (PDF).

    PubMed

    Pereira-Castro, Isabel; Costa, Luís Teixeira da; Amorim, António; Azevedo, Luisa

    2012-05-18

    The last years of research have been particularly dynamic in establishing the importance of peptide deformylase (PDF), a protein of the N-terminal methionine excision (NME) pathway that removes formyl-methionine from mitochondrial-encoded proteins. The genomic sequence of the human PDF gene is shared with the COG8 gene, which encodes a component of the oligomeric golgi complex, a very unusual case in Eukaryotic genomes. Since PDF is crucial in maintaining mitochondrial function and given the atypical short distance between the end of COG8 coding sequence and the PDF initiation codon, we investigated whether the regulation of the human PDF is affected by the COG8 overlapping partner. Our data reveals that PDF has several transcription start sites, the most important of which only 18 bp from the initiation codon. Furthermore, luciferase-activation assays using differently-sized fragments defined a 97 bp minimal promoter region for human PDF, which is capable of very strong transcriptional activity. This fragment contains a potential Sp1 binding site highly conserved in mammalian species. We show that this binding site, whose mutation significantly reduces transcription activation, is a target for the Sp1 transcription factor, and possibly of other members of the Sp family. Importantly, the entire minimal promoter region is located after the end of COG8's coding region, strongly suggesting that the human PDF preserves an independent regulation from its overlapping partner. Copyright © 2012 Elsevier Inc. All rights reserved.

  1. The agents of natural genome editing.

    PubMed

    Witzany, Guenther

    2011-06-01

    The DNA serves as a stable information storage medium and every protein which is needed by the cell is produced from this blueprint via an RNA intermediate code. More recently it was found that an abundance of various RNA elements cooperate in a variety of steps and substeps as regulatory and catalytic units with multiple competencies to act on RNA transcripts. Natural genome editing on one side is the competent agent-driven generation and integration of meaningful DNA nucleotide sequences into pre-existing genomic content arrangements, and the ability to (re-)combine and (re-)regulate them according to context-dependent (i.e. adaptational) purposes of the host organism. Natural genome editing on the other side designates the integration of all RNA activities acting on RNA transcripts without altering DNA-encoded genes. If we take the genetic code seriously as a natural code, there must be agents that are competent to act on this code because no natural code codes itself as no natural language speaks itself. As code editing agents, viral and subviral agents have been suggested because there are several indicators that demonstrate viruses competent in both RNA and DNA natural genome editing.

  2. A chromatin activity based chemoproteomic approach reveals a transcriptional repressome for gene-specific silencing

    PubMed Central

    Liu, Cui; Yu, Yanbao; Liu, Feng; Wei, Xin; Wrobel, John A.; Gunawardena, Harsha P.; Zhou, Li; Jin, Jian; Chen, Xian

    2015-01-01

    Immune cells develop endotoxin tolerance (ET) after prolonged stimulation. ET increases the level of a repression mark H3K9me2 in the transcriptional-silent chromatin specifically associated with pro-inflammatory genes. However, it is not clear what proteins are functionally involved in this process. Here we show that a novel chromatin activity based chemoproteomic (ChaC) approach can dissect the functional chromatin protein complexes that regulate ET-associated inflammation. Using UNC0638 that binds the enzymatically active H3K9-specific methyltransferase G9a/GLP, ChaC reveals that G9a is constitutively active at a G9a-dependent mega-dalton repressome in primary endotoxin-tolerant macrophages. G9a/GLP broadly impacts the ET-specific reprogramming of the histone code landscape, chromatin remodeling, and the activities of select transcription factors. We discover that the G9a-dependent epigenetic environment promotes the transcriptional repression activity of c-Myc for gene-specific co-regulation of chronic inflammation. ChaC may be also applicable to dissect other functional protein complexes in the context of phenotypic chromatin architectures. PMID:25502336

  3. Expression and regulation of long noncoding RNAs during the osteogenic differentiation of periodontal ligament stem cells in the inflammatory microenvironment.

    PubMed

    Zhang, Qingbin; Chen, Li; Cui, Shiman; Li, Yan; Zhao, Qi; Cao, Wei; Lai, Shixiang; Yin, Sanjun; Zuo, Zhixiang; Ren, Jian

    2017-10-25

    Although long noncoding RNAs (lncRNAs) have been emerging as critical regulators in various tissues and biological processes, little is known about their expression and regulation during the osteogenic differentiation of periodontal ligament stem cells (PDLSCs) in inflammatory microenvironment. In this study, we have identified 63 lncRNAs that are not annotated in previous database. These novel lncRNAs were not randomly located in the genome but preferentially located near protein-coding genes related to particular functions and diseases, such as stem cell maintenance and differentiation, development disorders and inflammatory diseases. Moreover, we have identified 650 differentially expressed lncRNAs among different subsets of PDLSCs. Pathway enrichment analysis for neighboring protein-coding genes of these differentially expressed lncRNAs revealed stem cell differentiation related functions. Many of these differentially expressed lncRNAs function as competing endogenous RNAs that regulate protein-coding transcripts through competing shared miRNAs.

  4. Decoding the non-coding RNAs in Alzheimer's disease.

    PubMed

    Schonrock, Nicole; Götz, Jürgen

    2012-11-01

    Non-coding RNAs (ncRNAs) are integral components of biological networks with fundamental roles in regulating gene expression. They can integrate sequence information from the DNA code, epigenetic regulation and functions of multimeric protein complexes to potentially determine the epigenetic status and transcriptional network in any given cell. Humans potentially contain more ncRNAs than any other species, especially in the brain, where they may well play a significant role in human development and cognitive ability. This review discusses their emerging role in Alzheimer's disease (AD), a human pathological condition characterized by the progressive impairment of cognitive functions. We discuss the complexity of the ncRNA world and how this is reflected in the regulation of the amyloid precursor protein and Tau, two proteins with central functions in AD. By understanding this intricate regulatory network, there is hope for a better understanding of disease mechanisms and ultimately developing diagnostic and therapeutic tools.

  5. Genetic Code Expansion as a Tool to Study Regulatory Processes of Transcription

    NASA Astrophysics Data System (ADS)

    Schmidt, Moritz; Summerer, Daniel

    2014-02-01

    The expansion of the genetic code with noncanonical amino acids (ncAA) enables the chemical and biophysical properties of proteins to be tailored, inside cells, with a previously unattainable level of precision. A wide range of ncAA with functions not found in canonical amino acids have been genetically encoded in recent years and have delivered insights into biological processes that would be difficult to access with traditional approaches of molecular biology. A major field for the development and application of novel ncAA-functions has been transcription and its regulation. This is particularly attractive, since advanced DNA sequencing- and proteomics-techniques continue to deliver vast information on these processes on a global level, but complementing methodologies to study them on a detailed, molecular level and in living cells have been comparably scarce. In a growing number of studies, genetic code expansion has now been applied to precisely control the chemical properties of transcription factors, RNA polymerases and histones, and this has enabled new insights into their interactions, conformational changes, cellular localizations and the functional roles of posttranslational modifications.

  6. Expression profiles of long non-coding RNAs located in autoimmune disease-associated regions reveal immune cell-type specificity.

    PubMed

    Hrdlickova, Barbara; Kumar, Vinod; Kanduri, Kartiek; Zhernakova, Daria V; Tripathi, Subhash; Karjalainen, Juha; Lund, Riikka J; Li, Yang; Ullah, Ubaid; Modderman, Rutger; Abdulahad, Wayel; Lähdesmäki, Harri; Franke, Lude; Lahesmaa, Riitta; Wijmenga, Cisca; Withoff, Sebo

    2014-01-01

    Although genome-wide association studies (GWAS) have identified hundreds of variants associated with a risk for autoimmune and immune-related disorders (AID), our understanding of the disease mechanisms is still limited. In particular, more than 90% of the risk variants lie in non-coding regions, and almost 10% of these map to long non-coding RNA transcripts (lncRNAs). lncRNAs are known to show more cell-type specificity than protein-coding genes. We aimed to characterize lncRNAs and protein-coding genes located in loci associated with nine AIDs which have been well-defined by Immunochip analysis and by transcriptome analysis across seven populations of peripheral blood leukocytes (granulocytes, monocytes, natural killer (NK) cells, B cells, memory T cells, naive CD4(+) and naive CD8(+) T cells) and four populations of cord blood-derived T-helper cells (precursor, primary, and polarized (Th1, Th2) T-helper cells). We show that lncRNAs mapping to loci shared between AID are significantly enriched in immune cell types compared to lncRNAs from the whole genome (α <0.005). We were not able to prioritize single cell types relevant for specific diseases, but we observed five different cell types enriched (α <0.005) in five AID (NK cells for inflammatory bowel disease, juvenile idiopathic arthritis, primary biliary cirrhosis, and psoriasis; memory T and CD8(+) T cells in juvenile idiopathic arthritis, primary biliary cirrhosis, psoriasis, and rheumatoid arthritis; Th0 and Th2 cells for inflammatory bowel disease, juvenile idiopathic arthritis, primary biliary cirrhosis, psoriasis, and rheumatoid arthritis). Furthermore, we show that co-expression analyses of lncRNAs and protein-coding genes can predict the signaling pathways in which these AID-associated lncRNAs are involved. The observed enrichment of lncRNA transcripts in AID loci implies lncRNAs play an important role in AID etiology and suggests that lncRNA genes should be studied in more detail to interpret GWAS findings correctly. The co-expression results strongly support a model in which the lncRNA and protein-coding genes function together in the same pathways.

  7. Keeping abreast with long non-coding RNAs in mammary gland development and breast cancer

    PubMed Central

    Hansji, Herah; Leung, Euphemia Y.; Baguley, Bruce C.; Finlay, Graeme J.; Askarian-Amiri, Marjan E.

    2014-01-01

    The majority of the human genome is transcribed, even though only 2% of transcripts encode proteins. Non-coding transcripts were originally dismissed as evolutionary junk or transcriptional noise, but with the development of whole genome technologies, these non-coding RNAs (ncRNAs) are emerging as molecules with vital roles in regulating gene expression. While shorter ncRNAs have been extensively studied, the functional roles of long ncRNAs (lncRNAs) are still being elucidated. Studies over the last decade show that lncRNAs are emerging as new players in a number of diseases including cancer. Potential roles in both oncogenic and tumor suppressive pathways in cancer have been elucidated, but the biological functions of the majority of lncRNAs remain to be identified. Accumulated data are identifying the molecular mechanisms by which lncRNA mediates both structural and functional roles. LncRNA can regulate gene expression at both transcriptional and post-transcriptional levels, including splicing and regulating mRNA processing, transport, and translation. Much current research is aimed at elucidating the function of lncRNAs in breast cancer and mammary gland development, and at identifying the cellular processes influenced by lncRNAs. In this paper we review current knowledge of lncRNAs contributing to these processes and present lncRNA as a new paradigm in breast cancer development. PMID:25400658

  8. High-resolution transcriptional analysis of the regulatory influence of cell-to-cell signalling reveals novel genes that contribute to Xanthomonas phytopathogenesis

    PubMed Central

    An, Shi-Qi; Febrer, Melanie; McCarthy, Yvonne; Tang, Dong-Jie; Clissold, Leah; Kaithakottil, Gemy; Swarbreck, David; Tang, Ji-Liang; Rogers, Jane; Dow, J Maxwell; Ryan, Robert P

    2013-01-01

    The bacterium Xanthomonas campestris is an economically important pathogen of many crop species and a model for the study of bacterial phytopathogenesis. In X. campestris, a regulatory system mediated by the signal molecule DSF controls virulence to plants. The synthesis and recognition of the DSF signal depends upon different Rpf proteins. DSF signal generation requires RpfF whereas signal perception and transduction depends upon a system comprising the sensor RpfC and regulator RpfG. Here we have addressed the action and role of Rpf/DSF signalling in phytopathogenesis by high-resolution transcriptional analysis coupled to functional genomics. We detected transcripts for many genes that were unidentified by previous computational analysis of the genome sequence. Novel transcribed regions included intergenic transcripts predicted as coding or non-coding as well as those that were antisense to coding sequences. In total, mutation of rpfF, rpfG and rpfC led to alteration in transcript levels (more than fourfold) of approximately 480 genes. The regulatory influence of RpfF and RpfC demonstrated considerable overlap. Contrary to expectation, the regulatory influence of RpfC and RpfG had limited overlap, indicating complexities of the Rpf signalling system. Importantly, functional analysis revealed over 160 new virulence factors within the group of Rpf-regulated genes. PMID:23617851

  9. A systemic identification approach for primary transcription start site of Arabidopsis miRNAs from multidimensional omics data.

    PubMed

    You, Qi; Yan, Hengyu; Liu, Yue; Yi, Xin; Zhang, Kang; Xu, Wenying; Su, Zhen

    2017-05-01

    The 22-nucleotide non-coding microRNAs (miRNAs) are mostly transcribed by RNA polymerase II and are similar to protein-coding genes. Unlike the clear process from stem-loop precursors to mature miRNAs, the primary transcriptional regulation of miRNA, especially in plants, still needs to be further clarified, including the original transcription start site, functional cis-elements and primary transcript structures. Due to several well-characterized transcription signals in the promoter region, we proposed a systemic approach integrating multidimensional "omics" (including genomics, transcriptomics, and epigenomics) data to improve the genome-wide identification of primary miRNA transcripts. Here, we used the model plant Arabidopsis thaliana to improve the ability to identify candidate promoter locations in intergenic miRNAs and to determine rules for identifying primary transcription start sites of miRNAs by integrating high-throughput omics data, such as the DNase I hypersensitive sites, chromatin immunoprecipitation-sequencing of polymerase II and H3K4me3, as well as high throughput transcriptomic data. As a result, 93% of refined primary transcripts could be confirmed by the primer pairs from a previous study. Cis-element and secondary structure analyses also supported the feasibility of our results. This work will contribute to the primary transcriptional regulatory analysis of miRNAs, and the conserved regulatory pattern may be a suitable miRNA characteristic in other plant species.

  10. [Neuromuscular system and aging: involutions and implications].

    PubMed

    Paillard, Thierry

    2013-12-01

    In aged human, the number of muscle fibers and motor units decreases. The remaining motor units lose their functionality (decrease of the discharge frequency, greater fluctuation of the discharge) particularly those which contain type II fibers. The renewal of intracellular proteins declines which creates a negative balance between the daily protein losses and the capacities to renew them. The activity of the protein kinase (Akt) that stimulates the synthesis of regulation proteins (mTOR, p70S6, IGFBP-5) declines whereas the factors of degradation of proteins (NF-kappa B) are activated. Besides, the process of activation and proliferation of satellite cells is affected and the production of anabolic hormones and local factors is decreased. After a strength training program, muscle hypertrophy is linked to the protein synthesis at the level of myosin heavy chain (MHC) isoforms in older subjects. However, the transcription of the genes that code the MHC-I (slow form) increases and the transcription of the genes that code the MHC-II (fast form) decreases. Thus, the transition of the phenotype towards a slower form cannot be inverted by strength training during the advanced in age. Moreover, strength training enables to decrease the proportion of fibers containing MHC of hybrid form in the process of evolution. Hence, strength training can engender a stabilization of the muscular phenotype i.e. different isoforms of MHC. In addition, strength training counteracts the noxious effects mentioned above by generating muscular hypertrophy thanks to a reactive increase in the production of anabolic hormones. A program of aerobic training can induce an increase in the synthesis of ARN messengers coding isoforms related to the oxidative metabolism (MHC-I and to a lesser extent MHC-IIa) while the transcribed for the type MHC-IIx decrease.

  11. Context influences on TALE–DNA binding revealed by quantitative profiling

    PubMed Central

    Rogers, Julia M.; Barrera, Luis A.; Reyon, Deepak; Sander, Jeffry D.; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L.

    2015-01-01

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design. PMID:26067805

  12. Context influences on TALE-DNA binding revealed by quantitative profiling.

    PubMed

    Rogers, Julia M; Barrera, Luis A; Reyon, Deepak; Sander, Jeffry D; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L

    2015-06-11

    Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE-DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000-20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE-DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.

  13. Trypanosome RNA polymerases and transcription factors: sensible trypanocidal drug targets?

    PubMed

    Vanhamme, Luc

    2008-11-01

    Trypanosomes and Leishmaniae are the agents of several important parasitic diseases threatening hundreds of million human beings worldwide. As they diverged early in evolution, they display original molecular characteristics. These peculiarities are each defining putative specific targets for anti-parasitic drugs. Transcription displays its lot of unique characteristics in trypanosomes and will be taken as an example to uncover these targets. Unique features of transcription in trypanosomes include constitutive and poly-cistronic transcription by RNA polymerase II as well as transcription of protein-coding genes by RNA polymerase I. It is becoming clear that these unique mechanisms are performed by dedicated molecular players. The first of them have been recently characterized. They are reviewed and their suitability as drug targets is commented.

  14. Transimulation - protein biosynthesis web service.

    PubMed

    Siwiak, Marlena; Zielenkiewicz, Piotr

    2013-01-01

    Although translation is the key step during gene expression, it remains poorly characterized at the level of individual genes. For this reason, we developed Transimulation - a web service measuring translational activity of genes in three model organisms: Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. The calculations are based on our previous computational model of translation and experimental data sets. Transimulation quantifies mean translation initiation and elongation time (expressed in SI units), and the number of proteins produced per transcript. It also approximates the number of ribosomes that typically occupy a transcript during translation, and simulates their propagation. The simulation of ribosomes' movement is interactive and allows modifying the coding sequence on the fly. It also enables uploading any coding sequence and simulating its translation in one of three model organisms. In such a case, ribosomes propagate according to mean codon elongation times of the host organism, which may prove useful for heterologous expression. Transimulation was used to examine evolutionary conservation of translational parameters of orthologous genes. Transimulation may be accessed at http://nexus.ibb.waw.pl/Transimulation (requires Java version 1.7 or higher). Its manual and source code, distributed under the GPL-2.0 license, is freely available at the website.

  15. Microarray-based transcriptome of Listeria monocytogenes adapted to sublethal concentrations of acetic acid, lactic acid, and hydrochloric acid.

    PubMed

    Tessema, Girum Tadesse; Møretrø, Trond; Snipen, Lars; Heir, Even; Holck, Askild; Naterstad, Kristine; Axelsson, Lars

    2012-09-01

    Listeria monocytogenes , an important foodborne pathogen, commonly encounters organic acids in food-related environments. The transcriptome of L. monocytogenes L502 was analyzed after adaptation to pH 5 in the presence of acetic acid, lactic acid, or hydrochloric acid (HCl) at 25 °C, representing a condition encountered in mildly acidic ready-to-eat food kept at room temperature. The acid-treated cells were compared with a reference culture with a pH of 6.7 at the time of RNA harvesting. The number of genes and magnitude of transcriptional responses were higher for the organic acids than for HCl. Protein coding genes described for low pH stress, energy transport and metabolism, virulence determinates, and acid tolerance response were commonly regulated in the 3 acid-stressed cultures. Interestingly, the transcriptional levels of histidine and cell wall biosynthetic operons were upregulated, indicating possible universal response against low pH stress in L. monocytogenes. The opuCABCD operon, coding proteins for compatible solutes transport, and the transcriptional regulator sigL were significantly induced in the organic acids, strongly suggesting key roles during organic acid stress. The present study revealed the complex transcriptional responses of L. monocytogenes towards food-related acidulants and opens the roadmap for more specific and in-depth future studies.

  16. Auto-Regulatory RNA Editing Fine-Tunes mRNA Re-Coding and Complex Behaviour in Drosophila

    PubMed Central

    Savva, Yiannis A.; Jepson, James E.C; Sahin, Asli; Sugden, Arthur U.; Dorsky, Jacquelyn S.; Alpert, Lauren; Lawrence, Charles; Reenan, Robert A.

    2014-01-01

    Auto-regulatory feedback loops are a common molecular strategy used to optimize protein function. In Drosophila many mRNAs involved in neuro-transmission are re-coded at the RNA level by the RNA editing enzyme dADAR, leading to the incorporation of amino acids that are not directly encoded by the genome. dADAR also re-codes its own transcript, but the consequences of this auto-regulation in vivo are unclear. Here we show that hard-wiring or abolishing endogenous dADAR auto-regulation dramatically remodels the landscape of re-coding events in a site-specific manner. These molecular phenotypes correlate with altered localization of dADAR within the nuclear compartment. Furthermore, auto-editing exhibits sexually dimorphic patterns of spatial regulation and can be modified by abiotic environmental factors. Finally, we demonstrate that modifying dAdar auto-editing affects adaptive complex behaviors. Our results reveal the in vivo relevance of auto-regulatory control over post-transcriptional mRNA re-coding events in fine-tuning brain function and organismal behavior. PMID:22531175

  17. Comparative genomics and transcriptional profiles of Saccharopolyspora erythraea NRRL 2338 and a classically improved erythromycin over-producing strain

    PubMed Central

    2012-01-01

    Background The molecular mechanisms altered by the traditional mutation and screening approach during the improvement of antibiotic-producing microorganisms are still poorly understood although this information is essential to design rational strategies for industrial strain improvement. In this study, we applied comparative genomics to identify all genetic changes occurring during the development of an erythromycin overproducer obtained using the traditional mutate-and- screen method. Results Compared with the parental Saccharopolyspora erythraea NRRL 2338, the genome of the overproducing strain presents 117 deletion, 78 insertion and 12 transposition sites, with 71 insertion/deletion sites mapping within coding sequences (CDSs) and generating frame-shift mutations. Single nucleotide variations are present in 144 CDSs. Overall, the genomic variations affect 227 proteins of the overproducing strain and a considerable number of mutations alter genes of key enzymes in the central carbon and nitrogen metabolism and in the biosynthesis of secondary metabolites, resulting in the redirection of common precursors toward erythromycin biosynthesis. Interestingly, several mutations inactivate genes coding for proteins that play fundamental roles in basic transcription and translation machineries including the transcription anti-termination factor NusB and the transcription elongation factor Efp. These mutations, along with those affecting genes coding for pleiotropic or pathway-specific regulators, affect global expression profile as demonstrated by a comparative analysis of the parental and overproducer expression profiles. Genomic data, finally, suggest that the mutate-and-screen process might have been accelerated by mutations in DNA repair genes. Conclusions This study helps to clarify the mechanisms underlying antibiotic overproduction providing valuable information about new possible molecular targets for rationale strain improvement. PMID:22401291

  18. Repression of YdaS Toxin Is Mediated by Transcriptional Repressor RacR in the Cryptic rac Prophage of Escherichia coli K-12.

    PubMed

    Krishnamurthi, Revathy; Ghosh, Swagatha; Khedkar, Supriya; Seshasayee, Aswin Sai Narain

    2017-01-01

    Horizontal gene transfer is a major driving force behind the genomic diversity seen in prokaryotes. The cryptic rac prophage in Escherichia coli K-12 carries the gene for a putative transcription factor RacR, whose deletion is lethal. We have shown that the essentiality of racR in E. coli K-12 is attributed to its role in transcriptionally repressing toxin gene(s) called ydaS and ydaT , which are adjacent to and coded divergently to racR . IMPORTANCE Transcription factors in the bacterium E. coli are rarely essential, and when they are essential, they are largely toxin-antitoxin systems. While studying transcription factors encoded in horizontally acquired regions in E. coli , we realized that the protein RacR, a putative transcription factor encoded by a gene on the rac prophage, is an essential protein. Here, using genetics, biochemistry, and bioinformatics, we show that its essentiality derives from its role as a transcriptional repressor of the ydaS and ydaT genes, whose products are toxic to the cell. Unlike type II toxin-antitoxin systems in which transcriptional regulation involves complexes of the toxin and antitoxin, repression by RacR is sufficient to keep ydaS transcriptionally silent.

  19. Maternal transcription of non-protein coding RNAs from the PWS-critical region rescues growth retardation in mice.

    PubMed

    Rozhdestvensky, Timofey S; Robeck, Thomas; Galiveti, Chenna R; Raabe, Carsten A; Seeger, Birte; Wolters, Anna; Gubar, Leonid V; Brosius, Jürgen; Skryabin, Boris V

    2016-02-05

    Prader-Willi syndrome (PWS) is a neurogenetic disorder caused by loss of paternally expressed genes on chromosome 15q11-q13. The PWS-critical region (PWScr) contains an array of non-protein coding IPW-A exons hosting intronic SNORD116 snoRNA genes. Deletion of PWScr is associated with PWS in humans and growth retardation in mice exhibiting ~15% postnatal lethality in C57BL/6 background. Here we analysed a knock-in mouse containing a 5'HPRT-LoxP-Neo(R) cassette (5'LoxP) inserted upstream of the PWScr. When the insertion was inherited maternally in a paternal PWScr-deletion mouse model (PWScr(p-/m5'LoxP)), we observed compensation of growth retardation and postnatal lethality. Genomic methylation pattern and expression of protein-coding genes remained unaltered at the PWS-locus of PWScr(p-/m5'LoxP) mice. Interestingly, ubiquitous Snord116 and IPW-A exon transcription from the originally silent maternal chromosome was detected. In situ hybridization indicated that PWScr(p-/m5'LoxP) mice expressed Snord116 in brain areas similar to wild type animals. Our results suggest that the lack of PWScr RNA expression in certain brain areas could be a primary cause of the growth retardation phenotype in mice. We propose that activation of disease-associated genes on imprinted regions could lead to general therapeutic strategies in associated diseases.

  20. Deciphering Mineral Homeostasis in Barley Seed Transfer Cells at Transcriptional Level.

    PubMed

    Darbani, Behrooz; Noeparvar, Shahin; Borg, Søren

    2015-01-01

    In addition to the micronutrient inadequacy of staple crops for optimal human nutrition, a global downtrend in crop-quality has emerged from intensive breeding for yield. This trend will be aggravated by elevated levels of the greenhouse gas carbon dioxide. Therefore, crop biofortification is inevitable to ensure a sustainable supply of minerals to the large part of human population who is dietary dependent on staple crops. This requires a thorough understanding of plant-mineral interactions due to the complexity of mineral homeostasis. Employing RNA sequencing, we here communicate transfer cell specific effects of excess iron and zinc during grain filling in our model crop plant barley. Responding to alterations in mineral contents, we found a long range of different genes and transcripts. Among them, it is worth to highlight the auxin and ethylene signaling factors Arfs, Abcbs, Cand1, Hps4, Hac1, Ecr1, and Ctr1, diurnal fluctuation components Sdg2, Imb1, Lip1, and PhyC, retroelements, sulfur homeostasis components Amp1, Hmt3, Eil3, and Vip1, mineral trafficking components Med16, Cnnm4, Aha2, Clpc1, and Pcbps, and vacuole organization factors Ymr155W, RabG3F, Vps4, and Cbl3. Our analysis introduces new interactors and signifies a broad spectrum of regulatory levels from chromatin remodeling to intracellular protein sorting mechanisms active in the plant mineral homeostasis. The results highlight the importance of storage proteins in metal ion toxicity-resistance and chelation. Interestingly, the protein sorting and recycling factors Exoc7, Cdc1, Sec23A, and Rab11A contributed to the response as well as the polar distributors of metal-transporters ensuring the directional flow of minerals. Alternative isoform switching was found important for plant adaptation and occurred among transcripts coding for identical proteins as well as transcripts coding for protein isoforms. We also identified differences in the alternative-isoform preference between the treatments, indicating metal-affinity shifts among isoforms of metal transporters. Most important, we found the zinc treatment to impair both photosynthesis and respiration. A wide range of transcriptional changes including stress-related genes and negative feedback loops emphasize the importance to withhold mineral contents below certain cellular levels which otherwise might lead to agronomical impeding side-effects. By illustrating new mechanisms, genes, and transcripts, this report provides a solid platform towards understanding the complex network of plant mineral homeostasis.

  1. Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function

    PubMed Central

    Ezkurdia, Iakes; del Pozo, Angela; Frankish, Adam; Rodriguez, Jose Manuel; Harrow, Jennifer; Ashman, Keith; Valencia, Alfonso; Tress, Michael L.

    2012-01-01

    Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have identified peptides that cover 35% of the genes annotated by the GENCODE consortium for the human genome as part of a comprehensive analysis of experimental spectra from two large publicly available mass spectrometry databases. We detected the translation to protein of “novel” and “putative” protein-coding transcripts as well as transcripts annotated as pseudogenes and nonsense-mediated decay targets. We provide a detailed overview of the population of alternatively spliced protein isoforms that are detectable by peptide identification methods. We found that 150 genes expressed multiple alternative protein isoforms. This constitutes the largest set of reliably confirmed alternatively spliced proteins yet discovered. Three groups of genes were highly overrepresented. We detected alternative isoforms for 10 of the 25 possible heterogeneous nuclear ribonucleoproteins, proteins with a key role in the splicing process. Alternative isoforms generated from interchangeable homologous exons and from short indels were also significantly enriched, both in human experiments and in parallel analyses of mouse and Drosophila proteomics experiments. Our results show that a surprisingly high proportion (almost 25%) of the detected alternative isoforms are only subtly different from their constitutive counterparts. Many of the alternative splicing events that give rise to these alternative isoforms are conserved in mouse. It was striking that very few of these conserved splicing events broke Pfam functional domains or would damage globular protein structures. This evidence of a strong bias toward subtle differences in CDS and likely conserved cellular function and structure is remarkable and strongly suggests that the translation of alternative transcripts may be subject to selective constraints. PMID:22446687

  2. Complex alternative splicing of acetylcholinesterase transcripts in Torpedo electric organ; primary structure of the precursor of the glycolipid-anchored dimeric form.

    PubMed Central

    Sikorav, J L; Duval, N; Anselmet, A; Bon, S; Krejci, E; Legay, C; Osterlund, M; Reimund, B; Massoulié, J

    1988-01-01

    In this paper, we show the existence of alternative splicing in the 3' region of the coding sequence of Torpedo acetylcholinesterase (AChE). We describe two cDNA structures which both diverge from the previously described coding sequence of the catalytic subunit of asymmetric (A) forms (Schumacher et al., 1986; Sikorav et al., 1987). They both contain a coding sequence followed by a non-coding sequence and a poly(A) stretch. Both of these structures were shown to exist in poly(A)+ RNAs, by S1 mapping experiments. The divergent region encoded by the first sequence corresponds to the precursor of the globular dimeric form (G2a), since it contains the expected C-terminal amino acids, Ala-Cys. These amino acids are followed by a 29 amino acid extension which contains a hydrophobic segment and must be replaced by a glycolipid in the mature protein. Analyses of intact G2a AChE showed that the common domain of the protein contains intersubunit disulphide bonds. The divergent region of the second type of cDNA consists of an adjacent genomic sequence, which is removed as an intron in A and Ga mRNAs, but may encode a distinct, less abundant catalytic subunit. The structures of the cDNA clones indicate that they are derived from minor mRNAs, shorter than the three major transcripts which have been described previously (14.5, 10.5 and 5.5 kb). Oligonucleotide probes specific for the asymmetric and globular terminal regions hybridize with the three major transcripts, indicating that their size is determined by 3'-untranslated regions which are not related to the differential splicing leading to A and Ga forms. Images PMID:3181125

  3. Gene expression of galectin-9/ecalectin, a potent eosinophil chemoattractant, and/or the insertional isoform in human colorectal carcinoma cell lines and detection of frame-shift mutations for protein sequence truncations in the second functional lectin domain.

    PubMed

    Lahm, H; Hoeflich, A; Andre, S; Sordat, B; Kaltner, H; Wolf, E; Gabius, H J

    2000-09-01

    The family of Ca2+-independent galactoside-binding lectins with the beta-strand topology of the jelly-roll, referred to as galectins, is known to mediate and modulate a variety of cellular activities. Their functional versatility explains the current interest in monitoring their expression in cancer research, so far primarily focused on galectin-1 and -3. Tandem-repeat-type galectin-9 and its (most probably) allelic variant ecalectin, a potent eosinophil chemoattractant, are known to be human leukocyte products. We show by RT-PCR with primers specific for both that their mRNA is expressed in 17 of 21 human colorectal cancer lines. As also indicated by restriction analysis, in addition to the expected transcript of 571 bp an otherwise identical isoform coding for a 32-amino acid extension of the link peptide was detected. Positive cell lines differentially expressed either one (7 lines) or both transcripts (10 lines). Sequence analysis of RT-PCR products, performed in four cases, allowed to assign the standard transcript to ecalectin in the case of SW480 cells and detected two point mutations in the insert of the link peptide-coding sequence in WiDr and Colo205. Furthermore, this analysis identified the insertion of a single nucleotide into the coding sequence generating a frame-shift mutation, an event which has so far not been reported for any galectin. This alteration encountered in both transcripts of the WiDr line and the isoform transcript of Colo205 cells will most likely truncate the protein part within the second (C-terminal) carbohydrate recognition domain. Our results thus reveal the presence of mRNA for a galectin-9-isoform or a potent eosinophil chemoattractant (ecalectin) or a truncated version thereof with preserved N-terminal carbohydrate recognition domain in established human colon cancer cell lines.

  4. The central nervous system transcriptome of the weakly electric brown ghost knifefish (Apteronotus leptorhynchus): de novo assembly, annotation, and proteomics validation.

    PubMed

    Salisbury, Joseph P; Sîrbulescu, Ruxandra F; Moran, Benjamin M; Auclair, Jared R; Zupanc, Günther K H; Agar, Jeffrey N

    2015-03-11

    The brown ghost knifefish (Apteronotus leptorhynchus) is a weakly electric teleost fish of particular interest as a versatile model system for a variety of research areas in neuroscience and biology. The comprehensive information available on the neurophysiology and neuroanatomy of this organism has enabled significant advances in such areas as the study of the neural basis of behavior, the development of adult-born neurons in the central nervous system and their involvement in the regeneration of nervous tissue, as well as brain aging and senescence. Despite substantial scientific interest in this species, no genomic resources are currently available. Here, we report the de novo assembly and annotation of the A. leptorhynchus transcriptome. After evaluating several trimming and transcript reconstruction strategies, de novo assembly using Trinity uncovered 42,459 unique contigs containing at least a partial protein-coding sequence based on alignment to a reference set of known Actinopterygii sequences. As many as 11,847 of these contigs contained full or near-full length protein sequences, providing broad coverage of the proteome. A variety of non-coding RNA sequences were also identified and annotated, including conserved long intergenic non-coding RNA and other long non-coding RNA observed previously to be expressed in adult zebrafish (Danio rerio) brain, as well as a variety of miRNA, snRNA, and snoRNA. Shotgun proteomics confirmed translation of open reading frames from over 2,000 transcripts, including alternative splice variants. Assignment of tandem mass spectra was greatly improved by use of the assembly compared to databases of sequences from closely related organisms. The assembly and raw reads have been deposited at DDBJ/EMBL/GenBank under the accession number GBKR00000000. Tandem mass spectrometry data is available via ProteomeXchange with identifier PXD001285. Presented here is the first release of an annotated de novo transcriptome assembly from Apteronotus leptorhynchus, providing a broad overview of RNA expressed in central nervous system tissue. The assembly, which includes substantial coverage of a wide variety of both protein coding and non-coding transcripts, will allow the development of better tools to understand the mechanisms underlying unique characteristics of the knifefish model system, such as their tremendous regenerative capacity and negligible brain senescence.

  5. The heat-shock protein Apg-2 binds to the tight junction protein ZO-1 and regulates transcriptional activity of ZONAB.

    PubMed

    Tsapara, Anna; Matter, Karl; Balda, Maria S

    2006-03-01

    The tight junction adaptor protein ZO-1 regulates intracellular signaling and cell proliferation. Its Src homology 3 (SH3) domain is required for the regulation of proliferation and binds to the Y-box transcription factor ZO-1-associated nucleic acid binding protein (ZONAB). Binding of ZO-1 to ZONAB results in cytoplasmic sequestration and hence inhibition of ZONAB's transcriptional activity. Here, we identify a new binding partner of the SH3 domain that modulates ZO-1-ZONAB signaling. Expression screening of a cDNA library with a fusion protein containing the SH3 domain yielded a cDNA coding for Apg-2, a member of the heat-shock protein 110 (Hsp 110) subfamily of Hsp70 heat-shock proteins, which is overexpressed in carcinomas. Regulated depletion of Apg-2 in Madin-Darby canine kidney cells inhibits G(1)/S phase progression. Apg-2 coimmunoprecipitates with ZO-1 and partially localizes to intercellular junctions. Junctional recruitment and coimmunoprecipitation with ZO-1 are stimulated by heat shock. Apg-2 competes with ZONAB for binding to the SH3 domain in vitro and regulates ZONAB's transcriptional activity in reporter gene assays. Our data hence support a model in which Apg-2 regulates ZONAB function by competing for binding to the SH3 domain of ZO-1 and suggest that Apg-2 functions as a regulator of ZO-1-ZONAB signaling in epithelial cells in response to cellular stress.

  6. The Heat-Shock Protein Apg-2 Binds to the Tight Junction Protein ZO-1 and Regulates Transcriptional Activity of ZONAB

    PubMed Central

    Tsapara, Anna; Matter, Karl; Balda, Maria S.

    2006-01-01

    The tight junction adaptor protein ZO-1 regulates intracellular signaling and cell proliferation. Its Src homology 3 (SH3) domain is required for the regulation of proliferation and binds to the Y-box transcription factor ZO-1-associated nucleic acid binding protein (ZONAB). Binding of ZO-1 to ZONAB results in cytoplasmic sequestration and hence inhibition of ZONAB's transcriptional activity. Here, we identify a new binding partner of the SH3 domain that modulates ZO-1–ZONAB signaling. Expression screening of a cDNA library with a fusion protein containing the SH3 domain yielded a cDNA coding for Apg-2, a member of the heat-shock protein 110 (Hsp 110) subfamily of Hsp70 heat-shock proteins, which is overexpressed in carcinomas. Regulated depletion of Apg-2 in Madin-Darby canine kidney cells inhibits G1/S phase progression. Apg-2 coimmunoprecipitates with ZO-1 and partially localizes to intercellular junctions. Junctional recruitment and coimmunoprecipitation with ZO-1 are stimulated by heat shock. Apg-2 competes with ZONAB for binding to the SH3 domain in vitro and regulates ZONAB's transcriptional activity in reporter gene assays. Our data hence support a model in which Apg-2 regulates ZONAB function by competing for binding to the SH3 domain of ZO-1 and suggest that Apg-2 functions as a regulator of ZO-1–ZONAB signaling in epithelial cells in response to cellular stress. PMID:16407410

  7. Single-Nucleosome Mapping of Histone Modifications in S. cerevisiae

    PubMed Central

    Kim, Minkyu; Buratowski, Stephen; Schreiber, Stuart L; Friedman, Nir

    2005-01-01

    Covalent modification of histone proteins plays a role in virtually every process on eukaryotic DNA, from transcription to DNA repair. Many different residues can be covalently modified, and it has been suggested that these modifications occur in a great number of independent, meaningful combinations. Published low-resolution microarray studies on the combinatorial complexity of histone modification patterns suffer from confounding effects caused by the averaging of modification levels over multiple nucleosomes. To overcome this problem, we used a high-resolution tiled microarray with single-nucleosome resolution to investigate the occurrence of combinations of 12 histone modifications on thousands of nucleosomes in actively growing S. cerevisiae. We found that histone modifications do not occur independently; there are roughly two groups of co-occurring modifications. One group of lysine acetylations shows a sharply defined domain of two hypo-acetylated nucleosomes, adjacent to the transcriptional start site, whose occurrence does not correlate with transcription levels. The other group consists of modifications occurring in gradients through the coding regions of genes in a pattern associated with transcription. We found no evidence for a deterministic code of many discrete states, but instead we saw blended, continuous patterns that distinguish nucleosomes at one location (e.g., promoter nucleosomes) from those at another location (e.g., over the 3′ ends of coding regions). These results are consistent with the idea of a simple, redundant histone code, in which multiple modifications share the same role. PMID:16122352

  8. Pervasive Transcription of a Herpesvirus Genome Generates Functionally Important RNAs

    PubMed Central

    Canny, Susan P.; Reese, Tiffany A.; Johnson, L. Steven; Zhang, Xin; Kambal, Amal; Duan, Erning; Liu, Catherine Y.; Virgin, Herbert W.

    2014-01-01

    ABSTRACT Pervasive transcription is observed in a wide range of organisms, including humans, mice, and viruses, but the functional significance of the resulting transcripts remains uncertain. Current genetic approaches are often limited by their emphasis on protein-coding open reading frames (ORFs). We previously identified extensive pervasive transcription from the murine gammaherpesvirus 68 (MHV68) genome outside known ORFs and antisense to known genes (termed expressed genomic regions [EGRs]). Similar antisense transcripts have been identified in many other herpesviruses, including Kaposi’s sarcoma-associated herpesvirus and human and murine cytomegalovirus. Despite their prevalence, whether these RNAs have any functional importance in the viral life cycle is unknown, and one interpretation is that these are merely “noise” generated by functionally unimportant transcriptional events. To determine whether pervasive transcription of a herpesvirus genome generates RNA molecules that are functionally important, we used a strand-specific functional approach to target transcripts from thirteen EGRs in MHV68. We found that targeting transcripts from six EGRs reduced viral protein expression, proving that pervasive transcription can generate functionally important RNAs. We characterized transcripts emanating from EGRs 26 and 27 in detail using several methods, including RNA sequencing, and identified several novel polyadenylated transcripts that were enriched in the nuclei of infected cells. These data provide the first evidence of the functional importance of regions of pervasive transcription emanating from MHV68 EGRs. Therefore, studies utilizing mutation of a herpesvirus genome must account for possible effects on RNAs generated by pervasive transcription. PMID:24618256

  9. Molecular cloning and functional characterization of an antifungal PR-5 protein from Ocimum basilicum.

    PubMed

    Rather, Irshad Ahmad; Awasthi, Praveen; Mahajan, Vidushi; Bedi, Yashbir S; Vishwakarma, Ram A; Gandhi, Sumit G

    2015-03-01

    Pathogenesis-related (PR) proteins are involved in biotic and abiotic stress responses of plants and are grouped into 17 families (PR-1 to PR-17). PR-5 family includes proteins related to thaumatin and osmotin, with several members possessing antimicrobial properties. In this study, a PR-5 gene showing a high degree of homology with osmotin-like protein was isolated from sweet basil (Ocimum basilicum L.). A complete open reading frame consisting of 675 nucleotides, coding for a precursor protein, was obtained by PCR amplification. Based on sequence comparisons with tobacco osmotin and other osmotin-like proteins (OLPs), this protein was named ObOLP. The predicted mature protein is 225 amino acids in length and contains 16 cysteine residues that may potentially form eight disulfide bonds, a signature common to most PR-5 proteins. Among the various abiotic stress treatments tested, including high salt, mechanical wounding and exogenous phytohormone/elicitor treatments; methyl jasmonate (MeJA) and mechanical wounding significantly induced the expression of ObOLP gene. The coding sequence of ObOLP was cloned and expressed in a bacterial host resulting in a 25kDa recombinant-HIS tagged protein, displaying antifungal activity. The ObOLP protein sequence appears to contain an N-terminal signal peptide with signatures of secretory pathway. Further, our experimental data shows that ObOLP expression is regulated transcriptionally and in silico analysis suggests that it may be post-transcriptionally and post-translationally regulated through microRNAs and post-translational protein modifications, respectively. This study appears to be the first report of isolation and characterization of osmotin-like protein gene from O. basilicum. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. Comparative architecture of silks, fibrous proteins and their encoding genes in insects and spiders.

    PubMed

    Craig, Catherine L; Riekel, Christian

    2002-12-01

    The known silk fibroins and fibrous glues are thought to be encoded by members of the same gene family. All silk fibroins sequenced to date contain regions of long-range order (crystalline regions) and/or short-range order (non-crystalline regions). All of the sequenced fibroin silks (Flag or silk from flagelliform gland in spiders; Fhc or heavy chain fibroin silks produced by Lepidoptera larvae) are made up of hierarchically organized, repetitive arrays of amino acids. Fhc fibroin genes are characterized by a similar molecular genetic architecture of two exons and one intron, but the organization and size of these units differs. The Flag, Ser (sericin gene) and BR (Balbiani ring genes; both fibrous proteins) genes are made up of multiple exons and introns. Sequences coding for crystalline and non-crystalline protein domains are integrated in the repetitive regions of Fhc and MA exons, but not in the protein glues Ser1 and BR-1. Genetic 'hot-spots' promote recombination errors in Fhc, MA, and Flag. Codon bias, structural constraint, point mutations, and shortened coding arrays may be alternative means of stabilizing precursor mRNA transcripts. Differential regulation of gene expression and selective splicing of the mRNA transcript may allow rapid adaptation of silk functional properties to different physical environments.

  11. RNAP-II transcribes two small RNAs at the promoter and terminator regions of the RNAP-I gene in Saccharomyces cerevisiae.

    PubMed

    Mayán, Maria D

    2013-01-01

    Three RNA polymerases coexist in the ribosomal DNA of Saccharomyces cerevisiae. RNAP-I transcribes the 35S rRNA, RNAP-III transcribes the 5S rRNA and RNAP-II is found in both intergenic non-coding regions. Previously, we demonstrated that RNAP-II molecules bound to the intergenic non-coding regions (IGS) of the ribosomal locus are mainly found in a stalled conformation, and the stalled polymerase mediates chromatin interactions, which isolate RNAP-I from the RNAP-III transcriptional domain. Besides, RNAP-II transcribes both IGS regions at low levels, using different cryptic promoters. This report demonstrates that RNAP-II also transcribes two sequences located in the 5'- and 3'-ends of the 35S rRNA gene that overlap with the sequences of the 35S rRNA precursor transcribed by RNAP-I. The sequence located at the promoter region of RNAP-I, called the p-RNA transcript, binds to the transcription termination-related protein, Reb1p, while the T-RNA sequence, located in the termination sites of RNAP-I gene, contains the stem-loop recognized by Rtn1p, which is necessary for proper termination of RNAP-I. Because of their location, these small RNAs may play a key role in the initiation and termination of RNAP-I transcription. To correctly synthesize proteins, eukaryotic cells may retain a mechanism that connects the three main polymerases. This report suggests that cryptic transcription by RNAP-II may be required for normal transcription by RNAP-I in the ribosomal locus of S. cerevisiae. Copyright © 2012 John Wiley & Sons, Ltd.

  12. Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    MacArthur, Stewart; Li, Xiao-Yong; Li, Jingyi

    2009-05-15

    BACKGROUND: We previously established that six sequence-specific transcription factors that initiate anterior/posterior patterning in Drosophila bind to overlapping sets of thousands of genomic regions in blastoderm embryos. While regions bound at high levels include known and probable functional targets, more poorly bound regions are preferentially associated with housekeeping genes and/or genes not transcribed in the blastoderm, and are frequently found in protein coding sequences or in less conserved non-coding DNA, suggesting that many are likely non-functional. RESULTS: Here we show that an additional 15 transcription factors that regulate other aspects of embryo patterning show a similar quantitative continuum of functionmore » and binding to thousands of genomic regions in vivo. Collectively, the 21 regulators show a surprisingly high overlap in the regions they bind given that they belong to 11 DNA binding domain families, specify distinct developmental fates, and can act via different cis-regulatory modules. We demonstrate, however, that quantitative differences in relative levels of binding to shared targets correlate with the known biological and transcriptional regulatory specificities of these factors. CONCLUSIONS: It is likely that the overlap in binding of biochemically and functionally unrelated transcription factors arises from the high concentrations of these proteins in nuclei, which, coupled with their broad DNA binding specificities, directs them to regions of open chromatin. We suggest that most animal transcription factors will be found to show a similar broad overlapping pattern of binding in vivo, with specificity achieved by modulating the amount, rather than the identity, of bound factor.« less

  13. Identification of unannotated exons of low abundance transcripts in Drosophila melanogaster and cloning of a new serine protease gene upregulated upon injury.

    PubMed

    Maia, Rafaela M; Valente, Valeria; Cunha, Marco A V; Sousa, Josane F; Araujo, Daniela D; Silva, Wilson A; Zago, Marco A; Dias-Neto, Emmanuel; Souza, Sandro J; Simpson, Andrew J G; Monesi, Nadia; Ramos, Ricardo G P; Espreafico, Enilza M; Paçó-Larson, Maria L

    2007-07-24

    The sequencing of the D.melanogaster genome revealed an unexpected small number of genes (~ 14,000) indicating that mechanisms acting on generation of transcript diversity must have played a major role in the evolution of complex metazoans. Among the most extensively used mechanisms that accounts for this diversity is alternative splicing. It is estimated that over 40% of Drosophila protein-coding genes contain one or more alternative exons. A recent transcription map of the Drosophila embryogenesis indicates that 30% of the transcribed regions are unannotated, and that 1/3 of this is estimated as missed or alternative exons of previously characterized protein-coding genes. Therefore, the identification of the variety of expressed transcripts depends on experimental data for its final validation and is continuously being performed using different approaches. We applied the Open Reading Frame Expressed Sequence Tags (ORESTES) methodology, which is capable of generating cDNA data from the central portion of rare transcripts, in order to investigate the presence of hitherto unnanotated regions of Drosophila transcriptome. Bioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury. Using the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data.

  14. Identification of unannotated exons of low abundance transcripts in Drosophila melanogaster and cloning of a new serine protease gene upregulated upon injury

    PubMed Central

    Maia, Rafaela M; Valente, Valeria; Cunha, Marco AV; Sousa, Josane F; Araujo, Daniela D; Silva, Wilson A; Zago, Marco A; Dias-Neto, Emmanuel; Souza, Sandro J; Simpson, Andrew JG; Monesi, Nadia; Ramos, Ricardo GP; Espreafico, Enilza M; Paçó-Larson, Maria L

    2007-01-01

    Background The sequencing of the D.melanogaster genome revealed an unexpected small number of genes (~ 14,000) indicating that mechanisms acting on generation of transcript diversity must have played a major role in the evolution of complex metazoans. Among the most extensively used mechanisms that accounts for this diversity is alternative splicing. It is estimated that over 40% of Drosophila protein-coding genes contain one or more alternative exons. A recent transcription map of the Drosophila embryogenesis indicates that 30% of the transcribed regions are unannotated, and that 1/3 of this is estimated as missed or alternative exons of previously characterized protein-coding genes. Therefore, the identification of the variety of expressed transcripts depends on experimental data for its final validation and is continuously being performed using different approaches. We applied the Open Reading Frame Expressed Sequence Tags (ORESTES) methodology, which is capable of generating cDNA data from the central portion of rare transcripts, in order to investigate the presence of hitherto unnanotated regions of Drosophila transcriptome. Results Bioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury. Conclusion Using the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data. PMID:17650329

  15. cDNA cloning and characterization of the human THRAP2 gene which maps to chromosome 12q24, and its mouse ortholog Thrap2.

    PubMed

    Musante, Luciana; Bartsch, Oliver; Ropers, Hans-Hilger; Kalscheuer, Vera M

    2004-05-12

    Characterization of a balanced t(2;12)(q37;q24) translocation in a patient with suspicion of Noonan syndrome revealed that the chromosome 12 breakpoint lies in the vicinity of a novel human gene, thyroid hormone receptor-associated protein 2 (THRAP2). We therefore characterized this gene and its mouse counterpart in more detail. Human and mouse THRAP2/Thrap2 span a genomic region of about 310 and >170 kilobases (kb), and both contain 31 exons. Corresponding transcripts are approximately 9.5 kb long. Their open reading frames code for proteins of 2210 and 2203 amino acids, which are 93% identical. By northern blot analysis, human and mouse THRAP2/Thrap2 genes showed ubiquitous expression. Transcripts were most abundant in human skeletal muscle and in mouse heart. THRAP2 protein is 56% identical to human TRAP240, which belongs to the thyroid hormone receptor associated protein (TRAP) complex and is evolutionary conserved up to yeast. This complex is involved in transcriptional regulation and is believed to serve as adapting interface between regulatory proteins bound to specific DNA sequences and RNA polymerase II.

  16. Analysis of the TCP genes expressed in the inflorescence of the orchid Orchis italica

    PubMed Central

    De Paolo, Sofia; Gaudio, Luciano; Aceto, Serena

    2015-01-01

    TCP proteins are plant-specific transcription factors involved in many different processes. Because of their involvement in a large number of developmental pathways, their roles have been investigated in various plant species. However, there are almost no studies of this transcription factor family in orchids. Based on the available transcriptome of the inflorescence of the orchid Orchis italica, in the present study we identified 12 transcripts encoding TCP proteins. The phylogenetic analysis showed that they belong to different TCP classes (I and II) and groups (PCF, CIN and CYC/TB1), and that they display a number of conserved motifs when compared with the TCPs of Arabidopsis and Oryza. The presence of a specific cleavage site for the microRNA miRNA319, an important post-transcriptional regulator of several TCP genes in other species, was demonstrated for one transcript of O. italica, and the analysis of the expression pattern of the TCP transcripts in different inflorescence organs and in leaf tissue suggests that some TCP transcripts of O. italica exert their role only in specific tissues, while others may play multiple roles in different tissues. In addition, the evolutionary analysis showed a general purifying selection acting on the coding region of these transcripts. PMID:26531864

  17. Analysis of the TCP genes expressed in the inflorescence of the orchid Orchis italica.

    PubMed

    De Paolo, Sofia; Gaudio, Luciano; Aceto, Serena

    2015-11-04

    TCP proteins are plant-specific transcription factors involved in many different processes. Because of their involvement in a large number of developmental pathways, their roles have been investigated in various plant species. However, there are almost no studies of this transcription factor family in orchids. Based on the available transcriptome of the inflorescence of the orchid Orchis italica, in the present study we identified 12 transcripts encoding TCP proteins. The phylogenetic analysis showed that they belong to different TCP classes (I and II) and groups (PCF, CIN and CYC/TB1), and that they display a number of conserved motifs when compared with the TCPs of Arabidopsis and Oryza. The presence of a specific cleavage site for the microRNA miRNA319, an important post-transcriptional regulator of several TCP genes in other species, was demonstrated for one transcript of O. italica, and the analysis of the expression pattern of the TCP transcripts in different inflorescence organs and in leaf tissue suggests that some TCP transcripts of O. italica exert their role only in specific tissues, while others may play multiple roles in different tissues. In addition, the evolutionary analysis showed a general purifying selection acting on the coding region of these transcripts.

  18. Genome-wide transcriptional analysis of flagellar regeneration in Chlamydomonas reinhardtii identifies orthologs of ciliary disease genes

    NASA Technical Reports Server (NTRS)

    Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Marshall, Wallace F.

    2005-01-01

    The important role that cilia and flagella play in human disease creates an urgent need to identify genes involved in ciliary assembly and function. The strong and specific induction of flagellar-coding genes during flagellar regeneration in Chlamydomonas reinhardtii suggests that transcriptional profiling of such cells would reveal new flagella-related genes. We have conducted a genome-wide analysis of RNA transcript levels during flagellar regeneration in Chlamydomonas by using maskless photolithography method-produced DNA oligonucleotide microarrays with unique probe sequences for all exons of the 19,803 predicted genes. This analysis represents previously uncharacterized whole-genome transcriptional activity profiling study in this important model organism. Analysis of strongly induced genes reveals a large set of known flagellar components and also identifies a number of important disease-related proteins as being involved with cilia and flagella, including the zebrafish polycystic kidney genes Qilin, Reptin, and Pontin, as well as the testis-expressed tubby-like protein TULP2.

  19. Human-specific protein isoforms produced by novel splice sites in the human genome after the human-chimpanzee divergence.

    PubMed

    Kim, Dong Seon; Hahn, Yoonsoo

    2012-11-13

    Evolution of splice sites is a well-known phenomenon that results in transcript diversity during human evolution. Many novel splice sites are derived from repetitive elements and may not contribute to protein products. Here, we analyzed annotated human protein-coding exons and identified human-specific splice sites that arose after the human-chimpanzee divergence. We analyzed multiple alignments of the annotated human protein-coding exons and their respective orthologous mammalian genome sequences to identify 85 novel splice sites (50 splice acceptors and 35 donors) in the human genome. The novel protein-coding exons, which are expressed either constitutively or alternatively, produce novel protein isoforms by insertion, deletion, or frameshift. We found three cases in which the human-specific isoform conferred novel molecular function in the human cells: the human-specific IMUP protein isoform induces apoptosis of the trophoblast and is implicated in pre-eclampsia; the intronization of a part of SMOX gene exon produces inactive spermine oxidase; the human-specific NUB1 isoform shows reduced interaction with ubiquitin-like proteins, possibly affecting ubiquitin pathways. Although the generation of novel protein isoforms does not equate to adaptive evolution, we propose that these cases are useful candidates for a molecular functional study to identify proteomic changes that might bring about novel phenotypes during human evolution.

  20. Structural and functional studies of a family of Dictyostelium discoideum developmentally regulated, prestalk genes coding for small proteins.

    PubMed

    Vicente, Juan J; Galardi-Castilla, María; Escalante, Ricardo; Sastre, Leandro

    2008-01-03

    The social amoeba Dictyostelium discoideum executes a multicellular development program upon starvation. This morphogenetic process requires the differential regulation of a large number of genes and is coordinated by extracellular signals. The MADS-box transcription factor SrfA is required for several stages of development, including slug migration and spore terminal differentiation. Subtractive hybridization allowed the isolation of a gene, sigN (SrfA-induced gene N), that was dependent on the transcription factor SrfA for expression at the slug stage of development. Homology searches detected the existence of a large family of sigN-related genes in the Dictyostelium discoideum genome. The 13 most similar genes are grouped in two regions of chromosome 2 and have been named Group1 and Group2 sigN genes. The putative encoded proteins are 87-89 amino acids long. All these genes have a similar structure, composed of a first exon containing a 13 nucleotides long open reading frame and a second exon comprising the remaining of the putative coding region. The expression of these genes is induced at10 hours of development. Analyses of their promoter regions indicate that these genes are expressed in the prestalk region of developing structures. The addition of antibodies raised against SigN Group 2 proteins induced disintegration of multi-cellular structures at the mound stage of development. A large family of genes coding for small proteins has been identified in D. discoideum. Two groups of very similar genes from this family have been shown to be specifically expressed in prestalk cells during development. Functional studies using antibodies raised against Group 2 SigN proteins indicate that these genes could play a role during multicellular development.

  1. Identification and characterization of moonlighting long non-coding RNAs based on RNA and protein interactome.

    PubMed

    Cheng, Lixin; Leung, Kwong-Sak

    2018-05-16

    Moonlighting proteins are a class of proteins having multiple distinct functions, which play essential roles in a variety of cellular and enzymatic functioning systems. Although there have long been calls for computational algorithms for the identification of moonlighting proteins, research on approaches to identify moonlighting long non-coding RNAs (lncRNAs) has never been undertaken. Here, we introduce a novel methodology, MoonFinder, for the identification of moonlighting lncRNAs. MoonFinder is a statistical algorithm identifying moonlighting lncRNAs without a priori knowledge through the integration of protein interactome, RNA-protein interactions, and functional annotation of proteins. We identify 155 moonlighting lncRNA candidates and uncover that they are a distinct class of lncRNAs characterized by specific sequence and cellular localization features. The non-coding genes that transcript moonlighting lncRNAs tend to have shorter but more exons and the moonlighting lncRNAs have a variable localization pattern with a high chance of residing in the cytoplasmic compartment in comparison to the other lncRNAs. Moreover, moonlighting lncRNAs and moonlighting proteins are rather mutually exclusive in terms of both their direct interactions and interacting partners. Our results also shed light on how the moonlighting candidates and their interacting proteins implicated in the formation and development of cancers and other diseases. The code implementing MoonFinder is supplied as an R package in the supplementary material. lxcheng@cse.cuhk.edu.hk or ksleung@cse.cuhk.edu.hk. Supplementary data are available at Bioinformatics online.

  2. Nodeomics: Pathogen Detection in Vertebrate Lymph Nodes Using Meta-Transcriptomics

    USGS Publications Warehouse

    Wittekindt, Nicola E.; Padhi, Abinash; Schuster, Stephan C.; Qi, Ji; Zhao, Fangqing; Tomsho, Lynn P.; Kasson, Lindsay R.; Packard, Michael; Cross, Paul C.; Poss, Mary

    2010-01-01

    The ongoing emergence of human infections originating from wildlife highlights the need for better knowledge of the microbial community in wildlife species where traditional diagnostic approaches are limited. Here we evaluate the microbial biota in healthy mule deer (Odocoileus hemionus) by analyses of lymph node meta-transcriptomes. cDNA libraries from five individuals and two pools of samples were prepared from retropharyngeal lymph node RNA enriched for polyadenylated RNA and sequenced using Roche-454 Life Sciences technology. Protein-coding and 16S ribosomal RNA (rRNA) sequences were taxonomically profiled using protein and rRNA specific databases. Representatives of all bacterial phyla were detected in the seven libraries based on protein-coding transcripts indicating that viable microbiota were present in lymph nodes. Residents of skin and rumen, and those ubiquitous in mule deer habitat dominated classifiable bacterial species. Based on detection of both rRNA and protein-coding transcripts, we identified two new proteobacterial species; a Helicobacter closely related to Helicobacter cetorum in the Helicobacter pylori/Helicobacter acinonychis complex and an Acinetobacter related to Acinetobacter schindleri. Among viruses, a novel gamma retrovirus and other members of the Poxviridae and Retroviridae were identified. We additionally evaluated bacterial diversity by amplicon sequencing the hypervariable V6 region of 16S rRNA and demonstrate that overall taxonomic diversity is higher with the meta-transcriptomic approach. These data provide the most complete picture to date of the microbial diversity within a wildlife host. Our research advances the use of meta-transcriptomics to study microbiota in wildlife tissues, which will facilitate detection of novel organisms with pathogenic potential to human and animals.

  3. Positions of Trp Codons in the Leader Peptide-Coding Region of the at Operon Influence Anti-Trap Synthesis and trp Operon Expression in Bacillus licheniformis▿

    PubMed Central

    Levitin, Anastasia; Yanofsky, Charles

    2010-01-01

    Tryptophan, phenylalanine, tyrosine, and several other metabolites are all synthesized from a common precursor, chorismic acid. Since tryptophan is a product of an energetically expensive biosynthetic pathway, bacteria have developed sensing mechanisms to downregulate synthesis of the enzymes of tryptophan formation when synthesis of the amino acid is not needed. In Bacillus subtilis and some other Gram-positive bacteria, trp operon expression is regulated by two proteins, TRAP (the tryptophan-activated RNA binding protein) and AT (the anti-TRAP protein). TRAP is activated by bound tryptophan, and AT synthesis is increased upon accumulation of uncharged tRNATrp. Tryptophan-activated TRAP binds to trp operon leader RNA, generating a terminator structure that promotes transcription termination. AT binds to tryptophan-activated TRAP, inhibiting its RNA binding ability. In B. subtilis, AT synthesis is upregulated both transcriptionally and translationally in response to the accumulation of uncharged tRNATrp. In this paper, we focus on explaining the differences in organization and regulatory functions of the at operon's leader peptide-coding region, rtpLP, of B. subtilis and Bacillus licheniformis. Our objective was to correlate the greater growth sensitivity of B. licheniformis to tryptophan starvation with the spacing of the three Trp codons in its at operon leader peptide-coding region. Our findings suggest that the Trp codon location in rtpLP of B. licheniformis is designed to allow a mild charged-tRNATrp deficiency to expose the Shine-Dalgarno sequence and start codon for the AT protein, leading to increased AT synthesis. PMID:20061467

  4. Different domains of the murine RNA polymerase I-specific termination factor mTTF-I serve distinct functions in transcription termination.

    PubMed

    Evers, R; Smid, A; Rudloff, U; Lottspeich, F; Grummt, I

    1995-03-15

    Termination of mouse ribosomal gene transcription by RNA polymerase I (Pol I) requires the specific interaction of a DNA binding protein, mTTF-I, with an 18 bp sequence element located downstream of the rRNA coding region. Here we describe the molecular cloning and functional characterization of the cDNA encoding this transcription termination factor. Recombinant mTTF-I binds specifically to the murine terminator elements and terminates Pol I transcription in a reconstituted in vitro system. Deletion analysis has defined a modular structure of mTTF-I comprising a dispensable N-terminal half, a large C-terminal DNA binding region and an internal domain which is required for transcription termination. Significantly, the C-terminal region of mTTF-I reveals striking homology to the DNA binding domains of the proto-oncogene c-Myb and the yeast transcription factor Reb1p. Site-directed mutagenesis of one of the tryptophan residues that is conserved in the homology region of c-Myb, Reb1p and mTTF-I abolishes specific DNA binding, a finding which underscores the functional relevance of these residues in DNA-protein interactions.

  5. Different domains of the murine RNA polymerase I-specific termination factor mTTF-I serve distinct functions in transcription termination.

    PubMed Central

    Evers, R; Smid, A; Rudloff, U; Lottspeich, F; Grummt, I

    1995-01-01

    Termination of mouse ribosomal gene transcription by RNA polymerase I (Pol I) requires the specific interaction of a DNA binding protein, mTTF-I, with an 18 bp sequence element located downstream of the rRNA coding region. Here we describe the molecular cloning and functional characterization of the cDNA encoding this transcription termination factor. Recombinant mTTF-I binds specifically to the murine terminator elements and terminates Pol I transcription in a reconstituted in vitro system. Deletion analysis has defined a modular structure of mTTF-I comprising a dispensable N-terminal half, a large C-terminal DNA binding region and an internal domain which is required for transcription termination. Significantly, the C-terminal region of mTTF-I reveals striking homology to the DNA binding domains of the proto-oncogene c-Myb and the yeast transcription factor Reb1p. Site-directed mutagenesis of one of the tryptophan residues that is conserved in the homology region of c-Myb, Reb1p and mTTF-I abolishes specific DNA binding, a finding which underscores the functional relevance of these residues in DNA-protein interactions. Images PMID:7720715

  6. Molecular Evolution of the Non-Coding Eosinophil Granule Ontogeny Transcript

    PubMed Central

    Rose, Dominic; Stadler, Peter F.

    2011-01-01

    Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element. PMID:22303364

  7. Fluconazole Resistance Associated with Drug Efflux and Increased Transcription of a Drug Transporter Gene, PDH1, in Candida glabrata

    PubMed Central

    Miyazaki, Haruko; Miyazaki, Yoshitsugu; Geber, Antonia; Parkinson, Tanya; Hitchcock, Christopher; Falconer, Derek J.; Ward, Douglas J.; Marsden, Katherine; Bennett, John E.

    1998-01-01

    Sequential Candida glabrata isolates were obtained from the mouth of a patient infected with human immunodeficiency virus type 1 who was receiving high doses of fluconazole for oropharyngeal thrush. Fluconazole-susceptible colonies were replaced by resistant colonies that exhibited both increased fluconazole efflux and increased transcripts of a gene which codes for a protein with 72.5% identity to Pdr5p, an ABC multidrug transporter in Saccharomyces cerevisiae. The deduced protein had a molecular mass of 175 kDa and was composed of two homologous halves, each with six putative transmembrane domains and highly conserved sequences of ATP-binding domains. When the earliest and most azole-susceptible isolate of C. glabrata from this patient was exposed to fluconazole, increased transcripts of the PDR5 homolog appeared, linking azole exposure to regulation of this gene. PMID:9661006

  8. Expression of the Caulobacter heat shock gene dnaK is developmentally controlled during growth at normal temperatures.

    PubMed Central

    Gomes, S L; Gober, J W; Shapiro, L

    1990-01-01

    Caulobacter crescentus has a single dnaK gene that is highly homologous to the hsp70 family of heat shock genes. Analysis of the cloned and sequenced dnaK gene has shown that the deduced amino acid sequence could encode a protein of 67.6 kilodaltons that is 68% identical to the DnaK protein of Escherichia coli and 49% identical to the Drosophila and human hsp70 protein family. A partial open reading frame 165 base pairs 3' to the end of dnaK encodes a peptide of 190 amino acids that is 59% identical to DnaJ of E. coli. Northern blot analysis revealed a single 4.0-kilobase mRNA homologous to the cloned fragment. Since the dnaK coding region is 1.89 kilobases, dnaK and dnaJ may be transcribed as a polycistronic message. S1 mapping and primer extension experiments showed that transcription initiated at two sites 5' to the dnaK coding sequence. A single start site of transcription was identified during heat shock at 42 degrees C, and the predicted promoter sequence conformed to the consensus heat shock promoters of E. coli. At normal growth temperature (30 degrees C), a different start site was identified 3' to the heat shock start site that conformed to the E. coli sigma 70 promoter consensus sequence. S1 protection assays and analysis of expression of the dnaK gene fused to the lux transcription reporter gene showed that expression of dnaK is temporally controlled under normal physiological conditions and that transcription occurs just before the initiation of DNA replication. Thus, in both human cells (I. K. L. Milarski and R. I. Morimoto, Proc. Natl. Acad. Sci. USA 83:9517-9521, 1986) and in a simple bacterium, the transcription of a hsp70 gene is temporally controlled as a function of the cell cycle under normal growth conditions. Images PMID:2345134

  9. The Fragmented Mitochondrial Ribosomal RNAs of Plasmodium falciparum

    PubMed Central

    Feagin, Jean E.; Harrell, Maria Isabel; Lee, Jung C.; Coe, Kevin J.; Sands, Bryan H.; Cannone, Jamie J.; Tami, Germaine; Schnare, Murray N.; Gutell, Robin R.

    2012-01-01

    Background The mitochondrial genome in the human malaria parasite Plasmodium falciparum is most unusual. Over half the genome is composed of the genes for three classic mitochondrial proteins: cytochrome oxidase subunits I and III and apocytochrome b. The remainder encodes numerous small RNAs, ranging in size from 23 to 190 nt. Previous analysis revealed that some of these transcripts have significant sequence identity with highly conserved regions of large and small subunit rRNAs, and can form the expected secondary structures. However, these rRNA fragments are not encoded in linear order; instead, they are intermixed with one another and the protein coding genes, and are coded on both strands of the genome. This unorthodox arrangement hindered the identification of transcripts corresponding to other regions of rRNA that are highly conserved and/or are known to participate directly in protein synthesis. Principal Findings The identification of 14 additional small mitochondrial transcripts from P. falcipaurm and the assignment of 27 small RNAs (12 SSU RNAs totaling 804 nt, 15 LSU RNAs totaling 1233 nt) to specific regions of rRNA are supported by multiple lines of evidence. The regions now represented are highly similar to those of the small but contiguous mitochondrial rRNAs of Caenorhabditis elegans. The P. falciparum rRNA fragments cluster on the interfaces of the two ribosomal subunits in the three-dimensional structure of the ribosome. Significance All of the rRNA fragments are now presumed to have been identified with experimental methods, and nearly all of these have been mapped onto the SSU and LSU rRNAs. Conversely, all regions of the rRNAs that are known to be directly associated with protein synthesis have been identified in the P. falciparum mitochondrial genome and RNA transcripts. The fragmentation of the rRNA in the P. falciparum mitochondrion is the most extreme example of any rRNA fragmentation discovered. PMID:22761677

  10. Diversity of Antisense and Other Non-Coding RNAs in Archaea Revealed by Comparative Small RNA Sequencing in Four Pyrobaculum Species

    PubMed Central

    Bernick, David L.; Dennis, Patrick P.; Lui, Lauren M.; Lowe, Todd M.

    2012-01-01

    A great diversity of small, non-coding RNA (ncRNA) molecules with roles in gene regulation and RNA processing have been intensely studied in eukaryotic and bacterial model organisms, yet our knowledge of possible parallel roles for small RNAs (sRNA) in archaea is limited. We employed RNA-seq to identify novel sRNA across multiple species of the hyperthermophilic genus Pyrobaculum, known for unusual RNA gene characteristics. By comparing transcriptional data collected in parallel among four species, we were able to identify conserved RNA genes fitting into known and novel families. Among our findings, we highlight three novel cis-antisense sRNAs encoded opposite to key regulatory (ferric uptake regulator), metabolic (triose-phosphate isomerase), and core transcriptional apparatus genes (transcription factor B). We also found a large increase in the number of conserved C/D box sRNA genes over what had been previously recognized; many of these genes are encoded antisense to protein coding genes. The conserved opposition to orthologous genes across the Pyrobaculum genus suggests similarities to other cis-antisense regulatory systems. Furthermore, the genus-specific nature of these sRNAs indicates they are relatively recent, stable adaptations. PMID:22783241

  11. ICAM-1-related long non-coding RNA: promoter analysis and expression in human retinal endothelial cells.

    PubMed

    Lumsden, Amanda L; Ma, Yuefang; Ashander, Liam M; Stempel, Andrew J; Keating, Damien J; Smith, Justine R; Appukuttan, Binoy

    2018-05-09

    Regulation of intercellular adhesion molecule (ICAM)-1 in retinal endothelial cells is a promising druggable target for retinal vascular diseases. The ICAM-1-related (ICR) long non-coding RNA stabilizes ICAM-1 transcript, increasing protein expression. However, studies of ICR involvement in disease have been limited as the promoter is uncharacterized. To address this issue, we undertook a comprehensive in silico analysis of the human ICR gene promoter region. We used genomic evolutionary rate profiling to identify a 115 base pair (bp) sequence within 500 bp upstream of the transcription start site of the annotated human ICR gene that was conserved across 25 eutherian genomes. A second constrained sequence upstream of the orthologous mouse gene (68 bp; conserved across 27 Eutherian genomes including human) was also discovered. Searching these elements identified 33 matrices predictive of binding sites for transcription factors known to be responsive to a broad range of pathological stimuli, including hypoxia, and metabolic and inflammatory proteins. Five phenotype-associated single nucleotide polymorphisms (SNPs) in the immediate vicinity of these elements included four SNPs (i.e. rs2569693, rs281439, rs281440 and rs11575074) predicted to impact binding motifs of transcription factors, and thus the expression of ICR and ICAM-1 genes, with potential to influence disease susceptibility. We verified that human retinal endothelial cells expressed ICR, and observed induction of expression by tumor necrosis factor-α.

  12. Nuclear cereblon modulates transcriptional activity of Ikaros and regulates its downstream target, enkephalin, in human neuroblastoma cells

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wada, Takeyoshi; Asahi, Toru; Research Organization for Nano & Life Innovation, Waseda University #03C309, TWIns, 2-2 Wakamatsu, Shinjuku, Tokyo, 162-8480

    2016-08-26

    The gene coding cereblon (CRBN) was originally identified in genetic linkage analysis of mild autosomal recessive nonsyndromic intellectual disability. CRBN has broad localization in both the cytoplasm and nucleus. However, the significance of nuclear CRBN remains unknown. In the present study, we aimed to elucidate the role of CRBN in the nucleus. First, we generated a series of CRBN deletion mutants and determined the regions responsible for the nuclear localization. Only CRBN protein lacking the N-terminal region was localized outside of the nucleus, suggesting that the N-terminal region is important for its nuclear localization. CRBN was also identified as amore » thalidomide-binding protein and component of the cullin-4-containing E3 ubiquitin ligase complex. Thalidomide has been reported to be involved in the regulation of the transcription factor Ikaros by CRBN-mediated degradation. To investigate the nuclear functions of CRBN, we performed co-immunoprecipitation experiments and evaluated the binding of CRBN to Ikaros. As a result, we found that CRBN was associated with Ikaros protein, and the N-terminal region of CRBN was required for Ikaros binding. In luciferase reporter gene experiments, CRBN modulated transcriptional activity of Ikaros. Furthermore, we found that CRBN modulated Ikaros-mediated transcriptional repression of the proenkephalin gene by binding to its promoter region. These results suggest that CRBN binds to Ikaros via its N-terminal region and regulates transcriptional activities of Ikaros and its downstream target, enkephalin. - Highlights: • We found that CRBN is a nucleocytoplasmic shutting protein and identified the key domain for nucleocytoplasmic shuttling. • CRBN associates with the transcription factor Ikaros via the N-terminal domain. • CRBN modulates Ikaros-mediated transcriptional regulation and its downstream target, enkephalin.« less

  13. A Noncoding, Regulatory Mutation Implicates HCFC1 in Nonsyndromic Intellectual Disability

    PubMed Central

    Huang, Lingli; Jolly, Lachlan A.; Willis-Owen, Saffron; Gardner, Alison; Kumar, Raman; Douglas, Evelyn; Shoubridge, Cheryl; Wieczorek, Dagmar; Tzschach, Andreas; Cohen, Monika; Hackett, Anna; Field, Michael; Froyen, Guy; Hu, Hao; Haas, Stefan A.; Ropers, Hans-Hilger; Kalscheuer, Vera M.; Corbett, Mark A.; Gecz, Jozef

    2012-01-01

    The discovery of mutations causing human disease has so far been biased toward protein-coding regions. Having excluded all annotated coding regions, we performed targeted massively parallel resequencing of the nonrepetitive genomic linkage interval at Xq28 of family MRX3. We identified in the binding site of transcription factor YY1 a regulatory mutation that leads to overexpression of the chromatin-associated transcriptional regulator HCFC1. When tested on embryonic murine neural stem cells and embryonic hippocampal neurons, HCFC1 overexpression led to a significant increase of the production of astrocytes and a considerable reduction in neurite growth. Two other nonsynonymous, potentially deleterious changes have been identified by X-exome sequencing in individuals with intellectual disability, implicating HCFC1 in normal brain function. PMID:23000143

  14. Genome-Wide Spectra of Transcription Insertions and Deletions Reveal That Slippage Depends on RNA:DNA Hybrid Complementarity.

    PubMed

    Traverse, Charles C; Ochman, Howard

    2017-08-29

    Advances in sequencing technologies have enabled direct quantification of genome-wide errors that occur during RNA transcription. These errors occur at rates that are orders of magnitude higher than rates during DNA replication, but due to technical difficulties such measurements have been limited to single-base substitutions and have not yet quantified the scope of transcription insertions and deletions. Previous reporter gene assay findings suggested that transcription indels are produced exclusively by elongation complex slippage at homopolymeric runs, so we enumerated indels across the protein-coding transcriptomes of Escherichia coli and Buchnera aphidicola , which differ widely in their genomic base compositions and incidence of repeat regions. As anticipated from prior assays, transcription insertions prevailed in homopolymeric runs of A and T; however, transcription deletions arose in much more complex sequences and were rarely associated with homopolymeric runs. By reconstructing the relocated positions of the elongation complex as inferred from the sequences inserted or deleted during transcription, we show that continuation of transcription after slippage hinges on the degree of nucleotide complementarity within the RNA:DNA hybrid at the new DNA template location. IMPORTANCE The high level of mistakes generated during transcription can result in the accumulation of malfunctioning and misfolded proteins which can alter global gene regulation and in the expenditure of energy to degrade these nonfunctional proteins. The transcriptome-wide occurrence of base substitutions has been elucidated in bacteria, but information on transcription insertions and deletions-errors that potentially have more dire effects on protein function-is limited to reporter gene constructs. Here, we capture the transcriptome-wide spectrum of insertions and deletions in Escherichia coli and Buchnera aphidicola and show that they occur at rates approaching those of base substitutions. Knowledge of the full extent of sequences subject to transcription indels supports a new model of bacterial transcription slippage, one that relies on the number of complementary bases between the transcript and the DNA template to which it slipped. Copyright © 2017 Traverse and Ochman.

  15. Replication of poliovirus RNA and subgenomic RNA transcripts in transfected cells.

    PubMed Central

    Collis, P S; O'Donnell, B J; Barton, D J; Rogers, J A; Flanegan, J B

    1992-01-01

    Full-length and subgenomic poliovirus RNAs were transcribed in vitro and transfected into HeLa cells to study viral RNA replication in vivo. RNAs with deletion mutations were analyzed for the ability to replicate in either the absence or the presence of helper RNA by using a cotransfection procedure and Northern (RNA) blot analysis. An advantage of this approach was that viral RNA replication and genetic complementation could be characterized without first isolating conditional-lethal mutants. A subgenomic RNA with a large in-frame deletion in the capsid coding region (P1) replicated more efficiently than full-length viral RNA transcripts. In cotransfection experiments, both the full-length and subgenomic RNAs replicated at slightly reduced levels and appeared to interfere with each other's replication. In contrast, a subgenomic RNA with a similarly sized out-of-frame deletion in P1 did not replicate in transfected cells, either alone or in the presence of helper RNA. Similar results were observed with an RNA transcript containing a large in-frame deletion spanning the P1, P2, and P3 coding regions. A mutant RNA with an in-frame deletion in the P1-2A coding sequence was self-replicating but at a significantly reduced level. The replication of this RNA was fully complemented after cotransfection with a helper RNA that provided 2A in trans. A P1-2A-2B in-frame deletion, however, totally blocked RNA replication and was not complemented. Control experiments showed that all of the expected viral proteins were both synthesized and processed when the RNA transcripts were translated in vitro. Thus, our results indicated that 2A was a trans-acting protein and that 2B and perhaps other viral proteins were cis acting during poliovirus RNA replication in vivo. Our data support a model for poliovirus RNA replication which directly links the translation of a molecule of plus-strand RNA with the formation of a replication complex for minus-strand RNA synthesis. Images PMID:1328676

  16. Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations

    PubMed Central

    Araya, Carlos L.; Cenik, Can; Reuter, Jason A.; Kiss, Gert; Pande, Vijay S.; Snyder, Michael P.; Greenleaf, William J.

    2015-01-01

    Cancer sequencing studies have primarily identified cancer-driver genes by the accumulation of protein-altering mutations. An improved method would be annotation-independent, sensitive to unknown distributions of functions within proteins, and inclusive of non-coding drivers. We employed density-based clustering methods in 21 tumor types to detect variably-sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and non-coding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs reveal spatial clustering of mutations at molecular domains and interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated among tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally-agnostic driver identification. PMID:26691984

  17. Maternal transcription of non-protein coding RNAs from the PWS-critical region rescues growth retardation in mice

    PubMed Central

    Rozhdestvensky, Timofey S.; Robeck, Thomas; Galiveti, Chenna R.; Raabe, Carsten A.; Seeger, Birte; Wolters, Anna; Gubar, Leonid V.; Brosius, Jürgen; Skryabin, Boris V.

    2016-01-01

    Prader-Willi syndrome (PWS) is a neurogenetic disorder caused by loss of paternally expressed genes on chromosome 15q11-q13. The PWS-critical region (PWScr) contains an array of non-protein coding IPW-A exons hosting intronic SNORD116 snoRNA genes. Deletion of PWScr is associated with PWS in humans and growth retardation in mice exhibiting ~15% postnatal lethality in C57BL/6 background. Here we analysed a knock-in mouse containing a 5′HPRT-LoxP-NeoR cassette (5′LoxP) inserted upstream of the PWScr. When the insertion was inherited maternally in a paternal PWScr-deletion mouse model (PWScrp−/m5′LoxP), we observed compensation of growth retardation and postnatal lethality. Genomic methylation pattern and expression of protein-coding genes remained unaltered at the PWS-locus of PWScrp−/m5′LoxP mice. Interestingly, ubiquitous Snord116 and IPW-A exon transcription from the originally silent maternal chromosome was detected. In situ hybridization indicated that PWScrp−/m5′LoxP mice expressed Snord116 in brain areas similar to wild type animals. Our results suggest that the lack of PWScr RNA expression in certain brain areas could be a primary cause of the growth retardation phenotype in mice. We propose that activation of disease-associated genes on imprinted regions could lead to general therapeutic strategies in associated diseases. PMID:26848093

  18. Non-coding RNAs—Novel targets in neurotoxicity

    PubMed Central

    Tal, Tamara L.; Tanguay, Robert L.

    2012-01-01

    Over the past ten years non-coding RNAs (ncRNAs) have emerged as pivotal players in fundamental physiological and cellular processes and have been increasingly implicated in cancer, immune disorders, and cardiovascular, neurodegenerative, and metabolic diseases. MicroRNAs (miRNAs) represent a class of ncRNA molecules that function as negative regulators of post-transcriptional gene expression. miRNAs are predicted to regulate 60% of all human protein-coding genes and as such, play key roles in cellular and developmental processes, human health, and disease. Relative to counterparts that lack bindings sites for miRNAs, genes encoding proteins that are post-transcriptionally regulated by miRNAs are twice as likely to be sensitive to environmental chemical exposure. Not surprisingly, miRNAs have been recognized as targets or effectors of nervous system, developmental, hepatic, and carcinogenic toxicants, and have been identified as putative regulators of phase I xenobiotic-metabolizing enzymes. In this review, we give an overview of the types of ncRNAs and highlight their roles in neurodevelopment, neurological disease, activity-dependent signaling, and drug metabolism. We then delve into specific examples that illustrate their importance as mediators, effectors, or adaptive agents of neurotoxicants or neuroactive pharmaceutical compounds. Finally, we identify a number of outstanding questions regarding ncRNAs and neurotoxicity. PMID:22394481

  19. Tenebrio molitor antifreeze protein gene identification and regulation.

    PubMed

    Qin, Wensheng; Walker, Virginia K

    2006-02-15

    The yellow mealworm, Tenebrio molitor, is a freeze susceptible, stored product pest. Its winter survival is facilitated by the accumulation of antifreeze proteins (AFPs), encoded by a small gene family. We have now isolated 11 different AFP genomic clones from 3 genomic libraries. All the clones had a single coding sequence, with no evidence of intervening sequences. Three genomic clones were further characterized. All have putative TATA box sequences upstream of the coding regions and multiple potential poly(A) signal sequences downstream of the coding regions. A TmAFP regulatory region, B1037, conferred transcriptional activity when ligated to a luciferase reporter sequence and after transfection into an insect cell line. A 143 bp core promoter including a TATA box sequence was identified. Its promoter activity was increased 4.4 times by inserting an exotic 245 bp intron into the construct, similar to the enhancement of transgenic expression seen in several other systems. The addition of a duplication of the first 120 bp sequence from the 143 bp core promoter decreased promoter activity by half. Although putative hormonal response sequences were identified, none of the five hormones tested enhanced reporter activity. These studies on the mechanisms of AFP transcriptional control are important for the consideration of any transfer of freeze-resistance phenotypes to beneficial hosts.

  20. Regulatory variation: an emerging vantage point for cancer biology.

    PubMed

    Li, Luolan; Lorzadeh, Alireza; Hirst, Martin

    2014-01-01

    Transcriptional regulation involves complex and interdependent interactions of noncoding and coding regions of the genome with proteins that interact and modify them. Genetic variation/mutation in coding and noncoding regions of the genome can drive aberrant transcription and disease. In spite of accounting for nearly 98% of the genome comparatively little is known about the contribution of noncoding DNA elements to disease. Genome-wide association studies of complex human diseases including cancer have revealed enrichment for variants in the noncoding genome. A striking finding of recent cancer genome re-sequencing efforts has been the previously underappreciated frequency of mutations in epigenetic modifiers across a wide range of cancer types. Taken together these results point to the importance of dysregulation in transcriptional regulatory control in genesis of cancer. Powered by recent technological advancements in functional genomic profiling, exploration of normal and transformed regulatory networks will provide novel insight into the initiation and progression of cancer and open new windows to future prognostic and diagnostic tools. © 2013 Wiley Periodicals, Inc.

  1. Perspectives on the mechanism of transcriptional regulation by long non-coding RNAs.

    PubMed

    Roberts, Thomas C; Morris, Kevin V; Weinberg, Marc S

    2014-01-01

    Long non-coding RNAs (lncRNAs) are increasingly being recognized as epigenetic regulators of gene transcription. The diversity and complexity of lncRNA genes means that they exert their regulatory effects by a variety of mechanisms. Although there is still much to be learned about the mechanism of lncRNA function, general principles are starting to emerge. In particular, the application of high throughput (deep) sequencing methodologies has greatly advanced our understanding of lncRNA gene function. lncRNAs function as adaptors that link specific chromatin loci with chromatin-remodeling complexes and transcription factors. lncRNAs can act in cis or trans to guide epigenetic-modifier complexes to distinct genomic sites, or act as scaffolds which recruit multiple proteins simultaneously, thereby coordinating their activities. In this review we discuss the genomic organization of lncRNAs, the importance of RNA secondary structure to lncRNA functionality, the multitude of ways in which they interact with the genome, and what evolutionary conservation tells us about their function.

  2. Long Noncoding RNAs: New Players in the Osteogenic Differentiation of Bone Marrow- and Adipose-Derived Mesenchymal Stem Cells.

    PubMed

    Yang, Qiaolin; Jia, Lingfei; Li, Xiaobei; Guo, Runzhi; Huang, Yiping; Zheng, Yunfei; Li, Weiran

    2018-06-01

    Mesenchymal stem cells (MSCs) are an important population of multipotent stem cells that differentiate into multiple lineages and display great potential in bone regeneration and repair. Although the role of protein-coding genes in the osteogenic differentiation of MSCs has been extensively studied, the functions of noncoding RNAs in the osteogenic differentiation of MSCs are unclear. The recent application of next-generation sequencing to MSC transcriptomes has revealed that long noncoding RNAs (lncRNAs) are associated with the osteogenic differentiation of MSCs. LncRNAs are a class of non-coding transcripts of more than 200 nucleotides in length. Noncoding RNAs are thought to play a key role in osteoblast differentiation through various regulatory mechanisms including chromatin modification, transcription factor binding, competent endogenous mechanism, and other post-transcriptional mechanisms. Here, we review the roles of lncRNAs in the osteogenic differentiation of bone marrow- and adipose-derived stem cells and provide a theoretical foundation for future research.

  3. Advances in esophageal cancer: A new perspective on pathogenesis associated with long non-coding RNAs.

    PubMed

    Huang, Xiaomei; Zhou, Xi; Hu, Qing; Sun, Binyu; Deng, Mingming; Qi, Xiaolong; Lü, Muhan

    2018-01-28

    Esophageal cancer is a malignant digestive tract cancer with high mortality. Although studies have found that esophageal cancer is involved in a complex and important gene regulation network, the pathogenesis remains unclear. The recently described long non-coding RNAs (lncRNAs) are one effective part of the gene regulation network. However, in past decades, lncRNAs were thought to be "transcript noise" or "pseudogenes" and were thus ignored. Early studies indicated that lncRNAs play pivotal roles during evolution. However, in recent years, increasing research has revealed that many lncRNAs are associated with tumorigenesis. In particular, lncRNAs may act as important elements for epigenetic regulation, transcription, post-transcriptional regulation and post-translational modification of proteins. Additionally, they may be novel biomarkers for tumors and therapeutic targets in cancer. Here, we summarize the functions of lncRNAs in esophageal cancer, with an emphasis on lncRNA-mediated regulatory mechanisms that affect the biological characteristics of esophageal cancer. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. The primitive code and repeats of base oligomers as the primordial protein-encoding sequence.

    PubMed Central

    Ohno, S; Epplen, J T

    1983-01-01

    Even if the prebiotic self-replication of nucleic acids and the subsequent emergence of primitive, enzyme-independent tRNAs are accepted as plausible, the origin of life by spontaneous generation still appears improbable. This is because the just-emerged primitive translational machinery had to cope with base sequences that were not preselected for their coding potentials. Particularly if the primitive mitochondria-like code with four chain-terminating base triplets preceded the universal code, the translation of long, randomly generated, base sequences at this critical stage would have merely resulted in the production of short oligopeptides instead of long polypeptide chains. We present the base sequence of a mouse transcript containing tetranucleotide repeats conserved during evolution. Even if translated in accordance with the primitive mitochondria-like code, this transcript in its three reading frames can yield 245-, 246-, and 251-residue-long tetrapeptidic periodical polypeptides that are already acquiring longer periodicities. We contend that the first set of base sequences translated at the beginning of life were such oligonucleotide repeats. By quickly acquiring longer periodicities, their products must have soon gained characteristic secondary structures--alpha-helical or beta-sheet or both. PMID:6574491

  5. Identification of the Operon for the Sorbitol (Glucitol) Phosphoenolpyruvate:Sugar Phosphotransferase System in Streptococcus mutans

    PubMed Central

    Boyd, David A.; Thevenot, Tracy; Gumbmann, Markus; Honeyman, Allen L.; Hamilton, Ian R.

    2000-01-01

    Transposon mutagenesis and marker rescue were used to isolate and identify an 8.5-kb contiguous region containing six open reading frames constituting the operon for the sorbitol P-enolpyruvate phosphotransferase transport system (PTS) of Streptococcus mutans LT11. The first gene, srlD, codes for sorbitol-6-phosphate dehydrogenase, followed downstream by srlR, coding for a transcriptional regulator; srlM, coding for a putative activator; and the srlA, srlE, and srlB genes, coding for the EIIC, EIIBC, and EIIA components of the sorbitol PTS, respectively. Among all sorbitol PTS operons characterized to date, the srlD gene is found after the genes coding for the EII components; thus, the location of the gene in S. mutans is unique. The SrlR protein is similar to several transcriptional regulators found in Bacillus spp. that contain PTS regulator domains (J. Stülke, M. Arnaud, G. Rapoport, and I. Martin-Verstraete, Mol. Microbiol. 28:865–874, 1998), and its gene overlaps the srlM gene by 1 bp. The arrangement of these two regulatory genes is unique, having not been reported for other bacteria. PMID:10639465

  6. Long Non-Coding RNAs: A Novel Paradigm for Toxicology

    PubMed Central

    Dempsey, Joseph L.; Cui, Julia Yue

    2017-01-01

    Long non-coding RNAs (lncRNAs) are over 200 nucleotides in length and are transcribed from the mammalian genome in a tissue-specific and developmentally regulated pattern. There is growing recognition that lncRNAs are novel biomarkers and/or key regulators of toxicological responses in humans and animal models. Lacking protein-coding capacity, the numerous types of lncRNAs possess a myriad of transcriptional regulatory functions that include cis and trans gene expression, transcription factor activity, chromatin remodeling, imprinting, and enhancer up-regulation. LncRNAs also influence mRNA processing, post-transcriptional regulation, and protein trafficking. Dysregulation of lncRNAs has been implicated in various human health outcomes such as various cancers, Alzheimer’s disease, cardiovascular disease, autoimmune diseases, as well as intermediary metabolism such as glucose, lipid, and bile acid homeostasis. Interestingly, emerging evidence in the literature over the past five years has shown that lncRNA regulation is impacted by exposures to various chemicals such as polycyclic aromatic hydrocarbons, benzene, cadmium, chlorpyrifos-methyl, bisphenol A, phthalates, phenols, and bile acids. Recent technological advancements, including next-generation sequencing technologies and novel computational algorithms, have enabled the profiling and functional characterizations of lncRNAs on a genomic scale. In this review, we summarize the biogenesis and general biological functions of lncRNAs, highlight the important roles of lncRNAs in human diseases and especially during the toxicological responses to various xenobiotics, evaluate current methods for identifying aberrant lncRNA expression and molecular target interactions, and discuss the potential to implement these tools to address fundamental questions in toxicology. PMID:27864543

  7. The Mediator complex and transcription regulation

    PubMed Central

    Poss, Zachary C.; Ebmeier, Christopher C.

    2013-01-01

    The Mediator complex is a multi-subunit assembly that appears to be required for regulating expression of most RNA polymerase II (pol II) transcripts, which include protein-coding and most non-coding RNA genes. Mediator and pol II function within the pre-initiation complex (PIC), which consists of Mediator, pol II, TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH and is approximately 4.0 MDa in size. Mediator serves as a central scaffold within the PIC and helps regulate pol II activity in ways that remain poorly understood. Mediator is also generally targeted by sequence-specific, DNA-binding transcription factors (TFs) that work to control gene expression programs in response to developmental or environmental cues. At a basic level, Mediator functions by relaying signals from TFs directly to the pol II enzyme, thereby facilitating TF-dependent regulation of gene expression. Thus, Mediator is essential for converting biological inputs (communicated by TFs) to physiological responses (via changes in gene expression). In this review, we summarize an expansive body of research on the Mediator complex, with an emphasis on yeast and mammalian complexes. We focus on the basics that underlie Mediator function, such as its structure and subunit composition, and describe its broad regulatory influence on gene expression, ranging from chromatin architecture to transcription initiation and elongation, to mRNA processing. We also describe factors that influence Mediator structure and activity, including TFs, non-coding RNAs and the CDK8 module. PMID:24088064

  8. Relationships between Translation and Transcription Processes during fMRI Connectivity Scanning and Coded Translation and Transcription in Writing Products after Scanning in Children with and without Transcription Disabilities

    PubMed Central

    Wallis, Peter; Richards, Todd; Boord, Peter; Abbott, Robert; Berninger, Virginia

    2018-01-01

    Students with transcription disabilities (dysgraphia/impaired handwriting, n = 13 or dyslexia/impaired word spelling, n = 16) or without transcription disabilities (controls) completed transcription and translation (idea generating, planning, and creating) writing tasks during fMRI connectivity scanning and compositions after scanning, which were coded for transcription and translation variables. Compositions in both groups showed diversity in genre beyond usual narrative-expository distinction; groups differed in coded transcription but not translation variables. For the control group specific transcription or translation tasks during scanning correlated with corresponding coded transcription or translation skills in composition, but connectivity during scanning was not correlated with coded handwriting during composing in dysgraphia group and connectivity during translating was not correlated with any coded variable during composing in dyslexia group. Results are discussed in reference to the trend in neuroscience to use connectivity from relevant seed points while performing tasks and trends in education to recognize the generativity (creativity) of composing at both the genre and syntax levels. PMID:29600113

  9. A human haploid gene trap collection to study lncRNAs with unusual RNA biology.

    PubMed

    Kornienko, Aleksandra E; Vlatkovic, Irena; Neesen, Jürgen; Barlow, Denise P; Pauler, Florian M

    2016-01-01

    Many thousand long non-coding (lnc) RNAs are mapped in the human genome. Time consuming studies using reverse genetic approaches by post-transcriptional knock-down or genetic modification of the locus demonstrated diverse biological functions for a few of these transcripts. The Human Gene Trap Mutant Collection in haploid KBM7 cells is a ready-to-use tool for studying protein-coding gene function. As lncRNAs show remarkable differences in RNA biology compared to protein-coding genes, it is unclear if this gene trap collection is useful for functional analysis of lncRNAs. Here we use the uncharacterized LOC100288798 lncRNA as a model to answer this question. Using public RNA-seq data we show that LOC100288798 is ubiquitously expressed, but inefficiently spliced. The minor spliced LOC100288798 isoforms are exported to the cytoplasm, whereas the major unspliced isoform is nuclear localized. This shows that LOC100288798 RNA biology differs markedly from typical mRNAs. De novo assembly from RNA-seq data suggests that LOC100288798 extends 289kb beyond its annotated 3' end and overlaps the downstream SLC38A4 gene. Three cell lines with independent gene trap insertions in LOC100288798 were available from the KBM7 gene trap collection. RT-qPCR and RNA-seq confirmed successful lncRNA truncation and its extended length. Expression analysis from RNA-seq data shows significant deregulation of 41 protein-coding genes upon LOC100288798 truncation. Our data shows that gene trap collections in human haploid cell lines are useful tools to study lncRNAs, and identifies the previously uncharacterized LOC100288798 as a potential gene regulator.

  10. The role of alternative Polyadenylation in regulation of rhythmic gene expression.

    PubMed

    Ptitsyna, Natalia; Boughorbel, Sabri; El Anbari, Mohammed; Ptitsyn, Andrey

    2017-08-04

    Alternative transcription is common in eukaryotic cells and plays important role in regulation of cellular processes. Alternative polyadenylation results from ambiguous PolyA signals in 3' untranslated region (UTR) of a gene. Such alternative transcripts share the same coding part, but differ by a stretch of UTR that may contain important functional sites. The methodoogy of this study is based on mathematical modeling, analytical solution, and subsequent validation by datamining in multiple independent experimental data from previously published studies. In this study we propose a mathematical model that describes the population dynamics of alternatively polyadenylated transcripts in conjunction with rhythmic expression such as transcription oscillation driven by circadian or metabolic oscillators. Analysis of the model shows that alternative transcripts with different turnover rates acquire a phase shift if the transcript decay rate is different. Difference in decay rate is one of the consequences of alternative polyadenylation. Phase shift can reach values equal to half the period of oscillation, which makes alternative transcripts oscillate in abundance in counter-phase to each other. Since counter-phased transcripts share the coding part, the rate of translation becomes constant. We have analyzed a few data sets collected in circadian timeline for the occurrence of transcript behavior that fits the mathematical model. Alternative transcripts with different turnover rate create the effect of rectifier. This "molecular diode" moderates or completely eliminates oscillation of individual transcripts and stabilizes overall protein production rate. In our observation this phenomenon is very common in different tissues in plants, mice, and humans. The occurrence of counter-phased alternative transcripts is also tissue-specific and affects functions of multiple biological pathways. Accounting for this mechanism is important for understanding the natural and engineering the synthetic cellular circuits.

  11. Biosynthesis and expression of ependymin homologous sequences in zebrafish brain.

    PubMed

    Sterrer, S; Königstorfer, A; Hoffmann, W

    1990-01-01

    Ependymins are unique, brain specific glycoproteins, which are major constituents of the cerebrospinal fluid. Originally, they were discovered in goldfish and are thought to be involved in synaptic plasticity. In the present study two transcripts were characterized in Brachydanio rerio originating from a single gene possibly by alternative splicing. These transcripts differ only in the length of their 3'-non-coding-regions and the encoded protein shares 90 and 88% homology with the two corresponding goldfish proteins, respectively. In situ hybridization revealed the expression of ependymins exclusively in the leptomeninx including its invaginations but not at all in the ependymal layer surrounding the ventricles. An initial developmental profile showed that ependymins first appear before hatching, i.e. between 48 and 72 h postfertilization.

  12. The Nrf2-antioxidant response element pathway: a target for regulating energy metabolism

    USDA-ARS?s Scientific Manuscript database

    The nuclear factor E2-related factor 2 (Nrf2) is a transcription factor that responds to oxidative stress by binding to the antioxidant response element (ARE) in the promoter of genes coding for antioxidant enzymes like NAD(P)H:quinone oxidoreductase 1 (NQO1) and proteins for glutathione synthesis. ...

  13. Evidence for Phex haploinsufficiency in murine X-linked hypophosphatemia.

    PubMed

    Wang, L; Du, L; Ecarot, B

    1999-04-01

    Mutations in the PHEX gene (phosphate-regulating gene with homology to endopeptidases on the X-chromosome) are responsible for X-linked hypophosphatemia (HYP). We previously reported the full-length coding sequence of murine Phex cDNA and provided evidence of Phex expression in bone and tooth. Here, we report the cloning of the entire 3.5-kb 3'UTR of the Phex gene, yielding a total of 6248 bp for the Phex transcript. Southern blot and RT-PCR analyses revealed that the 3' end of the coding sequence and the 3'UTR of the Phex gene, spanning exons 16 to 22, are deleted in Hyp, the mouse model for HYP. Northern blot analysis of bone revealed lack of expression of stable Phex mRNA from the mutant allele and expression of Phex transcripts from the wild-type allele in Hyp heterozygous females. Expression of the Phex protein in heterozygotes was confirmed by Western analysis with antibodies raised against a COOH-terminal peptide of the mouse Phex protein. Taken together, these results indicate that the dominant pattern of Hyp inheritance in mice is due to Phex haploinsufficiency.

  14. Control of Fur synthesis by the non-coding RNA RyhB and iron-responsive decoding.

    PubMed

    Vecerek, Branislav; Moll, Isabella; Bläsi, Udo

    2007-02-21

    The Fe2+-dependent Fur protein serves as a negative regulator of iron uptake in bacteria. As only metallo-Fur acts as an autogeneous repressor, Fe2+scarcity would direct fur expression when continued supply is not obviously required. We show that in Escherichia coli post-transcriptional regulatory mechanisms ensure that Fur synthesis remains steady in iron limitation. Our studies revealed that fur translation is coupled to that of an upstream open reading frame (uof), translation of which is downregulated by the non-coding RNA (ncRNA) RyhB. As RyhB transcription is negatively controlled by metallo-Fur, iron depletion creates a negative feedback loop. RyhB-mediated regulation of uof-fur provides the first example for indirect translational regulation by a trans-encoded ncRNA. In addition, we present evidence for an iron-responsive decoding mechanism of the uof-fur entity. It could serve as a backup mechanism of the RyhB circuitry, and represents the first link between iron availability and synthesis of an iron-containing protein.

  15. Deciphering Mineral Homeostasis in Barley Seed Transfer Cells at Transcriptional Level

    PubMed Central

    Borg, Søren

    2015-01-01

    In addition to the micronutrient inadequacy of staple crops for optimal human nutrition, a global downtrend in crop-quality has emerged from intensive breeding for yield. This trend will be aggravated by elevated levels of the greenhouse gas carbon dioxide. Therefore, crop biofortification is inevitable to ensure a sustainable supply of minerals to the large part of human population who is dietary dependent on staple crops. This requires a thorough understanding of plant-mineral interactions due to the complexity of mineral homeostasis. Employing RNA sequencing, we here communicate transfer cell specific effects of excess iron and zinc during grain filling in our model crop plant barley. Responding to alterations in mineral contents, we found a long range of different genes and transcripts. Among them, it is worth to highlight the auxin and ethylene signaling factors Arfs, Abcbs, Cand1, Hps4, Hac1, Ecr1, and Ctr1, diurnal fluctuation components Sdg2, Imb1, Lip1, and PhyC, retroelements, sulfur homeostasis components Amp1, Hmt3, Eil3, and Vip1, mineral trafficking components Med16, Cnnm4, Aha2, Clpc1, and Pcbps, and vacuole organization factors Ymr155W, RabG3F, Vps4, and Cbl3. Our analysis introduces new interactors and signifies a broad spectrum of regulatory levels from chromatin remodeling to intracellular protein sorting mechanisms active in the plant mineral homeostasis. The results highlight the importance of storage proteins in metal ion toxicity-resistance and chelation. Interestingly, the protein sorting and recycling factors Exoc7, Cdc1, Sec23A, and Rab11A contributed to the response as well as the polar distributors of metal-transporters ensuring the directional flow of minerals. Alternative isoform switching was found important for plant adaptation and occurred among transcripts coding for identical proteins as well as transcripts coding for protein isoforms. We also identified differences in the alternative-isoform preference between the treatments, indicating metal-affinity shifts among isoforms of metal transporters. Most important, we found the zinc treatment to impair both photosynthesis and respiration. A wide range of transcriptional changes including stress-related genes and negative feedback loops emphasize the importance to withhold mineral contents below certain cellular levels which otherwise might lead to agronomical impeding side-effects. By illustrating new mechanisms, genes, and transcripts, this report provides a solid platform towards understanding the complex network of plant mineral homeostasis. PMID:26536247

  16. Genomic localization of the human gene encoding Dr1, a negative modulator of transcription of class II and class III genes.

    PubMed

    Purrello, M; Di Pietro, C; Rapisarda, A; Viola, A; Corsaro, C; Motta, S; Grzeschik, K H; Sichel, G

    1996-01-01

    Dr1 is a nuclear protein of 19 kDa that exists in the nucleoplasm as a homotetramer. By binding to TBP (the DNA-binding subunit of TFIID, and also a subunit of SL1 and TFIIIB), the protein blocks class II and class III preinitiation complex assembly, thus repressing the activity of the corresponding promoters. Since transcription of class I genes is unaffected by Dr1. it has been proposed that the protein may coordinate the expression of class I, class II and class III genes. By somatic cell genetics and fluorescence in situ hybridization, we have localized the gene (DR1), present in the genome of higher eukaryotes as a single copy, to human chromosome region 1p21-->p13. The nucleotide sequence conservation of the coding segment of the gene, as determined by Noah's ark blot analysis, and its ubiquitous transcription suggest that Dr1 has an important biological role, which could be related to the negative control of cell proliferation.

  17. SMN control of RNP assembly: from post-transcriptional gene regulation to motor neuron disease

    PubMed Central

    Li, Darrick K.; Tisdale, Sarah; Lotti, Francesco; Pellizzoni, Livio

    2014-01-01

    At the post-transcriptional level, expression of protein-coding genes is controlled by a series of RNA regulatory events including nuclear processing of primary transcripts, transport of mature mRNAs to specific cellular compartments, translation and ultimately, turnover. These processes are orchestrated through the dynamic association of mRNAs with RNA binding proteins and ribonucleoprotein (RNP) complexes. Accurate formation of RNPs in vivo is fundamentally important to cellular development and function, and its impairment often leads to human disease. The survival motor neuron (SMN) protein is key to this biological paradigm: SMN is essential for the biogenesis of various RNPs that function in mRNA processing, and genetic mutations leading to SMN deficiency cause the neurodegenerative disease spinal muscular atrophy. Here we review the expanding role of SMN in the regulation of gene expression through its multiple functions in RNP assembly. We discuss advances in our understanding of SMN activity as a chaperone of RNPs and how disruption of SMN-dependent RNA pathways can cause motor neuron disease. PMID:24769255

  18. Human-specific protein isoforms produced by novel splice sites in the human genome after the human-chimpanzee divergence

    PubMed Central

    2012-01-01

    Background Evolution of splice sites is a well-known phenomenon that results in transcript diversity during human evolution. Many novel splice sites are derived from repetitive elements and may not contribute to protein products. Here, we analyzed annotated human protein-coding exons and identified human-specific splice sites that arose after the human-chimpanzee divergence. Results We analyzed multiple alignments of the annotated human protein-coding exons and their respective orthologous mammalian genome sequences to identify 85 novel splice sites (50 splice acceptors and 35 donors) in the human genome. The novel protein-coding exons, which are expressed either constitutively or alternatively, produce novel protein isoforms by insertion, deletion, or frameshift. We found three cases in which the human-specific isoform conferred novel molecular function in the human cells: the human-specific IMUP protein isoform induces apoptosis of the trophoblast and is implicated in pre-eclampsia; the intronization of a part of SMOX gene exon produces inactive spermine oxidase; the human-specific NUB1 isoform shows reduced interaction with ubiquitin-like proteins, possibly affecting ubiquitin pathways. Conclusions Although the generation of novel protein isoforms does not equate to adaptive evolution, we propose that these cases are useful candidates for a molecular functional study to identify proteomic changes that might bring about novel phenotypes during human evolution. PMID:23148531

  19. Dissecting non-coding RNA mechanisms in cellulo by single-molecule high-resolution localization and counting

    PubMed Central

    Pitchiaya, Sethuramasundaram; Krishnan, Vishalakshi; Custer, Thomas C.; Walter, Nils G.

    2013-01-01

    Non-coding RNAs (ncRNAs) recently were discovered to outnumber their protein-coding counterparts, yet their diverse functions are still poorly understood. Here we report on a method for the intracellular Single-molecule High Resolution Localization and Counting (iSHiRLoC) of microRNAs (miRNAs), a conserved, ubiquitous class of regulatory ncRNAs that controls the expression of over 60% of all mammalian protein coding genes post-transcriptionally, by a mechanism shrouded by seemingly contradictory observations. We present protocols to execute single particle tracking (SPT) and single-molecule counting of functional microinjected, fluorophore-labeled miRNAs and thereby extract diffusion coefficients and molecular stoichiometries of micro-ribonucleoprotein (miRNP) complexes from living and fixed cells, respectively. This probing of miRNAs at the single molecule level sheds new light on the intracellular assembly/disassembly of miRNPs, thus beginning to unravel the dynamic nature of this important gene regulatory pathway and facilitating the development of a parsimonious model for their obscured mechanism of action. PMID:23820309

  20. Identification of Putative Olfactory Genes from the Oriental Fruit Moth Grapholita molesta via an Antennal Transcriptome Analysis

    PubMed Central

    Li, Yiping; Wu, Junxiang

    2015-01-01

    Background The oriental fruit moth, Grapholita molesta, is an extremely important oligophagous pest species of stone and pome fruits throughout the world. As a host-switching species, adult moths, especially females, depend on olfactory cues to a large extent in locating host plants, finding mates, and selecting oviposition sites. The identification of olfactory genes can facilitate investigation on mechanisms for chemical communications. Methodology/Principal Finding We generated transcriptome of female antennae of G.molesta using the next-generation sequencing technique, and assembled transcripts from RNA-seq reads using Trinity, SOAPdenovo-trans and Abyss-trans assemblers. We identified 124 putative olfactory genes. Among the identified olfactory genes, 118 were novel to this species, including 28 transcripts encoding for odorant binding proteins, 17 chemosensory proteins, 48 odorant receptors, four gustatory receptors, 24 ionotropic receptors, two sensory neuron membrane proteins, and one odor degrading enzyme. The identified genes were further confirmed through semi-quantitative reverse transcription PCR for transcripts coding for 26 OBPs and 17 CSPs. OBP transcripts showed an obvious antenna bias, whereas CSP transcripts were detected in different tissues. Conclusion Antennal transcriptome data derived from the oriental fruit moth constituted an abundant molecular resource for the identification of genes potentially involved in the olfaction process of the species. This study provides a foundation for future research on the molecules involved in olfactory recognition of this insect pest, and in particular, the feasibility of using semiochemicals to control this pest. PMID:26540284

  1. A Deeper Examination of Thorellius atrox Scorpion Venom Components with Omic Techonologies.

    PubMed

    Romero-Gutierrez, Teresa; Peguero-Sanchez, Esteban; Cevallos, Miguel A; Batista, Cesar V F; Ortiz, Ernesto; Possani, Lourival D

    2017-12-12

    This communication reports a further examination of venom gland transcripts and venom composition of the Mexican scorpion Thorellius atrox using RNA-seq and tandem mass spectrometry. The RNA-seq, which was performed with the Illumina protocol, yielded more than 20,000 assembled transcripts. Following a database search and annotation strategy, 160 transcripts were identified, potentially coding for venom components. A novel sequence was identified that potentially codes for a peptide with similarity to spider ω-agatoxins, which act on voltage-gated calcium channels, not known before to exist in scorpion venoms. Analogous transcripts were found in other scorpion species. They could represent members of a new scorpion toxin family, here named omegascorpins. The mass fingerprint by LC-MS identified 135 individual venom components, five of which matched with the theoretical masses of putative peptides translated from the transcriptome. The LC-MS/MS de novo sequencing allowed to reconstruct and identify 42 proteins encoded by assembled transcripts, thus validating the transcriptome analysis. Earlier studies conducted with this scorpion venom permitted the identification of only twenty putative venom components. The present work performed with more powerful and modern omic technologies demonstrates the capacity of accomplishing a deeper characterization of scorpion venom components and the identification of novel molecules with potential applications in biomedicine and the study of ion channel physiology.

  2. Constitutive Expression of a Transcription Termination Factor by a Repressed Prophage: Promoters for Transcribing the Phage HK022 nun Gene

    PubMed Central

    King, Rodney A.; Madsen, Peter L.; Weisberg, Robert A.

    2000-01-01

    Lysogens of phage HK022 are resistant to infection by phage λ. Lambda resistance is caused by the action of the HK022 Nun protein, which prematurely terminates early λ transcripts. We report here that transcription of the nun gene initiates at a constitutive prophage promoter, PNun, located just upstream of the protein coding sequence. The 5′ end of the transcript was determined by primer extension analysis of RNA isolated from HK022 lysogens or RNA made in vitro by transcribing a template containing the promoter with purified Escherichia coli RNA polymerase. Inactivation of PNun by mutation greatly reduced Nun activity and Nun antigen in an HK022 lysogen. However, a low level of residual activity was detected, suggesting that a secondary promoter also contributes to nun expression. We found one possible secondary promoter, PNun′, just upstream of PNun. Neither promoter is likely to increase the expression of other phage genes in a lysogen because their transcripts should be terminated downstream of nun. We estimate that HK022 lysogens in stationary phase contain several hundred molecules of Nun per cell and that cells in exponential phase probably contain fewer. PMID:10629193

  3. The metazoan Mediator co-activator complex as an integrative hub for transcriptional regulation.

    PubMed

    Malik, Sohail; Roeder, Robert G

    2010-11-01

    The Mediator is an evolutionarily conserved, multiprotein complex that is a key regulator of protein-coding genes. In metazoan cells, multiple pathways that are responsible for homeostasis, cell growth and differentiation converge on the Mediator through transcriptional activators and repressors that target one or more of the almost 30 subunits of this complex. Besides interacting directly with RNA polymerase II, Mediator has multiple functions and can interact with and coordinate the action of numerous other co-activators and co-repressors, including those acting at the level of chromatin. These interactions ultimately allow the Mediator to deliver outputs that range from maximal activation of genes to modulation of basal transcription to long-term epigenetic silencing.

  4. GATA: A graphic alignment tool for comparative sequenceanalysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nix, David A.; Eisen, Michael B.

    2005-01-01

    Several problems exist with current methods used to align DNA sequences for comparative sequence analysis. Most dynamic programming algorithms assume that conserved sequence elements are collinear. This assumption appears valid when comparing orthologous protein coding sequences. Functional constraints on proteins provide strong selective pressure against sequence inversions, and minimize sequence duplications and feature shuffling. For non-coding sequences this collinearity assumption is often invalid. For example, enhancers contain clusters of transcription factor binding sites that change in number, orientation, and spacing during evolution yet the enhancer retains its activity. Dotplot analysis is often used to estimate non-coding sequence relatedness. Yet dotmore » plots do not actually align sequences and thus cannot account well for base insertions or deletions. Moreover, they lack an adequate statistical framework for comparing sequence relatedness and are limited to pairwise comparisons. Lastly, dot plots and dynamic programming text outputs fail to provide an intuitive means for visualizing DNA alignments.« less

  5. Integration of multi-omics data of a genome-reduced bacterium: Prevalence of post-transcriptional regulation and its correlation with protein abundances

    PubMed Central

    Chen, Wei-Hua; van Noort, Vera; Lluch-Senar, Maria; Hennrich, Marco L.; H. Wodke, Judith A.; Yus, Eva; Alibés, Andreu; Roma, Guglielmo; Mende, Daniel R.; Pesavento, Christina; Typas, Athanasios; Gavin, Anne-Claude; Serrano, Luis; Bork, Peer

    2016-01-01

    We developed a comprehensive resource for the genome-reduced bacterium Mycoplasma pneumoniae comprising 1748 consistently generated ‘-omics’ data sets, and used it to quantify the power of antisense non-coding RNAs (ncRNAs), lysine acetylation, and protein phosphorylation in predicting protein abundance (11%, 24% and 8%, respectively). These factors taken together are four times more predictive of the proteome abundance than of mRNA abundance. In bacteria, post-translational modifications (PTMs) and ncRNA transcription were both found to increase with decreasing genomic GC-content and genome size. Thus, the evolutionary forces constraining genome size and GC-content modify the relative contributions of the different regulatory layers to proteome homeostasis, and impact more genomic and genetic features than previously appreciated. Indeed, these scaling principles will enable us to develop more informed approaches when engineering minimal synthetic genomes. PMID:26773059

  6. Divergent transcription is associated with promoters of transcriptional regulators

    PubMed Central

    2013-01-01

    Background Divergent transcription is a wide-spread phenomenon in mammals. For instance, short bidirectional transcripts are a hallmark of active promoters, while longer transcripts can be detected antisense from active genes in conditions where the RNA degradation machinery is inhibited. Moreover, many described long non-coding RNAs (lncRNAs) are transcribed antisense from coding gene promoters. However, the general significance of divergent lncRNA/mRNA gene pair transcription is still poorly understood. Here, we used strand-specific RNA-seq with high sequencing depth to thoroughly identify antisense transcripts from coding gene promoters in primary mouse tissues. Results We found that a substantial fraction of coding-gene promoters sustain divergent transcription of long non-coding RNA (lncRNA)/mRNA gene pairs. Strikingly, upstream antisense transcription is significantly associated with genes related to transcriptional regulation and development. Their promoters share several characteristics with those of transcriptional developmental genes, including very large CpG islands, high degree of conservation and epigenetic regulation in ES cells. In-depth analysis revealed a unique GC skew profile at these promoter regions, while the associated coding genes were found to have large first exons, two genomic features that might enforce bidirectional transcription. Finally, genes associated with antisense transcription harbor specific H3K79me2 epigenetic marking and RNA polymerase II enrichment profiles linked to an intensified rate of early transcriptional elongation. Conclusions We concluded that promoters of a class of transcription regulators are characterized by a specialized transcriptional control mechanism, which is directly coupled to relaxed bidirectional transcription. PMID:24365181

  7. An upstream open reading frame represses expression of Lc, a member of the R/B family of maize transcriptional activators

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Damiani, R.D. Jr.; Wessler, S.R.

    1993-09-01

    The R/B genes of maize encode a family of basic helix-loop-helix proteins that determine where and when the anthocyanin-pigment pathway will be expressed in the plant. Previous studies showed that allelic diversity among family members reflects differences in gene expression, specifically in transcription initiation. The authors present evidence that the R gene Lc is under translational control. They demonstrate that the 235-nt transcript leader of Lc represses expression 25- to 30-fold in an in vivo assay. Repression is mediated by the presence in cis of a 38-codon upstream open reading frame. Furthermore, the coding capacity of the upstream open readingmore » frame influences the magnitude of repression. It is proposed that translational control does not contribute to tissue specificity but prevents overexpression of the Lc protein. The diversity of promoter and 5' untranslated leader sequences among the R/B genes provides an opportunity to study the coevolution of transcriptional and translational mechanisms of gene regulation. 36 refs., 5 figs.« less

  8. Cloning, sequencing, disruption and phenotypic analysis of uvsC, an Aspergillus nidulans homologue of yeast RAD51.

    PubMed

    van Heemst, D; Swart, K; Holub, E F; van Dijk, R; Offenberg, H H; Goosen, T; van den Broek, H W; Heyting, C

    1997-05-01

    We have cloned the uvsC gene of Aspergillus nidulans by complementation of the A. nidulans uvsC114 mutant. The predicted protein UVSC shows 67.4% sequence identity to the Saccharomyces cerevisiae Rad51 protein and 27.4% sequence identity to the Escherichia coli RecA protein. Transcription of uvsC is induced by methyl-methane sulphonate (MMS), as is transcription of RAD51 of yeast. Similar levels of uvsC transcription were observed after MMS induction in a uvsC+ strain and the uvsC114 mutant. The coding sequence of the uvsC114 allele has a deletion of 6 bp, which results in deletion of two amino acids and replacement of one amino acid in the translation product. In order to gain more insight into the biological function of the uvsC gene, a uvsC null mutant was constructed, in which the entire uvsC coding sequence was replaced by a selectable marker gene. Meiotic and mitotic phenotypes of a uvsC+ strain, the uvsC114 mutant and the uvsC null mutant were compared. The uvsC null mutant was more sensitive to both UV and MMS than the uvsC114 mutant. The uvsC114 mutant arrested in meiotic prophase-I. The uvsC null mutant arrested at an earlier stage, before the onset of meiosis. One possible interpretation of these meiotic phenotypes is that the A. nidulans homologue of Rad51 of yeast has a role both in the specialized processes preceding meiosis and in meiotic prophase I.

  9. Next stop for the CRISPR revolution: RNA-guided epigenetic regulators.

    PubMed

    Vora, Suhani; Tuttle, Marcelle; Cheng, Jenny; Church, George

    2016-09-01

    Clustered regularly interspaced short palindromic repeats (CRISPRs) and CRISPR-associated (Cas) proteins offer a breakthrough platform for cheap, programmable, and effective sequence-specific DNA targeting. The CRISPR-Cas system is naturally equipped for targeted DNA cutting through its native nuclease activity. As such, groups researching a broad spectrum of biological organisms have quickly adopted the technology with groundbreaking applications to genomic sequence editing in over 20 different species. However, the biological code of life is not only encoded in genetics but also in epigenetics as well. While genetic sequence editing is a powerful ability, we must also be able to edit and regulate transcriptional and epigenetic code. Taking inspiration from work on earlier sequence-specific targeting technologies such as zinc fingers (ZFs) and transcription activator-like effectors (TALEs), researchers quickly expanded the CRISPR-Cas toolbox to include transcriptional activation, repression, and epigenetic modification. In this review, we highlight advances that extend the CRISPR-Cas toolkit for transcriptional and epigenetic regulation, as well as best practice guidelines for these tools, and a perspective on future applications. © 2016 The Authors. The FEBS Journal published by John Wiley & Sons Ltd on behalf of Federation of European Biochemical Societies.

  10. Genetic coding and gene expression - new Quadruplet genetic coding model

    NASA Astrophysics Data System (ADS)

    Shankar Singh, Rama

    2012-07-01

    Successful demonstration of human genome project has opened the door not only for developing personalized medicine and cure for genetic diseases, but it may also answer the complex and difficult question of the origin of life. It may lead to making 21st century, a century of Biological Sciences as well. Based on the central dogma of Biology, genetic codons in conjunction with tRNA play a key role in translating the RNA bases forming sequence of amino acids leading to a synthesized protein. This is the most critical step in synthesizing the right protein needed for personalized medicine and curing genetic diseases. So far, only triplet codons involving three bases of RNA, transcribed from DNA bases, have been used. Since this approach has several inconsistencies and limitations, even the promise of personalized medicine has not been realized. The new Quadruplet genetic coding model proposed and developed here involves all four RNA bases which in conjunction with tRNA will synthesize the right protein. The transcription and translation process used will be the same, but the Quadruplet codons will help overcome most of the inconsistencies and limitations of the triplet codes. Details of this new Quadruplet genetic coding model and its subsequent potential applications including relevance to the origin of life will be presented.

  11. Identification of Putative Nuclear Receptors and Steroidogenic Enzymes in Murray-Darling Rainbowfish (Melanotaenia fluviatilis) Using RNA-Seq and De Novo Transcriptome Assembly.

    PubMed

    Bain, Peter A; Papanicolaou, Alexie; Kumar, Anupama

    2015-01-01

    Murray-Darling rainbowfish (Melanotaenia fluviatilis [Castelnau, 1878]; Atheriniformes: Melanotaeniidae) is a small-bodied teleost currently under development in Australasia as a test species for aquatic toxicological studies. To date, efforts towards the development of molecular biomarkers of contaminant exposure have been hindered by the lack of available sequence data. To address this, we sequenced messenger RNA from brain, liver and gonads of mature male and female fish and generated a high-quality draft transcriptome using a de novo assembly approach. 149,742 clusters of putative transcripts were obtained, encompassing 43,841 non-redundant protein-coding regions. Deduced amino acid sequences were annotated by functional inference based on similarity with sequences from manually curated protein sequence databases. The draft assembly contained protein-coding regions homologous to 95.7% of the complete cohort of predicted proteins from the taxonomically related species, Oryzias latipes (Japanese medaka). The mean length of rainbowfish protein-coding sequences relative to their medaka homologues was 92.1%, indicating that despite the limited number of tissues sampled a large proportion of the total expected number of protein-coding genes was captured in the study. Because of our interest in the effects of environmental contaminants on endocrine pathways, we manually curated subsets of coding regions for putative nuclear receptors and steroidogenic enzymes in the rainbowfish transcriptome, revealing 61 candidate nuclear receptors encompassing all known subfamilies, and 41 putative steroidogenic enzymes representing all major steroidogenic enzymes occurring in teleosts. The transcriptome presented here will be a valuable resource for researchers interested in biomarker development, protein structure and function, and contaminant-response genomics in Murray-Darling rainbowfish.

  12. A Comprehensive Analysis of Transcript-Supported De Novo Genes in Saccharomyces sensu stricto Yeasts

    PubMed Central

    Lu, Tzu-Chiao; Leu, Jun-Yi; Lin, Wen-Chang

    2017-01-01

    Abstract Novel genes arising from random DNA sequences (de novo genes) have been suggested to be widespread in the genomes of different organisms. However, our knowledge about the origin and evolution of de novo genes is still limited. To systematically understand the general features of de novo genes, we established a robust pipeline to analyze >20,000 transcript-supported coding sequences (CDSs) from the budding yeast Saccharomyces cerevisiae. Our analysis pipeline combined phylogeny, synteny, and sequence alignment information to identify possible orthologs across 20 Saccharomycetaceae yeasts and discovered 4,340 S. cerevisiae-specific de novo genes and 8,871 S. sensu stricto-specific de novo genes. We further combine information on CDS positions and transcript structures to show that >65% of de novo genes arose from transcript isoforms of ancient genes, especially in the upstream and internal regions of ancient genes. Fourteen identified de novo genes with high transcript levels were chosen to verify their protein expressions. Ten of them, including eight transcript isoform-associated CDSs, showed translation signals and five proteins exhibited specific cytosolic localizations. Our results suggest that de novo genes frequently arise in the S. sensu stricto complex and have the potential to be quickly integrated into ancient cellular network. PMID:28981695

  13. Modulation of yeast genome expression in response to defective RNA polymerase III-dependent transcription.

    PubMed

    Conesa, Christine; Ruotolo, Roberta; Soularue, Pascal; Simms, Tiffany A; Donze, David; Sentenac, André; Dieci, Giorgio

    2005-10-01

    We used genome-wide expression analysis in Saccharomyces cerevisiae to explore whether and how the expression of protein-coding, RNA polymerase (Pol) II-transcribed genes is influenced by a decrease in RNA Pol III-dependent transcription. The Pol II transcriptome was characterized in four thermosensitive, slow-growth mutants affected in different components of the RNA Pol III transcription machinery. Unexpectedly, we found only a modest correlation between altered expression of Pol II-transcribed genes and their proximity to class III genes, a result also confirmed by the analysis of single tRNA gene deletants. Instead, the transcriptome of all of the four mutants was characterized by increased expression of genes known to be under the control of the Gcn4p transcriptional activator. Indeed, GCN4 was found to be translationally induced in the mutants, and deleting the GCN4 gene eliminated the response. The Gcn4p-dependent expression changes did not require the Gcn2 protein kinase and could be specifically counteracted by an increased gene dosage of initiator tRNA(Met). Initiator tRNA(Met) depletion thus triggers a GCN4-dependent reprogramming of genome expression in response to decreased Pol III transcription. Such an effect might represent a key element in the coordinated transcriptional response of yeast cells to environmental changes.

  14. Surfactant Protein-C Promoter Variants Associated with Neonatal Respiratory Distress Syndrome Reduce Transcription

    PubMed Central

    Wambach, Jennifer A.; Yang, Ping; Wegner, Daniel J.; An, Ping; Hackett, Brian P.; Cole, F. S.; Hamvas, Aaron

    2010-01-01

    Dominant mutations in coding regions of the surfactant protein-C gene (SFTPC) cause respiratory distress syndrome (RDS) in infants. However, the contribution of variants in noncoding regions of SFTPC to pulmonary phenotypes is unknown. Using a case-control group of infants ≥34 weeks gestation (n=538), we used complete resequencing of SFTPC and its promoter, genotyping, and logistic regression to identify 80 single nucleotide polymorphisms (SNPs). Three promoter SNPs were statistically associated with neonatal RDS among European descent infants. To assess the transcriptional effects of these three promoter SNPs, we selectively mutated the SFTPC promoter and performed transient transfection using MLE-15 cells and a firefly luciferase reporter vector. Each promoter SNP decreased SFTPC transcription. The combination of two variants in high linkage dysequilibrium also decreased SFTPC transcription. In silico evaluation of transcription factor binding demonstrated that the rare allele at g.-1167 disrupts a SOX (SRY-related high mobility group box) consensus motif and introduces a GATA-1 site, at g.-2385 removes a MZF-1 (myeloid zinc finger) binding site, and at g.-1647 removes a potential methylation site. This combined statistical, in vitro, and in silico approach suggests that reduced SFTPC transcription contributes to the genetic risk for neonatal RDS in developmentally susceptible infants. PMID:20539253

  15. Nitrite reductase expression is regulated at the post-transcriptional level by the nitrogen source in Nicotiana plumbaginifolia and Arabidopsis thaliana.

    PubMed

    Crété, P; Caboche, M; Meyer, C

    1997-04-01

    Higher plant nitrite reductase (NiR) is a monomeric chloroplastic protein catalysing the reduction of nitrite, the product of nitrate reduction, to ammonium. The expression of this enzyme is controlled at the transcriptional level by light and by the nitrogen source. In order to study the post-transcriptional regulation of NiR, Nicotiana plumbaginifolia and Arabidopsis thaliana were transformed with a chimaeric NiR construct containing the tobacco leaf NiR1 coding sequence driven by the CaMV 35S RNA promoter. Transformed plants did not show any phenotypic difference when compared with the wild-type, although they overexpressed NiR activity in the leaves. When these plants were grown in vitro on media containing either nitrate or ammonium as sole nitrogen source, NiR mRNA derived from transgene expression was constitutively expressed, whereas NiR activity and protein level were strongly reduced on ammonium-containing medium. These results suggest that, together with transcriptional control, post-transcriptional regulation by the nitrogen source is operating on NiR expression. This post-transcriptional regulation of tobacco leaf NiR1 expression was observed not only in the closely related species N. plumbaginifolia but also in the more distant species A. thaliana.

  16. Pervasive and largely lineage-specific adaptive protein evolution in the dosage compensation complex of Drosophila melanogaster.

    PubMed

    Levine, Mia T; Holloway, Alisha K; Arshad, Umbreen; Begun, David J

    2007-11-01

    Dosage compensation refers to the equalization of X-linked gene transcription among heterogametic and homogametic sexes. In Drosophila, the dosage compensation complex (DCC) mediates the twofold hypertranscription of the single male X chromosome. Loss-of-function mutations at any DCC protein-coding gene are male lethal. Here we report a population genetic analysis suggesting that four of the five core DCC proteins--MSL1, MSL2, MSL3, and MOF--are evolving under positive selection in D. melanogaster. Within these four proteins, several domains that range in function from X chromosome localization to protein-protein interactions have elevated, D. melanogaster-specific, amino acid divergence.

  17. De Novo Assembly of the Whole Transcriptome of the Wild Embryo, Preleptocephalus, Leptocephalus, and Glass Eel of Anguilla japonica and Deciphering the Digestive and Absorptive Capacities during Early Development.

    PubMed

    Hsu, Hsiang-Yi; Chen, Shu-Hwa; Cha, Yuh-Ru; Tsukamoto, Katsumi; Lin, Chung-Yen; Han, Yu-San

    2015-01-01

    Natural stocks of Japanese eel (Anguilla japonica) have decreased drastically because of overfishing, habitat destruction, and changes in the ocean environment over the past few decades. However, to date, artificial mass production of glass eels is far from reality because of the lack of appropriate feed for the eel larvae. In this study, wild glass eel, leptocephali, preleptocephali, and embryos were collected to conduct RNA-seq. Approximately 279 million reads were generated and assembled into 224,043 transcripts. The transcript levels of genes coding for digestive enzymes and nutrient transporters were investigated to estimate the capacities for nutrient digestion and absorption during early development. The results showed that the transcript levels of protein digestion enzymes were higher than those of carbohydrate and lipid digestion enzymes in the preleptocephali and leptocephali, and the transcript levels of amino acid transporters were also higher than those of glucose and fructose transporters and the cholesterol transporter. In addition, the transcript levels of glucose and fructose transporters were significantly raising in the leptocephali. Moreover, the transcript levels of protein, carbohydrate, and lipid digestion enzymes were balanced in the glass eel, but the transcript levels of amino acid transporters were higher than those of glucose and cholesterol transporters. These findings implied that preleptocephali and leptocephali prefer high-protein food, and the nutritional requirements of monosaccharides and lipids for the eel larvae vary with growth. An online database (http://molas.iis.sinica.edu.tw/jpeel/) that will provide the sequences and the annotated results of assembled transcripts was established for the eel research community.

  18. De Novo Assembly of the Whole Transcriptome of the Wild Embryo, Preleptocephalus, Leptocephalus, and Glass Eel of Anguilla japonica and Deciphering the Digestive and Absorptive Capacities during Early Development

    PubMed Central

    Cha, Yuh-Ru; Tsukamoto, Katsumi; Lin, Chung-Yen; Han, Yu-San

    2015-01-01

    Natural stocks of Japanese eel (Anguilla japonica) have decreased drastically because of overfishing, habitat destruction, and changes in the ocean environment over the past few decades. However, to date, artificial mass production of glass eels is far from reality because of the lack of appropriate feed for the eel larvae. In this study, wild glass eel, leptocephali, preleptocephali, and embryos were collected to conduct RNA-seq. Approximately 279 million reads were generated and assembled into 224,043 transcripts. The transcript levels of genes coding for digestive enzymes and nutrient transporters were investigated to estimate the capacities for nutrient digestion and absorption during early development. The results showed that the transcript levels of protein digestion enzymes were higher than those of carbohydrate and lipid digestion enzymes in the preleptocephali and leptocephali, and the transcript levels of amino acid transporters were also higher than those of glucose and fructose transporters and the cholesterol transporter. In addition, the transcript levels of glucose and fructose transporters were significantly raising in the leptocephali. Moreover, the transcript levels of protein, carbohydrate, and lipid digestion enzymes were balanced in the glass eel, but the transcript levels of amino acid transporters were higher than those of glucose and cholesterol transporters. These findings implied that preleptocephali and leptocephali prefer high-protein food, and the nutritional requirements of monosaccharides and lipids for the eel larvae vary with growth. An online database (http://molas.iis.sinica.edu.tw/jpeel/) that will provide the sequences and the annotated results of assembled transcripts was established for the eel research community. PMID:26406914

  19. Modular Evolution of DNA-Binding Preference of a Tbrain Transcription Factor Provides a Mechanism for Modifying Gene Regulatory Networks

    PubMed Central

    Cheatle Jarvela, Alys M.; Brubaker, Lisa; Vedenko, Anastasia; Gupta, Anisha; Armitage, Bruce A.; Bulyk, Martha L.; Hinman, Veronica F.

    2014-01-01

    Gene regulatory networks (GRNs) describe the progression of transcriptional states that take a single-celled zygote to a multicellular organism. It is well documented that GRNs can evolve extensively through mutations to cis-regulatory modules (CRMs). Transcription factor proteins that bind these CRMs may also evolve to produce novelty. Coding changes are considered to be rarer, however, because transcription factors are multifunctional and hence are more constrained to evolve in ways that will not produce widespread detrimental effects. Recent technological advances have unearthed a surprising variation in DNA-binding abilities, such that individual transcription factors may recognize both a preferred primary motif and an additional secondary motif. This provides a source of modularity in function. Here, we demonstrate that orthologous transcription factors can also evolve a changed preference for a secondary binding motif, thereby offering an unexplored mechanism for GRN evolution. Using protein-binding microarray, surface plasmon resonance, and in vivo reporter assays, we demonstrate an important difference in DNA-binding preference between Tbrain protein orthologs in two species of echinoderms, the sea star, Patiria miniata, and the sea urchin, Strongylocentrotus purpuratus. Although both orthologs recognize the same primary motif, only the sea star Tbr also has a secondary binding motif. Our in vivo assays demonstrate that this difference may allow for greater evolutionary change in timing of regulatory control. This uncovers a layer of transcription factor binding divergence that could exist for many pairs of orthologs. We hypothesize that this divergence provides modularity that allows orthologous transcription factors to evolve novel roles in GRNs through modification of binding to secondary sites. PMID:25016582

  20. Intron-exon organization of the active human protein S gene PS. alpha. and its pseudogene PS. beta. : Duplication and silencing during primate evolution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ploos van Amstel, H.; Reitsma, P.H.; van der Logt, C.P.

    The human protein S locus on chromosome 3 consists of two protein S genes, PS{alpha} and PS{beta}. Here the authors report the cloning and characterization of both genes. Fifteen exons of the PS{alpha} gene were identified that together code for protein S mRNA as derived from the reported protein S cDNAs. Analysis by primer extension of liver protein S mRNA, however, reveals the presence of two mRNA forms that differ in the length of their 5{prime}-noncoding region. Both transcripts contain a 5{prime}-noncoding region longer than found in the protein S cDNAs. The two products may arise from alternative splicing ofmore » an additional intron in this region or from the usage of two start sites for transcription. The intron-exon organization of the PS{alpha} gene fully supports the hypothesis that the protein S gene is the product of an evolutional assembling process in which gene modules coding for structural/functional protein units also found in other coagulation proteins have been put upstream of the ancestral gene of a steroid hormone binding protein. The PS{beta} gene is identified as a pseudogene. It contains a large variety of detrimental aberrations, viz., the absence of exon I, a splice site mutation, three stop codons, and a frame shift mutation. Overall the two genes PS{alpha} and PS{beta} show between their exonic sequences 96.5% homology. Southern analysis of primate DNA showed that the duplication of the ancestral protein S gene has occurred after the branching of the orangutan from the African apes. A nonsense mutation that is present in the pseudogene of man also could be identified in one of the two protein S genes of both chimpanzee and gorilla. This implicates that silencing of one of the two protein S genes must have taken place before the divergence of the three African apes.« less

  1. Expression, regulation and functional assessment of the 80 amino acid Small Adipocyte Factor 1 (Smaf1) protein in adipocytes.

    PubMed

    Ren, Gang; Eskandari, Parisa; Wang, Siqian; Smas, Cynthia M

    2016-01-15

    The gene for Small Adipocyte Factor 1, Smaf1 (also known as adipogenin, ADIG), encodes a ∼600 base transcript that is highly upregulated during 3T3-L1 in vitro adipogenesis and markedly enriched in adipose tissues. Based on the lack of an obvious open reading frame in the Smaf1 transcript, it is not known if the Smaf1 gene is protein coding or non-coding RNA. Using a peptide from a putative open reading frame of Smaf1 as antigen, we generated antibodies for western analysis. Our studies prove that Smaf1 encodes an adipose-enriched protein which in western blot analysis migrates at ∼10 kDa. Rapid induction of Smaf1 protein occurs during in vitro adipogenesis and its expression in 3T3-L1 adipocytes is positively regulated by insulin and glucose. Moreover, siRNA studies reveal that expression of Smaf1 in adipocytes is wholly dependent on PPARγ. On the other hand, use of siRNA for Smaf1 to nearly abolish its protein expression in adipocytes revealed that Smaf1 does not have a major role in adipocyte triglyceride accumulation, lipolysis or insulin-stimulated pAkt induction. However, immunolocalization studies using HA-tagged Smaf1 reveal enrichment at adipocyte lipid droplets. Together our findings show that Smaf1 is a novel small protein endogenous to adipocytes and that Smaf1 expression is closely tied to PPARγ-mediated signals and the adipocyte phenotype. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Comprehensive Identification of mRNA-Binding Proteins of Leishmania donovani by Interactome Capture.

    PubMed

    Nandan, Devki; Thomas, Sneha A; Nguyen, Anne; Moon, Kyung-Mee; Foster, Leonard J; Reiner, Neil E

    2017-01-01

    Leishmania are unicellular eukaryotes responsible for leishmaniasis in humans. Like other trypanosomatids, leishmania regulate protein coding gene expression almost exclusively at the post-transcriptional level with the help of RNA binding proteins (RBPs). Due to the presence of polycystronic transcription units, leishmania do not regulate RNA polymerase II-dependent transcription initiation. Recent evidence suggests that the main control points in gene expression are mRNA degradation and translation. Protein-RNA interactions are involved in every aspect of RNA biology, such as mRNA splicing, polyadenylation, localization, degradation, and translation. A detailed picture of these interactions would likely prove to be highly informative in understanding leishmania biology and virulence. We developed a strategy involving covalent UV cross-linking of RBPs to mRNA in vivo, followed by interactome capture using oligo(dT) magnetic beads to define comprehensively the mRNA interactome of growing L. donovani amastigotes. The protein mass spectrometry analysis of captured proteins identified 79 mRNA interacting proteins which withstood very stringent washing conditions. Strikingly, we found that 49 of these mRNA interacting proteins had no orthologs or homologs in the human genome. Consequently, these may represent high quality candidates for selective drug targeting leading to novel therapeutics. These results show that this unbiased, systematic strategy has the promise to be applicable to study the mRNA interactome during various biological settings such as metabolic changes, stress (low pH environment, oxidative stress and nutrient deprivation) or drug treatment.

  3. Integration of mRNP formation and export.

    PubMed

    Björk, Petra; Wieslander, Lars

    2017-08-01

    Expression of protein-coding genes in eukaryotes relies on the coordinated action of many sophisticated molecular machineries. Transcription produces precursor mRNAs (pre-mRNAs) and the active gene provides an environment in which the pre-mRNAs are processed, folded, and assembled into RNA-protein (RNP) complexes. The dynamic pre-mRNPs incorporate the growing transcript, proteins, and the processing machineries, as well as the specific protein marks left after processing that are essential for export and the cytoplasmic fate of the mRNPs. After release from the gene, the mRNPs move by diffusion within the interchromatin compartment, making up pools of mRNPs. Here, splicing and polyadenylation can be completed and the mRNPs recruit the major export receptor NXF1. Export competent mRNPs interact with the nuclear pore complex, leading to export, concomitant with compositional and conformational changes of the mRNPs. We summarize the integrated nuclear processes involved in the formation and export of mRNPs.

  4. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins

    PubMed Central

    Pruitt, Kim D.; Tatusova, Tatiana; Maglott, Donna R.

    2005-01-01

    The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset representing the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses. Nucleotide and protein sequences are explicitly linked, and the sequences are linked to other resources including the NCBI Map Viewer and Gene. Sequences are annotated to include coding regions, conserved domains, variation, references, names, database cross-references, and other features using a combined approach of collaboration and other input from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff. PMID:15608248

  5. A purified truncated form of yeast Gal4 expressed in Escherichia coli and used to functionalize poly(lactic acid) nanoparticle surface is transcriptionally active in cellulo.

    PubMed

    Legaz, Sophie; Exposito, Jean-Yves; Borel, Agnès; Candusso, Marie-Pierre; Megy, Simon; Montserret, Roland; Lahaye, Vincent; Terzian, Christophe; Verrier, Bernard

    2015-09-01

    Gal4/UAS system is a powerful tool for the analysis of numerous biological processes. Gal4 is a large yeast transcription factor that activates genes including UAS sequences in their promoter. Here, we have synthesized a minimal form of Gal4 DNA sequence coding for the binding and dimerization regions, but also part of the transcriptional activation domain. This truncated Gal4 protein was expressed as inclusion bodies in Escherichia coli. A structured and active form of this recombinant protein was purified and used to cover poly(lactic acid) (PLA) nanoparticles. In cellulo, these Gal4-vehicles were able to activate the expression of a Green Fluorescent Protein (GFP) gene under the control of UAS sequences, demonstrating that the decorated Gal4 variant can be delivery into cells where it still retains its transcription factor capacities. Thus, we have produced in E. coli and purified a short active form of Gal4 that retains its functions at the surface of PLA-nanoparticles in cellular assay. These decorated Gal4-nanoparticles will be useful to decipher their tissue distribution and their potential after ingestion or injection in UAS-GFP recombinant animal models. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. cDNA cloning and characterization of mouse DTEF-1 and ETF, members of the TEA/ATTS family of transcription factors.

    PubMed

    Yockey, C E; Shimizu, N

    1998-02-01

    Members of the TEA/ATTS family of transcription factors have been found in most representative eukaryotic organisms. In vertebrates, the TEA family contains at least four members, which share overlapping DNA-binding specificity and have similar transcriptional activation properties. In this article, we describe the cDNA cloning and characterization of the murine TEA proteins DTEF-1 (mDTEF-1) and ETF. Using in situ hybridization analysis of mouse embryos, we found that mDTEF-1 and ETF transcript distributions substantially overlap. ETF is expressed throughout the embryo except in the myocardium early in development, whereas late in development, it is enriched in lung and neuroectoderm. Mouse DTEF-1 is expressed at a much lower level throughout development and is substantially enriched in ectoderm and skin, as well as in the developing pituitary at midgestation. Northern blot analysis of adult mouse tissue total RNA showed that both ETF and mDTEF-1 are abundant in uterus and lung relative to other tissues. Using gel mobility shift assays and GAL4-fusion protein analysis, we demonstrated that the full coding sequences of ETF and mDTEF-1 encode M-CAT/GT-IIC-binding proteins containing activation domains.

  7. MicroRNAs: regulators of gene expression and cell differentiation

    PubMed Central

    Shivdasani, Ramesh A.

    2006-01-01

    The existence and roles of a class of abundant regulatory RNA molecules have recently come into sharp focus. Micro-RNAs (miRNAs) are small (approximately 22 bases), non–protein-coding RNAs that recognize target sequences of imperfect complementarity in cognate mRNAs and either destabilize them or inhibit protein translation. Although mechanisms of miRNA biogenesis have been elucidated in some detail, there is limited appreciation of their biological functions. Reported examples typically focus on miRNA regulation of a single tissue-restricted transcript, often one encoding a transcription factor, that controls a specific aspect of development, cell differentiation, or physiology. However, computational algorithms predict up to hundreds of putative targets for individual miRNAs, single transcripts may be regulated by multiple miRNAs, and miRNAs may either eliminate target gene expression or serve to finetune transcript and protein levels. Theoretical considerations and early experimental results hence suggest diverse roles for miRNAs as a class. One appealing possibility, that miRNAs eliminate low-level expression of unwanted genes and hence refine unilineage gene expression, may be especially amenable to evaluation in models of hematopoiesis. This review summarizes current understanding of miRNA mechanisms, outlines some of the important outstanding questions, and describes studies that attempt to define miRNA functions in hematopoiesis. PMID:16882713

  8. Transcripts of the NADH-dehydrogenase subunit 3 gene are differentially edited in Oenothera mitochondria.

    PubMed Central

    Schuster, W; Wissinger, B; Unseld, M; Brennicke, A

    1990-01-01

    A number of cytosines are altered to be recognized as uridines in transcripts of the nad3 locus in mitochondria of the higher plant Oenothera. Such nucleotide modifications can be found at 16 different sites within the nad3 coding region. Most of these alterations in the mRNA sequence change codon identities to specify amino acids better conserved in evolution. Individual cDNA clones differ in their degree of editing at five nucleotide positions, three of which are silent, while two lead to codon alterations specifying different amino acids. None of the cDNA clones analysed is maximally edited at all possible sites, suggesting slow processing or lowered stringency of editing at these nucleotides. Differentially edited transcripts could be editing intermediates or could code for differing polypeptides. Two edited nucleotides in an open reading frame located upstream of nad3 change two amino acids in the deduced polypeptide. Part of the well-conserved ribosomal protein gene rps12 also encoded downstream of nad3 in other plants, is lost in Oenothera mitochondria by recombination events. The functional rps12 protein must be imported from the cytoplasm since the deleted sequences of this gene are not found in the Oenothera mitochondrial genome. The pseudogene sequence is not edited at any nucleotide position. Images Fig. 3. Fig. 4. Fig. 7. PMID:1688531

  9. A method for the further assembly of targeted unigenes in a transcriptome after assembly by Trinity

    PubMed Central

    Xiao, Xinlong; Ma, Jinbiao; Sun, Yufang; Yao, Yinan

    2015-01-01

    RNA-sequencing has been widely used to obtain high throughput transcriptome sequences in various species, but the assembly of a full set of complete transcripts is still a significant challenge. Judging by the number of expected transcripts and assembled unigenes in a transcriptome library, we believe that some unigenes could be reassembled. In this study, using the nitrate transporter (NRT) gene family and phosphate transporter (PHT) gene family in Salicornia europaea as examples, we introduced an approach to further assemble unigenes found in transcriptome libraries which had been previously generated by Trinity. To find the unigenes of a particular transcript that contained gaps, we respectively selected 16 NRT candidate unigene pairs and 12 PHT candidate unigene pairs for which the two unigenes had the same annotations, the same expression patterns among various RNA-seq samples, and different positions of the proteins coded as mapped to a reference protein. To fill a gap between the two unigenes, PCR was performed using primers that mapped to the two unigenes and the PCR products were sequenced, which demonstrated that 5 unigene pairs of NRT and 3 unigene pairs of PHT could be reassembled when the gaps were filled using the corresponding PCR product sequences. This fast and simple method will reduce the redundancy of targeted unigenes and allow acquisition of complete coding sequences (CDS). PMID:26528307

  10. Intergenic Transcriptional Interference Is Blocked by RNA Polymerase III Transcription Factor TFIIIB in Saccharomyces cerevisiae

    PubMed Central

    Korde, Asawari; Rosselot, Jessica M.; Donze, David

    2014-01-01

    The major function of eukaryotic RNA polymerase III is to transcribe transfer RNA, 5S ribosomal RNA, and other small non-protein-coding RNA molecules. Assembly of the RNA polymerase III complex on chromosomal DNA requires the sequential binding of transcription factor complexes TFIIIC and TFIIIB. Recent evidence has suggested that in addition to producing RNA transcripts, chromatin-assembled RNA polymerase III complexes may mediate additional nuclear functions that include chromatin boundary, nucleosome phasing, and general genome organization activities. This study provides evidence of another such “extratranscriptional” activity of assembled RNA polymerase III complexes, which is the ability to block progression of intergenic RNA polymerase II transcription. We demonstrate that the RNA polymerase III complex bound to the tRNA gene upstream of the Saccharomyces cerevisiae ATG31 gene protects the ATG31 promoter against readthrough transcriptional interference from the upstream noncoding intergenic SUT467 transcription unit. This protection is predominately mediated by binding of the TFIIIB complex. When TFIIIB binding to this tRNA gene is weakened, an extended SUT467–ATG31 readthrough transcript is produced, resulting in compromised ATG31 translation. Since the ATG31 gene product is required for autophagy, strains expressing the readthrough transcript exhibit defective autophagy induction and reduced fitness under autophagy-inducing nitrogen starvation conditions. Given the recent discovery of widespread pervasive transcription in all forms of life, protection of neighboring genes from intergenic transcriptional interference may be a key extratranscriptional function of assembled RNA polymerase III complexes and possibly other DNA binding proteins. PMID:24336746

  11. Regulation of neural macroRNAs by the transcriptional repressor REST

    PubMed Central

    Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J.; Stanton, Lawrence W.; Lipovich, Leonard

    2009-01-01

    The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs (“macroRNAs”), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer. PMID:19050060

  12. Regulation of neural macroRNAs by the transcriptional repressor REST.

    PubMed

    Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J; Stanton, Lawrence W; Lipovich, Leonard

    2009-01-01

    The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs ("macroRNAs"), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer.

  13. Structural organization of poliovirus RNA replication is mediated by viral proteins of the P2 genomic region

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bienz, K.; Egger, D.; Troxler, M.

    1990-03-01

    Transcriptionally active replication complexes bound to smooth membrane vesicles were isolated from poliovirus-infected cells. In electron microscopic, negatively stained preparations, the replication complex appeared as an irregularly shaped, oblong structure attached to several virus-induced vesicles of a rosettelike arrangement. Electron microscopic immunocytochemistry of such preparations demonstrated that the poliovirus replication complex contains the proteins coded by the P2 genomic region (P2 proteins) in a membrane-associated form. In addition, the P2 proteins are also associated with viral RNA, and they can be cross-linked to viral RNA by UV irradiation. Guanidine hydrochloride prevented the P2 proteins from becoming membrane bound but didmore » not change their association with viral RNA. The findings allow the conclusion that the protein 2C or 2C-containing precursor(s) is responsible for the attachment of the viral RNA to the vesicular membrane and for the spatial organization of the replication complex necessary for its proper functioning in viral transcription. A model for the structure of the viral replication complex and for the function of the 2C-containing P2 protein(s) and the vesicular membranes is proposed.« less

  14. Identification of functional elements and regulatory circuits by Drosophila modENCODE

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roy, Sushmita; Ernst, Jason; Kharchenko, Peter V.

    2010-12-22

    To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- andmore » tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation. Several years after the complete genetic sequencing of many species, it is still unclear how to translate genomic information into a functional map of cellular and developmental programs. The Encyclopedia of DNA Elements (ENCODE) (1) and model organism ENCODE (modENCODE) (2) projects use diverse genomic assays to comprehensively annotate the Homo sapiens (human), Drosophila melanogaster (fruit fly), and Caenorhabditis elegans (worm) genomes, through systematic generation and computational integration of functional genomic data sets. Previous genomic studies in flies have made seminal contributions to our understanding of basic biological mechanisms and genome functions, facilitated by genetic, experimental, computational, and manual annotation of the euchromatic and heterochromatic genome (3), small genome size, short life cycle, and a deep knowledge of development, gene function, and chromosome biology. The functions of {approx}40% of the protein and nonprotein-coding genes [FlyBase 5.12 (4)] have been determined from cDNA collections (5, 6), manual curation of gene models (7), gene mutations and comprehensive genome-wide RNA interference screens (8-10), and comparative genomic analyses (11, 12). The Drosophila modENCODE project has generated more than 700 data sets that profile transcripts, histone modifications and physical nucleosome properties, general and specific transcription factors (TFs), and replication programs in cell lines, isolated tissues, and whole organisms across several developmental stages (Fig. 1). Here, we computationally integrate these data sets and report (i) improved and additional genome annotations, including full-length proteincoding genes and peptides as short as 21 amino acids; (ii) noncoding transcripts, including 132 candidate structural RNAs and 1608 nonstructural transcripts; (iii) additional Argonaute (Ago)-associated small RNA genes and pathways, including new microRNAs (miRNAs) encoded within protein-coding exons and endogenous small interfering RNAs (siRNAs) from 3-inch untranslated regions; (iv) chromatin 'states' defined by combinatorial patterns of 18 chromatin marks that are associated with distinct functions and properties; (v) regions of high TF occupancy and replication activity with likely epigenetic regulation; (vi)mixed TF and miRNA regulatory networks with hierarchical structure and enriched feed-forward loops; (vii) coexpression- and co-regulation-based functional annotations for nearly 3000 genes; (viii) stage- and tissue-specific regulators; and (ix) predictive models of gene expression levels and regulator function.« less

  15. RNAseq analysis of fast skeletal muscle in restriction-fed transgenic coho salmon (Oncorhynchus kisutch): an experimental model uncoupling the growth hormone and nutritional signals regulating growth.

    PubMed

    Garcia de la Serrana, Daniel; Devlin, Robert H; Johnston, Ian A

    2015-07-31

    Coho salmon (Oncorhynchus kisutch) transgenic for growth hormone (Gh) express Gh in multiple tissues which results in increased appetite and continuous high growth with satiation feeding. Restricting Gh-transgenics to the same lower ration (TR) as wild-type fish (WT) results in similar growth, but with the recruitment of fewer, larger diameter, muscle skeletal fibres to reach a given body size. In order to better understand the genetic mechanisms behind these different patterns of muscle growth and to investigate how the decoupling of Gh and nutritional signals affects gene regulation we used RNA-seq to compare the fast skeletal muscle transcriptome in TR and WT coho salmon. Illumina sequencing of individually barcoded libraries from 6 WT and 6 TR coho salmon yielded 704,550,985 paired end reads which were used to construct 323,115 contigs containing 19,093 unique genes of which >10,000 contained >90 % of the coding sequence. Transcripts coding for 31 genes required for myoblast fusion were identified with 22 significantly downregulated in TR relative to WT fish, including 10 (vaspa, cdh15, graf1, crk, crkl, dock1, trio, plekho1a, cdc42a and dock5) associated with signaling through the cell surface protein cadherin. Nineteen out of 44 (43 %) translation initiation factors and 14 of 47 (30 %) protein chaperones were upregulated in TR relative to WT fish. TR coho salmon showed increased growth hormone transcripts and gene expression associated with protein synthesis and folding than WT fish even though net rates of protein accretion were similar. The uncoupling of Gh and amino acid signals likely results in additional costs of transcription associated with protein turnover in TR fish. The predicted reduction in the ionic costs of homeostasis in TR fish associated with increased fibre size were shown to involve multiple pathways regulating myotube fusion, particularly cadherin signaling.

  16. Aquaporin 2 of Rhipicephalus (Boophilus) microplus as a potential target to control ticks and tick-borne parasites

    USDA-ARS?s Scientific Manuscript database

    In a collaboration with Washington State University and ARS-Pullman, WA researchers, we identified and sequenced a 1,059 base pair Rhipicephalus microplus transcript that contained the coding region for a water channel protein, Aquaporin 2 (RmAQP2). The clone sequencing resulted in the production of...

  17. NRF2: Translating the Redox Code

    PubMed Central

    Tummala, Krishna S.; Kottakis, Filippos; Bardeesy, Nabeel

    2016-01-01

    Cancer requires mechanisms to mitigate reactive oxygen species (ROS) generated during rapid growth, such as induction of the antioxidant transcription factor, Nrf2. However, the targets of ROS-mediated cytotoxicity are unclear. Recent studies in pancreatic cancer show that redox control by Nrf2 prevents cysteine oxidation of the mRNA translational machinery, thereby supporting efficient protein synthesis. PMID:27555347

  18. Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

    PubMed Central

    Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio

    2004-01-01

    The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology. PMID:15103394

  19. A global transcriptional analysis of Plasmodium falciparum malaria reveals a novel family of telomere-associated lncRNAs

    PubMed Central

    2011-01-01

    Background Mounting evidence suggests a major role for epigenetic feedback in Plasmodium falciparum transcriptional regulation. Long non-coding RNAs (lncRNAs) have recently emerged as a new paradigm in epigenetic remodeling. We therefore set out to investigate putative roles for lncRNAs in P. falciparum transcriptional regulation. Results We used a high-resolution DNA tiling microarray to survey transcriptional activity across 22.6% of the P. falciparum strain 3D7 genome. We identified 872 protein-coding genes and 60 putative P. falciparum lncRNAs under developmental regulation during the parasite's pathogenic human blood stage. Further characterization of lncRNA candidates led to the discovery of an intriguing family of lncRNA telomere-associated repetitive element transcripts, termed lncRNA-TARE. We have quantified lncRNA-TARE expression at 15 distinct chromosome ends and mapped putative transcriptional start and termination sites of lncRNA-TARE loci. Remarkably, we observed coordinated and stage-specific expression of lncRNA-TARE on all chromosome ends tested, and two dominant transcripts of approximately 1.5 kb and 3.1 kb transcribed towards the telomere. Conclusions We have characterized a family of 22 telomere-associated lncRNAs in P. falciparum. Homologous lncRNA-TARE loci are coordinately expressed after parasite DNA replication, and are poised to play an important role in P. falciparum telomere maintenance, virulence gene regulation, and potentially other processes of parasite chromosome end biology. Further study of lncRNA-TARE and other promising lncRNA candidates may provide mechanistic insight into P. falciparum transcriptional regulation. PMID:21689454

  20. Long Non-Coding RNAs: A Novel Paradigm for Toxicology.

    PubMed

    Dempsey, Joseph L; Cui, Julia Yue

    2017-01-01

    Long non-coding RNAs (lncRNAs) are over 200 nucleotides in length and are transcribed from the mammalian genome in a tissue-specific and developmentally regulated pattern. There is growing recognition that lncRNAs are novel biomarkers and/or key regulators of toxicological responses in humans and animal models. Lacking protein-coding capacity, the numerous types of lncRNAs possess a myriad of transcriptional regulatory functions that include cis and trans gene expression, transcription factor activity, chromatin remodeling, imprinting, and enhancer up-regulation. LncRNAs also influence mRNA processing, post-transcriptional regulation, and protein trafficking. Dysregulation of lncRNAs has been implicated in various human health outcomes such as various cancers, Alzheimer's disease, cardiovascular disease, autoimmune diseases, as well as intermediary metabolism such as glucose, lipid, and bile acid homeostasis. Interestingly, emerging evidence in the literature over the past five years has shown that lncRNA regulation is impacted by exposures to various chemicals such as polycyclic aromatic hydrocarbons, benzene, cadmium, chlorpyrifos-methyl, bisphenol A, phthalates, phenols, and bile acids. Recent technological advancements, including next-generation sequencing technologies and novel computational algorithms, have enabled the profiling and functional characterizations of lncRNAs on a genomic scale. In this review, we summarize the biogenesis and general biological functions of lncRNAs, highlight the important roles of lncRNAs in human diseases and especially during the toxicological responses to various xenobiotics, evaluate current methods for identifying aberrant lncRNA expression and molecular target interactions, and discuss the potential to implement these tools to address fundamental questions in toxicology. © The Author 2016. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. DNA microarray analysis of the cyanotroph Pseudomonas pseudoalcaligenes CECT5344 in response to nitrogen starvation, cyanide and a jewelry wastewater.

    PubMed

    Luque-Almagro, V M; Escribano, M P; Manso, I; Sáez, L P; Cabello, P; Moreno-Vivián, C; Roldán, M D

    2015-11-20

    Pseudomonas pseudoalcaligenes CECT5344 is an alkaliphilic bacterium that can use cyanide as nitrogen source for growth, becoming a suitable candidate to be applied in biological treatment of cyanide-containing wastewaters. The assessment of the whole genome sequence of the strain CECT5344 has allowed the generation of DNA microarrays to analyze the response to different nitrogen sources. The mRNA of P. pseudoalcaligenes CECT5344 cells grown under nitrogen limiting conditions showed considerable changes when compared against the transcripts from cells grown with ammonium; up-regulated genes were, among others, the glnK gene encoding the nitrogen regulatory protein PII, the two-component ntrBC system involved in global nitrogen regulation, and the ammonium transporter-encoding amtB gene. The protein coding transcripts of P. pseudoalcaligenes CECT5344 cells grown with sodium cyanide or an industrial jewelry wastewater that contains high concentration of cyanide and metals like iron, copper and zinc, were also compared against the transcripts of cells grown with ammonium as nitrogen source. This analysis revealed the induction by cyanide and the cyanide-rich wastewater of four nitrilase-encoding genes, including the nitC gene that is essential for cyanide assimilation, the cyanase cynS gene involved in cyanate assimilation, the cioAB genes required for the cyanide-insensitive respiration, and the ahpC gene coding for an alkyl-hydroperoxide reductase that could be related with iron homeostasis and oxidative stress. The nitC and cynS genes were also induced in cells grown under nitrogen starvation conditions. In cells grown with the jewelry wastewater, a malate quinone:oxidoreductase mqoB gene and several genes coding for metal extrusion systems were specifically induced. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  2. Exceptionally long 5' UTR short tandem repeats specifically linked to primates.

    PubMed

    Namdar-Aligoodarzi, P; Mohammadparast, S; Zaker-Kandjani, B; Talebi Kakroodi, S; Jafari Vesiehsari, M; Ohadi, M

    2015-09-10

    We have previously reported genome-scale short tandem repeats (STRs) in the core promoter interval (i.e. -120 to +1 to the transcription start site) of protein-coding genes that have evolved identically in primates vs. non-primates. Those STRs may function as evolutionary switch codes for primate speciation. In the current study, we used the Ensembl database to analyze the 5' untranslated region (5' UTR) between +1 and +60 of the transcription start site of the entire human protein-coding genes annotated in the GeneCards database, in order to identify "exceptionally long" STRs (≥5-repeats), which may be of selective/adaptive advantage. The importance of this critical interval is its function as core promoter, and its effect on transcription and translation. In order to minimize ascertainment bias, we analyzed the evolutionary status of the human 5' UTR STRs of ≥5-repeats in several species encompassing six major orders and superorders across mammals, including primates, rodents, Scandentia, Laurasiatheria, Afrotheria, and Xenarthra. We introduce primate-specific STRs, and STRs which have expanded from mouse to primates. Identical co-occurrence of the identified STRs of rare average frequency between 0.006 and 0.0001 in primates supports a role for those motifs in processes that diverged primates from other mammals, such as neuronal differentiation (e.g. APOD and FGF4), and craniofacial development (e.g. FILIP1L). A number of the identified STRs of ≥5-repeats may be human-specific (e.g. ZMYM3 and DAZAP1). Future work is warranted to examine the importance of the listed genes in primate/human evolution, development, and disease. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Molecular, Cellular, and Structural Mechanisms of Cocaine Addiction: A Key Role for MicroRNAs

    PubMed Central

    Jonkman, Sietse; Kenny, Paul J

    2013-01-01

    The rewarding properties of cocaine play a key role in establishing and maintaining the drug-taking habit. However, as exposure to cocaine increases, drug use can transition from controlled to compulsive. Importantly, very little is known about the neurobiological mechanisms that control this switch in drug use that defines addiction. MicroRNAs (miRNAs) are small non-protein coding RNA transcripts that can regulate the expression of messenger RNAs that code for proteins. Because of their highly pleiotropic nature, each miRNA has the potential to regulate hundreds or even thousands of protein-coding RNA transcripts. This property of miRNAs has generated considerable interest in their potential involvement in complex psychiatric disorders such as addiction, as each miRNA could potentially influence the many different molecular and cellular adaptations that arise in response to drug use that are hypothesized to drive the emergence of addiction. Here, we review recent evidence supporting a key role for miRNAs in the ventral striatum in regulating the rewarding and reinforcing properties of cocaine in animals with limited exposure to the drug. Moreover, we discuss evidence suggesting that miRNAs in the dorsal striatum control the escalation of drug intake in rats with extended cocaine access. These findings highlight the central role for miRNAs in drug-induced neuroplasticity in brain reward systems that drive the emergence of compulsive-like drug use in animals, and suggest that a better understanding of how miRNAs control drug intake will provide new insights into the neurobiology of drug addiction. PMID:22968819

  4. RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336”

    PubMed Central

    Kumar, Ranjit; Lawrence, Mark L.; Watt, James; Cooksey, Amanda M.; Burgess, Shane C.; Nanduri, Bindu

    2012-01-01

    Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method. The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations. PMID:22276113

  5. RNA-seq based transcriptional map of bovine respiratory disease pathogen "Histophilus somni 2336".

    PubMed

    Kumar, Ranjit; Lawrence, Mark L; Watt, James; Cooksey, Amanda M; Burgess, Shane C; Nanduri, Bindu

    2012-01-01

    Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify "novel" genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.

  6. Origin and evolution of the long non-coding genes in the X-inactivation center.

    PubMed

    Romito, Antonio; Rougeulle, Claire

    2011-11-01

    Random X chromosome inactivation (XCI), the eutherian mechanism of X-linked gene dosage compensation, is controlled by a cis-acting locus termed the X-inactivation center (Xic). One of the striking features that characterize the Xic landscape is the abundance of loci transcribing non-coding RNAs (ncRNAs), including Xist, the master regulator of the inactivation process. Recent comparative genomic analyses have depicted the evolutionary scenario behind the origin of the X-inactivation center, revealing that this locus evolved from a region harboring protein-coding genes. During mammalian radiation, this ancestral protein-coding region was disrupted in the marsupial group, whilst it provided in eutherian lineage the starting material for the non-translated RNAs of the X-inactivation center. The emergence of non-coding genes occurred by a dual mechanism involving loss of protein-coding function of the pre-existing genes and integration of different classes of mobile elements, some of which modeled the structure and sequence of the non-coding genes in a species-specific manner. The rising genes started to produce transcripts that acquired function in regulating the epigenetic status of the X chromosome, as shown for Xist, its antisense Tsix, Jpx, and recently suggested for Ftx. Thus, the appearance of the Xic, which occurred after the divergence between eutherians and marsupials, was the basis for the evolution of random X inactivation as a strategy to achieve dosage compensation. Copyright © 2011. Published by Elsevier Masson SAS.

  7. Picornaviruses and nuclear functions: targeting a cellular compartment distinct from the replication site of a positive-strand RNA virus

    PubMed Central

    Flather, Dylan; Semler, Bert L.

    2015-01-01

    The compartmentalization of DNA replication and gene transcription in the nucleus and protein production in the cytoplasm is a defining feature of eukaryotic cells. The nucleus functions to maintain the integrity of the nuclear genome of the cell and to control gene expression based on intracellular and environmental signals received through the cytoplasm. The spatial separation of the major processes that lead to the expression of protein-coding genes establishes the necessity of a transport network to allow biomolecules to translocate between these two regions of the cell. The nucleocytoplasmic transport network is therefore essential for regulating normal cellular functioning. The Picornaviridae virus family is one of many viral families that disrupt the nucleocytoplasmic trafficking of cells to promote viral replication. Picornaviruses contain positive-sense, single-stranded RNA genomes and replicate in the cytoplasm of infected cells. As a result of the limited coding capacity of these viruses, cellular proteins are required by these intracellular parasites for both translation and genomic RNA replication. Being of messenger RNA polarity, a picornavirus genome can immediately be translated upon entering the cell cytoplasm. However, the replication of viral RNA requires the activity of RNA-binding proteins, many of which function in host gene expression, and are consequently localized to the nucleus. As a result, picornaviruses disrupt nucleocytoplasmic trafficking to exploit protein functions normally localized to a different cellular compartment from which they translate their genome to facilitate efficient replication. Furthermore, picornavirus proteins are also known to enter the nucleus of infected cells to limit host-cell transcription and down-regulate innate antiviral responses. The interactions of picornavirus proteins and host-cell nuclei are extensive, required for a productive infection, and are the focus of this review. PMID:26150805

  8. Search for protein partners of mitochondrial single-stranded DNA-binding protein Rim1p using a yeast two-hybrid system.

    PubMed

    Kucejová, B; Foury, F

    2003-01-01

    RIM1 is a nuclear gene of the yeast Saccharomyces cerevisiae coding for a protein with single-stranded DNA-binding activity that is essential for mitochondrial genome maintenance. No protein partners of Rim1p have been described so far in yeast. To better understand the role of this protein in mitochondrial DNA replication and recombination, a search for protein interactors by the yeast two-hybrid system was performed. This approach led to the identification of several candidates, including a putative transcription factor, Azf1p, and Mph1p, a protein with an RNA helicase domain which is known to influence the mutation rate of nuclear and mitochondrial genomes.

  9. Microprocessor-dependent processing of Splice site Overlapping microRNA exons does not result in changes in alternative splicing.

    PubMed

    Pianigiani, Giulia; Licastro, Danilo; Fortugno, Paola; Castiglia, Daniele; Petrovic, Ivana; Pagani, Franco

    2018-06-12

    MicroRNAs are found throughout the genome and are processed by the microprocessor complex (MPC) from longer precursors. Some precursor miRNAs overlap intron:exon junctions. These Splice site Overlapping microRNAs (SO-miRNAs) are mostly located in coding genes. It has been intimated, in the rarer examples of SO-miRNAs in non-coding RNAs, that the competition between the spliceosome and the MPC modulates alternative splicing. However, the effect of this overlap on coding transcripts is unknown. Unexpectedly, we show that neither Drosha silencing nor SF3b1 silencing changed the inclusion ratio of SO-miRNA exons. Two SO-miRNAs, located in genes that code for basal membrane proteins, are known to inhibit proliferation in primary keratinocytes. These SO-miRNAs were upregulated during differentiation and the host mRNAs were downregulated, but again there was no change in inclusion ratio of the SO-miRNA exons. Interestingly, Drosha silencing increased nascent RNA density, on chromatin, downstream of SO-miRNA exons. Overall our data suggest a novel mechanism for regulating gene expression in which MPC-dependent cleavage of SO-miRNA exons could cause premature transcriptional termination of coding genes rather than affecting alternative splicing. Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  10. Posttranslational Modifications of Baculovirus Protamine-Like Protein P6.9 and the Significance of Its Hyperphosphorylation for Viral Very Late Gene Hyperexpression

    PubMed Central

    Li, Ao; Zhao, Haizhou; Lai, Qingying; Huang, Zhihong; Yuan, Meijin

    2015-01-01

    ABSTRACT Many viruses utilize viral or cellular chromatin machinery for efficient infection. Baculoviruses encode a conserved protamine-like protein, P6.9. This protein plays essential roles in various viral physiological processes during infection. However, the mechanism by which P6.9 regulates transcription remains unknown. In this study, 7 phosphorylated species of P6.9 were resolved in Sf9 cells infected with the baculovirus type species Autographa californica multiple nucleopolyhedrovirus (AcMNPV). Mass spectrometry identified 22 phosphorylation and 10 methylation sites but no acetylation sites in P6.9. Immunofluorescence demonstrated that the P6.9 and virus-encoded serine/threonine kinase PK1 exhibited similar distribution patterns in infected cells, and coimmunoprecipitation confirmed the interaction between them. Upon pk1 deletion, nucleocapsid assembly and polyhedron formation were interrupted and the transcription of viral very late genes was downregulated. Interestingly, we found that the 3 most phosphorylated P6.9 species vanished from Sf9 cells transfected with the pk1 deletion mutant, suggesting that PK1 is involved in the hyperphosphorylation of P6.9. Mass spectrometry suggested that the phosphorylation of the 7 Ser/Thr and 5 Arg residues in P6.9 was PK1 dependent. Replacement of the 7 Ser/Thr residues with Ala resulted in a P6.9 phosphorylation pattern similar to that of the pk1 deletion mutant. Importantly, the decreases in the transcription level of viral very late genes and viral infectivity were consistent. Our findings reveal that P6.9 hyperphosphorylation is a precondition for the maximal hyperexpression of baculovirus very late genes and provide the first experimental insights into the function of the baculovirus protamine-like protein and the related protein kinase in epigenetics. IMPORTANCE Diverse posttranslational modifications (PTMs) of histones constitute a code that creates binding platforms that recruit transcription factors to regulate gene expression. Many viruses also utilize host- or virus-induced chromatin machinery to promote efficient infections. Baculoviruses encode a protamine-like protein, P6.9, which is required for a variety of processes in the infection cycle. Currently, P6.9's PTM sites and its regulating factors remain unknown. Here, we found that P6.9 could be categorized as unphosphorylated, hypophosphorylated, and hyperphosphorylated species and that a virus-encoded serine/threonine kinase, PK1, was essential for P6.9 hyperphosphorylation. Abundant PTM sites on P6.9 were identified, among which 7 Ser/Thr phosphorylated sites were PK1 dependent. Mutation of these Ser/Thr sites reduced very late viral gene transcription and viral infectivity, indicating that the PK1-mediated P6.9 hyperphosphorylation contributes to viral proliferation. These data suggest that a code exists in the sophisticated PTM of viral protamine-like proteins and participates in viral gene transcription. PMID:25972542

  11. Differential 3’ processing of specific transcripts expands regulatory and protein diversity across neuronal cell types

    PubMed Central

    Jereb, Saša; Hwang, Hun-Way; Van Otterloo, Eric; Govek, Eve-Ellen; Fak, John J; Yuan, Yuan; Hatten, Mary E

    2018-01-01

    Alternative polyadenylation (APA) regulates mRNA translation, stability, and protein localization. However, it is unclear to what extent APA regulates these processes uniquely in specific cell types. Using a new technique, cTag-PAPERCLIP, we discovered significant differences in APA between the principal types of mouse cerebellar neurons, the Purkinje and granule cells, as well as between proliferating and differentiated granule cells. Transcripts that differed in APA in these comparisons were enriched in key neuronal functions and many differed in coding sequence in addition to 3’UTR length. We characterize Memo1, a transcript that shifted from expressing a short 3’UTR isoform to a longer one during granule cell differentiation. We show that Memo1 regulates granule cell precursor proliferation and that its long 3’UTR isoform is targeted by miR-124, contributing to its downregulation during development. Our findings provide insight into roles for APA in specific cell types and establish a platform for further functional studies. PMID:29578408

  12. Post-transcriptional trafficking and regulation of neuronal gene expression.

    PubMed

    Goldie, Belinda J; Cairns, Murray J

    2012-02-01

    Intracellular messenger RNA (mRNA) traffic and translation must be highly regulated, both temporally and spatially, within eukaryotic cells to support the complex functional partitioning. This capacity is essential in neurons because it provides a mechanism for rapid input-restricted activity-dependent protein synthesis in individual dendritic spines. While this feature is thought to be important for synaptic plasticity, the structures and mechanisms that support this capability are largely unknown. Certainly specialized RNA binding proteins and binding elements in the 3' untranslated region (UTR) of translationally regulated mRNA are important, but the subtlety and complexity of this system suggests that an intermediate "specificity" component is also involved. Small non-coding microRNA (miRNA) are essential for CNS development and may fulfill this role by acting as the guide strand for mediating complex patterns of post-transcriptional regulation. In this review we examine post-synaptic gene regulation, mRNA trafficking and the emerging role of post-transcriptional gene silencing in synaptic plasticity.

  13. Staufen1 senses overall transcript secondary structure to regulate translation

    PubMed Central

    Ricci, Emiliano P; Kucukural, Alper; Cenik, Can; Mercier, Blandine C; Singh, Guramrit; Heyer, Erin E; Ashar-Patel, Ami; Peng, Lingtao; Moore, Melissa J

    2015-01-01

    Human Staufen1 (Stau1) is a double-stranded RNA (dsRNA)-binding protein implicated in multiple post-transcriptional gene-regulatory processes. Here we combined RNA immunoprecipitation in tandem (RIPiT) with RNase footprinting, formaldehyde cross-linking, sonication-mediated RNA fragmentation and deep sequencing to map Staufen1-binding sites transcriptome wide. We find that Stau1 binds complex secondary structures containing multiple short helices, many of which are formed by inverted Alu elements in annotated 3′ untranslated regions (UTRs) or in ‘strongly distal’ 3′ UTRs. Stau1 also interacts with actively translating ribosomes and with mRNA coding sequences (CDSs) and 3′ UTRs in proportion to their GC content and propensity to form internal secondary structure. On mRNAs with high CDS GC content, higher Stau1 levels lead to greater ribosome densities, thus suggesting a general role for Stau1 in modulating translation elongation through structured CDS regions. Our results also indicate that Stau1 regulates translation of transcription-regulatory proteins. PMID:24336223

  14. Xuhuai goat H-FABP gene clone, subcellular localization of expression products and the preparation of transgenic mice.

    PubMed

    Yin, Yan-hui; Li, Bi-chun; Wei, Guang-hui; Zhu, Cai-ye; Li, Wei; Zhang, Ya-ni; Du, Li-xin; Cao, Wen-guang

    2012-05-01

    The aim of this study was to clone the heart-type fatty acid binding protein (H-FABP) gene of Xuhuai goat, to explore it bioinformatically, and analyze the subcellular localization using enhanced green fluorescent protein (EGFP). The results showed that the coding sequence (CDS) length of Xuhuai goat H-FABP gene was 402 bp, encoding 133 amino acids (GenBank accession number AY466498.1). The H-FABP cDNA coding sequence was compared with the corresponding region of human, chicken, brown rat, cow, wild boar, donkey, and zebrafish. The similarity were 89%, 76%, 85%, 84%, 93%, 91%, 70%, respectively. For the corresponding amino acid sequences, the similarity were 90%, 79%, 88%, 97%, 95%, 94%, 72%, respectively. This study did not find the signal peptide region in the H-FABP protein; it revealed that H-FABP protein might be a nonsecreted protein. H-FABP expression was detected in vitro by reverse transcription-polymerase chain reaction (RT-PCR), and the EGFP-H-FABP fusion protein was localized to the cytoplasm. The gene could also be transiently and permanently expressed in mice.

  15. Histone Arginine Methylation

    PubMed Central

    Lorenzo, Alessandra Di; Bedford, Mark T.

    2012-01-01

    Arginine methylation is a common posttranslational modification (PTM). This type of PTM occurs on both nuclear and cytoplasmic proteins, and is particularly abundant on shuttling proteins. In this review, we will focus on one aspect of this PTM: the diverse roles that arginine methylation of the core histone tails play in regulating chromatin function. A family of nine protein arginine methyltransferases (PRMTs) catalyze methylation reactions, and a subset target histones. Importantly, arginine methylation of histone tails can promote or prevent the docking of key transcriptional effector molecules, thus playing a central role in the orchestration of the histone code. PMID:21074527

  16. Structure and transcription of the Helicoverpa armigera densovirus (HaDV2) genome and its expression strategy in LD652 cells.

    PubMed

    Xu, Pengjun; Graham, Robert I; Wilson, Kenneth; Wu, Kongming

    2017-02-07

    Densoviruses (DVs) are highly pathogenic to their hosts. However, we previously reported a mutualistic DV (HaDV2). Very little was known about the characteristics of this virus, so herein we undertook a series of experiments to explore the molecular biology of HaDV2 further. Phylogenetic analysis showed that HaDV2 was similar to members of the genus Iteradensovirus. However, compared to current members of the genus Iteradensovirus, the sequence identity of HaDV2 is less than 44% at the nucleotide-level, and lower than 36, 28 and 19% at the amino-acid-level of VP, NS1 and NS2 proteins, respectively. Moreover, NS1 and NS2 proteins from HaDV2 were smaller than those from other iteradensoviruses due to their shorter N-terminal sequences. Two transcripts of about 2.2 kb coding for the NS proteins and the VP proteins were identified by Northern Blot and RACE analysis. Using specific anti-NS1 and anti-NS2 antibodies, Western Blot analysis revealed a 78 kDa and a 48 kDa protein, respectively. Finally, the localization of both NS1 and NS2 proteins within the cell nucleus was determined by using Green Fluorescent Protein (GFP) labelling. The genome organization, terminal hairpin structure, transcription and expression strategies as well as the mutualistic relationship with its host, suggested that HaDV2 was a novel member of the genus Iteradensovirus within the subfamily Densovirinae.

  17. Spliced X-box Binding Protein 1 Couples the Unfolded Protein Response to Hexosamine Biosynthetic Pathway

    PubMed Central

    Wang, Zhao V.; Deng, Yingfeng; Gao, Ningguo; Pedrozo, Zully; Li, Dan L.; Morales, Cyndi R.; Criollo, Alfredo; Luo, Xiang; Tan, Wei; Jiang, Nan; Lehrman, Mark A.; Rothermel, Beverly A.; Lee, Ann-Hwee; Lavandero, Sergio; Mammen, Pradeep P.A.; Ferdous, Anwarul; Gillette, Thomas G.; Scherer, Philipp E.; Hill, Joseph A.

    2014-01-01

    SUMMARY The hexosamine biosynthetic pathway (HBP) generates UDP-GlcNAc (uridine diphosphate N-acetylglucosamine) for glycan synthesis and O-linked GlcNAc (O-GlcNAc) protein modifications. Despite the established role of the HBP in metabolism and multiple diseases, regulation of the HBP remains largely undefined. Here, we show that spliced X-box binding protein 1 (Xbp1s), the most conserved signal transducer of the unfolded protein response (UPR), is a direct transcriptional activator of the HBP. We demonstrate that the UPR triggers HBP activation via Xbp1s-dependent transcription of genes coding for key, rate-limiting enzymes. We further establish that this previously unrecognized UPR-HBP axis is triggered in a variety of stress conditions. Finally, we demonstrate a physiologic role for the UPR-HBP axis, by showing that acute stimulation of Xbp1s in heart by ischemia/reperfusion confers robust cardioprotection in part through induction of the HBP. Collectively, these studies reveal that Xbp1s couples the UPR to the HBP to protect cells under stress. PMID:24630721

  18. Spliced X-box binding protein 1 couples the unfolded protein response to hexosamine biosynthetic pathway.

    PubMed

    Wang, Zhao V; Deng, Yingfeng; Gao, Ningguo; Pedrozo, Zully; Li, Dan L; Morales, Cyndi R; Criollo, Alfredo; Luo, Xiang; Tan, Wei; Jiang, Nan; Lehrman, Mark A; Rothermel, Beverly A; Lee, Ann-Hwee; Lavandero, Sergio; Mammen, Pradeep P A; Ferdous, Anwarul; Gillette, Thomas G; Scherer, Philipp E; Hill, Joseph A

    2014-03-13

    The hexosamine biosynthetic pathway (HBP) generates uridine diphosphate N-acetylglucosamine (UDP-GlcNAc) for glycan synthesis and O-linked GlcNAc (O-GlcNAc) protein modifications. Despite the established role of the HBP in metabolism and multiple diseases, regulation of the HBP remains largely undefined. Here, we show that spliced X-box binding protein 1 (Xbp1s), the most conserved signal transducer of the unfolded protein response (UPR), is a direct transcriptional activator of the HBP. We demonstrate that the UPR triggers HBP activation via Xbp1s-dependent transcription of genes coding for key, rate-limiting enzymes. We further establish that this previously unrecognized UPR-HBP axis is triggered in a variety of stress conditions. Finally, we demonstrate a physiologic role for the UPR-HBP axis by showing that acute stimulation of Xbp1s in heart by ischemia/reperfusion confers robust cardioprotection in part through induction of the HBP. Collectively, these studies reveal that Xbp1s couples the UPR to the HBP to protect cells under stress. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. Ribosome profiling reveals changes in translational status of soybean transcripts during immature cotyledon development

    PubMed Central

    Shamimuzzaman, Md.

    2018-01-01

    To understand translational capacity on a genome-wide scale across three developmental stages of immature soybean seed cotyledons, ribosome profiling was performed in combination with RNA sequencing and cluster analysis. Transcripts representing 216 unique genes demonstrated a higher level of translational activity in at least one stage by exhibiting higher translational efficiencies (TEs) in which there were relatively more ribosome footprint sequence reads mapping to the transcript than were present in the control total RNA sample. The majority of these transcripts were more translationally active at the early stage of seed development and included 12 unique serine or cysteine proteases and 16 2S albumin and low molecular weight cysteine-rich proteins that may serve as substrates for turnover and mobilization early in seed development. It would appear that the serine proteases and 2S albumins play a vital role in the early stages. In contrast, our investigation of profiles of 19 genes encoding high abundance seed storage proteins, such as glycinins, beta-conglycinins, lectin, and Kunitz trypsin inhibitors, showed that they all had similar patterns in which the TE values started at low levels and increased approximately 2 to 6-fold during development. The highest levels of these seed protein transcripts were found at the mid-developmental stage, whereas the highest ribosome footprint levels of only up to 1.6 TE were found at the late developmental stage. These experimental findings suggest that the major seed storage protein coding genes are primarily regulated at the transcriptional level during normal soybean cotyledon development. Finally, our analyses also identified a total of 370 unique gene models that showed very low TE values including over 48 genes encoding ribosomal family proteins and 95 gene models that are related to energy and photosynthetic functions, many of which have homology to the chloroplast genome. Additionally, we showed that genes of the chloroplast were relatively translationally inactive during seed development. PMID:29570733

  20. Development of a bioluminescence resonance energy transfer (BRET) for monitoring estrogen receptor alpha activation

    NASA Astrophysics Data System (ADS)

    Michelini, Elisa; Mirasoli, Mara; Karp, Matti; Virta, Marko; Roda, Aldo

    2004-06-01

    Estrogen receptor (ER) is a ligand-activated transcriptional factor, able to dimerize after activation and to bind specific DNA sequences (estrogen response elements), thus activating gene target transcription. Since ER homo- and hetero-dimerization (giving a-a and a-b isoforms) is a fundamental step for receptor activation, we developed an assay for detecting compounds that induce human ERa homo-dimerization based on bioluminescence resonance energy transfer (BRET). BRET is a non-radiative energy transfer, occurring between a luminescent donor and a fluorescent acceptor, that strictly depends on the closeness between the two proteins and can therefore be used for studying protein-protein interactions. We cloned ERa coding sequence in frame with either a variant of the green fluorescent protein (enhanced yellow fluorescent protein, EYFP) or Renilla luciferase (RLuc). Upon ERa homo-dimerization, BRET process takes place in the presence of the RLuc substrate coelenterazine resulting in EYFP emission at its characteristic wavelength. The ER alpha-Rluc and ER alpha-EYFP fusion proteins were cloned, then the occurrence of BRET in the presence of ER alpha activators was assayed both in vivo, within cells, and in vitro, with purified fusion proteins.

  1. RNA-Binding Proteins in Trichomonas vaginalis: Atypical Multifunctional Proteins.

    PubMed

    Figueroa-Angulo, Elisa E; Calla-Choque, Jaeson S; Mancilla-Olea, Maria Inocente; Arroyo, Rossana

    2015-11-26

    Iron homeostasis is highly regulated in vertebrates through a regulatory system mediated by RNA-protein interactions between the iron regulatory proteins (IRPs) that interact with an iron responsive element (IRE) located in certain mRNAs, dubbed the IRE-IRP regulatory system. Trichomonas vaginalis, the causal agent of trichomoniasis, presents high iron dependency to regulate its growth, metabolism, and virulence properties. Although T. vaginalis lacks IRPs or proteins with aconitase activity, possesses gene expression mechanisms of iron regulation at the transcriptional and posttranscriptional levels. However, only one gene with iron regulation at the transcriptional level has been described. Recently, our research group described an iron posttranscriptional regulatory mechanism in the T. vaginalis tvcp4 and tvcp12 cysteine proteinase mRNAs. The tvcp4 and tvcp12 mRNAs have a stem-loop structure in the 5'-coding region or in the 3'-UTR, respectively that interacts with T. vaginalis multifunctional proteins HSP70, α-Actinin, and Actin under iron starvation condition, causing translation inhibition or mRNA stabilization similar to the previously characterized IRE-IRP system in eukaryotes. Herein, we summarize recent progress and shed some light on atypical RNA-binding proteins that may participate in the iron posttranscriptional regulation in T. vaginalis.

  2. Genome-wide transcription start site profiling in biofilm-grown Burkholderia cenocepacia J2315.

    PubMed

    Sass, Andrea M; Van Acker, Heleen; Förstner, Konrad U; Van Nieuwerburgh, Filip; Deforce, Dieter; Vogel, Jörg; Coenye, Tom

    2015-10-13

    Burkholderia cenocepacia is a soil-dwelling Gram-negative Betaproteobacterium with an important role as opportunistic pathogen in humans. Infections with B. cenocepacia are very difficult to treat due to their high intrinsic resistance to most antibiotics. Biofilm formation further adds to their antibiotic resistance. B. cenocepacia harbours a large, multi-replicon genome with a high GC-content, the reference genome of strain J2315 includes 7374 annotated genes. This study aims to annotate transcription start sites and identify novel transcripts on a whole genome scale. RNA extracted from B. cenocepacia J2315 biofilms was analysed by differential RNA-sequencing and the resulting dataset compared to data derived from conventional, global RNA-sequencing. Transcription start sites were annotated and further analysed according to their position relative to annotated genes. Four thousand ten transcription start sites were mapped over the whole B. cenocepacia genome and the primary transcription start site of 2089 genes expressed in B. cenocepacia biofilms were defined. For 64 genes a start codon alternative to the annotated one was proposed. Substantial antisense transcription for 105 genes and two novel protein coding sequences were identified. The distribution of internal transcription start sites can be used to identify genomic islands in B. cenocepacia. A potassium pump strongly induced only under biofilm conditions was found and 15 non-coding small RNAs highly expressed in biofilms were discovered. Mapping transcription start sites across the B. cenocepacia genome added relevant information to the J2315 annotation. Genes and novel regulatory RNAs putatively involved in B. cenocepacia biofilm formation were identified. These findings will help in understanding regulation of B. cenocepacia biofilm formation.

  3. Four RNA families with functional transient structures

    PubMed Central

    Zhu, Jing Yun A; Meyer, Irmtraud M

    2015-01-01

    Protein-coding and non-coding RNA transcripts perform a wide variety of cellular functions in diverse organisms. Several of their functional roles are expressed and modulated via RNA structure. A given transcript, however, can have more than a single functional RNA structure throughout its life, a fact which has been previously overlooked. Transient RNA structures, for example, are only present during specific time intervals and cellular conditions. We here introduce four RNA families with transient RNA structures that play distinct and diverse functional roles. Moreover, we show that these transient RNA structures are structurally well-defined and evolutionarily conserved. Since Rfam annotates one structure for each family, there is either no annotation for these transient structures or no such family. Thus, our alignments either significantly update and extend the existing Rfam families or introduce a new RNA family to Rfam. For each of the four RNA families, we compile a multiple-sequence alignment based on experimentally verified transient and dominant (dominant in terms of either the thermodynamic stability and/or attention received so far) RNA secondary structures using a combination of automated search via covariance model and manual curation. The first alignment is the Trp operon leader which regulates the operon transcription in response to tryptophan abundance through alternative structures. The second alignment is the HDV ribozyme which we extend to the 5′ flanking sequence. This flanking sequence is involved in the regulation of the transcript's self-cleavage activity. The third alignment is the 5′ UTR of the maturation protein from Levivirus which contains a transient structure that temporarily postpones the formation of the final inhibitory structure to allow translation of maturation protein. The fourth and last alignment is the SAM riboswitch which regulates the downstream gene expression by assuming alternative structures upon binding of SAM. All transient and dominant structures are mapped to our new alignments introduced here. PMID:25751035

  4. Four RNA families with functional transient structures.

    PubMed

    Zhu, Jing Yun A; Meyer, Irmtraud M

    2015-01-01

    Protein-coding and non-coding RNA transcripts perform a wide variety of cellular functions in diverse organisms. Several of their functional roles are expressed and modulated via RNA structure. A given transcript, however, can have more than a single functional RNA structure throughout its life, a fact which has been previously overlooked. Transient RNA structures, for example, are only present during specific time intervals and cellular conditions. We here introduce four RNA families with transient RNA structures that play distinct and diverse functional roles. Moreover, we show that these transient RNA structures are structurally well-defined and evolutionarily conserved. Since Rfam annotates one structure for each family, there is either no annotation for these transient structures or no such family. Thus, our alignments either significantly update and extend the existing Rfam families or introduce a new RNA family to Rfam. For each of the four RNA families, we compile a multiple-sequence alignment based on experimentally verified transient and dominant (dominant in terms of either the thermodynamic stability and/or attention received so far) RNA secondary structures using a combination of automated search via covariance model and manual curation. The first alignment is the Trp operon leader which regulates the operon transcription in response to tryptophan abundance through alternative structures. The second alignment is the HDV ribozyme which we extend to the 5' flanking sequence. This flanking sequence is involved in the regulation of the transcript's self-cleavage activity. The third alignment is the 5' UTR of the maturation protein from Levivirus which contains a transient structure that temporarily postpones the formation of the final inhibitory structure to allow translation of maturation protein. The fourth and last alignment is the SAM riboswitch which regulates the downstream gene expression by assuming alternative structures upon binding of SAM. All transient and dominant structures are mapped to our new alignments introduced here.

  5. The point of no return: The poly(A)-associated elongation checkpoint.

    PubMed

    Tellier, Michael; Ferrer-Vicens, Ivan; Murphy, Shona

    2016-01-01

    Cyclin-dependent kinases play critical roles in transcription by RNA polymerase II (pol II) and processing of the transcripts. For example, CDK9 regulates transcription of protein-coding genes, splicing, and 3' end formation of the transcripts. Accordingly, CDK9 inhibitors have a drastic effect on the production of mRNA in human cells. Recent analyses indicate that CDK9 regulates transcription at the early-elongation checkpoint of the vast majority of pol II-transcribed genes. Our recent discovery of an additional CDK9-regulated elongation checkpoint close to poly(A) sites adds a new layer to the control of transcription by this critical cellular kinase. This novel poly(A)-associated checkpoint has the potential to powerfully regulate gene expression just before a functional polyadenylated mRNA is produced: the point of no return. However, many questions remain to be answered before the role of this checkpoint becomes clear. Here we speculate on the possible biological significance of this novel mechanism of gene regulation and the players that may be involved.

  6. MHC class I-associated peptides derive from selective regions of the human genome.

    PubMed

    Pearson, Hillary; Daouda, Tariq; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Mader, Sylvie; Lemieux, Sébastien; Thibault, Pierre; Perreault, Claude

    2016-12-01

    MHC class I-associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology.

  7. MHC class I–associated peptides derive from selective regions of the human genome

    PubMed Central

    Pearson, Hillary; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Thibault, Pierre

    2016-01-01

    MHC class I–associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology. PMID:27841757

  8. Pea Chaperones under Centrifugation

    NASA Astrophysics Data System (ADS)

    Talalaiev, Oleksandr

    2008-06-01

    Etiolated Pisum sativum seedlings were subjected to altered g-forces by centrifugation (3-14g). By using semiquantitative RT-PCR, we studied transcripts of pea genes coding for chaperones that are representatives of small heat shock proteins (sHsps) family. Four members from the different classes of sHsps: cytosolic Hsp17.7 and Hsp18.1 (class I and class II accordingly), chloroplast Hsp21 (class III) and endoplasmic reticulum Hsp22.7 (class IV) were investigated. We conclude that exposure to 3, 7, 10 and 14g for 1h did not affect the level of sHsp transcripts.

  9. Expression-Linked Patterns of Codon Usage, Amino Acid Frequency, and Protein Length in the Basally Branching Arthropod Parasteatoda tepidariorum

    PubMed Central

    Whittle, Carrie A.; Extavour, Cassandra G.

    2016-01-01

    Abstract Spiders belong to the Chelicerata, the most basally branching arthropod subphylum. The common house spider, Parasteatoda tepidariorum, is an emerging model and provides a valuable system to address key questions in molecular evolution in an arthropod system that is distinct from traditionally studied insects. Here, we provide evidence suggesting that codon usage, amino acid frequency, and protein lengths are each influenced by expression-mediated selection in P. tepidariorum. First, highly expressed genes exhibited preferential usage of T3 codons in this spider, suggestive of selection. Second, genes with elevated transcription favored amino acids with low or intermediate size/complexity (S/C) scores (glycine and alanine) and disfavored those with large S/C scores (such as cysteine), consistent with the minimization of biosynthesis costs of abundant proteins. Third, we observed a negative correlation between expression level and coding sequence length. Together, we conclude that protein-coding genes exhibit signals of expression-related selection in this emerging, noninsect, arthropod model. PMID:27017527

  10. Transcripts of sulphur metabolic genes are co-ordinately regulated in developing seeds of common bean lacking phaseolin and major lectins

    PubMed Central

    Marsolais, Frédéric

    2012-01-01

    The lack of phaseolin and phytohaemagglutinin in common bean (dry bean, Phaseolus vulgaris) is associated with an increase in total cysteine and methionine concentrations by 70% and 10%, respectively, mainly at the expense of an abundant non-protein amino acid, S-methyl-cysteine. Transcripts were profiled between two genetically related lines differing for this trait at four stages of seed development using a high density microarray designed for common bean. Transcripts of multiple sulphur-rich proteins were elevated, several previously identified by proteomics, including legumin, basic 7S globulin, albumin-2, defensin, albumin-1, the Bowman–Birk type proteinase inhibitor, the double-headed trypsin inhibitor, and the Kunitz trypsin inhibitor. A co-ordinated regulation of transcripts coding for sulphate transporters, sulphate assimilatory enzymes, serine acetyltransferases, cystathionine β-lyase, homocysteine S-methyltransferase and methionine gamma-lyase was associated with changes in cysteine and methionine concentrations. Differential gene expression of sulphur-rich proteins preceded that of sulphur metabolic enzymes, suggesting a regulation by demand from the protein sink. Up-regulation of SERAT1;1 and -1;2 expression revealed an activation of cytosolic O-acetylserine biosynthesis. Down-regulation of SERAT2;1 suggested that cysteine and S-methyl-cysteine biosynthesis may be spatially separated in different subcellular compartments. Analysis of free amino acid profiles indicated that enhanced cysteine biosynthesis was correlated with a depletion of O-acetylserine. These results contribute to our understanding of the regulation of sulphur metabolism in developing seed in response to a change in the composition of endogenous proteins. PMID:23066144

  11. Transcripts of sulphur metabolic genes are co-ordinately regulated in developing seeds of common bean lacking phaseolin and major lectins.

    PubMed

    Liao, Dengqun; Pajak, Agnieszka; Karcz, Steven R; Chapman, B Patrick; Sharpe, Andrew G; Austin, Ryan S; Datla, Raju; Dhaubhadel, Sangeeta; Marsolais, Frédéric

    2012-10-01

    The lack of phaseolin and phytohaemagglutinin in common bean (dry bean, Phaseolus vulgaris) is associated with an increase in total cysteine and methionine concentrations by 70% and 10%, respectively, mainly at the expense of an abundant non-protein amino acid, S-methyl-cysteine. Transcripts were profiled between two genetically related lines differing for this trait at four stages of seed development using a high density microarray designed for common bean. Transcripts of multiple sulphur-rich proteins were elevated, several previously identified by proteomics, including legumin, basic 7S globulin, albumin-2, defensin, albumin-1, the Bowman-Birk type proteinase inhibitor, the double-headed trypsin inhibitor, and the Kunitz trypsin inhibitor. A co-ordinated regulation of transcripts coding for sulphate transporters, sulphate assimilatory enzymes, serine acetyltransferases, cystathionine β-lyase, homocysteine S-methyltransferase and methionine gamma-lyase was associated with changes in cysteine and methionine concentrations. Differential gene expression of sulphur-rich proteins preceded that of sulphur metabolic enzymes, suggesting a regulation by demand from the protein sink. Up-regulation of SERAT1;1 and -1;2 expression revealed an activation of cytosolic O-acetylserine biosynthesis. Down-regulation of SERAT2;1 suggested that cysteine and S-methyl-cysteine biosynthesis may be spatially separated in different subcellular compartments. Analysis of free amino acid profiles indicated that enhanced cysteine biosynthesis was correlated with a depletion of O-acetylserine. These results contribute to our understanding of the regulation of sulphur metabolism in developing seed in response to a change in the composition of endogenous proteins.

  12. Current Insights into Long Non-Coding RNAs in Renal Cell Carcinoma

    PubMed Central

    Seles, Maximilian; Hutterer, Georg C.; Kiesslich, Tobias; Pummer, Karl; Berindan-Neagoe, Ioana; Perakis, Samantha; Schwarzenbacher, Daniela; Stotz, Michael; Gerger, Armin; Pichler, Martin

    2016-01-01

    Renal cell carcinoma (RCC) represents a deadly disease with rising mortality despite intensive therapeutic efforts. It comprises several subtypes in terms of distinct histopathological features and different clinical presentations. Long non-coding RNAs (lncRNAs) are non-protein-coding transcripts in the genome which vary in expression levels and length and perform diverse functions. They are involved in the inititation, evolution and progression of primary cancer, as well as in the development and spread of metastases. Recently, several lncRNAs were described in RCC. This review emphasises the rising importance of lncRNAs in RCC. Moreover, it provides an outlook on their therapeutic potential in the future. PMID:27092491

  13. Nucleic Acid Chaperone Activity of the ORF1 Protein from the Mouse LINE-1 Retrotransposon

    PubMed Central

    Martin, Sandra L.; Bushman, Frederic D.

    2001-01-01

    Non-LTR retrotransposons such as L1 elements are major components of the mammalian genome, but their mechanism of replication is incompletely understood. Like retroviruses and LTR-containing retrotransposons, non-LTR retrotransposons replicate by reverse transcription of an RNA intermediate. The details of cDNA priming and integration, however, differ between these two classes. In retroviruses, the nucleocapsid (NC) protein has been shown to assist reverse transcription by acting as a “nucleic acid chaperone,” promoting the formation of the most stable duplexes between nucleic acid molecules. A protein-coding region with an NC-like sequence is present in most non-LTR retrotransposons, but no such sequence is evident in mammalian L1 elements or other members of its class. Here we investigated the ORF1 protein from mouse L1 and found that it does in fact display nucleic acid chaperone activities in vitro. L1 ORF1p (i) promoted annealing of complementary DNA strands, (ii) facilitated strand exchange to form the most stable hybrids in competitive displacement assays, and (iii) facilitated melting of an imperfect duplex but stabilized perfect duplexes. These findings suggest a role for L1 ORF1p in mediating nucleic acid strand transfer steps during L1 reverse transcription. PMID:11134335

  14. The Non-Coding RNA Ncr0700/PmgR1 is Required for Photomixotrophic Growth and the Regulation of Glycogen Accumulation in the Cyanobacterium Synechocystis sp. PCC 6803.

    PubMed

    de Porcellinis, Alice J; Klähn, Stephan; Rosgaard, Lisa; Kirsch, Rebekka; Gutekunst, Kirstin; Georg, Jens; Hess, Wolfgang R; Sakuragi, Yumiko

    2016-10-01

    Carbohydrate metabolism is a tightly regulated process in photosynthetic organisms. In the cyanobacterium Synechocystis sp. PCC 6803, the photomixotrophic growth protein A (PmgA) is involved in the regulation of glucose and storage carbohydrate (i.e. glycogen) metabolism, while its biochemical activity and possible factors acting downstream of PmgA are unknown. Here, a genome-wide microarray analysis of a ΔpmgA strain identified the expression of 36 protein-coding genes and 42 non-coding transcripts as significantly altered. From these, the non-coding RNA Ncr0700 was identified as the transcript most strongly reduced in abundance. Ncr0700 is widely conserved among cyanobacteria. In Synechocystis its expression is inversely correlated with light intensity. Similarly to a ΔpmgA mutant, a Δncr0700 deletion strain showed an approximately 2-fold increase in glycogen content under photoautotrophic conditions and wild-type-like growth. Moreover, its growth was arrested by 38 h after a shift to photomixotrophic conditions. Ectopic expression of Ncr0700 in Δncr0700 and ΔpmgA restored the glycogen content and photomixotrophic growth to wild-type levels. These results indicate that Ncr0700 is required for photomixotrophic growth and the regulation of glycogen accumulation, and acts downstream of PmgA. Hence Ncr0700 is renamed here as PmgR1 for photomixotrophic growth RNA 1. © The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  15. A compendium of transcription factor and Transcriptionally active protein coding gene families in cowpea (Vigna unguiculata L.).

    PubMed

    Misra, Vikram A; Wang, Yu; Timko, Michael P

    2017-11-22

    Cowpea (Vigna unguiculata (L.) Walp.) is the most important food and forage legume in the semi-arid tropics of sub-Saharan Africa where approximately 80% of worldwide production takes place primarily on low-input, subsistence farm sites. Among the major goals of cowpea breeding and improvement programs are the rapid manipulation of agronomic traits for seed size and quality and improved resistance to abiotic and biotic stresses to enhance productivity. Knowing the suite of transcription factors (TFs) and transcriptionally active proteins (TAPs) that control various critical plant cellular processes would contribute tremendously to these improvement aims. We used a computational approach that employed three different predictive pipelines to data mine the cowpea genome and identified over 4400 genes representing 136 different TF and TAP families. We compare the information content of cowpea to two evolutionarily close species common bean (Phaseolus vulgaris), and soybean (Glycine max) to gauge the relative informational content. Our data indicate that correcting for genome size cowpea has fewer TF and TAP genes than common bean (4408 / 5291) and soybean (4408/ 11,065). Members of the GROWTH-REGULATING FACTOR (GRF) and Auxin/indole-3-acetic acid (Aux/IAA) gene families appear to be over-represented in the genome relative to common bean and soybean, whereas members of the MADS (Minichromosome maintenance deficient 1 (MCM1), AGAMOUS, DEFICIENS, and serum response factor (SRF)) and C2C2-YABBY appear to be under-represented. Analysis of the AP2-EREBP APETALA2-Ethylene Responsive Element Binding Protein (AP2-EREBP), NAC (NAM (no apical meristem), ATAF1, 2 (Arabidopsis transcription activation factor), CUC (cup-shaped cotyledon)), and WRKY families, known to be important in defense signaling, revealed changes and phylogenetic rearrangements relative to common bean and soybean that suggest these groups may have evolved different functions. The availability of detailed information on the coding capacity of the cowpea genome and in particular the various TF and TAP gene families will facilitate future comparative analysis and development of strategies for controlling growth, differentiation, and abiotic and biotic stress resistances of cowpea.

  16. Nucleotide sequences of Dictyostelium discoideum developmentally regulated cDNAs rich in (AAC) imply proteins that contain clusters of asparagine, glutamine, or threonine.

    PubMed

    Shaw, D R; Richter, H; Giorda, R; Ohmachi, T; Ennis, H L

    1989-09-01

    A Dictyostelium discoideum repetitive element composed of long repeats of the codon (AAC) is found in developmentally regulated transcripts. The concentration of (AAC) sequences is low in mRNA from dormant spores and growing cells and increases markedly during spore germination and multicellular development. The sequence hybridizes to many different sized Dictyostelium DNA restriction fragments indicating that it is scattered throughout the genome. Four cDNA clones isolated contain (AAC) sequences in the deduced coding region. Interestingly, the (AAC)-rich sequences are present in all three reading frames in the deduced proteins, i.e., AAC (asparagine), ACA (threonine) and CAA (glutamine). Three of the clones contain only one of these in-frame so that the individual proteins carry either asparagine, threonine, or glutamine clusters, not mixtures. However, one clone is both glutamine- and asparagine-rich. The (AAC) portion of the transcripts are reiterated 300 times in the haploid genome while the other portions of the cDNAs represent single copy genes, whose sequences show no similarity other than the (AAC) repeats. The repeated sequence is similar to the opa or M sequence found in Drosophila melanogaster notch and homeo box genes and in fly developmentally regulated transcripts. The transcripts are present on polysomes suggesting that they are translated. Although the function of these repeats is unknown, long amino acid repeats are a characteristic feature of extracellular proteins of lower eukaryotes.

  17. Viral Ubiquitin Ligase Stimulates Selective Host MicroRNA Expression by Targeting ZEB Transcriptional Repressors

    PubMed Central

    Kim, Ju Youn; Leader, Andrew; Stoller, Michelle L.; Coen, Donald M.; Wilson, Angus C.

    2017-01-01

    Infection with herpes simplex virus-1 (HSV-1) brings numerous changes in cellular gene expression. Levels of most host mRNAs are reduced, limiting synthesis of host proteins, especially those involved in antiviral defenses. The impact of HSV-1 on host microRNAs (miRNAs), an extensive network of short non-coding RNAs that regulate mRNA stability/translation, remains largely unexplored. Here we show that transcription of the miR-183 cluster (miR-183, miR-96, and miR-182) is selectively induced by HSV-1 during productive infection of primary fibroblasts and neurons. ICP0, a viral E3 ubiquitin ligase expressed as an immediate-early protein, is both necessary and sufficient for this induction. Nuclear exclusion of ICP0 or removal of the RING (really interesting new gene) finger domain that is required for E3 ligase activity prevents induction. ICP0 promotes the degradation of numerous host proteins and for the most part, the downstream consequences are unknown. Induction of the miR-183 cluster can be mimicked by depletion of host transcriptional repressors zinc finger E-box binding homeobox 1 (ZEB1)/δ-crystallin enhancer binding factor 1 (δEF1) and zinc finger E-box binding homeobox 2 (ZEB2)/Smad-interacting protein 1 (SIP1), which we establish as new substrates for ICP0-mediated degradation. Thus, HSV-1 selectively stimulates expression of the miR-183 cluster by ICP0-mediated degradation of ZEB transcriptional repressors. PMID:28783105

  18. Insight into the Role of Long Non-coding RNAs During Osteogenesis in Mesenchymal Stem Cells.

    PubMed

    Huo, Sibei; Zhou, Yachuan; He, Xinyu; Wan, Mian; Du, Wei; Xu, Xin; Ye, Ling; Zhou, Xuedong; Zheng, Liwei

    2018-01-01

    Long non-coding RNAs (LncRNAs) are non-protein coding transcripts longer than 200 nucleotides in length. Instead of being "transcriptional noise", lncRNAs are emerging as a key modulator in various biological processes and disease development. Mesenchymal stem cells can be isolated from various adult tissues, such as bone marrow and dental tissues. The differentiation processes into multiple lineages, such as osteogenic differentiation, are precisely orchestrated by molecular signals in both genetic and epigenetic ways. Recently, several lines of evidence suggested the role of lncRNAs participating in cell differentiation through the regulation of gene transcriptions. And the involvement of lncRNAs may be associated with initiation and progression of mesenchymal stem cell-related diseases. We aimed at addressing the role of lncRNAs in the regulation of osteogenesis of mesenchymal stem cells derived from bone marrow and dental tissues, and discussing the potential utility of lncRNAs as biomarkers and therapeutic targets for mesenchymal stem cell-related diseases. Numerous lncRNAs were differentially expressed during osteogenesis or odontogenesis of mesenchymal stem cells, and some of them were confirmed to be able to regulate the differentiation processes through the modifications of chromatin, transcriptional and post-transcriptional processes. LncRNAs were also associated with some diseases related with pathologic differentiation of mesenchymal stem cells. LncRNAs involve in the osteogenic differentiation of bone marrow and dental tissuederived mesenchymal stem cells, and they could become promising therapeutic targets and prognosis parameters. However, the mechanisms of the role of lncRNAs are still enigmatic and require further investigation. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  19. The Ftx Noncoding Locus Controls X Chromosome Inactivation Independently of Its RNA Products.

    PubMed

    Furlan, Giulia; Gutierrez Hernandez, Nancy; Huret, Christophe; Galupa, Rafael; van Bemmel, Joke Gerarda; Romito, Antonio; Heard, Edith; Morey, Céline; Rougeulle, Claire

    2018-05-03

    Accumulation of the Xist long noncoding RNA (lncRNA) on one X chromosome is the trigger for X chromosome inactivation (XCI) in female mammals. Xist expression, which needs to be tightly controlled, involves a cis-acting region, the X-inactivation center (Xic), containing many lncRNA genes that evolved concomitantly to Xist from protein-coding ancestors through pseudogeneization and loss of coding potential. Here, we uncover an essential role for the Xic-linked noncoding gene Ftx in the regulation of Xist expression. We show that Ftx is required in cis to promote Xist transcriptional activation and establishment of XCI. Importantly, we demonstrate that this function depends on Ftx transcription and not on the RNA products. Our findings illustrate the multiplicity of layers operating in the establishment of XCI and highlight the diversity in the modus operandi of the noncoding players. Copyright © 2018 Elsevier Inc. All rights reserved.

  20. RNA editing in Drosophila melanogaster: new targets and functionalconsequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stapleton, Mark; Carlson, Joseph W.; Celniker, Susan E.

    2006-09-05

    Adenosine deaminases that act on RNA (ADARs) catalyze the site-specific conversion of adenosine to inosine in primary mRNA transcripts. These re-coding events affect coding potential, splice-sites, and stability of mature mRNAs. ADAR is an essential gene and studies in mouse, C. elegans, and Drosophila suggest its primary function is to modify adult behavior by altering signaling components in the nervous system. By comparing the sequence of isogenic cDNAs to genomic DNA, we have identified and experimentally verified 27 new targets of Drosophila ADAR. Our analyses lead us to identify new classes of genes whose transcripts are targets of ADAR includingmore » components of the actin cytoskeleton, and genes involved in ion homeostasis and signal transduction. Our results indicate that editing in Drosophila increases the diversity of the proteome, and does so in a manner that has direct functional consequences on protein function.« less

  1. RAID: a comprehensive resource for human RNA-associated (RNA–RNA/RNA–protein) interaction

    PubMed Central

    Zhang, Xiaomeng; Wu, Deng; Chen, Liqun; Li, Xiang; Yang, Jinxurong; Fan, Dandan; Dong, Tingting; Liu, Mingyue; Tan, Puwen; Xu, Jintian; Yi, Ying; Wang, Yuting; Zou, Hua; Hu, Yongfei; Fan, Kaili; Kang, Juanjuan; Huang, Yan; Miao, Zhengqiang; Bi, Miaoman; Jin, Nana; Li, Kongning; Li, Xia; Xu, Jianzhen; Wang, Dong

    2014-01-01

    Transcriptomic analyses have revealed an unexpected complexity in the eukaryote transcriptome, which includes not only protein-coding transcripts but also an expanding catalog of noncoding RNAs (ncRNAs). Diverse coding and noncoding RNAs (ncRNAs) perform functions through interaction with each other in various cellular processes. In this project, we have developed RAID (http://www.rna-society.org/raid), an RNA-associated (RNA–RNA/RNA–protein) interaction database. RAID intends to provide the scientific community with all-in-one resources for efficient browsing and extraction of the RNA-associated interactions in human. This version of RAID contains more than 6100 RNA-associated interactions obtained by manually reviewing more than 2100 published papers, including 4493 RNA–RNA interactions and 1619 RNA–protein interactions. Each entry contains detailed information on an RNA-associated interaction, including RAID ID, RNA/protein symbol, RNA/protein categories, validated method, expressing tissue, literature references (Pubmed IDs), and detailed functional description. Users can query, browse, analyze, and manipulate RNA-associated (RNA–RNA/RNA–protein) interaction. RAID provides a comprehensive resource of human RNA-associated (RNA–RNA/RNA–protein) interaction network. Furthermore, this resource will help in uncovering the generic organizing principles of cellular function network. PMID:24803509

  2. Zipper plot: visualizing transcriptional activity of genomic regions.

    PubMed

    Avila Cobos, Francisco; Anckaert, Jasper; Volders, Pieter-Jan; Everaert, Celine; Rombaut, Dries; Vandesompele, Jo; De Preter, Katleen; Mestdagh, Pieter

    2017-05-02

    Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may result in false negatives due to the overall limited conservation of lncRNAs. To tackle this problem we have developed the Zipper plot, a novel visualization and analysis method that enables users to simultaneously interrogate thousands of human putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These include publicly available CAGE-sequencing, ChIP-sequencing and DNase-sequencing datasets. Our method only requires three tab-separated fields (chromosome, genomic coordinate of the TSS and strand) as input and generates a report that includes a detailed summary table, a Zipper plot and several statistics derived from this plot. Using the Zipper plot, we found evidence of transcription for a set of well-characterized lncRNAs and observed that fewer mono-exonic lncRNAs have CAGE peaks overlapping with their TSSs compared to multi-exonic lncRNAs. Using publicly available RNA-seq data, we found more than one hundred cases where junction reads connected protein-coding gene exons with a downstream mono-exonic lncRNA, revealing the need for a careful evaluation of lncRNA 5'-boundaries. Our method is implemented using the statistical programming language R and is freely available as a webtool.

  3. Post-transcriptional regulatory network of epithelial-to-mesenchymal and mesenchymal-to-epithelial transitions

    PubMed Central

    2014-01-01

    Epithelial-to-mesenchymal transition (EMT) and its reverse process, mesenchymal-to-epithelial transition (MET), play important roles in embryogenesis, stem cell biology, and cancer progression. EMT can be regulated by many signaling pathways and regulatory transcriptional networks. Furthermore, post-transcriptional regulatory networks regulate EMT; these networks include the long non-coding RNA (lncRNA) and microRNA (miRNA) families. Specifically, the miR-200 family, miR-101, miR-506, and several lncRNAs have been found to regulate EMT. Recent studies have illustrated that several lncRNAs are overexpressed in various cancers and that they can promote tumor metastasis by inducing EMT. MiRNA controls EMT by regulating EMT transcription factors or other EMT regulators, suggesting that lncRNAs and miRNA are novel therapeutic targets for the treatment of cancer. Further efforts have shown that non-coding-mediated EMT regulation is closely associated with epigenetic regulation through promoter methylation (e.g., miR-200 or miR-506) and protein regulation (e.g., SET8 via miR-502). The formation of gene fusions has also been found to promote EMT in prostate cancer. In this review, we discuss the post-transcriptional regulatory network that is involved in EMT and MET and how targeting EMT and MET may provide effective therapeutics for human disease. PMID:24598126

  4. A Deeper Examination of Thorellius atrox Scorpion Venom Components with Omic Techonologies

    PubMed Central

    Romero-Gutierrez, Teresa; Batista, Cesar V. F.

    2017-01-01

    This communication reports a further examination of venom gland transcripts and venom composition of the Mexican scorpion Thorellius atrox using RNA-seq and tandem mass spectrometry. The RNA-seq, which was performed with the Illumina protocol, yielded more than 20,000 assembled transcripts. Following a database search and annotation strategy, 160 transcripts were identified, potentially coding for venom components. A novel sequence was identified that potentially codes for a peptide with similarity to spider ω-agatoxins, which act on voltage-gated calcium channels, not known before to exist in scorpion venoms. Analogous transcripts were found in other scorpion species. They could represent members of a new scorpion toxin family, here named omegascorpins. The mass fingerprint by LC-MS identified 135 individual venom components, five of which matched with the theoretical masses of putative peptides translated from the transcriptome. The LC-MS/MS de novo sequencing allowed to reconstruct and identify 42 proteins encoded by assembled transcripts, thus validating the transcriptome analysis. Earlier studies conducted with this scorpion venom permitted the identification of only twenty putative venom components. The present work performed with more powerful and modern omic technologies demonstrates the capacity of accomplishing a deeper characterization of scorpion venom components and the identification of novel molecules with potential applications in biomedicine and the study of ion channel physiology. PMID:29231872

  5. Genome-wide uniformity of human ‘open’ pre-initiation complexes

    PubMed Central

    Lai, William K.M.; Pugh, B. Franklin

    2017-01-01

    Transcription of protein-coding and noncoding DNA occurs pervasively throughout the mammalian genome. Their sites of initiation are generally inferred from transcript 5′ ends and are thought to be either locally dispersed or focused. How these two modes of initiation relate is unclear. Here, we apply permanganate treatment and chromatin immunoprecipitation (PIP-seq) of initiation factors to identify the precise location of melted DNA separately associated with the preinitiation complex (PIC) and the adjacent paused complex (PC). This approach revealed the two known modes of transcription initiation. However, in contrast to prevailing views, they co-occurred within the same promoter region: initiation originating from a focused PIC, and broad nucleosome-linked initiation. PIP-seq allowed transcriptional orientation of Pol II to be determined, which may be useful near promoters where sufficient sense/anti-sense transcript mapping information is lacking. PIP-seq detected divergently oriented Pol II at both coding and noncoding promoters, as well as at enhancers. Their occupancy levels were not necessarily coupled in the two orientations. DNA sequence and shape analysis of initiation complex sites suggest that both sequence and shape contribute to specificity, but in a context-restricted manner. That is, initiation sites have the locally “best” initiator (INR) sequence and/or shape. These findings reveal a common core to pervasive Pol II initiation throughout the human genome. PMID:27927716

  6. Exogean: a framework for annotating protein-coding genes in eukaryotic genomic DNA

    PubMed Central

    Djebali, Sarah; Delaplace, Franck; Crollius, Hugues Roest

    2006-01-01

    Background Accurate and automatic gene identification in eukaryotic genomic DNA is more than ever of crucial importance to efficiently exploit the large volume of assembled genome sequences available to the community. Automatic methods have always been considered less reliable than human expertise. This is illustrated in the EGASP project, where reference annotations against which all automatic methods are measured are generated by human annotators and experimentally verified. We hypothesized that replicating the accuracy of human annotators in an automatic method could be achieved by formalizing the rules and decisions that they use, in a mathematical formalism. Results We have developed Exogean, a flexible framework based on directed acyclic colored multigraphs (DACMs) that can represent biological objects (for example, mRNA, ESTs, protein alignments, exons) and relationships between them. Graphs are analyzed to process the information according to rules that replicate those used by human annotators. Simple individual starting objects given as input to Exogean are thus combined and synthesized into complex objects such as protein coding transcripts. Conclusion We show here, in the context of the EGASP project, that Exogean is currently the method that best reproduces protein coding gene annotations from human experts, in terms of identifying at least one exact coding sequence per gene. We discuss current limitations of the method and several avenues for improvement. PMID:16925841

  7. Human La binds mRNAs through contacts to the poly(A) tail.

    PubMed

    Vinayak, Jyotsna; Marrella, Stefano A; Hussain, Rawaa H; Rozenfeld, Leonid; Solomon, Karine; Bayfield, Mark A

    2018-05-04

    In addition to a role in the processing of nascent RNA polymerase III transcripts, La proteins are also associated with promoting cap-independent translation from the internal ribosome entry sites of numerous cellular and viral coding RNAs. La binding to RNA polymerase III transcripts via their common UUU-3'OH motif is well characterized, but the mechanism of La binding to coding RNAs is poorly understood. Using electromobility shift assays and cross-linking immunoprecipitation, we show that in addition to a sequence specific UUU-3'OH binding mode, human La exhibits a sequence specific and length dependent poly(A) binding mode. We demonstrate that this poly(A) binding mode uses the canonical nucleic acid interaction winged helix face of the eponymous La motif, previously shown to be vacant during uridylate binding. We also show that cytoplasmic, but not nuclear La, engages poly(A) RNA in human cells, that La entry into polysomes utilizes the poly(A) binding mode, and that La promotion of translation from the cyclin D1 internal ribosome entry site occurs in competition with cytoplasmic poly(A) binding protein (PABP). Our data are consistent with human La functioning in translation through contacts to the poly(A) tail.

  8. A genome-wide survey of maternal and embryonic transcripts during Xenopus tropicalis development.

    PubMed

    Paranjpe, Sarita S; Jacobi, Ulrike G; van Heeringen, Simon J; Veenstra, Gert Jan C

    2013-11-06

    Dynamics of polyadenylation vs. deadenylation determine the fate of several developmentally regulated genes. Decay of a subset of maternal mRNAs and new transcription define the maternal-to-zygotic transition, but the full complement of polyadenylated and deadenylated coding and non-coding transcripts has not yet been assessed in Xenopus embryos. To analyze the dynamics and diversity of coding and non-coding transcripts during development, both polyadenylated mRNA and ribosomal RNA-depleted total RNA were harvested across six developmental stages and subjected to high throughput sequencing. The maternally loaded transcriptome is highly diverse and consists of both polyadenylated and deadenylated transcripts. Many maternal genes show peak expression in the oocyte and include genes which are known to be the key regulators of events like oocyte maturation and fertilization. Of all the transcripts that increase in abundance between early blastula and larval stages, about 30% of the embryonic genes are induced by fourfold or more by the late blastula stage and another 35% by late gastrulation. Using a gene model validation and discovery pipeline, we identified novel transcripts and putative long non-coding RNAs (lncRNA). These lncRNA transcripts were stringently selected as spliced transcripts generated from independent promoters, with limited coding potential and a codon bias characteristic of noncoding sequences. Many lncRNAs are conserved and expressed in a developmental stage-specific fashion. These data reveal dynamics of transcriptome polyadenylation and abundance and provides a high-confidence catalogue of novel and long non-coding RNAs.

  9. HOTAIR: An Oncogenic Long Non-Coding RNA in Human Cancer.

    PubMed

    Tang, Qing; Hann, Swei Sunny

    2018-05-24

    Long non-coding RNAs (LncRNAs) represent a novel class of noncoding RNAs that are longer than 200 nucleotides without protein-coding potential and function as novel master regulators in various human diseases, including cancer. Accumulating evidence shows that lncRNAs are dysregulated and implicated in various aspects of cellular homeostasis, such as proliferation, apoptosis, mobility, invasion, metastasis, chromatin remodeling, gene transcription, and post-transcriptional processing. However, the mechanisms by which lncRNAs regulate various biological functions in human diseases have yet to be determined. HOX antisense intergenic RNA (HOTAIR) is a recently discovered lncRNA and plays a critical role in various areas of cancer, such as proliferation, survival, migration, drug resistance, and genomic stability. In this review, we briefly introduce the concept, identification, and biological functions of HOTAIR. We then describe the involvement of HOTAIR that has been associated with tumorigenesis, growth, invasion, cancer stem cell differentiation, metastasis, and drug resistance in cancer. We also discuss emerging insights into the role of HOTAIR as potential biomarkers and therapeutic targets for novel treatment paradigms in cancer. © 2018 The Author(s). Published by S. Karger AG, Basel.

  10. lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.

    PubMed

    Sun, Lei; Liu, Hui; Zhang, Lin; Meng, Jia

    2015-01-01

    Functional long non-coding RNAs (lncRNAs) have been bringing novel insight into biological study, however it is still not trivial to accurately distinguish the lncRNA transcripts (LNCTs) from the protein coding ones (PCTs). As various information and data about lncRNAs are preserved by previous studies, it is appealing to develop novel methods to identify the lncRNAs more accurately. Our method lncRScan-SVM aims at classifying PCTs and LNCTs using support vector machine (SVM). The gold-standard datasets for lncRScan-SVM model training, lncRNA prediction and method comparison were constructed according to the GENCODE gene annotations of human and mouse respectively. By integrating features derived from gene structure, transcript sequence, potential codon sequence and conservation, lncRScan-SVM outperforms other approaches, which is evaluated by several criteria such as sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC) and area under curve (AUC). In addition, several known human lncRNA datasets were assessed using lncRScan-SVM. LncRScan-SVM is an efficient tool for predicting the lncRNAs, and it is quite useful for current lncRNA study.

  11. Long-Range Control of Gene Expression: Emerging Mechanisms and Disruption in Disease

    PubMed Central

    Kleinjan, Dirk A.; van Heyningen, Veronica

    2005-01-01

    Transcriptional control is a major mechanism for regulating gene expression. The complex machinery required to effect this control is still emerging from functional and evolutionary analysis of genomic architecture. In addition to the promoter, many other regulatory elements are required for spatiotemporally and quantitatively correct gene expression. Enhancer and repressor elements may reside in introns or up- and downstream of the transcription unit. For some genes with highly complex expression patterns—often those that function as key developmental control genes—the cis-regulatory domain can extend long distances outside the transcription unit. Some of the earliest hints of this came from disease-associated chromosomal breaks positioned well outside the relevant gene. With the availability of wide-ranging genome sequence comparisons, strong conservation of many noncoding regions became obvious. Functional studies have shown many of these conserved sites to be transcriptional regulatory elements that sometimes reside inside unrelated neighboring genes. Such sequence-conserved elements generally harbor sites for tissue-specific DNA-binding proteins. Developmentally variable chromatin conformation can control protein access to these sites and can regulate transcription. Disruption of these finely tuned mechanisms can cause disease. Some regulatory element mutations will be associated with phenotypes distinct from any identified for coding-region mutations. PMID:15549674

  12. CRISPR-Cas9-Mediated Genome Editing and Transcriptional Control in Yarrowia lipolytica.

    PubMed

    Schwartz, Cory; Wheeldon, Ian

    2018-01-01

    The discovery and adaptation of RNA-guided nucleases has resulted in the rapid development of efficient, scalable, and easily accessible synthetic biology tools for targeted genome editing and transcriptional control. In these systems, for example CRISPR-Cas9 from Streptococcus pyogenes, a protein with nuclease activity is targeted to a specific nucleotide sequence by a short RNA molecule, whereupon binding it cleaves the targeted nucleotide strand. To extend this genome-editing ability to the industrially important oleaginous yeast Yarrowia lipolytica, we developed a set of easily usable and effective CRISPR-Cas9 episomal vectors. In this protocols chapter, we first present a method by which arbitrary protein-coding genes can be disrupted via indel formation after CRISPR-Cas9 targeting. A second method demonstrates how the same CRISPR-Cas9 system can be used to induce markerless gene cassette integration into the genome by inducing homologous recombination after DNA cleavage by Cas9. Finally, we describe how a catalytically inactive form of Cas9 fused to a transcriptional repressor can be used to control transcription of native genes in Y. lipolytica. The CRISPR-Cas9 tools and strategies described here greatly increase the types of genome editing and transcriptional control that can be achieved in Y. lipolytica, and promise to facilitate more advanced engineering of this important oleaginous host.

  13. Global Regulatory Functions of the Staphylococcus aureus Endoribonuclease III in Gene Expression

    PubMed Central

    Lioliou, Efthimia; Sharma, Cynthia M.; Caldelari, Isabelle; Helfer, Anne-Catherine; Fechter, Pierre; Vandenesch, François; Vogel, Jörg; Romby, Pascale

    2012-01-01

    RNA turnover plays an important role in both virulence and adaptation to stress in the Gram-positive human pathogen Staphylococcus aureus. However, the molecular players and mechanisms involved in these processes are poorly understood. Here, we explored the functions of S. aureus endoribonuclease III (RNase III), a member of the ubiquitous family of double-strand-specific endoribonucleases. To define genomic transcripts that are bound and processed by RNase III, we performed deep sequencing on cDNA libraries generated from RNAs that were co-immunoprecipitated with wild-type RNase III or two different cleavage-defective mutant variants in vivo. Several newly identified RNase III targets were validated by independent experimental methods. We identified various classes of structured RNAs as RNase III substrates and demonstrated that this enzyme is involved in the maturation of rRNAs and tRNAs, regulates the turnover of mRNAs and non-coding RNAs, and autoregulates its synthesis by cleaving within the coding region of its own mRNA. Moreover, we identified a positive effect of RNase III on protein synthesis based on novel mechanisms. RNase III–mediated cleavage in the 5′ untranslated region (5′UTR) enhanced the stability and translation of cspA mRNA, which encodes the major cold-shock protein. Furthermore, RNase III cleaved overlapping 5′UTRs of divergently transcribed genes to generate leaderless mRNAs, which constitutes a novel way to co-regulate neighboring genes. In agreement with recent findings, low abundance antisense RNAs covering 44% of the annotated genes were captured by co-immunoprecipitation with RNase III mutant proteins. Thus, in addition to gene regulation, RNase III is associated with RNA quality control of pervasive transcription. Overall, this study illustrates the complexity of post-transcriptional regulation mediated by RNase III. PMID:22761586

  14. Computational Identification and Functional Predictions of Long Noncoding RNA in Zea mays

    PubMed Central

    Boerner, Susan; McGinnis, Karen M.

    2012-01-01

    Background Computational analysis of cDNA sequences from multiple organisms suggests that a large portion of transcribed DNA does not code for a functional protein. In mammals, noncoding transcription is abundant, and often results in functional RNA molecules that do not appear to encode proteins. Many long noncoding RNAs (lncRNAs) appear to have epigenetic regulatory function in humans, including HOTAIR and XIST. While epigenetic gene regulation is clearly an essential mechanism in plants, relatively little is known about the presence or function of lncRNAs in plants. Methodology/Principal Findings To explore the connection between lncRNA and epigenetic regulation of gene expression in plants, a computational pipeline using the programming language Python has been developed and applied to maize full length cDNA sequences to identify, classify, and localize potential lncRNAs. The pipeline was used in parallel with an SVM tool for identifying ncRNAs to identify the maximal number of ncRNAs in the dataset. Although the available library of sequences was small and potentially biased toward protein coding transcripts, 15% of the sequences were predicted to be noncoding. Approximately 60% of these sequences appear to act as precursors for small RNA molecules and may function to regulate gene expression via a small RNA dependent mechanism. ncRNAs were predicted to originate from both genic and intergenic loci. Of the lncRNAs that originated from genic loci, ∼20% were antisense to the host gene loci. Conclusions/Significance Consistent with similar studies in other organisms, noncoding transcription appears to be widespread in the maize genome. Computational predictions indicate that maize lncRNAs may function to regulate expression of other genes through multiple RNA mediated mechanisms. PMID:22916204

  15. mTORC1 activity as a determinant of cancer risk--rationalizing the cancer-preventive effects of adiponectin, metformin, rapamycin, and low-protein vegan diets.

    PubMed

    McCarty, Mark F

    2011-10-01

    Increased plasma levels of adiponectin, metformin therapy of diabetes, rapamycin administration in transplant patients, and lifelong consumption of low-protein plant-based diets have all been linked to decreased risk for various cancers. These benefits may be mediated, at least in part, by down-regulated activity of the mTORC1 complex, a key regulator of protein translation. By boosting the effective availability of the translation initiator eIF4E, mTORC1 activity promotes the translation of a number of "weak" mRNAs that code for proteins, often up-regulated in cancer, that promote cellular proliferation, invasiveness, and angiogenesis, and that abet cancer promotion and chemoresistance by opposing apoptosis. Measures which inhibit eIF4E activity, either directly or indirectly, may have utility not only for cancer prevention, but also for the treatment of many cancers in which eIF4E drives malignancy. Since eIF4E is overexpressed in many cancers, strategies which target eIF4E directly--some of which are now being assessed clinically--may have the broadest efficacy in this regard. Many of the "weak" mRNAs coding for proteins that promote malignant behavior or chemoresistance are regulated transcriptionally by NF-kappaB and/or Stat3, which are active in a high proportion of cancers; thus, regimens concurrently targeting eIF4E, NF-kappaB, and Stat3 may suppress these proteins at both the transcriptional and translational levels, potentially achieving a very marked reduction in their expression. Copyright © 2011 Elsevier Ltd. All rights reserved.

  16. Influence of the stringent control system on the transcription of ribosomal ribonucleic acid and ribosomal protein genes in Escherichia coli.

    PubMed Central

    Dennis, P P

    1977-01-01

    The fraction of the total ribonucleic acid (RNA) synthesis rate that is messenger RNA (mRNA) for ribosomal protein (r-protein) and ribosomal RNA (rRNA) has been estimated in valS(Ts) rel+ stringent and valS(Ts) relA1 relaxed strains of Escherichia coli during a partial inhibition of valyl-transfer RNA aminoacylation. The partial inhibition was accomplished by shifting the strains from the permissive growth temperature of 29.5 degrees C to the semipermissive temperature of 35.5 degrees C. The RNA synthesized at the elevated temperature was pulse labeled with [3H]uracil. The fraction of the total incorpoarted 3H radioactivity in r-protein mRNA or in rRNA was estimated by specific hybridization to the transducing phages gammaspc1, which carries about 15 r-protein genes and lambdailv5, which carries an rRNA transcription unit. The results clearly demonstrate that the rel gene influences the fraction of the total RNA synthesis rate that is r protein mRNA and rRNA; in the rel+ strain they are significantly increased relative to control cultures. This indicates that the expression of the genes coding for the RNA and protein component of the ribosome are most likely regulated at the level of transcription. Furthermore, it appears that the distribution of functioning RNA polymerase between rRNA genes, r-protein genes, and other types of genes is influenced by the rel gene control system; presumably this influence is mediated through the unusual nucleotide guanosine tetraphosphate. PMID:320185

  17. Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

    PubMed Central

    Itoh, Takeshi; Tanaka, Tsuyoshi; Barrero, Roberto A.; Yamasaki, Chisato; Fujii, Yasuyuki; Hilton, Phillip B.; Antonio, Baltazar A.; Aono, Hideo; Apweiler, Rolf; Bruskiewich, Richard; Bureau, Thomas; Burr, Frances; Costa de Oliveira, Antonio; Fuks, Galina; Habara, Takuya; Haberer, Georg; Han, Bin; Harada, Erimi; Hiraki, Aiko T.; Hirochika, Hirohiko; Hoen, Douglas; Hokari, Hiroki; Hosokawa, Satomi; Hsing, Yue; Ikawa, Hiroshi; Ikeo, Kazuho; Imanishi, Tadashi; Ito, Yukiyo; Jaiswal, Pankaj; Kanno, Masako; Kawahara, Yoshihiro; Kawamura, Toshiyuki; Kawashima, Hiroaki; Khurana, Jitendra P.; Kikuchi, Shoshi; Komatsu, Setsuko; Koyanagi, Kanako O.; Kubooka, Hiromi; Lieberherr, Damien; Lin, Yao-Cheng; Lonsdale, David; Matsumoto, Takashi; Matsuya, Akihiro; McCombie, W. Richard; Messing, Joachim; Miyao, Akio; Mulder, Nicola; Nagamura, Yoshiaki; Nam, Jongmin; Namiki, Nobukazu; Numa, Hisataka; Nurimoto, Shin; O’Donovan, Claire; Ohyanagi, Hajime; Okido, Toshihisa; OOta, Satoshi; Osato, Naoki; Palmer, Lance E.; Quetier, Francis; Raghuvanshi, Saurabh; Saichi, Naomi; Sakai, Hiroaki; Sakai, Yasumichi; Sakata, Katsumi; Sakurai, Tetsuya; Sato, Fumihiko; Sato, Yoshiharu; Schoof, Heiko; Seki, Motoaki; Shibata, Michie; Shimizu, Yuji; Shinozaki, Kazuo; Shinso, Yuji; Singh, Nagendra K.; Smith-White, Brian; Takeda, Jun-ichi; Tanino, Motohiko; Tatusova, Tatiana; Thongjuea, Supat; Todokoro, Fusano; Tsugane, Mika; Tyagi, Akhilesh K.; Vanavichit, Apichart; Wang, Aihui; Wing, Rod A.; Yamaguchi, Kaori; Yamamoto, Mayu; Yamamoto, Naoyuki; Yu, Yeisoo; Zhang, Hao; Zhao, Qiang; Higo, Kenichi; Burr, Benjamin; Gojobori, Takashi; Sasaki, Takuji

    2007-01-01

    We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene. PMID:17210932

  18. Chimeric mitochondrial peptides from contiguous regular and swinger RNA.

    PubMed

    Seligmann, Hervé

    2016-01-01

    Previous mass spectrometry analyses described human mitochondrial peptides entirely translated from swinger RNAs, RNAs where polymerization systematically exchanged nucleotides. Exchanges follow one among 23 bijective transformation rules, nine symmetric exchanges (X ↔ Y, e.g. A ↔ C) and fourteen asymmetric exchanges (X → Y → Z → X, e.g. A → C → G → A), multiplying by 24 DNA's protein coding potential. Abrupt switches from regular to swinger polymerization produce chimeric RNAs. Here, human mitochondrial proteomic analyses assuming abrupt switches between regular and swinger transcriptions, detect chimeric peptides, encoded by part regular, part swinger RNA. Contiguous regular- and swinger-encoded residues within single peptides are stronger evidence for translation of swinger RNA than previously detected, entirely swinger-encoded peptides: regular parts are positive controls matched with contiguous swinger parts, increasing confidence in results. Chimeric peptides are 200 × rarer than swinger peptides (3/100,000 versus 6/1000). Among 186 peptides with > 8 residues for each regular and swinger parts, regular parts of eleven chimeric peptides correspond to six among the thirteen recognized, mitochondrial protein-coding genes. Chimeric peptides matching partly regular proteins are rarer and less expressed than chimeric peptides matching non-coding sequences, suggesting targeted degradation of misfolded proteins. Present results strengthen hypotheses that the short mitogenome encodes far more proteins than hitherto assumed. Entirely swinger-encoded proteins could exist.

  19. The Big Entity of New RNA World: Long Non-Coding RNAs in Microvascular Complications of Diabetes.

    PubMed

    Raut, Satish K; Khullar, Madhu

    2018-01-01

    A major part of the genome is known to be transcribed into non-protein coding RNAs (ncRNAs), such as microRNA and long non-coding RNA (lncRNA). The importance of ncRNAs is being increasingly recognized in physiological and pathological processes. lncRNAs are a novel class of ncRNAs that do not code for proteins and are important regulators of gene expression. In the past, these molecules were thought to be transcriptional "noise" with low levels of evolutionary conservation. However, recent studies provide strong evidence indicating that lncRNAs are (i) regulated during various cellular processes, (ii) exhibit cell type-specific expression, (iii) localize to specific organelles, and (iv) associated with human diseases. Emerging evidence indicates an aberrant expression of lncRNAs in diabetes and diabetes-related microvascular complications. In the present review, we discuss the current state of knowledge of lncRNAs, their genesis from genome, and the mechanism of action of individual lncRNAs in the pathogenesis of microvascular complications of diabetes and therapeutic approaches.

  20. Voltage-Gated Na+ Channel Isoforms and Their mRNA Expression Levels and Protein Abundance in Three Electric Organs and the Skeletal Muscle of the Electric Eel Electrophorus electricus

    PubMed Central

    Hiong, Kum C.; Boo, Mel V.; Wong, Wai P.; Chew, Shit F.

    2016-01-01

    This study aimed to obtain the coding cDNA sequences of voltage-gated Na+ channel (scn) α-subunit (scna) and β-subunit (scnb) isoforms from, and to quantify their transcript levels in, the main electric organ (EO), Hunter’s EO, Sach’s EO and the skeletal muscle (SM) of the electric eel, Electrophorus electricus, which can generate both high and low voltage electric organ discharges (EODs). The full coding sequences of two scna (scn4aa and scn4ab) and three scnb (scn1b, scn2b and scn4b) were identified for the first time (except scn4aa) in E. electricus. In adult fish, the scn4aa transcript level was the highest in the main EO and the lowest in the Sach’s EO, indicating that it might play an important role in generating high voltage EODs. For scn4ab/Scn4ab, the transcript and protein levels were unexpectedly high in the EOs, with expression levels in the main EO and the Hunter’s EO comparable to those of scn4aa. As the key domains affecting the properties of the channel were mostly conserved between Scn4aa and Scn4ab, Scn4ab might play a role in electrogenesis. Concerning scnb, the transcript level of scn4b was much higher than those of scn1b and scn2b in the EOs and the SM. While the transcript level of scn4b was the highest in the main EO, protein abundance of Scn4b was the highest in the SM. Taken together, it is unlikely that Scna could function independently to generate EODs in the EOs as previously suggested. It is probable that different combinations of Scn4aa/Scn4ab and various Scnb isoforms in the three EOs account for the differences in EODs produced in E. electricus. In general, the transcript levels of various scn isoforms in the EOs and the SM were much higher in adult than in juvenile, and the three EOs of the juvenile fish could be functionally indistinct. PMID:27907137

  1. An operon from Lactobacillus helveticus composed of a proline iminopeptidase gene (pepI) and two genes coding for putative members of the ABC transporter family of proteins.

    PubMed

    Varmanen, P; Rantanen, T; Palva, A

    1996-12-01

    A proline iminopeptidase gene (pepI) of an industrial Lactobacillus helveticus strain was cloned and found to be organized in an operon-like structure of three open reading frames (ORF1, ORF2 and ORF3). ORF1 was preceded by a typical prokaryotic promoter region, and a putative transcription terminator was found downstream of ORF3, identified as the pepI gene. Using primer-extension analyses, only one transcription start site, upstream of ORF1, was identifiable in the predicted operon. Although the size of mRNA could not be judged by Northern analysis either with ORF1-, ORF2- or pepI-specific probes, reverse transcription-PCR analyses further supported the operon structure of the three genes. ORF1, ORF2 and ORF3 had coding capacities for 50.7, 24.5 and 33.8 kDa proteins, respectively. The ORF3-encoded PepI protein showed 65% identity with the PepI proteins from Lactobacillus delbrueckii subsp. bulgaricus and Lactobacillus delbrueckii subsp. lactis. The ORF1-encoded protein had significant homology with several members of the ABC transporter family but, with two distinct putative ATP-binding sites, it would represent an unusual type among the bacterial ABC transporters. ORF2 encoded a putative integral membrane protein also characteristic of the ABC transporter family. The pepI gene was overexpressed in Escherichia coli. Purified PepI hydrolysed only di and tripeptides with proline in the first position. Optimum PepI activity was observed at pH 7.5 and 40 degrees C. A gel filtration analysis indicated that PepI is a dimer of M(r) 53,000. PepI was shown to be a metal-independent serine peptidase having thiol groups at or near the active site. Kinetic studies with proline-p-nitroanilide as substrate revealed Km and Vmax values of 0.8 mM and 350 mmol min-1 mg-1, respectively, and a very high turnover number of 135,000 s-1.

  2. Cross-site comparison of ribosomal depletion kits for Illumina RNAseq library construction.

    PubMed

    Herbert, Zachary T; Kershner, Jamie P; Butty, Vincent L; Thimmapuram, Jyothi; Choudhari, Sulbha; Alekseyev, Yuriy O; Fan, Jun; Podnar, Jessica W; Wilcox, Edward; Gipson, Jenny; Gillaspy, Allison; Jepsen, Kristen; BonDurant, Sandra Splinter; Morris, Krystalynne; Berkeley, Maura; LeClerc, Ashley; Simpson, Stephen D; Sommerville, Gary; Grimmett, Leslie; Adams, Marie; Levine, Stuart S

    2018-03-15

    Ribosomal RNA (rRNA) comprises at least 90% of total RNA extracted from mammalian tissue or cell line samples. Informative transcriptional profiling using massively parallel sequencing technologies requires either enrichment of mature poly-adenylated transcripts or targeted depletion of the rRNA fraction. The latter method is of particular interest because it is compatible with degraded samples such as those extracted from FFPE and also captures transcripts that are not poly-adenylated such as some non-coding RNAs. Here we provide a cross-site study that evaluates the performance of ribosomal RNA removal kits from Illumina, Takara/Clontech, Kapa Biosystems, Lexogen, New England Biolabs and Qiagen on intact and degraded RNA samples. We find that all of the kits are capable of performing significant ribosomal depletion, though there are differences in their ease of use. All kits were able to remove ribosomal RNA to below 20% with intact RNA and identify ~ 14,000 protein coding genes from the Universal Human Reference RNA sample at >1FPKM. Analysis of differentially detected genes between kits suggests that transcript length may be a key factor in library production efficiency. These results provide a roadmap for labs on the strengths of each of these methods and how best to utilize them.

  3. Staphylococcus aureus undergoes major transcriptional reorganization during growth with Enterococcus faecalis in milk.

    PubMed

    Viçosa, Gabriela Nogueira; Botta, Cristian; Ferrocino, Ilario; Bertolino, Marta; Ventura, Marco; Nero, Luís Augusto; Cocolin, Luca

    2018-08-01

    Previous studies have demonstrated the antagonistic potential of lactic acid bacteria (LAB) present in raw milk microbiota over Staphylococcus aureus, albeit the molecular mechanisms underlying this inhibitory effect are not fully understood. In this study, we compared the behavior of S. aureus ATCC 29213 alone and in the presence of a cheese-isolated LAB strain, Enterococcus faecalis 41FL1 in skimmed milk at 30 °C for 24 h using phenotypical and molecular approaches. Phenotypic analysis showed the absence of classical staphylococcal enterotoxins in co-culture with a 1.2-log decrease in S. aureus final population compared to single culture. Transcriptional activity of several exotoxins and global regulators, including agr, was negatively impacted in co-culture, contrasting with the accumulation of transcripts coding for surface proteins. After 24 h, the number of transcripts coding for several metabolite responsive elements, as well as enzymes involved in glycolysis and acetoin metabolism was increased in co-culture. The present study discusses the complexity of the transcriptomic mechanisms possibly leading to S. aureus attenuated virulence in the presence of E. faecalis and provides insights into this interspecies interaction in a simulated food context. Copyright © 2018 Elsevier Ltd. All rights reserved.

  4. Characterization and transcription of arsenic respiration and resistance genes during in situ uranium bioremediation

    PubMed Central

    Giloteaux, Ludovic; Holmes, Dawn E; Williams, Kenneth H; Wrighton, Kelly C; Wilkins, Michael J; Montgomery, Alison P; Smith, Jessica A; Orellana, Roberto; Thompson, Courtney A; Roper, Thomas J; Long, Philip E; Lovley, Derek R

    2013-01-01

    The possibility of arsenic release and the potential role of Geobacter in arsenic biogeochemistry during in situ uranium bioremediation was investigated because increased availability of organic matter has been associated with substantial releases of arsenic in other subsurface environments. In a field experiment conducted at the Rifle, CO study site, groundwater arsenic concentrations increased when acetate was added. The number of transcripts from arrA, which codes for the α-subunit of dissimilatory As(V) reductase, and acr3, which codes for the arsenic pump protein Acr3, were determined with quantitative reverse transcription-PCR. Most of the arrA (>60%) and acr3-1 (>90%) sequences that were recovered were most similar to Geobacter species, while the majority of acr3-2 (>50%) sequences were most closely related to Rhodoferax ferrireducens. Analysis of transcript abundance demonstrated that transcription of acr3-1 by the subsurface Geobacter community was correlated with arsenic concentrations in the groundwater. In contrast, Geobacter arrA transcript numbers lagged behind the major arsenic release and remained high even after arsenic concentrations declined. This suggested that factors other than As(V) availability regulated the transcription of arrA in situ, even though the presence of As(V) increased the transcription of arrA in cultures of Geobacter lovleyi, which was capable of As(V) reduction. These results demonstrate that subsurface Geobacter species can tightly regulate their physiological response to changes in groundwater arsenic concentrations. The transcriptomic approach developed here should be useful for the study of a diversity of other environments in which Geobacter species are considered to have an important influence on arsenic biogeochemistry. PMID:23038171

  5. Profilin is associated with transcriptionally active genes

    PubMed Central

    Söderberg, Emilia; Hessle, Viktoria; von Euler, Anne; Visa, Neus

    2012-01-01

    We have raised antibodies against the profilin of Chironomus tentans to study the location of profilin relative to chromatin and to active genes in salivary gland polytene chromosomes. We show that a fraction of profilin is located in the nucleus, where profilin is highly concentrated in the nucleoplasm and at the nuclear periphery. Moreover, profilin is associated with multiple bands in the polytene chromosomes. By staining salivary glands with propidium iodide, we show that profilin does not co-localize with dense chromatin. Profilin associates instead with protein-coding genes that are transcriptionally active, as revealed by co-localization with hnRNP and snRNP proteins. We have performed experiments of transcription inhibition with actinomycin D and we show that the association of profilin with the chromosomes requires ongoing transcription. However, the interaction of profilin with the gene loci does not depend on RNA. Our results are compatible with profilin regulating actin polymerization in the cell nucleus. However, the association of actin with the polytene chromosomes of C. tentans is sensitive to RNase, whereas the association of profilin is not, and we propose therefore that the chromosomal location of profilin is independent of actin. PMID:22572953

  6. Induction of multixenobiotic defense mechanisms in resistant Daphnia magna clones as a general cellular response to stress.

    PubMed

    Jordão, Rita; Campos, Bruno; Lemos, Marco F L; Soares, Amadeu M V M; Tauler, Romà; Barata, Carlos

    2016-06-01

    Multixenobiotic resistance mechanisms (MXR) were recently identified in Daphnia magna. Previous results characterized gene transcripts of genes encoding and efflux activities of four putative ABCB1 and ABCC transporters that were chemically induced but showed low specificity against model transporter substrates and inhibitors, thus preventing us from distinguishing between activities of different efflux transporter types. In this study we report on the specificity of induction of ABC transporters and of the stress protein hsp70 in clones selected to be genetically resistant to ABCB1 chemical substrates. Clones resistant to mitoxantrone, ivermectin and pentachlorophenol showed distinctive transcriptional responses of transporter protein coding genes and of putative transporter dye activities. Expression of hsp70 proteins also varied across resistant clones. Clones resistant to mitoxantrone and pentachlorophenol showed high constitutive levels of hsp70. Transcriptional levels of the abcb1 gene transporter and of putative dye transporter activity were also induced to a greater extent in the pentachlorophenol resistant clone. Observed higher dye transporter activities in individuals from clones resistant to mitoxantrone and ivermectin were unrelated with transcriptional levels of the studied four abcc and abcb1 transporter genes. These findings suggest that Abcb1 induction in D. magna may be a part of a general cellular stress response. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Uncovering the functional constraints underlying the genomic organization of the odorant-binding protein genes.

    PubMed

    Librado, Pablo; Rozas, Julio

    2013-01-01

    Animal olfactory systems have a critical role for the survival and reproduction of individuals. In insects, the odorant-binding proteins (OBPs) are encoded by a moderately sized gene family, and mediate the first steps of the olfactory processing. Most OBPs are organized in clusters of a few paralogs, which are conserved over time. Currently, the biological mechanism explaining the close physical proximity among OBPs is not yet established. Here, we conducted a comprehensive study aiming to gain insights into the mechanisms underlying the OBP genomic organization. We found that the OBP clusters are embedded within large conserved arrangements. These organizations also include other non-OBP genes, which often encode proteins integral to plasma membrane. Moreover, the conservation degree of such large clusters is related to the following: 1) the promoter architecture of the confined genes, 2) a characteristic transcriptional environment, and 3) the chromatin conformation of the chromosomal region. Our results suggest that chromatin domains may restrict the location of OBP genes to regions having the appropriate transcriptional environment, leading to the OBP cluster structure. However, the appropriate transcriptional environment for OBP and the other neighbor genes is not dominated by reduced levels of expression noise. Indeed, the stochastic fluctuations in the OBP transcript abundance may have a critical role in the combinatorial nature of the olfactory coding process.

  8. Bioinformatics of prokaryotic RNAs

    PubMed Central

    Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

    2014-01-01

    The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880

  9. The Long Noncoding RNA Landscape of the Mouse Eye.

    PubMed

    Chen, Weiwei; Yang, Shuai; Zhou, Zhonglou; Zhao, Xiaoting; Zhong, Jiayun; Reinach, Peter S; Yan, Dongsheng

    2017-12-01

    Long noncoding RNAs (lncRNAs) are important regulators of diverse biological functions. However, an extensive in-depth analysis of their expression profile and function in mammalian eyes is still lacking. Here we describe comprehensive landscapes of stage-dependent and tissue-specific lncRNA expression in the mouse eye. Affymetrix transcriptome array profiled lncRNA signatures from six different ocular tissue subsets (i.e., cornea, lens, retina, RPE, choroid, and sclera) in newborn and 8-week-old mice. Quantitative RT-PCR analysis validated array findings. Cis analyses and Gene Ontology (GO) annotation of protein-coding genes adjacent to signature lncRNA loci clarified potential lncRNA roles in maintaining tissue identity and regulating eye maturation during the aforementioned phase. In newborn and 8-week-old mice, we identified 47,332 protein-coding and noncoding gene transcripts. LncRNAs comprise 19,313 of these transcripts annotated in public data banks. During this maturation phase of these six different tissue subsets, more than 1000 lncRNAs expression levels underwent ≥2-fold changes. qRT-PCR analysis confirmed part of the gene microarray analysis results. K-means clustering identified 910 lncRNAs in the P0 groups and 686 lncRNAs in the postnatal 8-week-old groups, suggesting distinct tissue-specific lncRNA clusters. GO analysis of protein-coding genes proximal to lncRNA signatures resolved close correlations with their tissue-specific functional maturation between P0 and 8 weeks of age in the 6 tissue subsets. Characterizating maturational changes in lncRNA expression patterns as well as tissue-specific lncRNA signatures in six ocular tissues suggest important contributions made by lncRNA to the control of developmental processes in the mouse eye.

  10. Bacteriophage 5' untranslated regions for control of plastid transgene expression.

    PubMed

    Yang, Huijun; Gray, Benjamin N; Ahner, Beth A; Hanson, Maureen R

    2013-02-01

    Expression of foreign proteins from transgenes incorporated into plastid genomes requires regulatory sequences that can be recognized by the plastid transcription and translation machinery. Translation signals harbored by the 5' untranslated region (UTR) of plastid transcripts can profoundly affect the level of accumulation of proteins expressed from chimeric transgenes. Both endogenous 5' UTRs and the bacteriophage T7 gene 10 (T7g10) 5' UTR have been found to be effective in combination with particular coding regions to mediate high-level expression of foreign proteins. We investigated whether two other bacteriophage 5' UTRs could be utilized in plastid transgenes by fusing them to the aadA (aminoglycoside-3'-adenyltransferase) coding region that is commonly used as a selectable marker in plastid transformation. Transplastomic plants containing either the T7g1.3 or T4g23 5' UTRs fused to Myc-epitope-tagged aadA were successfully obtained, demonstrating the ability of these 5' UTRs to regulate gene expression in plastids. Placing the Thermobifida fusca cel6A gene under the control of the T7g1.3 or T4g23 5' UTRs, along with a tetC downstream box, resulted in poor expression of the cellulase in contrast with high-level accumulation while using the T7g10 5' UTR. However, transplastomic plants with the bacteriophage 5' UTRs controlling the aadA coding region exhibited fewer undesired recombinant species than plants containing the same marker gene regulated by the Nicotiana tabacum psbA 5' UTR. Furthermore, expression of the T7g1.3 and T4g23 5' UTR::aadA fusions downstream of the cel6A gene provided sufficient spectinomycin resistance to allow selection of homoplasmic transgenic plants and had no effect on Cel6A accumulation.

  11. Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene.

    PubMed

    Amit, Maayan; Sela, Noa; Keren, Hadas; Melamed, Ze'ev; Muler, Inna; Shomron, Noam; Izraeli, Shai; Ast, Gil

    2007-11-29

    Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains.

  12. Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene

    PubMed Central

    Amit, Maayan; Sela, Noa; Keren, Hadas; Melamed, Ze'ev; Muler, Inna; Shomron, Noam; Izraeli, Shai; Ast, Gil

    2007-01-01

    Background Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Results Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. Conclusion The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains. PMID:18047649

  13. CHIR99021 promotes self-renewal of mouse embryonic stem cells by modulation of protein-encoding gene and long intergenic non-coding RNA expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Yongyan; Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi; Ai, Zhiying

    2013-10-15

    Embryonic stem cells (ESCs) can proliferate indefinitely in vitro and differentiate into cells of all three germ layers. These unique properties make them exceptionally valuable for drug discovery and regenerative medicine. However, the practical application of ESCs is limited because it is difficult to derive and culture ESCs. It has been demonstrated that CHIR99021 (CHIR) promotes self-renewal and enhances the derivation efficiency of mouse (m)ESCs. However, the downstream targets of CHIR are not fully understood. In this study, we identified CHIR-regulated genes in mESCs using microarray analysis. Our microarray data demonstrated that CHIR not only influenced the Wnt/β-catenin pathway bymore » stabilizing β-catenin, but also modulated several other pluripotency-related signaling pathways such as TGF-β, Notch and MAPK signaling pathways. More detailed analysis demonstrated that CHIR inhibited Nodal signaling, while activating bone morphogenetic protein signaling in mESCs. In addition, we found that pluripotency-maintaining transcription factors were up-regulated by CHIR, while several developmental-related genes were down-regulated. Furthermore, we found that CHIR altered the expression of epigenetic regulatory genes and long intergenic non-coding RNAs. Quantitative real-time PCR results were consistent with microarray data, suggesting that CHIR alters the expression pattern of protein-encoding genes (especially transcription factors), epigenetic regulatory genes and non-coding RNAs to establish a relatively stable pluripotency-maintaining network. - Highlights: • Combined use of CHIR with LIF promotes self-renewal of J1 mESCs. • CHIR-regulated genes are involved in multiple pathways. • CHIR inhibits Nodal signaling and promotes Bmp4 expression to activate BMP signaling. • Expression of epigenetic regulatory genes and lincRNAs is altered by CHIR.« less

  14. Complex Interplay among DNA Modification, Noncoding RNA Expression and Protein-Coding RNA Expression in Salvia miltiorrhiza Chloroplast Genome

    PubMed Central

    Chen, Haimei; Zhang, Jianhui; Yuan, George; Liu, Chang

    2014-01-01

    Salvia miltiorrhiza is one of the most widely used medicinal plants. As a first step to develop a chloroplast-based genetic engineering method for the over-production of active components from S. miltiorrhiza, we have analyzed the genome, transcriptome, and base modifications of the S. miltiorrhiza chloroplast. Total genomic DNA and RNA were extracted from fresh leaves and then subjected to strand-specific RNA-Seq and Single-Molecule Real-Time (SMRT) sequencing analyses. Mapping the RNA-Seq reads to the genome assembly allowed us to determine the relative expression levels of 80 protein-coding genes. In addition, we identified 19 polycistronic transcription units and 136 putative antisense and intergenic noncoding RNA (ncRNA) genes. Comparison of the abundance of protein-coding transcripts (cRNA) with and without overlapping antisense ncRNAs (asRNA) suggest that the presence of asRNA is associated with increased cRNA abundance (p<0.05). Using the SMRT Portal software (v1.3.2), 2687 potential DNA modification sites and two potential DNA modification motifs were predicted. The two motifs include a TATA box–like motif (CPGDMM1, “TATANNNATNA”), and an unknown motif (CPGDMM2 “WNYANTGAW”). Specifically, 35 of the 97 CPGDMM1 motifs (36.1%) and 91 of the 369 CPGDMM2 motifs (24.7%) were found to be significantly modified (p<0.01). Analysis of genes downstream of the CPGDMM1 motif revealed the significantly increased abundance of ncRNA genes that are less than 400 bp away from the significantly modified CPGDMM1motif (p<0.01). Taking together, the present study revealed a complex interplay among DNA modifications, ncRNA and cRNA expression in chloroplast genome. PMID:24914614

  15. Complex interplay among DNA modification, noncoding RNA expression and protein-coding RNA expression in Salvia miltiorrhiza chloroplast genome.

    PubMed

    Chen, Haimei; Zhang, Jianhui; Yuan, George; Liu, Chang

    2014-01-01

    Salvia miltiorrhiza is one of the most widely used medicinal plants. As a first step to develop a chloroplast-based genetic engineering method for the over-production of active components from S. miltiorrhiza, we have analyzed the genome, transcriptome, and base modifications of the S. miltiorrhiza chloroplast. Total genomic DNA and RNA were extracted from fresh leaves and then subjected to strand-specific RNA-Seq and Single-Molecule Real-Time (SMRT) sequencing analyses. Mapping the RNA-Seq reads to the genome assembly allowed us to determine the relative expression levels of 80 protein-coding genes. In addition, we identified 19 polycistronic transcription units and 136 putative antisense and intergenic noncoding RNA (ncRNA) genes. Comparison of the abundance of protein-coding transcripts (cRNA) with and without overlapping antisense ncRNAs (asRNA) suggest that the presence of asRNA is associated with increased cRNA abundance (p<0.05). Using the SMRT Portal software (v1.3.2), 2687 potential DNA modification sites and two potential DNA modification motifs were predicted. The two motifs include a TATA box-like motif (CPGDMM1, "TATANNNATNA"), and an unknown motif (CPGDMM2 "WNYANTGAW"). Specifically, 35 of the 97 CPGDMM1 motifs (36.1%) and 91 of the 369 CPGDMM2 motifs (24.7%) were found to be significantly modified (p<0.01). Analysis of genes downstream of the CPGDMM1 motif revealed the significantly increased abundance of ncRNA genes that are less than 400 bp away from the significantly modified CPGDMM1motif (p<0.01). Taking together, the present study revealed a complex interplay among DNA modifications, ncRNA and cRNA expression in chloroplast genome.

  16. A deep transcriptomic resource for the copepod crustacean Labidocera madurae: A potential indicator species for assessing near shore ecosystem health

    PubMed Central

    Christie, Andrew E.; Sommer, Stephanie A.; Cieslak, Matthew C.; Hartline, Daniel K.; Lenz, Petra H.

    2017-01-01

    Coral reef ecosystems of many sub-tropical and tropical marine coastal environments have suffered significant degradation from anthropogenic sources. Research to inform management strategies that mitigate stressors and promote a healthy ecosystem has focused on the ecology and physiology of coral reefs and associated organisms. Few studies focus on the surrounding pelagic communities, which are equally important to ecosystem function. Zooplankton, often dominated by small crustaceans such as copepods, is an important food source for invertebrates and fishes, especially larval fishes. The reef-associated zooplankton includes a sub-neustonic copepod family that could serve as an indicator species for the community. Here, we describe the generation of a de novo transcriptome for one such copepod, Labidocera madurae, a pontellid from an intensively-studied coral reef ecosystem, Kāne‘ohe Bay, Oahu, Hawai‘i. The transcriptome was assembled using high-throughput sequence data obtained from whole organisms. It comprised 211,002 unique transcripts, including 72,391 with coding regions. It was assessed for quality and completeness using multiple workflows. Bench-marking-universal-single-copy-orthologs (BUSCO) analysis identified transcripts for 88% of expected eukaryotic core proteins. Targeted gene-discovery analyses included searches for transcripts coding full-length “giant” proteins (>4,000 amino acids), proteins and splice variants of voltage-gated sodium channels, and proteins involved in the circadian signaling pathway. Four different reference transcriptomes were generated and compared for the detection of differential gene expression between copepodites and adult females; 6,229 genes were consistently identified as differentially expressed between the two regardless of reference. Automated bioinformatics analyses and targeted manual gene curation suggest that the de novo assembled L. madurae transcriptome is of high quality and completeness. This transcriptome provides a new resource for assessing the global physiological status of a planktonic species inhabiting a coral reef ecosystem that is subjected to multiple anthropogenic stressors. The workflows provide a template for generating and assessing transcriptomes in other non-model species. PMID:29065152

  17. A deep transcriptomic resource for the copepod crustacean Labidocera madurae: A potential indicator species for assessing near shore ecosystem health.

    PubMed

    Roncalli, Vittoria; Christie, Andrew E; Sommer, Stephanie A; Cieslak, Matthew C; Hartline, Daniel K; Lenz, Petra H

    2017-01-01

    Coral reef ecosystems of many sub-tropical and tropical marine coastal environments have suffered significant degradation from anthropogenic sources. Research to inform management strategies that mitigate stressors and promote a healthy ecosystem has focused on the ecology and physiology of coral reefs and associated organisms. Few studies focus on the surrounding pelagic communities, which are equally important to ecosystem function. Zooplankton, often dominated by small crustaceans such as copepods, is an important food source for invertebrates and fishes, especially larval fishes. The reef-associated zooplankton includes a sub-neustonic copepod family that could serve as an indicator species for the community. Here, we describe the generation of a de novo transcriptome for one such copepod, Labidocera madurae, a pontellid from an intensively-studied coral reef ecosystem, Kāne'ohe Bay, Oahu, Hawai'i. The transcriptome was assembled using high-throughput sequence data obtained from whole organisms. It comprised 211,002 unique transcripts, including 72,391 with coding regions. It was assessed for quality and completeness using multiple workflows. Bench-marking-universal-single-copy-orthologs (BUSCO) analysis identified transcripts for 88% of expected eukaryotic core proteins. Targeted gene-discovery analyses included searches for transcripts coding full-length "giant" proteins (>4,000 amino acids), proteins and splice variants of voltage-gated sodium channels, and proteins involved in the circadian signaling pathway. Four different reference transcriptomes were generated and compared for the detection of differential gene expression between copepodites and adult females; 6,229 genes were consistently identified as differentially expressed between the two regardless of reference. Automated bioinformatics analyses and targeted manual gene curation suggest that the de novo assembled L. madurae transcriptome is of high quality and completeness. This transcriptome provides a new resource for assessing the global physiological status of a planktonic species inhabiting a coral reef ecosystem that is subjected to multiple anthropogenic stressors. The workflows provide a template for generating and assessing transcriptomes in other non-model species.

  18. Differential Gene Expression at Coral Settlement and Metamorphosis - A Subtractive Hybridization Study

    PubMed Central

    Hayward, David C.; Hetherington, Suzannah; Behm, Carolyn A.; Grasso, Lauretta C.; Forêt, Sylvain; Miller, David J.; Ball, Eldon E.

    2011-01-01

    Background A successful metamorphosis from a planktonic larva to a settled polyp, which under favorable conditions will establish a future colony, is critical for the survival of corals. However, in contrast to the situation in other animals, e.g., frogs and insects, little is known about the molecular basis of coral metamorphosis. We have begun to redress this situation with previous microarray studies, but there is still a great deal to learn. In the present paper we have utilized a different technology, subtractive hybridization, to characterize genes differentially expressed across this developmental transition and to compare the success of this method to microarray. Methodology/Principal Findings Suppressive subtractive hybridization (SSH) was used to identify two pools of transcripts from the coral, Acropora millepora. One is enriched for transcripts expressed at higher levels at the pre-settlement stage, and the other for transcripts expressed at higher levels at the post-settlement stage. Virtual northern blots were used to demonstrate the efficacy of the subtractive hybridization technique. Both pools contain transcripts coding for proteins in various functional classes but transcriptional regulatory proteins were represented more frequently in the post-settlement pool. Approximately 18% of the transcripts showed no significant similarity to any other sequence on the public databases. Transcripts of particular interest were further characterized by in situ hybridization, which showed that many are regulated spatially as well as temporally. Notably, many transcripts exhibit axially restricted expression patterns that correlate with the pool from which they were isolated. Several transcripts are expressed in patterns consistent with a role in calcification. Conclusions We have characterized over 200 transcripts that are differentially expressed between the planula larva and post-settlement polyp of the coral, Acropora millepora. Sequence, putative function, and in some cases temporal and spatial expression are reported. PMID:22065994

  19. Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

    PubMed Central

    Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

    2011-01-01

    Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309

  20. A Comparative Encyclopedia of DNA Elements in the Mouse Genome

    PubMed Central

    Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D.; Shen, Yin; Pervouchine, Dmitri D.; Djebali, Sarah; Thurman, Bob; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K.; Williams, Brian A.; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M. A.; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T.; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Keller, Cheryl A.; Morrissey, Christapher S.; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S.; Cayting, Philip; Kawli, Trupti; Boyle, Alan P.; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S.; Cline, Melissa S.; Erickson, Drew T.; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A.; Rosenbloom, Kate R.; de Sousa, Beatriz Lacerda; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W. James; Santos, Miguel Ramalho; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J.; Wilken, Matthew S.; Reh, Thomas A.; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P.; Neph, Shane; Humbert, Richard; Hansen, R. Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E.; Orkin, Stuart H.; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J.; Blobel, Gerd A.; Good, Peter J.; Lowdon, Rebecca F.; Adams, Leslie B.; Zhou, Xiao-Qiao; Pazin, Michael J.; Feingold, Elise A.; Wold, Barbara; Taylor, James; Kellis, Manolis; Mortazavi, Ali; Weissman, Sherman M.; Stamatoyannopoulos, John; Snyder, Michael P.; Guigo, Roderic; Gingeras, Thomas R.; Gilbert, David M.; Hardison, Ross C.; Beer, Michael A.; Ren, Bing

    2014-01-01

    Summary As the premier model organism in biomedical research, the laboratory mouse shares the majority of protein-coding genes with humans, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications, and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of other sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases. PMID:25409824

  1. A comparative encyclopedia of DNA elements in the mouse genome.

    PubMed

    Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D; Shen, Yin; Pervouchine, Dmitri D; Djebali, Sarah; Thurman, Robert E; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K; Williams, Brian A; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M A; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D; Bansal, Mukul S; Kellis, Manolis; Keller, Cheryl A; Morrissey, Christapher S; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S; Cayting, Philip; Kawli, Trupti; Boyle, Alan P; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S; Cline, Melissa S; Erickson, Drew T; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A; Rosenbloom, Kate R; Lacerda de Sousa, Beatriz; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W James; Ramalho Santos, Miguel; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J; Wilken, Matthew S; Reh, Thomas A; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P; Neph, Shane; Humbert, Richard; Hansen, R Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E; Orkin, Stuart H; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J; Blobel, Gerd A; Cao, Xiaoyi; Zhong, Sheng; Wang, Ting; Good, Peter J; Lowdon, Rebecca F; Adams, Leslie B; Zhou, Xiao-Qiao; Pazin, Michael J; Feingold, Elise A; Wold, Barbara; Taylor, James; Mortazavi, Ali; Weissman, Sherman M; Stamatoyannopoulos, John A; Snyder, Michael P; Guigo, Roderic; Gingeras, Thomas R; Gilbert, David M; Hardison, Ross C; Beer, Michael A; Ren, Bing

    2014-11-20

    The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.

  2. G =  MAT: linking transcription factor expression and DNA binding data.

    PubMed

    Tretyakov, Konstantin; Laur, Sven; Vilo, Jaak

    2011-01-31

    Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/.

  3. G = MAT: Linking Transcription Factor Expression and DNA Binding Data

    PubMed Central

    Tretyakov, Konstantin; Laur, Sven; Vilo, Jaak

    2011-01-01

    Transcription factors are proteins that bind to motifs on the DNA and thus affect gene expression regulation. The qualitative description of the corresponding processes is therefore important for a better understanding of essential biological mechanisms. However, wet lab experiments targeted at the discovery of the regulatory interplay between transcription factors and binding sites are expensive. We propose a new, purely computational method for finding putative associations between transcription factors and motifs. This method is based on a linear model that combines sequence information with expression data. We present various methods for model parameter estimation and show, via experiments on simulated data, that these methods are reliable. Finally, we examine the performance of this model on biological data and conclude that it can indeed be used to discover meaningful associations. The developed software is available as a web tool and Scilab source code at http://biit.cs.ut.ee/gmat/. PMID:21297945

  4. Genome-wide analysis of the DNA-binding with one zinc finger (Dof) transcription factor family in bananas.

    PubMed

    Dong, Chen; Hu, Huigang; Xie, Jianghui

    2016-12-01

    DNA-binding with one finger (Dof) domain proteins are a multigene family of plant-specific transcription factors involved in numerous aspects of plant growth and development. In this study, we report a genome-wide search for Musa acuminata Dof (MaDof) genes and their expression profiles at different developmental stages and in response to various abiotic stresses. In addition, a complete overview of the Dof gene family in bananas is presented, including the gene structures, chromosomal locations, cis-regulatory elements, conserved protein domains, and phylogenetic inferences. Based on the genome-wide analysis, we identified 74 full-length protein-coding MaDof genes unevenly distributed on 11 chromosomes. Phylogenetic analysis with Dof members from diverse plant species showed that MaDof genes can be classified into four subgroups (StDof I, II, III, and IV). The detailed genomic information of the MaDof gene homologs in the present study provides opportunities for functional analyses to unravel the exact role of the genes in plant growth and development.

  5. A class of circadian long non-coding RNAs mark enhancers modulating long-range circadian gene regulation

    PubMed Central

    Fan, Zenghua; Zhao, Meng; Joshi, Parth D.; Li, Ping; Zhang, Yan; Guo, Weimin; Xu, Yichi; Wang, Haifang; Zhao, Zhihu

    2017-01-01

    Abstract Circadian rhythm exerts its influence on animal physiology and behavior by regulating gene expression at various levels. Here we systematically explored circadian long non-coding RNAs (lncRNAs) in mouse liver and examined their circadian regulation. We found that a significant proportion of circadian lncRNAs are expressed at enhancer regions, mostly bound by two key circadian transcription factors, BMAL1 and REV-ERBα. These circadian lncRNAs showed similar circadian phases with their nearby genes. The extent of their nuclear localization is higher than protein coding genes but less than enhancer RNAs. The association between enhancer and circadian lncRNAs is also observed in tissues other than liver. Comparative analysis between mouse and rat circadian liver transcriptomes showed that circadian transcription at lncRNA loci tends to be conserved despite of low sequence conservation of lncRNAs. One such circadian lncRNA termed lnc-Crot led us to identify a super-enhancer region interacting with a cluster of genes involved in circadian regulation of metabolism through long-range interactions. Further experiments showed that lnc-Crot locus has enhancer function independent of lnc-Crot's transcription. Our results suggest that the enhancer-associated circadian lncRNAs mark the genomic loci modulating long-range circadian gene regulation and shed new lights on the evolutionary origin of lncRNAs. PMID:28335007

  6. The complete genome sequence of the Atlantic salmon paramyxovirus (ASPV)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nylund, Stian; Karlsen, Marius; Nylund, Are

    2008-03-30

    The complete RNA genome of the Atlantic salmon paramyxovirus (ASPV), isolated from Atlantic salmon suffering from proliferative gill inflammation (PGI), has been determined. The genome is 16,965 nucleotides in length and consists of six nonoverlapping genes in the order 3'- N - P/C/V - M - F - HN - L -5', coding for the nucleocapsid, phospho-, matrix, fusion, hemagglutinin-neuraminidase and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and trinucleotide intergenic regions similar to those of other Paramyxoviridae. The ASPV P-gene expression strategy is like that of the respiro- and morbilliviruses,more » which express the phosphoprotein from the primary transcript, and edit a portion of the mRNA to encode the accessory proteins V and W. It also encodes the C-protein by ribosomal choice of translation initiation. Pairwise comparisons of amino acid identities, and phylogenetic analysis of deduced ASPV protein sequences with homologous sequences from other Paramyxoviridae, show that ASPV has an affinity for the genus Respirovirus, but may represent a new genus within the subfamily Paramyxovirinae.« less

  7. P-class pentatricopeptide repeat proteins are required for efficient 5′ end formation of plant mitochondrial transcripts

    PubMed Central

    Binder, Stefan; Stoll, Katrin; Stoll, Birgit

    2013-01-01

    It is well recognized that flowering plants maintain a particularly broad spectrum of factors to support gene expression in mitochondria. Many of these factors are pentatricopeptide repeat (PPR) proteins that participate in virtually all processes dealing with RNA. One of these processes is the post-transcriptional generation of mature 5′ termini of RNA. Several PPR proteins are required for efficient 5′ maturation of mitochondrial mRNA and rRNA. These so-called RNA PROCESSING FACTORs (RPF) exclusively represent P-class PPR proteins, mainly composed of canonical PPR motifs without any extra domains. Applying the recent PPR-nucleotide recognition code, binding sites of RPF are predicted on the 5′ leader sequences. The sequence-specific interaction of an RPF with one or a few RNA substrates probably directly or indirectly recruits an as-yet-unidentified endonuclease to the processing site(s). The identification and characterization of RPF is a major step toward the understanding of the role of 5′ end maturation in flowering plant mitochondria. PMID:24184847

  8. Bijective transformation circular codes and nucleotide exchanging RNA transcription.

    PubMed

    Michel, Christian J; Seligmann, Hervé

    2014-04-01

    The C(3) self-complementary circular code X identified in genes of prokaryotes and eukaryotes is a set of 20 trinucleotides enabling reading frame retrieval and maintenance, i.e. a framing code (Arquès and Michel, 1996; Michel, 2012, 2013). Some mitochondrial RNAs correspond to DNA sequences when RNA transcription systematically exchanges between nucleotides (Seligmann, 2013a,b). We study here the 23 bijective transformation codes ΠX of X which may code nucleotide exchanging RNA transcription as suggested by this mitochondrial observation. The 23 bijective transformation codes ΠX are C(3) trinucleotide circular codes, seven of them are also self-complementary. Furthermore, several correlations are observed between the Reading Frame Retrieval (RFR) probability of bijective transformation codes ΠX and the different biological properties of ΠX related to their numbers of RNAs in GenBank's EST database, their polymerization rate, their number of amino acids and the chirality of amino acids they code. Results suggest that the circular code X with the functions of reading frame retrieval and maintenance in regular RNA transcription, may also have, through its bijective transformation codes ΠX, the same functions in nucleotide exchanging RNA transcription. Associations with properties such as amino acid chirality suggest that the RFR of X and its bijective transformations molded the origins of the genetic code's machinery. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  9. Avoidance of truncated proteins from unintended ribosome binding sites within heterologous protein coding sequences.

    PubMed

    Whitaker, Weston R; Lee, Hanson; Arkin, Adam P; Dueber, John E

    2015-03-20

    Genetic sequences ported into non-native hosts for synthetic biology applications can gain unexpected properties. In this study, we explored sequences functioning as ribosome binding sites (RBSs) within protein coding DNA sequences (CDSs) that cause internal translation, resulting in truncated proteins. Genome-wide prediction of bacterial RBSs, based on biophysical calculations employed by the RBS calculator, suggests a selection against internal RBSs within CDSs in Escherichia coli, but not those in Saccharomyces cerevisiae. Based on these calculations, silent mutations aimed at removing internal RBSs can effectively reduce truncation products from internal translation. However, a solution for complete elimination of internal translation initiation is not always feasible due to constraints of available coding sequences. Fluorescence assays and Western blot analysis showed that in genes with internal RBSs, increasing the strength of the intended upstream RBS had little influence on the internal translation strength. Another strategy to minimize truncated products from an internal RBS is to increase the relative strength of the upstream RBS with a concomitant reduction in promoter strength to achieve the same protein expression level. Unfortunately, lower transcription levels result in increased noise at the single cell level due to stochasticity in gene expression. At the low expression regimes desired for many synthetic biology applications, this problem becomes particularly pronounced. We found that balancing promoter strengths and upstream RBS strengths to intermediate levels can achieve the target protein concentration while avoiding both excessive noise and truncated protein.

  10. Transcriptional dissection of melanoma identifies a high-risk subtype underlying TP53 family genes and epigenome deregulation

    PubMed Central

    Badal, Brateil; Solovyov, Alexander; Di Cecilia, Serena; Chan, Joseph Minhow; Chang, Li-Wei; Iqbal, Ramiz; Aydin, Iraz T.; Rajan, Geena S.; Chen, Chen; Abbate, Franco; Arora, Kshitij S.; Tanne, Antoine; Gruber, Stephen B.; Johnson, Timothy M.; Fullen, Douglas R.; Phelps, Robert; Bhardwaj, Nina; Bernstein, Emily; Ting, David T.; Brunner, Georg; Schadt, Eric E.; Greenbaum, Benjamin D.; Celebi, Julide Tok

    2017-01-01

    BACKGROUND. Melanoma is a heterogeneous malignancy. We set out to identify the molecular underpinnings of high-risk melanomas, those that are likely to progress rapidly, metastasize, and result in poor outcomes. METHODS. We examined transcriptome changes from benign states to early-, intermediate-, and late-stage tumors using a set of 78 treatment-naive melanocytic tumors consisting of primary melanomas of the skin and benign melanocytic lesions. We utilized a next-generation sequencing platform that enabled a comprehensive analysis of protein-coding and -noncoding RNA transcripts. RESULTS. Gene expression changes unequivocally discriminated between benign and malignant states, and a dual epigenetic and immune signature emerged defining this transition. To our knowledge, we discovered previously unrecognized melanoma subtypes. A high-risk primary melanoma subset was distinguished by a 122-epigenetic gene signature (“epigenetic” cluster) and TP53 family gene deregulation (TP53, TP63, and TP73). This subtype associated with poor overall survival and showed enrichment of cell cycle genes. Noncoding repetitive element transcripts (LINEs, SINEs, and ERVs) that can result in immunostimulatory signals recapitulating a state of “viral mimicry” were significantly repressed. The high-risk subtype and its poor predictive characteristics were validated in several independent cohorts. Additionally, primary melanomas distinguished by specific immune signatures (“immune” clusters) were identified. CONCLUSION. The TP53 family of genes and genes regulating the epigenetic machinery demonstrate strong prognostic and biological relevance during progression of early disease. Gene expression profiling of protein-coding and -noncoding RNA transcripts may be a better predictor for disease course in melanoma. This study outlines the transcriptional interplay of the cancer cell’s epigenome with the immune milieu with potential for future therapeutic targeting. FUNDING. National Institutes of Health (CA154683, CA158557, CA177940, CA087497-13), Tisch Cancer Institute, Melanoma Research Foundation, the Dow Family Charitable Foundation, and the Icahn School of Medicine at Mount Sinai. PMID:28469092

  11. A Transcriptome Map of Actinobacillus pleuropneumoniae at Single-Nucleotide Resolution Using Deep RNA-Seq

    PubMed Central

    Su, Zhipeng; Zhu, Jiawen; Xu, Zhuofei; Xiao, Ran; Zhou, Rui; Li, Lu; Chen, Huanchun

    2016-01-01

    Actinobacillus pleuropneumoniae is the pathogen of porcine contagious pleuropneumoniae, a highly contagious respiratory disease of swine. Although the genome of A. pleuropneumoniae was sequenced several years ago, limited information is available on the genome-wide transcriptional analysis to accurately annotate the gene structures and regulatory elements. High-throughput RNA sequencing (RNA-seq) has been applied to study the transcriptional landscape of bacteria, which can efficiently and accurately identify gene expression regions and unknown transcriptional units, especially small non-coding RNAs (sRNAs), UTRs and regulatory regions. The aim of this study is to comprehensively analyze the transcriptome of A. pleuropneumoniae by RNA-seq in order to improve the existing genome annotation and promote our understanding of A. pleuropneumoniae gene structures and RNA-based regulation. In this study, we utilized RNA-seq to construct a single nucleotide resolution transcriptome map of A. pleuropneumoniae. More than 3.8 million high-quality reads (average length ~90 bp) from a cDNA library were generated and aligned to the reference genome. We identified 32 open reading frames encoding novel proteins that were mis-annotated in the previous genome annotations. The start sites for 35 genes based on the current genome annotation were corrected. Furthermore, 51 sRNAs in the A. pleuropneumoniae genome were discovered, of which 40 sRNAs were never reported in previous studies. The transcriptome map also enabled visualization of 5'- and 3'-UTR regions, in which contained 11 sRNAs. In addition, 351 operons covering 1230 genes throughout the whole genome were identified. The RNA-Seq based transcriptome map validated annotated genes and corrected annotations of open reading frames in the genome, and led to the identification of many functional elements (e.g. regions encoding novel proteins, non-coding sRNAs and operon structures). The transcriptional units described in this study provide a foundation for future studies concerning the gene functions and the transcriptional regulatory architectures of this pathogen. PMID:27018591

  12. Molecular Phylogenetic and Expression Analysis of the Complete WRKY Transcription Factor Family in Maize

    PubMed Central

    Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin

    2012-01-01

    The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of <15%. The remaining 29 transcripts produced by 25 WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance. PMID:22279089

  13. Molecular phylogenetic and expression analysis of the complete WRKY transcription factor family in maize.

    PubMed

    Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin

    2012-04-01

    The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of <15%. The remaining 29 transcripts produced by 25 WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance.

  14. DNA sequence requirements for the accurate transcription of a protein-coding plastid gene in a plastid in vitro system from mustard (Sinapis alba L.)

    PubMed Central

    Link, Gerhard

    1984-01-01

    A nuclease-treated plastid extract from mustard (Sinapis alba L.) allows efficient transcription of cloned plastid DNA templates. In this in vitro system, the major runoff transcript of the truncated gene for the 32 000 mol. wt. photosystem II protein was accurately initiated from a site close to or identical with the in vivo start site. By using plasmids with deletions in the 5'-flanking region of this gene as templates, a DNA region required for efficient and selective initiation was detected ˜28-35 nucleotides upstream of the transcription start site. This region contains the sequence element TTGACA, which matches the consensus sequence for prokaryotic `−35' promoter elements. In the absence of this region, a region ˜13-27 nucleotides upstream of the start site still enables a basic level of specific transcription. This second region contains the sequence element TATATAA, which matches the consensus sequence for the `TATA' box of genes transcribed by RNA polymerase II (or B). The region between the `TATA'-like element and the transcription start site is not sufficient but may be required for specific transcription of the plastid gene. This latter region contains the sequence element TATACT, which resembles the prokaryotic `−10' (Pribnow) box. Based on the structural and transcriptional features of the 5' upstream region, a `promoter switch' mechanism is proposed, which may account for the developmentally regulated expression of this plastid gene. ImagesFig. 1.Fig. 2.Fig. 3.Fig. 4.Figure 5. PMID:16453540

  15. Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers

    PubMed Central

    Latgé, Guillaume; Poulet, Christophe; Bours, Vincent; Jerusalem, Guy

    2018-01-01

    Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers. PMID:29301303

  16. Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers.

    PubMed

    Latgé, Guillaume; Poulet, Christophe; Bours, Vincent; Josse, Claire; Jerusalem, Guy

    2018-01-02

    Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers.

  17. Genetic Variation Linked to Lung Cancer Survival in White Smokers | Center for Cancer Research

    Cancer.gov

    CCR investigators have discovered evidence that links lung cancer survival with genetic variations (called single nucleotide polymorphisms) in the MBL2 gene, a key player in innate immunity. The variations in the gene, which codes for a protein called the mannose-binding lectin, occur in its promoter region, where the RNA polymerase molecule binds to start transcription, and

  18. NRF2: Translating the Redox Code.

    PubMed

    Tummala, Krishna S; Kottakis, Filippos; Bardeesy, Nabeel

    2016-10-01

    Cancer requires mechanisms to mitigate reactive oxygen species (ROS) generated during rapid growth, such as induction of the antioxidant transcription factor, Nrf2. However, the targets of ROS-mediated cytotoxicity are unclear. Recent studies in pancreatic cancer show that redox control by Nrf2 prevents cysteine oxidation of the mRNA translational machinery, thereby supporting efficient protein synthesis. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity

    PubMed Central

    Yassour, Moran; Grabherr, Manfred; Blood, Philip D.; Bowden, Joshua; Couger, Matthew Brian; Eccles, David; Li, Bo; Lieber, Matthias; MacManes, Matthew D.; Ott, Michael; Orvis, Joshua; Pochet, Nathalie; Strozzi, Francesco; Weeks, Nathan; Westerman, Rick; William, Thomas; Dewey, Colin N.; Henschel, Robert; LeDuc, Richard D.; Friedman, Nir; Regev, Aviv

    2013-01-01

    De novo assembly of RNA-Seq data allows us to study transcriptomes without the need for a genome sequence, such as in non-model organisms of ecological and evolutionary importance, cancer samples, or the microbiome. In this protocol, we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-Seq data in non-model organisms. We also present Trinity’s supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples, and approaches to identify protein coding genes. In an included tutorial we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sf.net. PMID:23845962

  20. MetaPlotR: a Perl/R pipeline for plotting metagenes of nucleotide modifications and other transcriptomic sites.

    PubMed

    Olarerin-George, Anthony O; Jaffrey, Samie R

    2017-05-15

    An increasing number of studies are mapping protein binding and nucleotide modifications sites throughout the transcriptome. Often, these sites cluster in certain regions of the transcript, giving clues to their function. Hence, it is informative to summarize where in the transcript these sites occur. A metagene is a simple and effective tool for visualizing the distribution of sites along a simplified transcript model. In this work, we introduce MetaPlotR, a Perl/R pipeline for creating metagene plots. The code and associated tutorial are available at https://github.com/olarerin/metaPlotR . srj2003@med.cornell.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  1. Xenopus microRNA genes are predominantly located within introns and are differentially expressed in adult frog tissues via post-transcriptional regulation

    PubMed Central

    Tang, Guo-Qing; Maxwell, E. Stuart

    2008-01-01

    The amphibian Xenopus provides a model organism for investigating microRNA expression during vertebrate embryogenesis and development. Searching available Xenopus genome databases using known human pre-miRNAs as query sequences, more than 300 genes encoding 142 Xenopus tropicalis miRNAs were identified. Analysis of Xenopus tropicalis miRNA genes revealed a predominate positioning within introns of protein-coding and nonprotein-coding RNA Pol II-transcribed genes. MiRNA genes were also located in pre-mRNA exons and positioned intergenically between known protein-coding genes. Many miRNA species were found in multiple locations and in more than one genomic context. MiRNA genes were also clustered throughout the genome, indicating the potential for the cotranscription and coordinate expression of miRNAs located in a given cluster. Northern blot analysis confirmed the expression of many identified miRNAs in both X. tropicalis and X. laevis. Comparison of X. tropicalis and X. laevis blots revealed comparable expression profiles, although several miRNAs exhibited species-specific expression in different tissues. More detailed analysis revealed that for some miRNAs, the tissue-specific expression profile of the pri-miRNA precursor was distinctly different from that of the mature miRNA profile. Differential miRNA precursor processing in both the nucleus and cytoplasm was implicated in the observed tissue-specific differences. These observations indicated that post-transcriptional processing plays an important role in regulating miRNA expression in the amphibian Xenopus. PMID:18032731

  2. Novel features of a PIWI-like protein homolog in the parasitic protozoan Leishmania.

    PubMed

    Padmanabhan, Prasad K; Dumas, Carole; Samant, Mukesh; Rochette, Annie; Simard, Martin J; Papadopoulou, Barbara

    2012-01-01

    In contrast to nearly all eukaryotes, the Old World Leishmania species L. infantum and L. major lack the bona fide RNAi machinery genes. Interestingly, both Leishmania genomes code for an atypical Argonaute-like protein that possesses a PIWI domain but lacks the PAZ domain found in Argonautes from RNAi proficient organisms. Using sub-cellular fractionation and confocal fluorescence microscopy, we show that unlike other eukaryotes, the PIWI-like protein is mainly localized in the single mitochondrion in Leishmania. To predict PIWI function, we generated a knockout mutant for the PIWI gene in both L. infantum (Lin) and L. major species by double-targeted gene replacement. Depletion of PIWI has no effect on the viability of insect promastigote forms but leads to an important growth defect of the mammalian amastigote lifestage in vitro and significantly delays disease pathology in mice, consistent with a higher expression of the PIWI transcript in amastigotes. Moreover, amastigotes lacking PIWI display a higher sensitivity to apoptosis inducing agents than wild type parasites, suggesting that PIWI may be a sensor for apoptotic stimuli. Furthermore, a whole-genome DNA microarray analysis revealed that loss of LinPIWI in Leishmania amastigotes affects mostly the expression of specific subsets of developmentally regulated genes. Several transcripts encoding surface and membrane-bound proteins were found downregulated in the LinPIWI((-/-)) mutant whereas all histone transcripts were upregulated in the null mutant, supporting the possibility that PIWI plays a direct or indirect role in the stability of these transcripts. Although our data suggest that PIWI is not involved in the biogenesis or the stability of small noncoding RNAs, additional studies are required to gain further insights into the role of this protein on RNA regulation and amastigote development in Leishmania.

  3. Carbon source-dependent expansion of the genetic code in bacteria

    PubMed Central

    Prat, Laure; Heinemann, Ilka U.; Aerni, Hans R.; Rinehart, Jesse; O’Donoghue, Patrick; Söll, Dieter

    2012-01-01

    Despite the fact that the genetic code is known to vary between organisms in rare cases, it is believed that in the lifetime of a single cell the code is stable. We found Acetohalobium arabaticum cells grown on pyruvate genetically encode 20 amino acids, but in the presence of trimethylamine (TMA), A. arabaticum dynamically expands its genetic code to 21 amino acids including pyrrolysine (Pyl). A. arabaticum is the only known organism that modulates the size of its genetic code in response to its environment and energy source. The gene cassette pylTSBCD, required to biosynthesize and genetically encode UAG codons as Pyl, is present in the genomes of 24 anaerobic archaea and bacteria. Unlike archaeal Pyl-decoding organisms that constitutively encode Pyl, we observed that A. arabaticum controls Pyl encoding by down-regulating transcription of the entire Pyl operon under growth conditions lacking TMA, to the point where no detectable Pyl-tRNAPyl is made in vivo. Pyl-decoding archaea adapted to an expanded genetic code by minimizing TAG codon frequency to typically ∼5% of ORFs, whereas Pyl-decoding bacteria (∼20% of ORFs contain in-frame TAGs) regulate Pyl-tRNAPyl formation and translation of UAG by transcriptional deactivation of genes in the Pyl operon. We further demonstrate that Pyl encoding occurs in a bacterium that naturally encodes the Pyl operon, and identified Pyl residues by mass spectrometry in A. arabaticum proteins including two methylamine methyltransferases. PMID:23185002

  4. Foxo3 activity promoted by non-coding effects of circular RNA and Foxo3 pseudogene in the inhibition of tumor growth and angiogenesis.

    PubMed

    Yang, W; Du, W W; Li, X; Yee, A J; Yang, B B

    2016-07-28

    It has recently been shown that the upregulation of a pseudogene specific to a protein-coding gene could function as a sponge to bind multiple potential targeting microRNAs (miRNAs), resulting in increased gene expression. Similarly, it was recently demonstrated that circular RNAs can function as sponges for miRNAs, and could upregulate expression of mRNAs containing an identical sequence. Furthermore, some mRNAs are now known to not only translate protein, but also function to sponge miRNA binding, facilitating gene expression. Collectively, these appear to be effective mechanisms to ensure gene expression and protein activity. Here we show that expression of a member of the forkhead family of transcription factors, Foxo3, is regulated by the Foxo3 pseudogene (Foxo3P), and Foxo3 circular RNA, both of which bind to eight miRNAs. We found that the ectopic expression of the Foxo3P, Foxo3 circular RNA and Foxo3 mRNA could all suppress tumor growth and cancer cell proliferation and survival. Our results showed that at least three mechanisms are used to ensure protein translation of Foxo3, which reflects an essential role of Foxo3 and its corresponding non-coding RNAs.

  5. Bioinformatic analysis of microRNA biogenesis and function related proteins in eleven animal genomes.

    PubMed

    Liu, Xiuying; Luo, GuanZheng; Bai, Xiujuan; Wang, Xiu-Jie

    2009-10-01

    MicroRNAs are approximately 22 nt long small non-coding RNAs that play important regulatory roles in eukaryotes. The biogenesis and functional processes of microRNAs require the participation of many proteins, of which, the well studied ones are Dicer, Drosha, Argonaute and Exportin 5. To systematically study these four protein families, we screened 11 animal genomes to search for genes encoding above mentioned proteins, and identified some new members for each family. Domain analysis results revealed that most proteins within the same family share identical or similar domains. Alternative spliced transcript variants were found for some proteins. We also examined the expression patterns of these proteins in different human tissues and identified other proteins that could potentially interact with these proteins. These findings provided systematic information on the four key proteins involved in microRNA biogenesis and functional pathways in animals, and will shed light on further functional studies of these proteins.

  6. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  7. Long non-coding RNAs in hepatocellular carcinoma: Potential roles and clinical implications

    PubMed Central

    Niu, Zhao-Shan; Niu, Xiao-Jun; Wang, Wen-Hong

    2017-01-01

    Long non-coding RNAs (lncRNAs) are a subgroup of non-coding RNA transcripts greater than 200 nucleotides in length with little or no protein-coding potential. Emerging evidence indicates that lncRNAs may play important regulatory roles in the pathogenesis and progression of human cancers, including hepatocellular carcinoma (HCC). Certain lncRNAs may be used as diagnostic or prognostic markers for HCC, a serious malignancy with increasing morbidity and high mortality rates worldwide. Therefore, elucidating the functional roles of lncRNAs in tumors can contribute to a better understanding of the molecular mechanisms of HCC and may help in developing novel therapeutic targets. In this review, we summarize the recent progress regarding the functional roles of lncRNAs in HCC and explore their clinical implications as diagnostic or prognostic biomarkers and molecular therapeutic targets for HCC. PMID:28932078

  8. Gene expression profiling via LongSAGE in a non-model plant species: a case study in seeds of Brassica napus

    PubMed Central

    Obermeier, Christian; Hosseini, Bashir; Friedt, Wolfgang; Snowdon, Rod

    2009-01-01

    Background Serial analysis of gene expression (LongSAGE) was applied for gene expression profiling in seeds of oilseed rape (Brassica napus ssp. napus). The usefulness of this technique for detailed expression profiling in a non-model organism was demonstrated for the highly complex, neither fully sequenced nor annotated genome of B. napus by applying a tag-to-gene matching strategy based on Brassica ESTs and the annotated proteome of the closely related model crucifer A. thaliana. Results Transcripts from 3,094 genes were detected at two time-points of seed development, 23 days and 35 days after pollination (DAP). Differential expression showed a shift from gene expression involved in diverse developmental processes including cell proliferation and seed coat formation at 23 DAP to more focussed metabolic processes including storage protein accumulation and lipid deposition at 35 DAP. The most abundant transcripts at 23 DAP were coding for diverse protease inhibitor proteins and proteases, including cysteine proteases involved in seed coat formation and a number of lipid transfer proteins involved in embryo pattern formation. At 35 DAP, transcripts encoding napin, cruciferin and oleosin storage proteins were most abundant. Over both time-points, 18.6% of the detected genes were matched by Brassica ESTs identified by LongSAGE tags in antisense orientation. This suggests a strong involvement of antisense transcript expression in regulatory processes during B. napus seed development. Conclusion This study underlines the potential of transcript tagging approaches for gene expression profiling in Brassica crop species via EST matching to annotated A. thaliana genes. Limits of tag detection for low-abundance transcripts can today be overcome by ultra-high throughput sequencing approaches, so that tag-based gene expression profiling may soon become the method of choice for global expression profiling in non-model species. PMID:19575793

  9. Transcriptome analysis of thermophilic methylotrophic Bacillus methanolicus MGA3 using RNA-sequencing provides detailed insights into its previously uncharted transcriptional landscape.

    PubMed

    Irla, Marta; Neshat, Armin; Brautaset, Trygve; Rückert, Christian; Kalinowski, Jörn; Wendisch, Volker F

    2015-02-14

    Bacillus methanolicus MGA3 is a thermophilic, facultative ribulose monophosphate (RuMP) cycle methylotroph. Together with its ability to produce high yields of amino acids, the relevance of this microorganism as a promising candidate for biotechnological applications is evident. The B. methanolicus MGA3 genome consists of a 3,337,035 nucleotides (nt) circular chromosome, the 19,174 nt plasmid pBM19 and the 68,999 nt plasmid pBM69. 3,218 protein-coding regions were annotated on the chromosome, 22 on pBM19 and 82 on pBM69. In the present study, the RNA-seq approach was used to comprehensively investigate the transcriptome of B. methanolicus MGA3 in order to improve the genome annotation, identify novel transcripts, analyze conserved sequence motifs involved in gene expression and reveal operon structures. For this aim, two different cDNA library preparation methods were applied: one which allows characterization of the whole transcriptome and another which includes enrichment of primary transcript 5'-ends. Analysis of the primary transcriptome data enabled the detection of 2,167 putative transcription start sites (TSSs) which were categorized into 1,642 TSSs located in the upstream region (5'-UTR) of known protein-coding genes and 525 TSSs of novel antisense, intragenic, or intergenic transcripts. Firstly, 14 wrongly annotated translation start sites (TLSs) were corrected based on primary transcriptome data. Further investigation of the identified 5'-UTRs resulted in the detailed characterization of their length distribution and the detection of 75 hitherto unknown cis-regulatory RNA elements. Moreover, the exact TSSs positions were utilized to define conserved sequence motifs for translation start sites, ribosome binding sites and promoters in B. methanolicus MGA3. Based on the whole transcriptome data set, novel transcripts, operon structures and mRNA abundances were determined. The analysis of the operon structures revealed that almost half of the genes are transcribed monocistronically (940), whereas 1,164 genes are organized in 381 operons. Several of the genes related to methylotrophy had highly abundant transcripts. The extensive insights into the transcriptional landscape of B. methanolicus MGA3, gained in this study, represent a valuable foundation for further comparative quantitative transcriptome analyses and possibly also for the development of molecular biology tools which at present are very limited for this organism.

  10. Physical structure and chromosomal localization of a gene encoding human p58[sup clk-1], a cell division control related protein kinase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eipers, P.G.

    1992-01-01

    The gene for the human p58[sup clk[minus]1] protein kinase, a cell division control-related gene, has been mapped by somatic cell hybrid analyses, in situ localization with the chromosomal gene, and nested polymerase chain reaction amplification of microdissected chromosomes. These studies indicate that the expressed p58[sup clk[minus]1] chromosomal gene maps to 1p36, while a highly related p58[sup clk[minus]1] sequence of unknown nature maps to chromosome 15. Assignment of a p34[sup cdc2]-related gene to 1p36 region, including neuroblastoma, ductal carcinoma of the breast, malignant melanoma, Merkel cell carcinoma and endocrine neoplasia among others. Aberrant expression of this protein kinase negatively regulates normalmore » cellular growth. The p58[sup clk[minus]1] protein contains a central domain of 299 amino acids that is 46% identical to human p34[sup cdc2], the master mitotic protein kinase. This dissertation details the complete structure of the p58[sup clk[minus]1] chromosomal gene, including its putative promoter region, transcriptional start sites, exonic sequences, and intron/exon boundary sequences. The gene is 10 kb in size and contains 12 exons and 11 introns. Interestingly, the rather large 2.0 kb 3[prime] untranslated region is interrupted by an intron that separates a region containing numerous AUUUA destabilization motifs from the coding region. Furthermore, the expression of this gene in normal human tissues, as well as several human tumor cell samples and lines, is examined. The origin of multiple human transcripts from the same chromosomal gene, and the possible differential stability of these various transcripts, is discussed with regard to the transcriptional and post-transcriptional regulation of this gene. This is the first report of the chromosomal gene structure of a member of the p34[sup cdc2] supergene family.« less

  11. Characterization of regulatory elements within the coat protein (CP) coding region of Tobacco mosaic virus affecting subgenomic transcription and green fluorescent protein expression from the CP subgenomic RNA promoter.

    PubMed

    Man, Michal; Epel, Bernard L

    2004-06-01

    A replicon based on Tobacco mosaic virus that was engineered to express the open reading frame (ORF) of the green fluorescent protein (GFP) gene in place of the native coat protein (CP) gene from a minimal CP subgenomic (sg) RNA promoter was found to accumulate very low levels of GFP. Regulatory regions within the CP ORF were identified that, when presented as untranslated regions flanking the GFP ORF, enhanced or inhibited sg transcription and GFP expression. Full GFP expression from the CP sgRNA promoter required more than the first 20 nt of the CP ORF but not beyond the first 56 nt. Further analysis indicated the presence of an enhancer element between nt +25 and +55 with respect to the CP translation start site. The inclusion of this enhancer sequence upstream of the GFP ORF led to elevated sg transcription and to a 50-fold increase in GFP accumulation in comparison with a minimal CP promoter in which the entire CP ORF was displaced by the GFP ORF. Inclusion of the 3'-terminal 22 nt had a minor positive effect on GFP accumulation, but the addition of extended untranslated sequences from the 3' terminus of the CP ORF downstream of the GFP ORF was basically found to inhibit sg transcription. Secondary structure analysis programs predicted the CP sgRNA promoter to reside within two stable stem-loop structures, which are followed by an enhancer region.

  12. RNA-Binding Proteins in Trichomonas vaginalis: Atypical Multifunctional Proteins Involved in a Posttranscriptional Iron Regulatory Mechanism

    PubMed Central

    Figueroa-Angulo, Elisa E.; Calla-Choque, Jaeson S.; Mancilla-Olea, Maria Inocente; Arroyo, Rossana

    2015-01-01

    Iron homeostasis is highly regulated in vertebrates through a regulatory system mediated by RNA-protein interactions between the iron regulatory proteins (IRPs) that interact with an iron responsive element (IRE) located in certain mRNAs, dubbed the IRE-IRP regulatory system. Trichomonas vaginalis, the causal agent of trichomoniasis, presents high iron dependency to regulate its growth, metabolism, and virulence properties. Although T. vaginalis lacks IRPs or proteins with aconitase activity, possesses gene expression mechanisms of iron regulation at the transcriptional and posttranscriptional levels. However, only one gene with iron regulation at the transcriptional level has been described. Recently, our research group described an iron posttranscriptional regulatory mechanism in the T. vaginalis tvcp4 and tvcp12 cysteine proteinase mRNAs. The tvcp4 and tvcp12 mRNAs have a stem-loop structure in the 5'-coding region or in the 3'-UTR, respectively that interacts with T. vaginalis multifunctional proteins HSP70, α-Actinin, and Actin under iron starvation condition, causing translation inhibition or mRNA stabilization similar to the previously characterized IRE-IRP system in eukaryotes. Herein, we summarize recent progress and shed some light on atypical RNA-binding proteins that may participate in the iron posttranscriptional regulation in T. vaginalis. PMID:26703754

  13. aPPRove: An HMM-Based Method for Accurate Prediction of RNA-Pentatricopeptide Repeat Protein Binding Events

    PubMed Central

    Harrison, Thomas; Ruiz, Jaime; Sloan, Daniel B.; Ben-Hur, Asa; Boucher, Christina

    2016-01-01

    Pentatricopeptide repeat containing proteins (PPRs) bind to RNA transcripts originating from mitochondria and plastids. There are two classes of PPR proteins. The P class contains tandem P-type motif sequences, and the PLS class contains alternating P, L and S type sequences. In this paper, we describe a novel tool that predicts PPR-RNA interaction; specifically, our method, which we call aPPRove, determines where and how a PLS-class PPR protein will bind to RNA when given a PPR and one or more RNA transcripts by using a combinatorial binding code for site specificity proposed by Barkan et al. Our results demonstrate that aPPRove successfully locates how and where a PPR protein belonging to the PLS class can bind to RNA. For each binding event it outputs the binding site, the amino-acid-nucleotide interaction, and its statistical significance. Furthermore, we show that our method can be used to predict binding events for PLS-class proteins using a known edit site and the statistical significance of aligning the PPR protein to that site. In particular, we use our method to make a conjecture regarding an interaction between CLB19 and the second intronic region of ycf3. The aPPRove web server can be found at www.cs.colostate.edu/~approve. PMID:27560805

  14. Identification of Abundantly Expressed Novel and Conserved Genes from the Infective Larval Stage of Toxocara canis by an Expressed Sequence Tag Strategy

    PubMed Central

    Tetteh, Kevin K. A.; Loukas, Alex; Tripp, Cindy; Maizels, Rick M.

    1999-01-01

    Larvae of Toxocara canis, a nematode parasite of dogs, infect humans, causing visceral and ocular larva migrans. In noncanid hosts, larvae neither grow nor differentiate but endure in a state of arrested development. Reasoning that parasite protein production is orientated to immune evasion, we undertook a random sequencing project from a larval cDNA library to characterize the most highly expressed transcripts. In all, 266 clones were sequenced, most from both 3′ and 5′ ends, and similarity searches against GenBank protein and dbEST nucleotide databases were conducted. Cluster analyses showed that 128 distinct gene products had been found, all but 3 of which represented newly identified genes. Ninety-five genes were represented by a single clone, but seven transcripts were present at high frequencies, each composing >2% of all clones sequenced. These high-abundance transcripts include a mucin and a C-type lectin, which are both major excretory-secretory antigens released by parasites. Four highly expressed novel gene transcripts, termed ant (abundant novel transcript) genes, were found. Together, these four genes comprised 18% of all cDNA clones isolated, but no similar sequences occur in the Caenorhabditis elegans genome. While the coding regions of the four genes are dissimilar, their 3′ untranslated tracts have significant homology in nucleotide sequence. The discovery of these abundant, parasite-specific genes of newly identified lectins and mucins, as well as a range of conserved and novel proteins, provides defined candidates for future analysis of the molecular basis of immune evasion by T. canis. PMID:10456930

  15. POM-ZP3, a bipartite transcript derived from human ZP3 and a POM121 homologue

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kipersztok, S.; Osawa, G.A.; Liang, L.F.

    1995-01-20

    Human POM-ZP3 is a novel bipartite RNA transcript that is derived from a gene homologous to rat POM121 (a nuclear pore membrane protein) and ZP3 (a sperm receptor ligand in the zona pellucida). The 5{prime} region is 77% identical to the 5{prime} end of the coding region of rat POM121 and appears to represent a partial duplication of a gene encoding a human homologue of this rodent gene. The 3{prime} end of the POM-ZP3 transcript is 99% identical to ZP3 and appears to have arisen from a duplication of the last four exons (exons 5-8) of ZP3. Using Northern blotsmore » and RT-PCR, POM-ZP3 transcripts were detected in human ovaries, testes, spleen, thymus, lymphocytes, prostate, and intestines. The longest open reading frame encodes a conceptual protein of 210 amino acids, the first 76 of which are 83% identical to residues 241-315 of rat POM121. The next 125 amino acids are 98% identical to residues 239-363 of the 424-amino-acid human ZP3 protein. By fluorescence in situ hybridization, genomic fragments of ZP3 and a human homologue of POM121 were localized to chromosome 7q11.23. Taken together, these data suggest that partial duplications of human ZP3 and a POM121-like gene have resulted in a fusion transcript, POM-ZP3, that is expressed in multiple human tissues. 24 refs., 5 figs.« less

  16. Genome-Wide Spectra of Transcription Insertions and Deletions Reveal That Slippage Depends on RNA:DNA Hybrid Complementarity

    PubMed Central

    Traverse, Charles C.

    2017-01-01

    ABSTRACT Advances in sequencing technologies have enabled direct quantification of genome-wide errors that occur during RNA transcription. These errors occur at rates that are orders of magnitude higher than rates during DNA replication, but due to technical difficulties such measurements have been limited to single-base substitutions and have not yet quantified the scope of transcription insertions and deletions. Previous reporter gene assay findings suggested that transcription indels are produced exclusively by elongation complex slippage at homopolymeric runs, so we enumerated indels across the protein-coding transcriptomes of Escherichia coli and Buchnera aphidicola, which differ widely in their genomic base compositions and incidence of repeat regions. As anticipated from prior assays, transcription insertions prevailed in homopolymeric runs of A and T; however, transcription deletions arose in much more complex sequences and were rarely associated with homopolymeric runs. By reconstructing the relocated positions of the elongation complex as inferred from the sequences inserted or deleted during transcription, we show that continuation of transcription after slippage hinges on the degree of nucleotide complementarity within the RNA:DNA hybrid at the new DNA template location. PMID:28851848

  17. Rye B chromosomes encode a functional Argonaute-like protein with in vitro slicer activities similar to its A chromosome paralog.

    PubMed

    Ma, Wei; Gabriel, Tobias Sebastian; Martis, Mihaela Maria; Gursinsky, Torsten; Schubert, Veit; Vrána, Jan; Doležel, Jaroslav; Grundlach, Heidrun; Altschmied, Lothar; Scholz, Uwe; Himmelbach, Axel; Behrens, Sven-Erik; Banaei-Moghaddam, Ali Mohammad; Houben, Andreas

    2017-01-01

    B chromosomes (Bs) are supernumerary, dispensable parts of the nuclear genome, which appear in many different species of eukaryote. So far, Bs have been considered to be genetically inert elements without any functional genes. Our comparative transcriptome analysis and the detection of active RNA polymerase II (RNAPII) in the proximity of B chromatin demonstrate that the Bs of rye (Secale cereale) contribute to the transcriptome. In total, 1954 and 1218 B-derived transcripts with an open reading frame were expressed in generative and vegetative tissues, respectively. In addition to B-derived transposable element transcripts, a high percentage of short transcripts without detectable similarity to known proteins and gene fragments from A chromosomes (As) were found, suggesting an ongoing gene erosion process. In vitro analysis of the A- and B-encoded AGO4B protein variants demonstrated that both possess RNA slicer activity. These data demonstrate unambiguously the presence of a functional AGO4B gene on Bs and that these Bs carry both functional protein coding genes and pseudogene copies. Thus, B-encoded genes may provide an additional level of gene control and complexity in combination with their related A-located genes. Hence, physiological effects, associated with the presence of Bs, may partly be explained by the activity of B-located (pseudo)genes. © 2016 IPK Gatersleben. New Phytologist © 2016 New Phytologist Trust.

  18. Insight into the transcriptome of Arthrobotrys conoides using high throughput sequencing.

    PubMed

    Ramesh, Pandit; Reena, Patel; Amitbikram, Mohapatra; Chaitanya, Joshi; Anju, Kunjadia

    2015-12-01

    Arthrobotrys conoides is a nematode-trapping fungus belonging to Orbiliales, Ascomycota group, and traps prey nematodes by means of adhesive network. Fungus has a potential to be used as a biocontrol agent against plant parasitic nematodes. In the present study, we characterized the transcriptome of A. conoides using high-throughput sequencing technology and characterized its virulence unigenes. Total 7,255 cDNA contigs with an average length of 425 bp were generated and 6184 (61.81%) transcripts were functionally annotated and characterized. Majority of unigenes were found analogous to the genes of plant pathogenic fungi. A total of 1749 transcripts were found to be orthologous with eukaryotic proteins of KOG database. Several carbohydrate active enzymes and peptidases were identified. We also analyzed classically and nonclassically secreted proteins and confirmed by BLASTP against fungal secretome database. A total of 916 contigs were analogous to 556 unique proteins of Pathogen Host Interaction (PHI) database. Further, we identified 91 unigenes homologous to the database of fungal virulence factor (DFVF). A total of 104 putative protein kinases coding transcripts were identified by BLASTP against KinBase database, which are major players in signaling pathways. This study provides a comprehensive look at the transcriptome of A. conoides and the identified unigenes might have a role in catching and killing prey nematodes by A. conoides. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Diversity and Divergence of Dinoflagellate Histone Proteins

    PubMed Central

    Marinov, Georgi K.; Lynch, Michael

    2015-01-01

    Histone proteins and the nucleosomal organization of chromatin are near-universal eukaroytic features, with the exception of dinoflagellates. Previous studies have suggested that histones do not play a major role in the packaging of dinoflagellate genomes, although several genomic and transcriptomic surveys have detected a full set of core histone genes. Here, transcriptomic and genomic sequence data from multiple dinoflagellate lineages are analyzed, and the diversity of histone proteins and their variants characterized, with particular focus on their potential post-translational modifications and the conservation of the histone code. In addition, the set of putative epigenetic mark readers and writers, chromatin remodelers and histone chaperones are examined. Dinoflagellates clearly express the most derived set of histones among all autonomous eukaryote nuclei, consistent with a combination of relaxation of sequence constraints imposed by the histone code and the presence of numerous specialized histone variants. The histone code itself appears to have diverged significantly in some of its components, yet others are conserved, implying conservation of the associated biochemical processes. Specifically, and with major implications for the function of histones in dinoflagellates, the results presented here strongly suggest that transcription through nucleosomal arrays happens in dinoflagellates. Finally, the plausible roles of histones in dinoflagellate nuclei are discussed. PMID:26646152

  20. Systematic asymmetric nucleotide exchanges produce human mitochondrial RNAs cryptically encoding for overlapping protein coding genes.

    PubMed

    Seligmann, Hervé

    2013-05-07

    GenBank's EST database includes RNAs matching exactly human mitochondrial sequences assuming systematic asymmetric nucleotide exchange-transcription along exchange rules: A→G→C→U/T→A (12 ESTs), A→U/T→C→G→A (4 ESTs), C→G→U/T→C (3 ESTs), and A→C→G→U/T→A (1 EST), no RNAs correspond to other potential asymmetric exchange rules. Hypothetical polypeptides translated from nucleotide-exchanged human mitochondrial protein coding genes align with numerous GenBank proteins, predicted secondary structures resemble their putative GenBank homologue's. Two independent methods designed to detect overlapping genes (one based on nucleotide contents analyses in relation to replicative deamination gradients at third codon positions, and circular code analyses of codon contents based on frame redundancy), confirm nucleotide-exchange-encrypted overlapping genes. Methods converge on which genes are most probably active, and which not, and this for the various exchange rules. Mean EST lengths produced by different nucleotide exchanges are proportional to (a) extents that various bioinformatics analyses confirm the protein coding status of putative overlapping genes; (b) known kinetic chemistry parameters of the corresponding nucleotide substitutions by the human mitochondrial DNA polymerase gamma (nucleotide DNA misinsertion rates); (c) stop codon densities in predicted overlapping genes (stop codon readthrough and exchanging polymerization regulate gene expression by counterbalancing each other). Numerous rarely expressed proteins seem encoded within regular mitochondrial genes through asymmetric nucleotide exchange, avoiding lengthening genomes. Intersecting evidence between several independent approaches confirms the working hypothesis status of gene encryption by systematic nucleotide exchanges. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP

    PubMed Central

    Hafner, Markus; Landthaler, Markus; Burger, Lukas; Khorshid, Mohsen; Hausser, Jean; Berninger, Philipp; Rothballer, Andrea; Ascano, Manuel; Jungkamp, Anna-Carina; Munschauer, Mathias; Ulrich, Alexander; Wardle, Greg S.; Dewell, Scott; Zavolan, Mihaela; Tuschl, Thomas

    2010-01-01

    Summary RNA transcripts are subject to post-transcriptional gene regulation involving hundreds of RNA-binding proteins (RBPs) and microRNA-containing ribonucleoprotein complexes (miRNPs) expressed in a cell-type dependent fashion. We developed a cell-based crosslinking approach to determine at high resolution and transcriptome-wide the binding sites of cellular RBPs and miRNPs. The crosslinked sites are revealed by thymidine to cytidine transitions in the cDNAs prepared from immunopurified RNPs of 4-thiouridine-treated cells. We determined the binding sites and regulatory consequences for several intensely studied RBPs and miRNPs, including PUM2, QKI, IGF2BP1-3, AGO/EIF2C1-4 and TNRC6A-C. Our study revealed that these factors bind thousands of sites containing defined sequence motifs and have distinct preferences for exonic versus intronic or coding versus untranslated transcript regions. The precise mapping of binding sites across the transcriptome will be critical to the interpretation of the rapidly emerging data on genetic variation between individuals and how these variations contribute to complex genetic diseases. PMID:20371350

  2. Cloning of the cDNA encoding Scg-SPRP, an unusual Ser-protease-related protein from vitellogenic female desert locusts (Schistocerca gregaria).

    PubMed

    Chiou, S J; Vanden Broeck, J; Janssen, I; Borovsky, D; Vandenbussche, F; Simonet, G; De Loof, A

    1998-10-01

    The cDNA coding for a Ser-protease-related protein (Scg-SPRP) was cloned from desert locust (Schistocerca gregaria) midgut. The derived amino acid sequence consists of 260 residues and shows strong sequence similarity to insect trypsin-like molecules. It is, however, likely that Scg-SPRP is not a proteolytically active enzyme and that it plays another physiologically relevant role, since two out of three residues which are indispensable for catalytic activity of Ser-proteases are replaced. Northern analysis revealed that the Scg-SPRP gene is expressed in midgut tissue and that this expression is strongly induced in adult female locusts. Moreover, the occurrence of the transcript (1.2 kb) fluctuates during the molting cycle and during the female reproductive cycle. Juvenile hormone (JH III) dependence of transcription was investigated by chemical allatectomy (precocene I) of adult females. This resulted in inhibition of vitellogenesis and in disappearance of the Scg-SPRP transcript. Expression of Scg-SPRP in precocene-treated locusts could be reinduced by additional treatment with JH III or with 20-OH-ecdysone.

  3. Characterization and Transcription of Arsenic Respiration and Resistance Genes During In Situ Uranium Bioremediation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Giloteaux, L.; Holmes, Dawn E.; Williams, Kenneth H.

    2013-02-04

    The possibility of arsenic release and the potential role of Geobacter in arsenic biogeochemistry during in situ uranium bioremediation was investigated because increased availability of organic matter has been associated with substantial releases of arsenic in other subsurface environments. In a field experiment conducted at the Rifle, CO study site, groundwater arsenic concentrations increased when acetate was added. The number of transcripts from arrA, which codes for the alpha subunit of dissimilatory As(V) reductase, and acr3, which codes for the arsenic pump protein Acr3, were determined with quantitative RT-PCR. Most of the arrA (> 60%) and acr3-1 (> 90%) sequencesmore » that were recovered were most similar to Geobacter species, while the majority of acr3-2 (>50%) sequences were most closely related to Rhodoferax ferrireducens. Analysis of transcript abundance demonstrated that transcription of acr3-1 by the subsurface Geobacter community was correlated with arsenic concentrations in the groundwater. In contrast, Geobacter arrA transcript numbers lagged behind the major arsenic release and remained high even after arsenic concentrations declined. This suggested that factors other than As(V) availability regulated transcription of arrA in situ even though the presence of As(V) increased transcription of arrA in cultures of G. lovleyi, which was capable of As(V) reduction. These results demonstrate that subsurface Geobacter species can tightly regulate their physiological response to changes in groundwater arsenic concentrations. The transcriptomic approach developed here should be useful for the study of a diversity of other environments in which Geobacter species are considered to have an important influence on arsenic biogeochemistry.« less

  4. The point of no return: The poly(A)-associated elongation checkpoint

    PubMed Central

    Tellier, Michael; Ferrer-Vicens, Ivan; Murphy, Shona

    2016-01-01

    abstract Cyclin-dependent kinases play critical roles in transcription by RNA polymerase II (pol II) and processing of the transcripts. For example, CDK9 regulates transcription of protein-coding genes, splicing, and 3′ end formation of the transcripts. Accordingly, CDK9 inhibitors have a drastic effect on the production of mRNA in human cells. Recent analyses indicate that CDK9 regulates transcription at the early-elongation checkpoint of the vast majority of pol II-transcribed genes. Our recent discovery of an additional CDK9-regulated elongation checkpoint close to poly(A) sites adds a new layer to the control of transcription by this critical cellular kinase. This novel poly(A)-associated checkpoint has the potential to powerfully regulate gene expression just before a functional polyadenylated mRNA is produced: the point of no return. However, many questions remain to be answered before the role of this checkpoint becomes clear. Here we speculate on the possible biological significance of this novel mechanism of gene regulation and the players that may be involved. PMID:26853452

  5. Protein and Genetic Composition of Four Chromatin Types in Drosophila melanogaster Cell Lines.

    PubMed

    Boldyreva, Lidiya V; Goncharov, Fyodor P; Demakova, Olga V; Zykova, Tatyana Yu; Levitsky, Victor G; Kolesnikov, Nikolay N; Pindyurin, Alexey V; Semeshin, Valeriy F; Zhimulev, Igor F

    2017-04-01

    Recently, we analyzed genome-wide protein binding data for the Drosophila cell lines S2, Kc, BG3 and Cl.8 (modENCODE Consortium) and identified a set of 12 proteins enriched in the regions corresponding to interbands of salivary gland polytene chromosomes. Using these data, we developed a bioinformatic pipeline that partitioned the Drosophila genome into four chromatin types that we hereby refer to as aquamarine, lazurite, malachite and ruby. Here, we describe the properties of these chromatin types across different cell lines. We show that aquamarine chromatin tends to harbor transcription start sites (TSSs) and 5' untranslated regions (5'UTRs) of the genes, is enriched in diverse "open" chromatin proteins, histone modifications, nucleosome remodeling complexes and transcription factors. It encompasses most of the tRNA genes and shows enrichment for non-coding RNAs and miRNA genes. Lazurite chromatin typically encompasses gene bodies. It is rich in proteins involved in transcription elongation. Frequency of both point mutations and natural deletion breakpoints is elevated within lazurite chromatin. Malachite chromatin shows higher frequency of insertions of natural transposons. Finally, ruby chromatin is enriched for proteins and histone modifications typical for the "closed" chromatin. Ruby chromatin has a relatively low frequency of point mutations and is essentially devoid of miRNA and tRNA genes. Aquamarine and ruby chromatin types are highly stable across cell lines and have contrasting properties. Lazurite and malachite chromatin types also display characteristic protein composition, as well as enrichment for specific genomic features. We found that two types of chromatin, aquamarine and ruby, retain their complementary protein patterns in four Drosophila cell lines.

  6. RAID: a comprehensive resource for human RNA-associated (RNA-RNA/RNA-protein) interaction.

    PubMed

    Zhang, Xiaomeng; Wu, Deng; Chen, Liqun; Li, Xiang; Yang, Jinxurong; Fan, Dandan; Dong, Tingting; Liu, Mingyue; Tan, Puwen; Xu, Jintian; Yi, Ying; Wang, Yuting; Zou, Hua; Hu, Yongfei; Fan, Kaili; Kang, Juanjuan; Huang, Yan; Miao, Zhengqiang; Bi, Miaoman; Jin, Nana; Li, Kongning; Li, Xia; Xu, Jianzhen; Wang, Dong

    2014-07-01

    Transcriptomic analyses have revealed an unexpected complexity in the eukaryote transcriptome, which includes not only protein-coding transcripts but also an expanding catalog of noncoding RNAs (ncRNAs). Diverse coding and noncoding RNAs (ncRNAs) perform functions through interaction with each other in various cellular processes. In this project, we have developed RAID (http://www.rna-society.org/raid), an RNA-associated (RNA-RNA/RNA-protein) interaction database. RAID intends to provide the scientific community with all-in-one resources for efficient browsing and extraction of the RNA-associated interactions in human. This version of RAID contains more than 6100 RNA-associated interactions obtained by manually reviewing more than 2100 published papers, including 4493 RNA-RNA interactions and 1619 RNA-protein interactions. Each entry contains detailed information on an RNA-associated interaction, including RAID ID, RNA/protein symbol, RNA/protein categories, validated method, expressing tissue, literature references (Pubmed IDs), and detailed functional description. Users can query, browse, analyze, and manipulate RNA-associated (RNA-RNA/RNA-protein) interaction. RAID provides a comprehensive resource of human RNA-associated (RNA-RNA/RNA-protein) interaction network. Furthermore, this resource will help in uncovering the generic organizing principles of cellular function network. © 2014 Zhang et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  7. Transcriptional Basis of Drought-Induced Susceptibility to the Rice Blast Fungus Magnaporthe oryzae

    PubMed Central

    Bidzinski, Przemyslaw; Ballini, Elsa; Ducasse, Aurélie; Michel, Corinne; Zuluaga, Paola; Genga, Annamaria; Chiozzotto, Remo; Morel, Jean-Benoit

    2016-01-01

    Plants are often facing several stresses simultaneously. Understanding how they react and the way pathogens adapt to such combinational stresses is poorly documented. Here, we developed an experimental system mimicking field intermittent drought on rice followed by inoculation by the pathogenic fungus Magnaporthe oryzae. This experimental system triggers an enhancement of susceptibility that could be correlated with the dampening of several aspects of plant immunity, namely the oxidative burst and the transcription of several pathogenesis-related genes. Quite strikingly, the analysis of fungal transcription by RNASeq analysis under drought reveals that the fungus is greatly modifying its virulence program: genes coding for small secreted proteins were massively repressed in droughted plants compared to unstressed ones whereas genes coding for enzymes involved in degradation of cell-wall were induced. We also show that drought can lead to the partial breakdown of several major resistance genes by affecting R plant gene and/or pathogen effector expression. We propose a model where a yet unknown plant signal can trigger a change in the virulence program of the pathogen to adapt to a plant host that was affected by drought prior to infection. PMID:27833621

  8. Protein-DNA binding dynamics predict transcriptional response to nutrients in archaea.

    PubMed

    Todor, Horia; Sharma, Kriti; Pittman, Adrianne M C; Schmid, Amy K

    2013-10-01

    Organisms across all three domains of life use gene regulatory networks (GRNs) to integrate varied stimuli into coherent transcriptional responses to environmental pressures. However, inferring GRN topology and regulatory causality remains a central challenge in systems biology. Previous work characterized TrmB as a global metabolic transcription factor in archaeal extremophiles. However, it remains unclear how TrmB dynamically regulates its ∼100 metabolic enzyme-coding gene targets. Using a dynamic perturbation approach, we elucidate the topology of the TrmB metabolic GRN in the model archaeon Halobacterium salinarum. Clustering of dynamic gene expression patterns reveals that TrmB functions alone to regulate central metabolic enzyme-coding genes but cooperates with various regulators to control peripheral metabolic pathways. Using a dynamical model, we predict gene expression patterns for some TrmB-dependent promoters and infer secondary regulators for others. Our data suggest feed-forward gene regulatory topology for cobalamin biosynthesis. In contrast, purine biosynthesis appears to require TrmB-independent regulators. We conclude that TrmB is an important component for mediating metabolic modularity, integrating nutrient status and regulating gene expression dynamics alone and in concert with secondary regulators.

  9. The structure of transcription termination factor Nrd1 reveals an original mode for GUAA recognition

    PubMed Central

    Franco-Echevarría, Elsa; González-Polo, Noelia; Zorrilla, Silvia; Martínez-Lumbreras, Santiago; Santiveri, Clara M.; Campos-Olivas, Ramón; Sánchez, Mar; Calvo, Olga

    2017-01-01

    Abstract Transcription termination of non-coding RNAs is regulated in yeast by a complex of three RNA binding proteins: Nrd1, Nab3 and Sen1. Nrd1 is central in this process by interacting with Rbp1 of RNA polymerase II, Trf4 of TRAMP and GUAA/G terminator sequences. We lack structural data for the last of these binding events. We determined the structures of Nrd1 RNA binding domain and its complexes with three GUAA-containing RNAs, characterized RNA binding energetics and tested rationally designed mutants in vivo. The Nrd1 structure shows an RRM domain fused with a second α/β domain that we name split domain (SD), because it is formed by two non-consecutive segments at each side of the RRM. The GUAA interacts with both domains and with a pocket of water molecules, trapped between the two stacking adenines and the SD. Comprehensive binding studies demonstrate for the first time that Nrd1 has a slight preference for GUAA over GUAG and genetic and functional studies suggest that Nrd1 RNA binding domain might play further roles in non-coding RNAs transcription termination. PMID:28973465

  10. Translatomics combined with transcriptomics and proteomics reveals novel functional, recently evolved orphan genes in Escherichia coli O157:H7 (EHEC).

    PubMed

    Neuhaus, Klaus; Landstorfer, Richard; Fellner, Lea; Simon, Svenja; Schafferhans, Andrea; Goldberg, Tatyana; Marx, Harald; Ozoline, Olga N; Rost, Burkhard; Kuster, Bernhard; Keim, Daniel A; Scherer, Siegfried

    2016-02-24

    Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.

  11. Transcription Factors Bind Thousands of Active and InactiveRegions in the Drosophila Blastoderm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Xiao-Yong; MacArthur, Stewart; Bourgon, Richard

    2008-01-10

    Identifying the genomic regions bound by sequence-specific regulatory factors is central both to deciphering the complex DNA cis-regulatory code that controls transcription in metazoans and to determining the range of genes that shape animal morphogenesis. Here, we use whole-genome tiling arrays to map sequences bound in Drosophila melanogaster embryos by the six maternal and gap transcription factors that initiate anterior-posterior patterning. We find that these sequence-specific DNA binding proteins bind with quantitatively different specificities to highly overlapping sets of several thousand genomic regions in blastoderm embryos. Specific high- and moderate-affinity in vitro recognition sequences for each factor are enriched inmore » bound regions. This enrichment, however, is not sufficient to explain the pattern of binding in vivo and varies in a context-dependent manner, demonstrating that higher-order rules must govern targeting of transcription factors. The more highly bound regions include all of the over forty well-characterized enhancers known to respond to these factors as well as several hundred putative new cis-regulatory modules clustered near developmental regulators and other genes with patterned expression at this stage of embryogenesis. The new targets include most of the microRNAs (miRNAs) transcribed in the blastoderm, as well as all major zygotically transcribed dorsal-ventral patterning genes, whose expression we show to be quantitatively modulated by anterior-posterior factors. In addition to these highly bound regions, there are several thousand regions that are reproducibly bound at lower levels. However, these poorly bound regions are, collectively, far more distant from genes transcribed in the blastoderm than highly bound regions; are preferentially found in protein-coding sequences; and are less conserved than highly bound regions. Together these observations suggest that many of these poorly-bound regions are not involved in early-embryonic transcriptional regulation, and a significant proportion may be nonfunctional. Surprisingly, for five of the six factors, their recognition sites are not unambiguously more constrained evolutionarily than the immediate flanking DNA, even in more highly bound and presumably functional regions, indicating that comparative DNA sequence analysis is limited in its ability to identify functional transcription factor targets.« less

  12. Interplay between cardiac transcription factors and non-coding RNAs in predisposing to atrial fibrillation.

    PubMed

    Mikhailov, Alexander T; Torrado, Mario

    2018-05-12

    There is growing evidence that putative gene regulatory networks including cardio-enriched transcription factors, such as PITX2, TBX5, ZFHX3, and SHOX2, and their effector/target genes along with downstream non-coding RNAs can play a potentially important role in the process of adaptive and maladaptive atrial rhythm remodeling. In turn, expression of atrial fibrillation-associated transcription factors is under the control of upstream regulatory non-coding RNAs. This review broadly explores gene regulatory mechanisms associated with susceptibility to atrial fibrillation-with key examples from both animal models and patients-within the context of both cardiac transcription factors and non-coding RNAs. These two systems appear to have multiple levels of cross-regulation and act coordinately to achieve effective control of atrial rhythm effector gene expression. Perturbations of a dynamic expression balance between transcription factors and corresponding non-coding RNAs can provoke the development or promote the progression of atrial fibrillation. We also outline deficiencies in current models and discuss ongoing studies to clarify remaining mechanistic questions. An understanding of the function of transcription factors and non-coding RNAs in gene regulatory networks associated with atrial fibrillation risk will enable the development of innovative therapeutic strategies.

  13. An Autonomous BMP2 Regulatory Element in Mesenchymal Cells

    PubMed Central

    Kruithof, Boudewijn P.T.; Fritz, David T.; Liu, Yijun; Garsetti, Diane E.; Frank, David B.; Pregizer, Steven K.; Gaussin, Vinciane; Mortlock, Douglas P.; Rogers, Melissa B.

    2014-01-01

    BMP2 is a morphogen that controls mesenchymal cell differentiation and behavior. For example, BMP2 concentration controls the differentiation of mesenchymal precursors into myocytes, adipocytes, chondrocytes, and osteoblasts. Sequences within the 3′untranslated region (UTR) of the Bmp2 mRNA mediate a post-transcriptional block of protein synthesis. Interaction of cell and developmental stage-specific trans-regulatory factors with the 3′UTR is a nimble and versatile mechanism for modulating this potent morphogen in different cell types. We show here, that an ultra-conserved sequence in the 3′UTR functions independently of promoter, coding region, and 3′UTR context in primary and immortalized tissue culture cells and in transgenic mice. Our findings indicate that the ultra-conserved sequence is an autonomously functioning post-transcriptional element that may be used to modulate the level of BMP2 and other proteins while retaining tissue specific regulatory elements. PMID:21268088

  14. The Cell Cycle Regulator CCDC6 Is a Key Target of RNA-Binding Protein EWS

    PubMed Central

    Duggimpudi, Sujitha; Larsson, Erik; Nabhani, Schafiq; Borkhardt, Arndt; Hoell, Jessica I

    2015-01-01

    Genetic translocation of EWSR1 to ETS transcription factor coding region is considered as primary cause for Ewing sarcoma. Previous studies focused on the biology of chimeric transcription factors formed due to this translocation. However, the physiological consequences of heterozygous EWSR1 loss in these tumors have largely remained elusive. Previously, we have identified various mRNAs bound to EWS using PAR-CLIP. In this study, we demonstrate CCDC6, a known cell cycle regulator protein, as a novel target regulated by EWS. siRNA mediated down regulation of EWS caused an elevated apoptosis in cells in a CCDC6-dependant manner. This effect was rescued upon re-expression of CCDC6. This study provides evidence for a novel functional link through which wild-type EWS operates in a target-dependant manner in Ewing sarcoma. PMID:25751255

  15. A deep learning method for lincRNA detection using auto-encoder algorithm.

    PubMed

    Yu, Ning; Yu, Zeng; Pan, Yi

    2017-12-06

    RNA sequencing technique (RNA-seq) enables scientists to develop novel data-driven methods for discovering more unidentified lincRNAs. Meantime, knowledge-based technologies are experiencing a potential revolution ignited by the new deep learning methods. By scanning the newly found data set from RNA-seq, scientists have found that: (1) the expression of lincRNAs appears to be regulated, that is, the relevance exists along the DNA sequences; (2) lincRNAs contain some conversed patterns/motifs tethered together by non-conserved regions. The two evidences give the reasoning for adopting knowledge-based deep learning methods in lincRNA detection. Similar to coding region transcription, non-coding regions are split at transcriptional sites. However, regulatory RNAs rather than message RNAs are generated. That is, the transcribed RNAs participate the biological process as regulatory units instead of generating proteins. Identifying these transcriptional regions from non-coding regions is the first step towards lincRNA recognition. The auto-encoder method achieves 100% and 92.4% prediction accuracy on transcription sites over the putative data sets. The experimental results also show the excellent performance of predictive deep neural network on the lincRNA data sets compared with support vector machine and traditional neural network. In addition, it is validated through the newly discovered lincRNA data set and one unreported transcription site is found by feeding the whole annotated sequences through the deep learning machine, which indicates that deep learning method has the extensive ability for lincRNA prediction. The transcriptional sequences of lincRNAs are collected from the annotated human DNA genome data. Subsequently, a two-layer deep neural network is developed for the lincRNA detection, which adopts the auto-encoder algorithm and utilizes different encoding schemes to obtain the best performance over intergenic DNA sequence data. Driven by those newly annotated lincRNA data, deep learning methods based on auto-encoder algorithm can exert their capability in knowledge learning in order to capture the useful features and the information correlation along DNA genome sequences for lincRNA detection. As our knowledge, this is the first application to adopt the deep learning techniques for identifying lincRNA transcription sequences.

  16. The developmental transcriptome of Drosophila melanogaster

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    University of Connecticut; Graveley, Brenton R.; Brooks, Angela N.

    Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development of this complex organism. Here we used RNA-Seq, tiling microarrays and cDNA sequencing to explore the transcriptome in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded discovery using established experimental, predictionmore » and conservation-based approaches. These data substantially expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of transcriptome dynamics throughout development. Drosophila melanogaster is an important non-mammalian model system that has had a critical role in basic biological discoveries, such as identifying chromosomes as the carriers of genetic information and uncovering the role of genes in development. Because it shares a substantial genic content with humans, Drosophila is increasingly used as a translational model for human development, homeostasis and disease. High-quality maps are needed for all functional genomic elements. Previous studies demonstrated that a rich collection of genes is deployed during the life cycle of the fly. Although expression profiling using microarrays has revealed the expression of, 13,000 annotated genes, it is difficult to map splice junctions and individual base modifications generated by RNA editing using such approaches. Single-base resolution is essential to define precisely the elements that comprise the Drosophila transcriptome. Estimates of the number of transcript isoforms are less accurate than estimates of the number of genes. Whereas, 20% of Drosophila genes are annotated as encoding alternatively spliced premRNAs, splice-junction microarray experiments indicate that this number is at least 40% (ref. 7). Determining the diversity of mRNAs generated by alternative promoters, alternative splicing and RNA editing will substantially increase the inferred protein repertoire. Non-coding RNA genes (ncRNAs) including short interfering RNAs (siRNAs) and microRNAS (miRNAs) (reviewed in ref. 10), and longer ncRNAs such as bxd (ref. 11) and rox (ref. 12), have important roles in gene regulation, whereas others such as small nucleolar RNAs (snoRNAs)and small nuclear RNAs (snRNAs) are important components of macromolecular machines such as the ribosome and spliceosome. The transcription and processing of these ncRNAs must also be fully documented and mapped. As part of the modENCODE project to annotate the functional elements of the D. melanogaster and Caenorhabditis elegans genomes, we used RNA-Seq and tiling microarrays to sample the Drosophila transcriptome at unprecedented depth throughout development from early embryo to ageing male and female adults. We report on a high-resolution view of the discovery, structure and dynamic expression of the D. melanogaster transcriptome.« less

  17. Increased levels of B1 and B2 SINE transcripts in mouse fibroblast cells due to minute virus of mice infection.

    PubMed

    Williams, Warren P; Tamburic, Lillian; Astell, Caroline R

    2004-10-01

    Minute virus of mice (MVM), an autonomous parvovirus, has served as a model for understanding parvovirus infection including host cell response to infection. In this paper, we report the effect of MVM infection on host cell gene expression in mouse fibroblast cells (LA9 cells), analyzed by differential display. Somewhat surprisingly, our data reveal that few cellular protein-coding genes appear to be up- or downregulated and identify the murine B1 and B2 short interspersed element (SINE) transcripts as being increased upon MVM infection. Primer extension assays confirm the effect of MVM infection on SINE expression and demonstrate that both SINEs are upregulated in a roughly linear fashion throughout MVM infection. They also demonstrate that the SINE response was due to RNA polymerase III transcription and not contaminating DNA or RNA polymerase II transcription. Furthermore, expression of MVM NS1, the major nonstructural protein, by transient transfection also leads to an increase in both murine SINEs. We believe this is the first time that the B1 and B2 SINEs have been shown to be altered by viral infection and the first time parvovirus infection has been shown to increase SINE expression. The increase in SINE transcripts caused by MVM infection does not appear to be due to an increase in either of the basal transcription factors TFIIIC110 or 220, in contrast to that which has been shown for other viruses.

  18. Utrophin Up-Regulation by an Artificial Transcription Factor in Transgenic Mice

    PubMed Central

    Mattei, Elisabetta; Corbi, Nicoletta; Di Certo, Maria Grazia; Strimpakos, Georgios; Severini, Cinzia; Onori, Annalisa; Desantis, Agata; Libri, Valentina; Buontempo, Serena; Floridi, Aristide; Fanciulli, Maurizio; Baban, Dilair; Davies, Kay E.; Passananti, Claudio

    2007-01-01

    Duchenne Muscular Dystrophy (DMD) is a severe muscle degenerative disease, due to absence of dystrophin. There is currently no effective treatment for DMD. Our aim is to up-regulate the expression level of the dystrophin related gene utrophin in DMD, complementing in this way the lack of dystrophin functions. To this end we designed and engineered several synthetic zinc finger based transcription factors. In particular, we have previously shown that the artificial three zinc finger protein named Jazz, fused with the appropriate effector domain, is able to drive the transcription of a test gene from the utrophin promoter “A”. Here we report on the characterization of Vp16-Jazz-transgenic mice that specifically over-express the utrophin gene at the muscular level. A Chromatin Immunoprecipitation assay (ChIP) demonstrated the effective access/binding of the Jazz protein to active chromatin in mouse muscle and Vp16-Jazz was shown to be able to up-regulate endogenous utrophin gene expression by immunohistochemistry, western blot analyses and real-time PCR. To our knowledge, this is the first example of a transgenic mouse expressing an artificial gene coding for a zinc finger based transcription factor. The achievement of Vp16-Jazz transgenic mice validates the strategy of transcriptional targeting of endogenous genes and could represent an exclusive animal model for use in drug discovery and therapeutics. PMID:17712422

  19. Caste- and development-associated gene expression in a lower termite

    PubMed Central

    Scharf, Michael E; Wu-Scharf, Dancia; Pittendrigh, Barry R; Bennett, Gary W

    2003-01-01

    Background Social insects such as termites express dramatic polyphenism (the occurrence of multiple forms in a species on the basis of differential gene expression) both in association with caste differentiation and between castes after differentiation. We have used cDNA macroarrays to compare gene expression between polyphenic castes and intermediary developmental stages of the termite Reticulitermes flavipes. Results We identified differentially expressed genes from nine ontogenic categories. Quantitative PCR was used to quantify precise differences in gene expression between castes and between intermediary developmental stages. We found worker and nymph-biased expression of transcripts encoding termite and endosymbiont cellulases; presoldier-biased expression of transcripts encoding the storage/hormone-binding protein vitellogenin; and soldier-biased expression of gene transcripts encoding two transcription/translation factors, two signal transduction factors and four cytoskeletal/muscle proteins. The two transcription/translation factors showed significant homology to the bicaudal and bric-a-brac developmental genes of Drosophila. Conclusions Our results show differential expression of regulatory, structural and enzyme-coding genes in association with termite castes and their developmental precursor stages. They also provide the first glimpse into how insect endosymbiont cellulase gene expression can vary in association with the caste of a host. These findings shed light on molecular processes associated with termite biology, polyphenism, caste differentiation and development and highlight potentially interesting variations in developmental themes between termites, other insects, and higher animals. PMID:14519197

  20. Alternative splicing produces transcripts coding for alpha and beta chains of a hetero-dimeric phosphagen kinase.

    PubMed

    Ellington, W Ross; Yamashita, Daisuke; Suzuki, Tomohiko

    2004-06-09

    Glycocyamine kinase (GK) catalyzes the reversible phosphorylation of glycocyamine (guanidinoacetate), a reaction central to cellular energy homeostasis in certain animals. GK is a member of the phosphagen kinase enzyme family and appears to have evolved from creatine kinase (CK) early in the evolution of multi-cellular animals. Prior work has shown that GK from the polychaete Neanthes (Nereis) diversicolor exits as a hetero-dimer in vivo and that the two polypeptide chains (termed alpha and beta) are coded for by unique transcripts. In the present study, we demonstrate that the GK from a congener Nereis virens is also hetero-dimeric and is coded for by alpha and beta transcripts, which are virtually identical to the corresponding forms in N. diversicolor. The GK gene from N. diversicolor was amplified by PCR. Sequencing of the PCR products showed that the alpha and beta chains are the result of alternative splicing of the GK primary mRNA transcript. These results also strongly suggest that this gene underwent an early tandem exon duplication event. Full-length cDNAs for N. virens GKalpha and GKbeta were individually ligated into expression vectors and the resulting constructs used to transform Escherichia coli expression hosts. Regardless of expression conditions, minimal GK activity was observed in both GKalpha and GKbeta constructs. Inclusion bodies for both were harvested, unfolded in urea and alpha chains, beta chains and mixtures of alpha and beta chains were refolded by sequential dialysis. Only modest amounts of GK activity were observed when alpha and beta were refolded individually. In contrast, when refolded the alpha and beta mixture yielded highly active hetero-dimers, as validated by size exclusion chromatography, electrophoresis and mass spectrometry, with a specific activity comparable to that of natural GK. The above evidence suggests that there is a preference for hetero-dimer formation in the GKs from these two polychaetes. The evolution of the alternate splicing and an additional exon in these GKs, producing alpha and beta transcripts, can be viewed as a possible compensation for a mutation(s) in the original gene, which most likely coded for a homo-dimeric protein.

  1. Prevalence of transcription promoters within archaeal operons and coding sequences

    PubMed Central

    Koide, Tie; Reiss, David J; Bare, J Christopher; Pang, Wyming Lee; Facciotti, Marc T; Schmid, Amy K; Pan, Min; Marzolf, Bruz; Van, Phu T; Lo, Fang-Yin; Pratap, Abhishek; Deutsch, Eric W; Peterson, Amelia; Martin, Dan; Baliga, Nitin S

    2009-01-01

    Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of ∼64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein–DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3′ ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes—events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements. PMID:19536208

  2. Characterization of Smoc-1 uncovers two transcript variants showing differential tissue and age specific expression in Bubalus bubalis

    PubMed Central

    Srivastava, Jyoti; Premi, Sanjay; Kumar, Sudhir; Parwez, Iqbal; Ali, Sher

    2007-01-01

    Background Secreted modular calcium binding protein-1 (Smoc-1) belongs to the BM-40 family which has been implicated with tissue remodeling, angiogenesis and bone mineralization. Besides its anticipated role in embryogenesis, Smoc-1 has been characterized only in a few mammalian species. We made use of the consensus sequence (5' CACCTCTCCACCTGCC 3') of 33.15 repeat loci to explore the buffalo transcriptome and uncovered the Smoc-1 transcript tagged with this repeat. The main objective of this study was to gain an insight into its structural and functional organization, and expressional status of Smoc-1 in water buffalo, Bubalus bubalis. Results We cloned and characterized the buffalo Smoc-1, including its copy number status, in-vitro protein expression, tissue & age specific transcription/translation, chromosomal mapping and localization to the basement membrane zone. Buffalo Smoc-1 was found to encode a secreted matricellular glycoprotein containing two EF-hand calcium binding motifs homologous to that of BM-40/SPARC family. In buffalo, this single copy gene consisted of 12 exons and was mapped onto the acrocentric chromosome 11. Though this gene was found to be evolutionarily conserved, the buffalo Smoc-1 showed conspicuous nucleotide/amino acid changes altering its secondary structure compared to that in other mammals. In silico analysis of the Smoc-1 proposed its glycoprotein nature with a calcium dependent conformation. Further, we unveiled two transcript variants of this gene, varying in their 3'UTR lengths but both coding for identical protein(s). Smoc-1 evinced highest expression of both the variants in liver and modest to negligible in other tissues. The relative expression of variant-02 was markedly higher compared to that of variant-01 in all the tissues examined. Moreover, expression of Smoc-1, though modest during the early ages, was conspicuously enhanced after 1 year and remained consistently higher during the entire life span of buffalo with gradual increment in expression of variant-02. Immunohistochemically, Smoc-1 was localized in the basement membrane zones and extracellular matrices of various tissues. Conclusion These data added to our understandings about the tissue, age and species specific functions of the Smoc-1. It also enabled us to demonstrate varying expression of the two transcript variants of Smoc-1 amongst different somatic tissues/gonads and ages, in spite of their identical coding frames. Pursuance of these variants for their roles in various disease phenotypes such as hepatocellular carcinoma and angiogenesis is envisaged to establish broader biological significance of this gene. PMID:18042303

  3. A circadian gene expression atlas in mammals: implications for biology and medicine.

    PubMed

    Zhang, Ray; Lahens, Nicholas F; Ballance, Heather I; Hughes, Michael E; Hogenesch, John B

    2014-11-11

    To characterize the role of the circadian clock in mouse physiology and behavior, we used RNA-seq and DNA arrays to quantify the transcriptomes of 12 mouse organs over time. We found 43% of all protein coding genes showed circadian rhythms in transcription somewhere in the body, largely in an organ-specific manner. In most organs, we noticed the expression of many oscillating genes peaked during transcriptional "rush hours" preceding dawn and dusk. Looking at the genomic landscape of rhythmic genes, we saw that they clustered together, were longer, and had more spliceforms than nonoscillating genes. Systems-level analysis revealed intricate rhythmic orchestration of gene pathways throughout the body. We also found oscillations in the expression of more than 1,000 known and novel noncoding RNAs (ncRNAs). Supporting their potential role in mediating clock function, ncRNAs conserved between mouse and human showed rhythmic expression in similar proportions as protein coding genes. Importantly, we also found that the majority of best-selling drugs and World Health Organization essential medicines directly target the products of rhythmic genes. Many of these drugs have short half-lives and may benefit from timed dosage. In sum, this study highlights critical, systemic, and surprising roles of the mammalian circadian clock and provides a blueprint for advancement in chronotherapy.

  4. AP-2α and AP-2β cooperatively orchestrate homeobox gene expression during branchial arch patterning.

    PubMed

    Van Otterloo, Eric; Li, Hong; Jones, Kenneth L; Williams, Trevor

    2018-01-25

    The evolution of a hinged moveable jaw with variable morphology is considered a major factor behind the successful expansion of the vertebrates. DLX homeobox transcription factors are crucial for establishing the positional code that patterns the mandible, maxilla and intervening hinge domain, but how the genes encoding these proteins are regulated remains unclear. Herein, we demonstrate that the concerted action of the AP-2α and AP-2β transcription factors within the mouse neural crest is essential for jaw patterning. In the absence of these two proteins, the hinge domain is lost and there are alterations in the size and patterning of the jaws correlating with dysregulation of homeobox gene expression, with reduced levels of Emx, Msx and Dlx paralogs accompanied by an expansion of Six1 expression. Moreover, detailed analysis of morphological features and gene expression changes indicate significant overlap with various compound Dlx gene mutants. Together, these findings reveal that the AP-2 genes have a major function in mammalian neural crest development, influencing patterning of the craniofacial skeleton via the DLX code, an effect that has implications for vertebrate facial evolution, as well as for human craniofacial disorders. © 2018. Published by The Company of Biologists Ltd.

  5. Conserved Curvature of RNA Polymerase I Core Promoter Beyond rRNA Genes: The Case of the Tritryps

    PubMed Central

    Smircich, Pablo; Duhagon, María Ana; Garat, Beatriz

    2015-01-01

    In trypanosomatids, the RNA polymerase I (RNAPI)-dependent promoters controlling the ribosomal RNA (rRNA) genes have been well identified. Although the RNAPI transcription machinery recognizes the DNA conformation instead of the DNA sequence of promoters, no conformational study has been reported for these promoters. Here we present the in silico analysis of the intrinsic DNA curvature of the rRNA gene core promoters in Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major. We found that, in spite of the absence of sequence conservation, these promoters hold conformational properties similar to other eukaryotic rRNA promoters. Our results also indicated that the intrinsic DNA curvature pattern is conserved within the Leishmania genus and also among strains of T. cruzi and T. brucei. Furthermore, we analyzed the impact of point mutations on the intrinsic curvature and their impact on the promoter activity. Furthermore, we found that the core promoters of protein-coding genes transcribed by RNAPI in T. brucei show the same conserved conformational characteristics. Overall, our results indicate that DNA intrinsic curvature of the rRNA gene core promoters is conserved in these ancient eukaryotes and such conserved curvature might be a requirement of RNAPI machinery for transcription of not only rRNA genes but also protein-coding genes. PMID:26718450

  6. Transcriptional profiling of murine osteoblast differentiation based on RNA-seq expression analyses.

    PubMed

    Khayal, Layal Abo; Grünhagen, Johannes; Provazník, Ivo; Mundlos, Stefan; Kornak, Uwe; Robinson, Peter N; Ott, Claus-Eric

    2018-04-11

    Osteoblastic differentiation is a multistep process characterized by osteogenic induction of mesenchymal stem cells, which then differentiate into proliferative pre-osteoblasts that produce copious amounts of extracellular matrix, followed by stiffening of the extracellular matrix, and matrix mineralization by hydroxylapatite deposition. Although these processes have been well characterized biologically, a detailed transcriptional analysis of murine primary calvaria osteoblast differentiation based on RNA sequencing (RNA-seq) analyses has not previously been reported. Here, we used RNA-seq to obtain expression values of 29,148 genes at four time points as murine primary calvaria osteoblasts differentiate in vitro until onset of mineralization was clearly detectable by microscopic inspection. Expression of marker genes confirmed osteogenic differentiation. We explored differential expression of 1386 protein-coding genes using unsupervised clustering and GO analyses. 100 differentially expressed lncRNAs were investigated by co-expression with protein-coding genes that are localized within the same topologically associated domain. Additionally, we monitored expression of 237 genes that are silent or active at distinct time points and compared differential exon usage. Our data represent an in-depth profiling of murine primary calvaria osteoblast differentiation by RNA-seq and contribute to our understanding of genetic regulation of this key process in osteoblast biology. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. Human La binds mRNAs through contacts to the poly(A) tail

    PubMed Central

    Vinayak, Jyotsna; Marrella, Stefano A; Hussain, Rawaa H; Rozenfeld, Leonid; Solomon, Karine; Bayfield, Mark A

    2018-01-01

    Abstract In addition to a role in the processing of nascent RNA polymerase III transcripts, La proteins are also associated with promoting cap-independent translation from the internal ribosome entry sites of numerous cellular and viral coding RNAs. La binding to RNA polymerase III transcripts via their common UUU-3’OH motif is well characterized, but the mechanism of La binding to coding RNAs is poorly understood. Using electromobility shift assays and cross-linking immunoprecipitation, we show that in addition to a sequence specific UUU-3’OH binding mode, human La exhibits a sequence specific and length dependent poly(A) binding mode. We demonstrate that this poly(A) binding mode uses the canonical nucleic acid interaction winged helix face of the eponymous La motif, previously shown to be vacant during uridylate binding. We also show that cytoplasmic, but not nuclear La, engages poly(A) RNA in human cells, that La entry into polysomes utilizes the poly(A) binding mode, and that La promotion of translation from the cyclin D1 internal ribosome entry site occurs in competition with cytoplasmic poly(A) binding protein (PABP). Our data are consistent with human La functioning in translation through contacts to the poly(A) tail. PMID:29447394

  8. The Glucuronic Acid Utilization Gene Cluster from Bacillus stearothermophilus T-6

    PubMed Central

    Shulami, Smadar; Gat, Orit; Sonenshein, Abraham L.; Shoham, Yuval

    1999-01-01

    A λ-EMBL3 genomic library of Bacillus stearothermophilus T-6 was screened for hemicellulolytic activities, and five independent clones exhibiting β-xylosidase activity were isolated. The clones overlap each other and together represent a 23.5-kb chromosomal segment. The segment contains a cluster of xylan utilization genes, which are organized in at least three transcriptional units. These include the gene for the extracellular xylanase, xylanase T-6; part of an operon coding for an intracellular xylanase and a β-xylosidase; and a putative 15.5-kb-long transcriptional unit, consisting of 12 genes involved in the utilization of α-d-glucuronic acid (GlcUA). The first four genes in the potential GlcUA operon (orf1, -2, -3, and -4) code for a putative sugar transport system with characteristic components of the binding-protein-dependent transport systems. The most likely natural substrate for this transport system is aldotetraouronic acid [2-O-α-(4-O-methyl-α-d-glucuronosyl)-xylotriose] (MeGlcUAXyl3). The following two genes code for an intracellular α-glucuronidase (aguA) and a β-xylosidase (xynB). Five more genes (kdgK, kdgA, uxaC, uxuA, and uxuB) encode proteins that are homologous to enzymes involved in galacturonate and glucuronate catabolism. The gene cluster also includes a potential regulatory gene, uxuR, the product of which resembles repressors of the GntR family. The apparent transcriptional start point of the cluster was determined by primer extension analysis and is located 349 bp from the initial ATG codon. The potential operator site is a perfect 12-bp inverted repeat located downstream from the promoter between nucleotides +170 and +181. Gel retardation assays indicated that UxuR binds specifically to this sequence and that this binding is efficiently prevented in vitro by MeGlcUAXyl3, the most likely molecular inducer. PMID:10368143

  9. Regulation of host-pathogen interactions via the post-transcriptional Csr/Rsm system.

    PubMed

    Kusmierek, Maria; Dersch, Petra

    2018-02-01

    A successful colonization of specific hosts requires a rapid and efficient adaptation of the virulence-relevant gene expression program by bacterial pathogens. An important element in this endeavor is the Csr/Rsm system. This multi-component, post-transcriptional control system forms a central hub within complex regulatory networks and coordinately adjusts virulence properties with metabolic and physiological attributes of the pathogen. A key function is elicited by the RNA-binding protein CsrA/RsmA. CsrA/RsmA interacts with numerous target mRNAs, many of which encode crucial virulence factors, and alters their translation, stability or elongation of transcription. Recent studies highlighted that important colonization factors, toxins, and bacterial secretion systems are under CsrA/RsmA control. CsrA/RsmA deficiency impairs host colonization and attenuates virulence, making this post-transcriptional regulator a suitable drug target. The CsrA/RsmA protein can be inactivated through sequestration by non-coding RNAs, or via binding to specific highly abundant mRNAs and interacting proteins. The wide range of interaction partners and RNA targets, as well as the overarching, interlinked genetic control circuits illustrate the complexity of this regulatory system in the different pathogens. Future work addressing spatio-temporal changes of Csr/Rsm-mediated control during the course of an infection will help us to understand how bacteria reprogram their expression profile to cope with continuous changes experienced in colonized niches. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Mutations in the Promoter Region of the Aldolase B Gene that cause Hereditary Fructose Intolerance

    PubMed Central

    Coffee, Erin M.; Tolan, Dean R.

    2010-01-01

    SUMMARY Hereditary fructose intolerance (HFI) is a potentially fatal inherited metabolic disease caused by a deficiency of aldolase B activity in the liver and kidney. Over 40 disease-causing mutations are known in the protein-coding region of ALDOB. Mutations upstream of the protein-coding portion of ALDOB are reported here for the first time. DNA sequence analysis of 61 HFI patients revealed single base mutations in the promoter, intronic enhancer, and the first exon, which is entirely untranslated. One mutation, g.–132G>A, is located within the promoter at an evolutionarily conserved nucleotide within a transcription factor-binding site. A second mutation, IVS1+1G>C, is at the donor splice site of the first exon. In vitro electrophoretic mobility shift assays show a decrease in nuclear extract-protein binding at the g.–132G>A mutant site. The promoter mutation results in decreased transcription using luciferase reporter plasmids. Analysis of cDNA from cells transfected with plasmids harboring the IVS1+1G>C mutation results in aberrant splicing leading to complete retention of the first intron (~ 5 kb). The IVS1+1G>C splicing mutation results in loss of luciferase activity from a reporter plasmid. These novel mutations in ALDOB represent 2% of alleles in American HFI patients, with IVS1+1G>C representing a significantly higher allele frequency (6%) among HFI patients of Hispanic and African-American ethnicity. PMID:20882353

  11. Modeling of DNA local parameters predicts encrypted architectural motifs in Xenopus laevis ribosomal gene promoter.

    PubMed

    Roux-Rouquie, M; Marilley, M

    2000-09-15

    We have modeled local DNA sequence parameters to search for DNA architectural motifs involved in transcription regulation and promotion within the Xenopus laevis ribosomal gene promoter and the intergenic spacer (IGS) sequences. The IGS was found to be shaped into distinct topological domains. First, intrinsic bends split the IGS into domains of common but different helical features. Local parameters at inter-domain junctions exhibit a high variability with respect to intrinsic curvature, bendability and thermal stability. Secondly, the repeated sequence blocks of the IGS exhibit right-handed supercoiled structures which could be related to their enhancer properties. Thirdly, the gene promoter presents both inherent curvature and minor groove narrowing which may be viewed as motifs of a structural code for protein recognition and binding. Such pre-existing deformations could simply be remodeled during the binding of the transcription complex. Alternatively, these deformations could pre-shape the promoter in such a way that further remodeling is facilitated. Mutations shown to abolish promoter curvature as well as intrinsic minor groove narrowing, in a variant which maintained full transcriptional activity, bring circumstantial evidence for structurally-preorganized motifs in relation to transcription regulation and promotion. Using well documented X. laevis rDNA regulatory sequences we showed that computer modeling may be of invaluable assistance in assessing encrypted architectural motifs. The evidence of these DNA topological motifs with respect to the concept of structural code is discussed.

  12. Long noncoding RNAs and their proposed functions in fibre development of cotton (Gossypium spp.).

    PubMed

    Wang, Maojun; Yuan, Daojun; Tu, Lili; Gao, Wenhui; He, Yonghui; Hu, Haiyan; Wang, Pengcheng; Liu, Nian; Lindsey, Keith; Zhang, Xianlong

    2015-09-01

    Long noncoding RNAs (lncRNAs) are transcripts of at least 200 bp in length, possess no apparent coding capacity and are involved in various biological regulatory processes. Until now, no systematic identification of lncRNAs has been reported in cotton (Gossypium spp.). Here, we describe the identification of 30 550 long intergenic noncoding RNA (lincRNA) loci (50 566 transcripts) and 4718 long noncoding natural antisense transcript (lncNAT) loci (5826 transcripts). LncRNAs are rich in repetitive sequences and preferentially expressed in a tissue-specific manner. The detection of abundant genome-specific and/or lineage-specific lncRNAs indicated their weak evolutionary conservation. Approximately 76% of homoeologous lncRNAs exhibit biased expression patterns towards the At or Dt subgenomes. Compared with protein-coding genes, lncRNAs showed overall higher methylation levels and their expression was less affected by gene body methylation. Expression validation in different cotton accessions and coexpression network construction helped to identify several functional lncRNA candidates involved in cotton fibre initiation and elongation. Analysis of integrated expression from the subgenomes of lncRNAs generating miR397 and its targets as a result of genome polyploidization indicated their pivotal functions in regulating lignin metabolism in domesticated tetraploid cotton fibres. This study provides the first comprehensive identification of lncRNAs in Gossypium. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  13. lncRNA requirements for mouse acute myeloid leukemia and normal differentiation

    PubMed Central

    Knott, Simon RV; Munera Maravilla, Ester; Jackson, Benjamin T; Wild, Sophia A; Kovacevic, Tatjana; Stork, Eva Maria; Zhou, Meng; Erard, Nicolas; Lee, Emily; Kelley, David R; Roth, Mareike; Barbosa, Inês AM; Zuber, Johannes; Rinn, John L

    2017-01-01

    A substantial fraction of the genome is transcribed in a cell-type-specific manner, producing long non-coding RNAs (lncRNAs), rather than protein-coding transcripts. Here, we systematically characterize transcriptional dynamics during hematopoiesis and in hematological malignancies. Our analysis of annotated and de novo assembled lncRNAs showed many are regulated during differentiation and mis-regulated in disease. We assessed lncRNA function via an in vivo RNAi screen in a model of acute myeloid leukemia. This identified several lncRNAs essential for leukemia maintenance, and found that a number act by promoting leukemia stem cell signatures. Leukemia blasts show a myeloid differentiation phenotype when these lncRNAs were depleted, and our data indicates that this effect is mediated via effects on the MYC oncogene. Bone marrow reconstitutions showed that a lncRNA expressed across all progenitors was required for the myeloid lineage, whereas the other leukemia-induced lncRNAs were dispensable in the normal setting. PMID:28875933

  14. lncRNA requirements for mouse acute myeloid leukemia and normal differentiation.

    PubMed

    Delás, M Joaquina; Sabin, Leah R; Dolzhenko, Egor; Knott, Simon Rv; Munera Maravilla, Ester; Jackson, Benjamin T; Wild, Sophia A; Kovacevic, Tatjana; Stork, Eva Maria; Zhou, Meng; Erard, Nicolas; Lee, Emily; Kelley, David R; Roth, Mareike; Barbosa, Inês Am; Zuber, Johannes; Rinn, John L; Smith, Andrew D; Hannon, Gregory J

    2017-09-06

    A substantial fraction of the genome is transcribed in a cell-type-specific manner, producing long non-coding RNAs (lncRNAs), rather than protein-coding transcripts. Here, we systematically characterize transcriptional dynamics during hematopoiesis and in hematological malignancies. Our analysis of annotated and de novo assembled lncRNAs showed many are regulated during differentiation and mis-regulated in disease. We assessed lncRNA function via an in vivo RNAi screen in a model of acute myeloid leukemia. This identified several lncRNAs essential for leukemia maintenance, and found that a number act by promoting leukemia stem cell signatures. Leukemia blasts show a myeloid differentiation phenotype when these lncRNAs were depleted, and our data indicates that this effect is mediated via effects on the MYC oncogene. Bone marrow reconstitutions showed that a lncRNA expressed across all progenitors was required for the myeloid lineage, whereas the other leukemia-induced lncRNAs were dispensable in the normal setting.

  15. Small non coding RNAs in adipocyte biology and obesity.

    PubMed

    Amri, Ez-Zoubir; Scheideler, Marcel

    2017-11-15

    Obesity has reached epidemic proportions world-wide and constitutes a substantial risk factor for hypertension, type 2 diabetes, cardiovascular diseases and certain cancers. So far, regulation of energy intake by dietary and pharmacological treatments has met limited success. The main interest of current research is focused on understanding the role of different pathways involved in adipose tissue function and modulation of its mass. Whole-genome sequencing studies revealed that the majority of the human genome is transcribed, with thousands of non-protein-coding RNAs (ncRNA), which comprise small and long ncRNAs. ncRNAs regulate gene expression at the transcriptional and post-transcriptional level. Numerous studies described the involvement of ncRNAs in the pathogenesis of many diseases including obesity and associated metabolic disorders. ncRNAs represent potential diagnostic biomarkers and promising therapeutic targets. In this review, we focused on small ncRNAs involved in the formation and function of adipocytes and obesity. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Dissecting the genetics of the human transcriptome identifies novel trait-related trans-eQTLs and corroborates the regulatory relevance of non-protein coding loci†

    PubMed Central

    Kirsten, Holger; Al-Hasani, Hoor; Holdt, Lesca; Gross, Arnd; Beutner, Frank; Krohn, Knut; Horn, Katrin; Ahnert, Peter; Burkhardt, Ralph; Reiche, Kristin; Hackermüller, Jörg; Löffler, Markus; Teupser, Daniel; Thiery, Joachim; Scholz, Markus

    2015-01-01

    Genetics of gene expression (eQTLs or expression QTLs) has proved an indispensable tool for understanding biological pathways and pathomechanisms of trait-associated SNPs. However, power of most genome-wide eQTL studies is still limited. We performed a large eQTL study in peripheral blood mononuclear cells of 2112 individuals increasing the power to detect trans-effects genome-wide. Going beyond univariate SNP-transcript associations, we analyse relations of eQTLs to biological pathways, polygenetic effects of expression regulation, trans-clusters and enrichment of co-localized functional elements. We found eQTLs for about 85% of analysed genes, and 18% of genes were trans-regulated. Local eSNPs were enriched up to a distance of 5 Mb to the transcript challenging typically implemented ranges of cis-regulations. Pathway enrichment within regulated genes of GWAS-related eSNPs supported functional relevance of identified eQTLs. We demonstrate that nearest genes of GWAS-SNPs might frequently be misleading functional candidates. We identified novel trans-clusters of potential functional relevance for GWAS-SNPs of several phenotypes including obesity-related traits, HDL-cholesterol levels and haematological phenotypes. We used chromatin immunoprecipitation data for demonstrating biological effects. Yet, we show for strongly heritable transcripts that still little trans-chromosomal heritability is explained by all identified trans-eSNPs; however, our data suggest that most cis-heritability of these transcripts seems explained. Dissection of co-localized functional elements indicated a prominent role of SNPs in loci of pseudogenes and non-coding RNAs for the regulation of coding genes. In summary, our study substantially increases the catalogue of human eQTLs and improves our understanding of the complex genetic regulation of gene expression, pathways and disease-related processes. PMID:26019233

  17. Analysis of human ES cell differentiation establishes that the dominant isoforms of the lncRNAs RMST and FIRRE are circular.

    PubMed

    Izuogu, Osagie G; Alhasan, Abd A; Mellough, Carla; Collin, Joseph; Gallon, Richard; Hyslop, Jonathon; Mastrorosa, Francesco K; Ehrmann, Ingrid; Lako, Majlinda; Elliott, David J; Santibanez-Koref, Mauro; Jackson, Michael S

    2018-04-20

    Circular RNAs (circRNAs) are predominantly derived from protein coding genes, and some can act as microRNA sponges or transcriptional regulators. Changes in circRNA levels have been identified during human development which may be functionally important, but lineage-specific analyses are currently lacking. To address this, we performed RNAseq analysis of human embryonic stem (ES) cells differentiated for 90 days towards 3D laminated retina. A transcriptome-wide increase in circRNA expression, size, and exon count was observed, with circRNA levels reaching a plateau by day 45. Parallel statistical analyses, controlling for sample and locus specific effects, identified 239 circRNAs with expression changes distinct from the transcriptome-wide pattern, but these all also increased in abundance over time. Surprisingly, circRNAs derived from long non-coding RNAs (lncRNAs) were found to account for a significantly larger proportion of transcripts from their loci of origin than circRNAs from coding genes. The most abundant, circRMST:E12-E6, showed a > 100X increase during differentiation accompanied by an isoform switch, and accounts for > 99% of RMST transcripts in many adult tissues. The second most abundant, circFIRRE:E10-E5, accounts for > 98% of FIRRE transcripts in differentiating human ES cells, and is one of 39 FIRRE circRNAs, many of which include multiple unannotated exons. Our results suggest that during human ES cell differentiation, changes in circRNA levels are primarily globally controlled. They also suggest that RMST and FIRRE, genes with established roles in neurogenesis and topological organisation of chromosomal domains respectively, are processed as circular lncRNAs with only minor linear species.

  18. Decoding the non-coding genome: elucidating genetic risk outside the coding genome.

    PubMed

    Barr, C L; Misener, V L

    2016-01-01

    Current evidence emerging from genome-wide association studies indicates that the genetic underpinnings of complex traits are likely attributable to genetic variation that changes gene expression, rather than (or in combination with) variation that changes protein-coding sequences. This is particularly compelling with respect to psychiatric disorders, as genetic changes in regulatory regions may result in differential transcriptional responses to developmental cues and environmental/psychosocial stressors. Until recently, however, the link between transcriptional regulation and psychiatric genetic risk has been understudied. Multiple obstacles have contributed to the paucity of research in this area, including challenges in identifying the positions of remote (distal from the promoter) regulatory elements (e.g. enhancers) and their target genes and the underrepresentation of neural cell types and brain tissues in epigenome projects - the availability of high-quality brain tissues for epigenetic and transcriptome profiling, particularly for the adolescent and developing brain, has been limited. Further challenges have arisen in the prediction and testing of the functional impact of DNA variation with respect to multiple aspects of transcriptional control, including regulatory-element interaction (e.g. between enhancers and promoters), transcription factor binding and DNA methylation. Further, the brain has uncommon DNA-methylation marks with unique genomic distributions not found in other tissues - current evidence suggests the involvement of non-CG methylation and 5-hydroxymethylation in neurodevelopmental processes but much remains unknown. We review here knowledge gaps as well as both technological and resource obstacles that will need to be overcome in order to elucidate the involvement of brain-relevant gene-regulatory variants in genetic risk for psychiatric disorders. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.

  19. Junk DNA and the long non-coding RNA twist in cancer genetics

    PubMed Central

    Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A

    2015-01-01

    The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839

  20. Kangaroo – A pattern-matching program for biological sequences

    PubMed Central

    2002-01-01

    Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats. PMID:12150718

  1. Differential Responses to Wnt and PCP Disruption Predict Expression and Developmental Function of Conserved and Novel Genes in a Cnidarian

    PubMed Central

    Lapébie, Pascal; Ruggiero, Antonella; Barreau, Carine; Chevalier, Sandra; Chang, Patrick; Dru, Philippe; Houliston, Evelyn; Momose, Tsuyoshi

    2014-01-01

    We have used Digital Gene Expression analysis to identify, without bilaterian bias, regulators of cnidarian embryonic patterning. Transcriptome comparison between un-manipulated Clytia early gastrula embryos and ones in which the key polarity regulator Wnt3 was inhibited using morpholino antisense oligonucleotides (Wnt3-MO) identified a set of significantly over and under-expressed transcripts. These code for candidate Wnt signaling modulators, orthologs of other transcription factors, secreted and transmembrane proteins known as developmental regulators in bilaterian models or previously uncharacterized, and also many cnidarian-restricted proteins. Comparisons between embryos injected with morpholinos targeting Wnt3 and its receptor Fz1 defined four transcript classes showing remarkable correlation with spatiotemporal expression profiles. Class 1 and 3 transcripts tended to show sustained expression at “oral” and “aboral” poles respectively of the developing planula larva, class 2 transcripts in cells ingressing into the endodermal region during gastrulation, while class 4 gene expression was repressed at the early gastrula stage. The preferential effect of Fz1-MO on expression of class 2 and 4 transcripts can be attributed to Planar Cell Polarity (PCP) disruption, since it was closely matched by morpholino knockdown of the specific PCP protein Strabismus. We conclude that endoderm and post gastrula-specific gene expression is particularly sensitive to PCP disruption while Wnt-/β-catenin signaling dominates gene regulation along the oral-aboral axis. Phenotype analysis using morpholinos targeting a subset of transcripts indicated developmental roles consistent with expression profiles for both conserved and cnidarian-restricted genes. Overall our unbiased screen allowed systematic identification of regionally expressed genes and provided functional support for a shared eumetazoan developmental regulatory gene set with both predicted and previously unexplored members, but also demonstrated that fundamental developmental processes including axial patterning and endoderm formation in cnidarians can involve newly evolved (or highly diverged) genes. PMID:25233086

  2. Differential responses to Wnt and PCP disruption predict expression and developmental function of conserved and novel genes in a cnidarian.

    PubMed

    Lapébie, Pascal; Ruggiero, Antonella; Barreau, Carine; Chevalier, Sandra; Chang, Patrick; Dru, Philippe; Houliston, Evelyn; Momose, Tsuyoshi

    2014-09-01

    We have used Digital Gene Expression analysis to identify, without bilaterian bias, regulators of cnidarian embryonic patterning. Transcriptome comparison between un-manipulated Clytia early gastrula embryos and ones in which the key polarity regulator Wnt3 was inhibited using morpholino antisense oligonucleotides (Wnt3-MO) identified a set of significantly over and under-expressed transcripts. These code for candidate Wnt signaling modulators, orthologs of other transcription factors, secreted and transmembrane proteins known as developmental regulators in bilaterian models or previously uncharacterized, and also many cnidarian-restricted proteins. Comparisons between embryos injected with morpholinos targeting Wnt3 and its receptor Fz1 defined four transcript classes showing remarkable correlation with spatiotemporal expression profiles. Class 1 and 3 transcripts tended to show sustained expression at "oral" and "aboral" poles respectively of the developing planula larva, class 2 transcripts in cells ingressing into the endodermal region during gastrulation, while class 4 gene expression was repressed at the early gastrula stage. The preferential effect of Fz1-MO on expression of class 2 and 4 transcripts can be attributed to Planar Cell Polarity (PCP) disruption, since it was closely matched by morpholino knockdown of the specific PCP protein Strabismus. We conclude that endoderm and post gastrula-specific gene expression is particularly sensitive to PCP disruption while Wnt-/β-catenin signaling dominates gene regulation along the oral-aboral axis. Phenotype analysis using morpholinos targeting a subset of transcripts indicated developmental roles consistent with expression profiles for both conserved and cnidarian-restricted genes. Overall our unbiased screen allowed systematic identification of regionally expressed genes and provided functional support for a shared eumetazoan developmental regulatory gene set with both predicted and previously unexplored members, but also demonstrated that fundamental developmental processes including axial patterning and endoderm formation in cnidarians can involve newly evolved (or highly diverged) genes.

  3. MD simulations of papillomavirus DNA-E2 protein complexes hints at a protein structural code for DNA deformation.

    PubMed

    Falconi, M; Oteri, F; Eliseo, T; Cicero, D O; Desideri, A

    2008-08-01

    The structural dynamics of the DNA binding domains of the human papillomavirus strain 16 and the bovine papillomavirus strain 1, complexed with their DNA targets, has been investigated by modeling, molecular dynamics simulations, and nuclear magnetic resonance analysis. The simulations underline different dynamical features of the protein scaffolds and a different mechanical interaction of the two proteins with DNA. The two protein structures, although very similar, show differences in the relative mobility of secondary structure elements. Protein structural analyses, principal component analysis, and geometrical and energetic DNA analyses indicate that the two transcription factors utilize a different strategy in DNA recognition and deformation. Results show that the protein indirect DNA readout is not only addressable to the DNA molecule flexibility but it is finely tuned by the mechanical and dynamical properties of the protein scaffold involved in the interaction.

  4. ExDom: an integrated database for comparative analysis of the exon–intron structures of protein domains in eukaryotes

    PubMed Central

    Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan

    2009-01-01

    We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Preti, Milena; Guffanti, Elisa; Valitutto, Eleonora

    The SNR52 gene, coding for a box C/D snoRNA, is the only snoRNA gene transcribed by RNA polymerase (Pol) III in Saccharomyces cerevisiae. Pol III transcription generates a precisely terminated primary transcript that undergoes extensive 5'-end processing. Here, we show that mutations of the box C/D core motif required for snoRNP assembly compromise 5'-end maturation of the SNR52 snoRNA. Upstream processing was also impaired by specific depletion of either Nop1p or Nop58p snoRNP proteins. We further show that the nuclear exosome is required for 3'-end maturation of SNR52 snoRNA, at variance with all the other known Pol III transcripts. Ourmore » data suggest a functional coupling between snoRNP assembly and 5'-end maturation of independently transcribed box C/D snoRNAs.« less

  6. BioCichlid: central dogma-based 3D visualization system of time-course microarray data on a hierarchical biological network.

    PubMed

    Ishiwata, Ryosuke R; Morioka, Masaki S; Ogishima, Soichi; Tanaka, Hiroshi

    2009-02-15

    BioCichlid is a 3D visualization system of time-course microarray data on molecular networks, aiming at interpretation of gene expression data by transcriptional relationships based on the central dogma with physical and genetic interactions. BioCichlid visualizes both physical (protein) and genetic (regulatory) network layers, and provides animation of time-course gene expression data on the genetic network layer. Transcriptional regulations are represented to bridge the physical network (transcription factors) and genetic network (regulated genes) layers, thus integrating promoter analysis into the pathway mapping. BioCichlid enhances the interpretation of microarray data and allows for revealing the underlying mechanisms causing differential gene expressions. BioCichlid is freely available and can be accessed at http://newton.tmd.ac.jp/. Source codes for both biocichlid server and client are also available.

  7. Mediator complex dependent regulation of cardiac development and disease.

    PubMed

    Grueter, Chad E

    2013-06-01

    Cardiovascular disease (CVD) is a leading cause of morbidity and mortality. The risk factors for CVD include environmental and genetic components. Human mutations in genes involved in most aspects of cardiovascular function have been identified, many of which are involved in transcriptional regulation. The Mediator complex serves as a pivotal transcriptional regulator that functions to integrate diverse cellular signals by multiple mechanisms including recruiting RNA polymerase II, chromatin modifying proteins and non-coding RNAs to promoters in a context dependent manner. This review discusses components of the Mediator complex and the contribution of the Mediator complex to normal and pathological cardiac development and function. Enhanced understanding of the role of this core transcriptional regulatory complex in the heart will help us gain further insights into CVD. Copyright © 2013. Production and hosting by Elsevier Ltd.

  8. Cloning, characterisation, and comparative quantitative expression analyses of receptor for advanced glycation end products (RAGE) transcript forms.

    PubMed

    Sterenczak, Katharina A; Willenbrock, Saskia; Barann, Matthias; Klemke, Markus; Soller, Jan T; Eberle, Nina; Nolte, Ingo; Bullerdiek, Jörn; Murua Escobar, Hugo

    2009-04-01

    RAGE is a member of the immunoglobulin superfamily of cell surface molecules playing key roles in pathophysiological processes, e.g. immune/inflammatory disorders, Alzheimer's disease, diabetic arteriosclerosis and tumourigenesis. In humans 19 naturally occurring RAGE splicing variants resulting in either N-terminally or C-terminally truncated proteins were identified and are lately discussed as mechanisms for receptor regulation. Accordingly, deregulation of sRAGE levels has been associated with several diseases e.g. Alzheimer's disease, Type 1 diabetes, and rheumatoid arthritis. Administration of recombinant sRAGE to animal models of cancer blocked tumour growth successfully. In spite of its obvious relationship to cancer and metastasis data focusing sRAGE deregulation and tumours is rare. In this study we screened a set of tumours, healthy tissues and various cancer cell lines for RAGE splicing variants and analysed their structure. Additionally, we analysed the ratio of the mainly found transcript variants using quantitative Real-Time PCR. In total we characterised 24 previously not described canine and 4 human RAGE splicing variants, analysed their structure, classified their characteristics, and derived their respective protein forms. Interestingly, the healthy and the neoplastic tissue samples showed in majority RAGE transcripts coding for the complete receptor and transcripts showing insertions of intron 1.

  9. Reverse engineering a mouse embryonic stem cell-specific transcriptional network reveals a new modulator of neuronal differentiation.

    PubMed

    De Cegli, Rossella; Iacobacci, Simona; Flore, Gemma; Gambardella, Gennaro; Mao, Lei; Cutillo, Luisa; Lauria, Mario; Klose, Joachim; Illingworth, Elizabeth; Banfi, Sandro; di Bernardo, Diego

    2013-01-01

    Gene expression profiles can be used to infer previously unknown transcriptional regulatory interaction among thousands of genes, via systems biology 'reverse engineering' approaches. We 'reverse engineered' an embryonic stem (ES)-specific transcriptional network from 171 gene expression profiles, measured in ES cells, to identify master regulators of gene expression ('hubs'). We discovered that E130012A19Rik (E13), highly expressed in mouse ES cells as compared with differentiated cells, was a central 'hub' of the network. We demonstrated that E13 is a protein-coding gene implicated in regulating the commitment towards the different neuronal subtypes and glia cells. The overexpression and knock-down of E13 in ES cell lines, undergoing differentiation into neurons and glia cells, caused a strong up-regulation of the glutamatergic neurons marker Vglut2 and a strong down-regulation of the GABAergic neurons marker GAD65 and of the radial glia marker Blbp. We confirmed E13 expression in the cerebral cortex of adult mice and during development. By immuno-based affinity purification, we characterized protein partners of E13, involved in the Polycomb complex. Our results suggest a role of E13 in regulating the division between glutamatergic projection neurons and GABAergic interneurons and glia cells possibly by epigenetic-mediated transcriptional regulation.

  10. The bromodomain protein BRD4 regulates splicing during heat shock

    PubMed Central

    Hussong, Michelle; Kaehler, Christian; Kerick, Martin; Grimm, Christina; Franz, Alexandra; Timmermann, Bernd; Welzel, Franziska; Isensee, Jörg; Hucho, Tim; Krobitsch, Sylvia; Schweiger, Michal R.

    2017-01-01

    The cellular response to heat stress is an ancient and evolutionarily highly conserved defence mechanism characterised by the transcriptional up-regulation of cyto-protective genes and a partial inhibition of splicing. These features closely resemble the proteotoxic stress response during tumor development. The bromodomain protein BRD4 has been identified as an integral member of the oxidative stress as well as of the inflammatory response, mainly due to its role in the transcriptional regulation process. In addition, there are also several lines of evidence implicating BRD4 in the splicing process. Using RNA-sequencing we found a significant increase in splicing inhibition, in particular intron retentions (IR), following heat treatment in BRD4-depleted cells. This leads to a decrease of mRNA abundancy of the affected transcripts, most likely due to premature termination codons. Subsequent experiments revealed that BRD4 interacts with the heat shock factor 1 (HSF1) such that under heat stress BRD4 is recruited to nuclear stress bodies and non-coding SatIII RNA transcripts are up-regulated. These findings implicate BRD4 as an important regulator of splicing during heat stress. Our data which links BRD4 to the stress induced splicing process may provide novel mechanisms of BRD4 inhibitors in regard to anti-cancer therapies. PMID:27536004

  11. Drosophila PAF1 Modulates PIWI/piRNA Silencing Capacity.

    PubMed

    Clark, Josef P; Rahman, Reazur; Yang, Nachen; Yang, Linda H; Lau, Nelson C

    2017-09-11

    To test the directness of factors in initiating PIWI-directed gene silencing, we employed a Piwi-interacting RNA (piRNA)-targeted reporter assay in Drosophila ovary somatic sheet (OSS) cells [1]. This assay confirmed direct silencing roles for piRNA biogenesis factors and PIWI-associated factors [2-12] but suggested that chromatin-modifying proteins may act downstream of the initial silencing event. Our data also revealed that RNA-polymerase-II-associated proteins like PAF1 and RTF1 antagonize PIWI-directed silencing. PAF1 knockdown enhances PIWI silencing of reporters when piRNAs target the transcript region proximal to the promoter. Loss of PAF1 suppresses endogenous transposable element (TE) transcript maturation, whereas a subset of gene transcripts and long-non-coding RNAs adjacent to TE insertions are affected by PAF1 knockdown in a similar fashion to piRNA-targeted reporters. Additionally, transcription activation at specific TEs and TE-adjacent loci during PIWI knockdown is suppressed when PIWI and PAF1 levels are both reduced. Our study suggests a mechanistic conservation between fission yeast PAF1 repressing AGO1/small interfering RNA (siRNA)-directed silencing [13, 14] and Drosophila PAF1 opposing PIWI/piRNA-directed silencing. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Paraspeckles: nuclear bodies built on long noncoding RNA

    PubMed Central

    Bond, Charles S.

    2009-01-01

    Paraspeckles are ribonucleoprotein bodies found in the interchromatin space of mammalian cell nuclei. These structures play a role in regulating the expression of certain genes in differentiated cells by nuclear retention of RNA. The core paraspeckle proteins (PSF/SFPQ, P54NRB/NONO, and PSPC1 [paraspeckle protein 1]) are members of the DBHS (Drosophila melanogaster behavior, human splicing) family. These proteins, together with the long nonprotein-coding RNA NEAT1 (MEN-ϵ/β), associate to form paraspeckles and maintain their integrity. Given the large numbers of long noncoding transcripts currently being discovered through whole transcriptome analysis, paraspeckles may be a paradigm for a class of subnuclear bodies formed around long noncoding RNA. PMID:19720872

  13. Altruistic functions for selfish DNA.

    PubMed

    Faulkner, Geoffrey J; Carninci, Piero

    2009-09-15

    Mammalian genomes are comprised of 30-50% transposed elements (TEs). The vast majority of these TEs are truncated and mutated fragments of retrotransposons that are no longer capable of transposition. Although initially regarded as important factors in the evolution of gene regulatory networks, TEs are now commonly perceived as neutrally evolving and non-functional genomic elements. In a major development, recent works have strongly contradicted this "selfish DNA" or "junk DNA" dogma by demonstrating that TEs use a host of novel promoters to generate RNA on a massive scale across most eukaryotic cells. This transcription frequently functions to control the expression of protein-coding genes via alternative promoters, cis regulatory non protein-coding RNAs and the formation of double stranded short RNAs. If considered in sum, these findings challenge the designation of TEs as selfish and neutrally evolving genomic elements. Here, we will expand upon these themes and discuss challenges in establishing novel TE functions in vivo.

  14. Transcriptional Repression of the Dspp Gene Leads to Dentinogenesis Imperfecta Phenotype in Col1a1-Trps1 Transgenic Mice

    PubMed Central

    Napierala, Dobrawa; Sun, Yao; Maciejewska, Izabela; Bertin, Terry K; Dawson, Brian; D'Souza, Rena; Qin, Chunlin; Lee, Brendan

    2012-01-01

    Dentinogenesis imperfecta (DGI) is a hereditary defect of dentin, a calcified tissue that is the most abundant component of teeth. Most commonly, DGI is manifested as a part of osteogenesis imperfecta (OI) or the phenotype is restricted to dental findings only. In the latter case, DGI is caused by mutations in the DSPP gene, which codes for dentin sialoprotein (DSP) and dentin phosphoprotein (DPP). Although these two proteins together constitute the majority of noncollagenous proteins of the dentin, little is known about their transcriptional regulation. Here we demonstrate that mice overexpressing the Trps1 transcription factor (Col1a1-Trps1 mice) in dentin-producing cells, odontoblasts, present with severe defects of dentin formation that resemble DGI. Combined micro–computed tomography (µCT) and histological analyses revealed tooth fragility due to severe hypomineralization of dentin and a diminished dentin layer with irregular mineralization in Col1a1-Trps1 mice. Biochemical analyses of noncollagenous dentin matrix proteins demonstrated decreased levels of both DSP and DPP proteins in Col1a1-Trps1 mice. On the molecular level, we demonstrated that sustained high levels of Trps1 in odontoblasts lead to dramatic decrease of Dspp expression as a result of direct inhibition of the Dspp promoter by Trps1. During tooth development Trps1 is highly expressed in preodontoblasts, but in mature odontoblasts secreting matrix its expression significantly decreases, which suggests a Trps1 role in odontoblast development. In these studies we identified Trps1 as a potent inhibitor of Dspp expression and the subsequent mineralization of dentin. Thus, we provide novel insights into mechanisms of transcriptional dysregulation that leads to DGI. © 2012 American Society for Bone and Mineral Research. PMID:22508542

  15. FoxK1 splice variants show developmental stage-specific plasticity of expression with temperature in the tiger pufferfish.

    PubMed

    Fernandes, Jorge M O; MacKenzie, Matthew G; Kinghorn, James R; Johnston, Ian A

    2007-10-01

    FoxK1 is a member of the highly conserved forkhead/winged helix (Fox) family of transcription factors and it is known to play a key role in mammalian muscle development and myogenic stem cell function. The tiger pufferfish (Takifugu rubripes) orthologue of mammalian FoxK1 (TFoxK1) has seven exons and is located in a region of conserved synteny between pufferfish and mouse. TFoxK1 is expressed as three alternative transcripts: TFoxK1-alpha, TFoxK1-gamma and TFoxK1-delta. TFoxK1-alpha is the orthologue of mouse FoxK1-alpha, coding for a putative protein of 558 residues that contains the forkhead and forkhead-associated domains typical of Fox proteins and shares 53% global identity with its mammalian homologue. TFoxK1-gamma and TFoxK1-delta arise from intron retention events and these transcripts translate into the same 344-amino acid protein with a truncated forkhead domain. Neither are orthologues of mouse FoxK1-beta. In adult fish, the TFoxK1 splice variants were differentially expressed between fast and slow myotomal muscle, as well as other tissues, and the FoxK1-alpha protein was expressed in myogenic progenitor cells of fast myotomal muscle. During embryonic development, TFoxK1 was transiently expressed in the developing somites, heart, brain and eye. The relative expression of TFoxK1-alpha and the other two alternative transcripts varied with the incubation temperature regime for equivalent embryonic stages and the differences were particularly marked at later developmental stages. The developmental expression pattern of TFoxK1 and its localisation to mononuclear myogenic progenitor cells in adult fast muscle indicate that it may play an essential role in myogenesis in T. rubripes.

  16. Identification of the SRC pyrimidine-binding protein (SPy) as hnRNP K: implications in the regulation of SRC1A transcription

    PubMed Central

    Ritchie, Shawn A.; Pasha, Mohammed K.; Batten, Danielle J. P.; Sharma, Rajendra K.; Olson, Douglas J. H.; Ross, Andrew R. S.; Bonham, Keith

    2003-01-01

    The human SRC gene encodes pp60c–src, a non-receptor tyrosine kinase involved in numerous signaling pathways. Activation or overexpression of c-Src has also been linked to a number of important human cancers. Transcription of the SRC gene is complex and regulated by two closely linked but highly dissimilar promoters, each associated with its own distinct non-coding exon. In many tissues SRC expression is regulated by the housekeeping-like SRC1A promoter. In addition to other regulatory elements, three substantial polypurine:polypyrimidine (TC) tracts within this promoter are required for full transcriptional activity. Previously, we described an unusual factor called SRC pyrimidine-binding protein (SPy) that could bind to two of these TC tracts in their double-stranded form, but was also capable of interacting with higher affinity to all three pyrimidine tracts in their single-stranded form. Mutations in the TC tracts, which abolished the ability of SPy to interact with its double-stranded DNA target, significantly reduced SRC1A promoter activity, especially in concert with mutations in critical Sp1 binding sites. Here we expand upon our characterization of this interesting factor and describe the purification of SPy from human SW620 colon cancer cells using a DNA affinity-based approach. Subsequent in-gel tryptic digestion of purified SPy followed by MALDI-TOF mass spectrometric analysis identified SPy as heterogeneous nuclear ribonucleoprotein K (hnRNP K), a known nucleic-acid binding protein implicated in various aspects of gene expression including transcription. These data provide new insights into the double- and single-stranded DNA-binding specificity, as well as functional properties of hnRNP K, and suggest that hnRNP K is a critical component of SRC1A transcriptional processes. PMID:12595559

  17. The electrostatic role of the Zn-Cys2His2 complex in binding of operator DNA with transcription factors: mouse EGR-1 from the Cys2His2 family.

    PubMed

    Chirgadze, Y N; Boshkova, E A; Polozov, R V; Sivozhelezov, V S; Dzyabchenko, A V; Kuzminsky, M B; Stepanenko, V A; Ivanov, V V

    2018-01-07

    The mouse factor Zif268, known also as early growth response protein EGR-1, is a classical representative for the Cys2His2 transcription factor family. It is required for binding the RNA polymerase with operator dsDNA to initialize the transcription process. We have shown that only in this family of total six Zn-finger protein families the Zn complex plays a significant role in the protein-DNA binding. Electrostatic feature of this complex in the binding of factor Zif268 from Mus musculus with operator DNA has been considered. The factor consists of three similar Zn-finger units which bind with triplets of coding DNA. Essential contacts of the factor with the DNA phosphates are formed by three conservative His residues, one in each finger. We describe here the results of calculations of the electrostatic potentials for the Zn-Cys2His2 complex, Zn-finger unit 1, and the whole transcription factor. The potential of Zif268 has a positive area on the factor surface, and it corresponds exactly to the binding sites of each of Zn-finger units. The main part of these areas is determined by conservative His residues, which form contacts with the DNA phosphate groups. Our result shows that the electrostatic positive potential of this histidine residue is enhanced due to the Zn complex. The other contacts of the Zn-finger with DNA are related to nucleotide bases, and they are responsible for the sequence-specific binding with DNA. This result may be extended to all other members of the Cys2His2 transcription factor family.

  18. Bombyx mori nucleopolyhedrovirus orf25 encodes a 30kDa late protein in the infection cycle.

    PubMed

    Wang, Haiyan; Chen, Keping; Guo, Zhongjian; Yao, Qin

    2008-02-01

    Bombyx mori nucleopolyhedrovirus (BmNPV) orf25 gene was characterized for the first time. The coding sequence of Bm25 was amplified and subcloned into the prokaryotic expression vector pGEX-4T-2 to produce glutathione S-transferase-tagged fusion protein in the BL21 (DE3) cells. The GST-Bm25 fusion protein was expressed efficiently after induction with IPTG. The purified fusion protein was used to immunize New Zealand white rabbits to prepare polyclonal antibody. Temporal expression analysis revealed a 30-kDa protein, which was detected beginning 24 hours post-infection using a polyclonal antibody against GST-Bm25 fusion protein. The transcript of Bm25 was detected by RT-PCR at 18-72 h p.i. In conclusion, the available data suggest that Bm25 encodes a 30kDa protein expressed in the late stage of infection cycle.

  19. Molecular mechanisms of long noncoding RNAs on gastric cancer

    PubMed Central

    Li, Tianwen; Mo, Xiaoyan; Fu, Liyun; Xiao, Bingxiu; Guo, Junming

    2016-01-01

    Long noncoding RNAs (lncRNAs) are non-protein coding transcripts longer than 200 nucleotides. Aberrant expression of lncRNAs has been found associated with gastric cancer, one of the most malignant tumors. By complementary base pairing with mRNAs or forming complexes with RNA binding proteins (RBPs), some lncRNAs including GHET1, MALAT1, and TINCR may mediate mRNA stability and splicing. Other lncRNAs, such as BC032469, GAPLINC, and HOTAIR, participate in the competing endogenous RNA (ceRNA) network. Under certain circumstances, ANRIL, GACAT3, H19, MEG3, and TUSC7 exhibit their biological roles by associating with microRNAs (miRNAs). By recruiting histone-modifying complexes, ANRIL, FENDRR, H19, HOTAIR, MALAT1, and PVT1 may inhibit the transcription of target genes in cis or trans. Through these mechanisms, lncRNAs form RNA-dsDNA triplex. CCAT1, GAPLINC, GAS5, H19, MEG3, and TUSC7 play oncogenic or tumor suppressor roles by correlated with tumor suppressor P53 or onco-protein c-Myc, respectively. In conclusion, interaction with DNA, RNA and proteins is involved in lncRNAs’ participation in gastric tumorigenesis and development. PMID:26788991

  20. Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs

    PubMed Central

    Guttman, Mitchell; Garber, Manuel; Levin, Joshua Z.; Donaghey, Julie; Robinson, James; Adiconis, Xian; Fan, Lin; Koziol, Magdalena J.; Gnirke, Andreas; Nusbaum, Chad; Rinn, John L.; Lander, Eric S.; Regev, Aviv

    2010-01-01

    RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes. PMID:20436462

  1. The resurrection genome of Boea hygrometrica: A blueprint for survival of dehydration.

    PubMed

    Xiao, Lihong; Yang, Ge; Zhang, Liechi; Yang, Xinhua; Zhao, Shuang; Ji, Zhongzhong; Zhou, Qing; Hu, Min; Wang, Yu; Chen, Ming; Xu, Yu; Jin, Haijing; Xiao, Xuan; Hu, Guipeng; Bao, Fang; Hu, Yong; Wan, Ping; Li, Legong; Deng, Xin; Kuang, Tingyun; Xiang, Chengbin; Zhu, Jian-Kang; Oliver, Melvin J; He, Yikun

    2015-05-05

    "Drying without dying" is an essential trait in land plant evolution. Unraveling how a unique group of angiosperms, the Resurrection Plants, survive desiccation of their leaves and roots has been hampered by the lack of a foundational genome perspective. Here we report the ∼1,691-Mb sequenced genome of Boea hygrometrica, an important resurrection plant model. The sequence revealed evidence for two historical genome-wide duplication events, a compliment of 49,374 protein-coding genes, 29.15% of which are unique (orphan) to Boea and 20% of which (9,888) significantly respond to desiccation at the transcript level. Expansion of early light-inducible protein (ELIP) and 5S rRNA genes highlights the importance of the protection of the photosynthetic apparatus during drying and the rapid resumption of protein synthesis in the resurrection capability of Boea. Transcriptome analysis reveals extensive alternative splicing of transcripts and a focus on cellular protection strategies. The lack of desiccation tolerance-specific genome organizational features suggests the resurrection phenotype evolved mainly by an alteration in the control of dehydration response genes.

  2. Epigenomics of macrophages

    PubMed Central

    Gosselin, David; Glass, Christopher K

    2014-01-01

    Summary Macrophages play essential roles in tissue homeostasis, pathogen elimination, and tissue repair. A defining characteristic of these cells is their ability to efficiently adapt to a variety of abruptly changing and complex environments. This ability is intrinsically linked to a capacity to quickly alter their transcriptome, and this is tightly associated with the epigenomic organization of these cells and, in particular, their enhancer repertoire. Indeed, enhancers are genomic sites that serve as platforms for the integration of signaling pathways with the mechanisms that regulate mRNA transcription. Notably, transcription is pervasive at active enhancers and enhancer RNAs (eRNAs) are tightly coupled to regulated transcription of protein-coding genes. Furthermore, given that each cell type possesses a defining enhancer repertoire, studies on enhancers provide a powerful method to study how specialization of functions among the diverse macrophage subtypes may arise. Here, we review recent studies providing insights into the distinct mechanisms that contribute to the establishment of enhancers and their role in the regulation of transcription in macrophages. PMID:25319330

  3. High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.

    PubMed

    Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Carbonell, Silvia; Pérez-Lluch, Sílvia; Abad, Amaya; Davis, Carrie; Gingeras, Thomas R; Frankish, Adam; Harrow, Jennifer; Guigo, Roderic; Johnson, Rory

    2017-12-01

    Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.

  4. The pineapple AcMADS1 promoter confers high level expression in tomato and arabidopsis flowering and fruiting tissues, but AcMADS1 does not complement the tomato LeMADS-RIN (rin) mutant

    USDA-ARS?s Scientific Manuscript database

    A previous EST study identified a MADS box transcription factor coding sequence, AcMADS1, that is strongly induced during non-climacteric pineapple fruit ripening. Phylogenetic analyses place the AcMADS1 protein in the same superclade as LeMADS-RIN, a master regulator of fruit ripening upstream of e...

  5. Long non-coding RNA expression profile in cervical cancer tissues

    PubMed Central

    Zhu, Hua; Chen, Xiangjian; Hu, Yan; Shi, Zhengzheng; Zhou, Qing; Zheng, Jingjie; Wang, Yifeng

    2017-01-01

    Cervical cancer (CC), one of the most common types of cancer of the female population, presents an enormous challenge in diagnosis and treatment. Long non-coding (lnc)RNAs, non-coding (nc)RNAs with length >200 nucleotides, have been identified to be associated with multiple types of cancer, including CC. This class of nc transcripts serves an important role in tumor suppression and oncogenic signaling pathways. In the present study, the microarray method was used to obtain the expression profile of lncRNAs and protein-coding mRNAs and to compare the expression of lncRNAs between CC tissues and corresponding adjacent non-cancerous tissues in order to screen potential lncRNAs for associations with CC. Overall, 3356 lncRNAs with significantly different expression pattern in CC tissues compared with adjacent non-cancerous tissues were identified, while 1,857 of them were upregulated. These differentially expressed lncRNAs were additionally classified into 5 subgroups. Reverse transcription quantitative polymerase chain reactions were performed to validate the expression pattern of 5 random selected lncRNAs, and 2lncRNAs were identified to have significantly different expression in CC samples compared with adjacent non-cancerous tissues. This finding suggests that those lncRNAs with different expression may serve important roles in the development of CC, and the expression data may provide information for additional study on the involvement of lncRNAs in CC. PMID:28789353

  6. In silico screening of the chicken genome for overlaps between genomic regions: microRNA genes, coding and non-coding transcriptional units, QTL, and genetic variations.

    PubMed

    Zorc, Minja; Kunej, Tanja

    2016-05-01

    MicroRNAs (miRNAs) are a class of non-coding RNAs involved in posttranscriptional regulation of target genes. Regulation requires complementarity between target mRNA and the mature miRNA seed region, responsible for their recognition and binding. It has been estimated that each miRNA targets approximately 200 genes, and genetic variability of miRNA genes has been reported to affect phenotypic variability and disease susceptibility in humans, livestock species, and model organisms. Polymorphisms in miRNA genes could therefore represent biomarkers for phenotypic traits in livestock animals. In our previous study, we collected polymorphisms within miRNA genes in chicken. In the present study, we identified miRNA-related genomic overlaps to prioritize genomic regions of interest for further functional studies and biomarker discovery. Overlapping genomic regions in chicken were analyzed using the following bioinformatics tools and databases: miRNA SNiPer, Ensembl, miRBase, NCBI Blast, and QTLdb. Out of 740 known pre-miRNA genes, 263 (35.5 %) contain polymorphisms; among them, 35 contain more than three polymorphisms The most polymorphic miRNA genes in chicken are gga-miR-6662, containing 23 single nucleotide polymorphisms (SNPs) within the pre-miRNA region, including five consecutive SNPs, and gga-miR-6688, containing ten polymorphisms including three consecutive polymorphisms. Several miRNA-related genomic hotspots have been revealed in chicken genome; polymorphic miRNA genes are located within protein-coding and/or non-coding transcription units and quantitative trait loci (QTL) associated with production traits. The present study includes the first description of an exonic miRNA in a chicken genome, an overlap between the miRNA gene and the exon of the protein-coding gene (gga-miR-6578/HADHB), and the first report of a missense polymorphism located within a mature miRNA seed region. Identified miRNA-related genomic hotspots in chicken can serve researchers as a starting point for further functional studies and association studies with poultry production and health traits and the basis for systematic screening of exonic miRNAs and missense/miRNA seed polymorphisms in other genomes.

  7. Expression of hemagglutinin protein of Rinderpest virus in transgenic pigeon pea [Cajanus cajan (L.) Millsp.] plants.

    PubMed

    Satyavathi, V V; Prasad, V; Khandelwal, Abha; Shaila, M S; Sita, G Lakshmi

    2003-03-01

    Rinderpest virus is the causative agent of a devastating, often fatal disease in wild and domestic bovids that is endemic in Africa, the Middle East and South Asia. The existing live attenuated vaccine is heat-labile, and thus there is a need for the development of new strategies for vaccination. This paper reports the development of transgenic pigeon pea [ Cajanus cajun (L.) Millsp.] expressing one of the protective antigens, the hemagglutinin (H) protein of Rinderpest virus. A 2-kb fragment containing the coding region of the H protein was cloned into pBI121 and mobilized into Agrobacterium tumefaciensstrain EHA105. Embryonic axes and cotyledonary nodes from germinated seeds of pigeon pea were used for transformation. The presence of the transgene in transgenic plants was confirmed by Southern blots, and the specific transcription of the marker gene in the plants was demonstrated by reverse transcription-polymerase chain reaction. Integration of the H gene into the pigeon pea genome was confirmed by Southern hybridization. The expression of the H protein in the transgenic lines was confirmed by Western blot analysis using a polyclonal monospecific antibody to the H protein. The highest level of expression of the hemagglutinin protein in leaves of pigeon pea was 0.49% of the total soluble protein. The transgenic plants were fertile and the transgene expressed in the progeny.

  8. The fission yeast CENP-B protein Abp1 prevents pervasive transcription of repetitive DNA elements.

    PubMed

    Daulny, Anne; Mejía-Ramírez, Eva; Reina, Oscar; Rosado-Lugo, Jesus; Aguilar-Arnal, Lorena; Auer, Herbert; Zaratiegui, Mikel; Azorin, Fernando

    2016-10-01

    It is well established that eukaryotic genomes are pervasively transcribed producing cryptic unstable transcripts (CUTs). However, the mechanisms regulating pervasive transcription are not well understood. Here, we report that the fission yeast CENP-B homolog Abp1 plays an important role in preventing pervasive transcription. We show that loss of abp1 results in the accumulation of CUTs, which are targeted for degradation by the exosome pathway. These CUTs originate from different types of genomic features, but the highest increase corresponds to Tf2 retrotransposons and rDNA repeats, where they map along the entire elements. In the absence of abp1, increased RNAPII-Ser5P occupancy is observed throughout the Tf2 coding region and, unexpectedly, RNAPII-Ser5P is enriched at rDNA repeats. Loss of abp1 also results in Tf2 derepression and increased nucleolus size. Altogether these results suggest that Abp1 prevents pervasive RNAPII transcription of repetitive DNA elements (i.e., Tf2 and rDNA repeats) from internal cryptic sites. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Post-transcriptional control by bacteriophage T4: mRNA decay and inhibition of translation initiation

    PubMed Central

    2010-01-01

    Over 50 years of biological research with bacteriophage T4 includes notable discoveries in post-transcriptional control, including the genetic code, mRNA, and tRNA; the very foundations of molecular biology. In this review we compile the past 10 - 15 year literature on RNA-protein interactions with T4 and some of its related phages, with particular focus on advances in mRNA decay and processing, and on translational repression. Binding of T4 proteins RegB, RegA, gp32 and gp43 to their cognate target RNAs has been characterized. For several of these, further study is needed for an atomic-level perspective, where resolved structures of RNA-protein complexes are awaiting investigation. Other features of post-transcriptional control are also summarized. These include: RNA structure at translation initiation regions that either inhibit or promote translation initiation; programmed translational bypassing, where T4 orchestrates ribosome bypass of a 50 nucleotide mRNA sequence; phage exclusion systems that involve T4-mediated activation of a latent endoribonuclease (PrrC) and cofactor-assisted activation of EF-Tu proteolysis (Gol-Lit); and potentially important findings on ADP-ribosylation (by Alt and Mod enzymes) of ribosome-associated proteins that might broadly impact protein synthesis in the infected cell. Many of these problems can continue to be addressed with T4, whereas the growing database of T4-related phage genome sequences provides new resources and potentially new phage-host systems to extend the work into a broader biological, evolutionary context. PMID:21129205

  10. Gene regulation network behind drought escape, avoidance and tolerance strategies in black poplar (Populus nigra L.).

    PubMed

    Yıldırım, Kubilay; Kaya, Zeki

    2017-06-01

    Drought is the major environmental problem limiting the productivity and survival of plant species. Here, previously identified three black poplar genotypes having contrasting response to drought were subjected to gradual soil water depletion in a pot trial to identify their physiological, morphological and antioxidation related adaptations. We also performed a microarray based transcriptome analyses on the leaves of genotypes by using Affymetrix poplar Genome Array containing 56,000 transcripts. Phenotypic analyses of each genotype confirmed their differential adaptations to drought that could be classified as drought escape, avoidance and tolerance. Comparative transcriptomic analysis indicated highly divergent gene expression patterns among the genotypes in response to drought and post drought re-watering (PDR). We identified 10641, 3824 and 9411 transcripts exclusively regulated in drought escape, avoidance and tolerant genotypes, respectively. The key genes involved in metabolic pathways, such as carbohydrate metabolism, photosynthesis, lipid metabolism, generation of precursor metabolites/energy, protein folding, redox homeostasis, secondary metabolic process and cell wall component biogenesis, were affected by drought stresses in the leaves of these genotypes. Transcript isoforms showed increased expression specificity in the genes coding for bark storage proteins and small heat shock proteins in drought tolerant genotype. On the other hand, drought-avoiding genotype specifically induced the transcripts annotated to the genes functional in secondary metabolite production that linked to enhanced leaf water content and growth performance under drought stress. Transcriptome profiling of drought escape genotype indicated specific regulation of the genes functional in programmed cell death and leaf senescence. Specific upregulation of GTP cyclohydrolase II and transcription factors (WRKY and ERFs) in only this genotype were associated to ROS dependent signalling pathways and gene regulation network responsible in induction of many degrading enzymes acting on cell wall carbohydrates, fatty acids and proteins under drought stress. Our findings provide new insights into the transcriptome dynamics and components of regulatory network associated with drought adaptation strategies. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  11. Analysis of cellulose synthase genes from domesticated apple identifies collinear genes WDR53 and CesA8A: partial co-expression, bicistronic mRNA, and alternative splicing of CESA8A

    PubMed Central

    Guerriero, Gea; Spadiut, Oliver; Kerschbamer, Christine; Giorno, Filomena; Baric, Sanja; Ezcurra, Inés

    2016-01-01

    Cellulose synthase (CesA) genes constitute a complex multigene family with six major phylogenetic clades in angiosperms. The recently sequenced genome of domestic apple, Malus×domestica, was mined for CesA genes, by blasting full-length cellulose synthase protein (CESA) sequences annotated in the apple genome against protein databases from the plant models Arabidopsis thaliana and Populus trichocarpa. Thirteen genes belonging to the six angiosperm CesA clades and coding for proteins with conserved residues typical of processive glycosyltransferases from family 2 were detected. Based on their phylogenetic relationship to Arabidopsis CESAs, as well as expression patterns, a nomenclature is proposed to facilitate further studies. Examination of their genomic organization revealed that MdCesA8-A is closely linked and co-oriented with WDR53, a gene coding for a WD40 repeat protein. The WDR53 and CesA8 genes display conserved collinearity in dicots and are partially co-expressed in the apple xylem. Interestingly, the presence of a bicistronic WDR53–CesA8A transcript was detected in phytoplasma-infected phloem tissues of apple. The bicistronic transcript contains a spliced intergenic sequence that is predicted to fold into hairpin structures typical of internal ribosome entry sites, suggesting its potential cap-independent translation. Surprisingly, the CesA8A cistron is alternatively spliced and lacks the zinc-binding domain. The possible roles of WDR53 and the alternatively spliced CESA8 variant during cellulose biosynthesis in M.×domestica are discussed. PMID:23048131

  12. Expanded subgenomic mRNA transcriptome and coding capacity of a nidovirus

    PubMed Central

    Di, Han; Madden, Joseph C.; Morantz, Esther K.; Tang, Hsin-Yao; Graham, Rachel L.; Baric, Ralph S.

    2017-01-01

    Members of the order Nidovirales express their structural protein ORFs from a nested set of 3′ subgenomic mRNAs (sg mRNAs), and for most of these ORFs, a single genomic transcription regulatory sequence (TRS) was identified. Nine TRSs were previously reported for the arterivirus Simian hemorrhagic fever virus (SHFV). In the present study, which was facilitated by next-generation sequencing, 96 SHFV body TRSs were identified that were functional in both infected MA104 cells and macaque macrophages. The abundance of sg mRNAs produced from individual TRSs was consistent over time in the two different cell types. Most of the TRSs are located in the genomic 3′ region, but some are in the 5′ ORF1a/1b region and provide alternative sources of nonstructural proteins. Multiple functional TRSs were identified for the majority of the SHFV 3′ ORFs, and four previously identified TRSs were found not to be the predominant ones used. A third of the TRSs generated sg mRNAs with variant leader–body junction sequences. Sg mRNAs encoding E′, GP2, or ORF5a as their 5′ ORF as well as sg mRNAs encoding six previously unreported alternative frame ORFs or 14 previously unreported C-terminal ORFs of known proteins were also identified. Mutation of the start codon of two C-terminal ORFs in an infectious clone reduced virus yield. Mass spectrometry detected one previously unreported protein and suggested translation of some of the C-terminal ORFs. The results reveal the complexity of the transcriptional regulatory mechanism and expanded coding capacity for SHFV, which may also be characteristic of other nidoviruses. PMID:29073030

  13. Can the 'neuron theory' be complemented by a universal mechanism for generic neuronal differentiation.

    PubMed

    Ernsberger, Uwe

    2015-01-01

    With the establishment of the 'neuron theory' at the turn of the twentieth century, this remarkably powerful term was introduced to name a breathtaking diversity of cells unified by a characteristic structural compartmentalization and unique information processing and propagating features. At the beginning of the twenty-first century, developmental, stem cell and reprogramming studies converged to suggest a common mechanism involved in the generation of possibly all vertebrate, and at least a significant number of invertebrate, neurons. Sox and, in particular, SoxB and SoxC proteins as well as basic helix-loop-helix proteins play major roles, even though their precise contributions to progenitor programming, proliferation and differentiation are not fully resolved. In addition to neuronal development, these transcription factors also regulate sensory receptor and endocrine cell development, thus specifying a range of cells with regulatory and communicative functions. To what extent microRNAs contribute to the diversification of these cell types is an upcoming question. Understanding the transcriptional and post-transcriptional regulation of genes coding for cell type-specific cytoskeletal and motor proteins as well as synaptic and ion channel proteins, which mark differences but also similarities between the three communicator cell types, will provide a key to the comprehension of their diversification and the signature of 'generic neuronal' differentiation. Apart from the general scientific significance of a putative universal core instruction for neuronal development, the impact of this line of research for cell replacement therapy and brain tumor treatment will be of considerable interest.

  14. Evaluation of 10 genes encoding cardiac proteins in Doberman Pinschers with dilated cardiomyopathy.

    PubMed

    O'Sullivan, M Lynne; O'Grady, Michael R; Pyle, W Glen; Dawson, John F

    2011-07-01

    To identify a causative mutation for dilated cardiomyopathy (DCM) in Doberman Pinschers by sequencing the coding regions of 10 cardiac genes known to be associated with familial DCM in humans. 5 Doberman Pinschers with DCM and congestive heart failure and 5 control mixed-breed dogs that were euthanized or died. RNA was extracted from frozen ventricular myocardial samples from each dog, and first-strand cDNA was synthesized via reverse transcription, followed by PCR amplification with gene-specific primers. Ten cardiac genes were analyzed: cardiac actin, α-actinin, α-tropomyosin, β-myosin heavy chain, metavinculin, muscle LIM protein, myosinbinding protein C, tafazzin, titin-cap (telethonin), and troponin T. Sequences for DCM-affected and control dogs and the published canine genome were compared. None of the coding sequences yielded a common causative mutation among all Doberman Pinscher samples. However, 3 variants were identified in the α-actinin gene in the DCM-affected Doberman Pinschers. One of these variants, identified in 2 of the 5 Doberman Pinschers, resulted in an amino acid change in the rod-forming triple coiled-coil domain. Mutations in the coding regions of several genes associated with DCM in humans did not appear to consistently account for DCM in Doberman Pinschers. However, an α-actinin variant was detected in some Doberman Pinschers that may contribute to the development of DCM given its potential effect on the structure of this protein. Investigation of additional candidate gene coding and noncoding regions and further evaluation of the role of α-actinin in development of DCM in Doberman Pinschers are warranted.

  15. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.

    PubMed

    Mathelier, Anthony; Lefebvre, Calvin; Zhang, Allen W; Arenillas, David J; Ding, Jiarui; Wasserman, Wyeth W; Shah, Sohrab P

    2015-04-23

    With the rapid increase of whole-genome sequencing of human cancers, an important opportunity to analyze and characterize somatic mutations lying within cis-regulatory regions has emerged. A focus on protein-coding regions to identify nonsense or missense mutations disruptive to protein structure and/or function has led to important insights; however, the impact on gene expression of mutations lying within cis-regulatory regions remains under-explored. We analyzed somatic mutations from 84 matched tumor-normal whole genomes from B-cell lymphomas with accompanying gene expression measurements to elucidate the extent to which these cancers are disrupted by cis-regulatory mutations. We characterize mutations overlapping a high quality set of well-annotated transcription factor binding sites (TFBSs), covering a similar portion of the genome as protein-coding exons. Our results indicate that cis-regulatory mutations overlapping predicted TFBSs are enriched in promoter regions of genes involved in apoptosis or growth/proliferation. By integrating gene expression data with mutation data, our computational approach culminates with identification of cis-regulatory mutations most likely to participate in dysregulation of the gene expression program. The impact can be measured along with protein-coding mutations to highlight key mutations disrupting gene expression and pathways in cancer. Our study yields specific genes with disrupted expression triggered by genomic mutations in either the coding or the regulatory space. It implies that mutated regulatory components of the genome contribute substantially to cancer pathways. Our analyses demonstrate that identifying genomically altered cis-regulatory elements coupled with analysis of gene expression data will augment biological interpretation of mutational landscapes of cancers.

  16. Transcription profiling suggests that mitochondrial topoisomerase IB acts as a topological barrier and regulator of mitochondrial DNA transcription.

    PubMed

    Dalla Rosa, Ilaria; Zhang, Hongliang; Khiati, Salim; Wu, Xiaolin; Pommier, Yves

    2017-12-08

    Mitochondrial DNA (mtDNA) is essential for cell viability because it encodes subunits of the respiratory chain complexes. Mitochondrial topoisomerase IB (TOP1MT) facilitates mtDNA replication by removing DNA topological tensions produced during mtDNA transcription, but it appears to be dispensable. To test whether cells lacking TOP1MT have aberrant mtDNA transcription, we performed mitochondrial transcriptome profiling. To that end, we designed and implemented a customized tiling array, which enabled genome-wide, strand-specific, and simultaneous detection of all mitochondrial transcripts. Our technique revealed that Top1mt KO mouse cells process the mitochondrial transcripts normally but that protein-coding mitochondrial transcripts are elevated. Moreover, we found discrete long noncoding RNAs produced by H-strand transcription and encompassing the noncoding regulatory region of mtDNA in human and murine cells and tissues. Of note, these noncoding RNAs were strongly up-regulated in the absence of TOP1MT. In contrast, 7S DNA, produced by mtDNA replication, was reduced in the Top1mt KO cells. We propose that the long noncoding RNA species in the D-loop region are generated by the extension of H-strand transcripts beyond their canonical stop site and that TOP1MT acts as a topological barrier and regulator for mtDNA transcription and D-loop formation.

  17. Large-scale identification and characterization of alternative splicing variants of human gene transcripts using 56 419 completely sequenced and manually annotated full-length cDNAs

    PubMed Central

    Takeda, Jun-ichi; Suzuki, Yutaka; Nakao, Mitsuteru; Barrero, Roberto A.; Koyanagi, Kanako O.; Jin, Lihua; Motono, Chie; Hata, Hiroko; Isogai, Takao; Nagai, Keiichi; Otsuki, Tetsuji; Kuryshev, Vladimir; Shionyu, Masafumi; Yura, Kei; Go, Mitiko; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Wiemann, Stefan; Nomura, Nobuo; Sugano, Sumio; Gojobori, Takashi; Imanishi, Tadashi

    2006-01-01

    We report the first genome-wide identification and characterization of alternative splicing in human gene transcripts based on analysis of the full-length cDNAs. Applying both manual and computational analyses for 56 419 completely sequenced and precisely annotated full-length cDNAs selected for the H-Invitational human transcriptome annotation meetings, we identified 6877 alternative splicing genes with 18 297 different alternative splicing variants. A total of 37 670 exons were involved in these alternative splicing events. The encoded protein sequences were affected in 6005 of the 6877 genes. Notably, alternative splicing affected protein motifs in 3015 genes, subcellular localizations in 2982 genes and transmembrane domains in 1348 genes. We also identified interesting patterns of alternative splicing, in which two distinct genes seemed to be bridged, nested or having overlapping protein coding sequences (CDSs) of different reading frames (multiple CDS). In these cases, completely unrelated proteins are encoded by a single locus. Genome-wide annotations of alternative splicing, relying on full-length cDNAs, should lay firm groundwork for exploring in detail the diversification of protein function, which is mediated by the fast expanding universe of alternative splicing variants. PMID:16914452

  18. Gene expression profile of the plant pathogen Fusarium graminearum under the antagonistic effect of Pantoea agglomerans.

    PubMed

    Pandolfi, V; Jorge, E C; Melo, C M R; Albuquerque, A C S; Carrer, H

    2010-07-06

    The pathogenic fungus Fusarium graminearum is an ongoing threat to agriculture, causing losses in grain yield and quality in diverse crops. Substantial progress has been made in the identification of genes involved in the suppression of phytopathogens by antagonistic microorganisms; however, limited information regarding responses of plant pathogens to these biocontrol agents is available. Gene expression analysis was used to identify differentially expressed transcripts of the fungal plant pathogen F. graminearum under antagonistic effect of the bacterium Pantoea agglomerans. A macroarray was constructed, using 1014 transcripts from an F. graminearum cDNA library. Probes consisted of the cDNA of F. graminearum grown in the presence and in the absence of P. agglomerans. Twenty-nine genes were either up (19) or down (10) regulated during interaction with the antagonist bacterium. Genes encoding proteins associated with fungal defense and/or virulence or with nutritional and oxidative stress responses were induced. The repressed genes coded for a zinc finger protein associated with cell division, proteins containing cellular signaling domains, respiratory chain proteins, and chaperone-type proteins. These data give molecular and biochemical evidence of response of F. graminearum to an antagonist and could help develop effective biocontrol procedures for pathogenic plant fungi.

  19. Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

    USGS Publications Warehouse

    Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

    2004-01-01

    The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.

  20. Infection of capilloviruses requires subgenomic RNAs whose transcription is controlled by promoter-like sequences conserved among flexiviruses.

    PubMed

    Komatsu, Ken; Hirata, Hisae; Fukagawa, Takako; Yamaji, Yasuyuki; Okano, Yukari; Ishikawa, Kazuya; Adachi, Tatsushi; Maejima, Kensaku; Hashimoto, Masayoshi; Namba, Shigetou

    2012-07-01

    The first open-reading frame (ORF) of apple stem grooving virus (ASGV), of the genus Capillovirus, encodes an apparently chimeric polyprotein containing conserved regions for replicase (Rep) and coat protein (CP). However, our previous study revealed that ASGV mutants with distinct and discontinuous Rep- and CP-coding regions successfully infect plants, indicating that CP expressed via a subgenomic RNA (sgRNA) is sufficient for viability of the virus. Here we identified a transcription start site of the CP sgRNA and revealed that CP translated from the sgRNA is essential for ASGV infection. We mapped the transcription start sites of both the CP and the movement protein (MP) sgRNAs of ASGV and found a hexanucleotide motif, UUAGGU, conserved upstream from both sgRNA transcription start sites. Mutational analysis of the putative CP initiation codon and of the UUAGGU sequence upstream from the transcription start site of CP sgRNA demonstrated their importance for ASGV accumulation. Our results also demonstrated that potato virus T (PVT), an unassigned species closely related to ASGV, produces two sgRNAs putatively deployed for the CP and MP expression and that the same hexanucleotide motif as found in ASGV is located upstream from the transcription start sites of both sgRNAs. This motif, which constituted putative core elements of the sgRNA promoter, is broadly conserved among viruses in the families Alphaflexiviridae and Betaflexiviridae, suggesting that the gene expression strategy of the viruses in both families has been conserved throughout evolution. Copyright © 2012 Elsevier B.V. All rights reserved.

Top