human intronic noncoding: Topics by Science.gov

Sample records for human intronic noncoding

Intriguing Balancing Selection on the Intron 5 Region of LMBR1 in Human Population

PubMed Central

He, Fang; Wu, Dong-Dong; Kong, Qing-Peng; Zhang, Ya-Ping

2008-01-01

Background The intron 5 of gene LMBR1 is the cis-acting regulatory module for the sonic hedgehog (SHH) gene. Mutation in this non-coding region is associated with preaxial polydactyly, and may play crucial roles in the evolution of limb and skeletal system. Methodology/Principal Findings We sequenced a region of the LMBR1 gene intron 5 in East Asian human population, and found a significant deviation of Tajima's D statistics from neutrality taking human population growth into account. Data from HapMap also demonstrated extended linkage disequilibrium in the region in East Asian and European population, and significantly low degree of genetic differentiation among human populations. Conclusion/Significance We proposed that the intron 5 of LMBR1 was presumably subject to balancing selection during the evolution of modern human. PMID:18698406
Two distinct promoters drive transcription of the human D1A dopamine receptor gene.

PubMed

Lee, S H; Minowa, M T; Mouradian, M M

1996-10-11

The human D1A dopamine receptor gene has a GC-rich, TATA-less promoter located upstream of a small, noncoding exon 1, which is separated from the coding exon 2 by a 116-base pair (bp)-long intron. Serial 3'-deletions of the 5'-noncoding region of this gene, including the intron and 5'-end of exon 2, resulted in 80 and 40% decrease in transcriptional activity of the upstream promoter in two D1A-expressing neuroblastoma cell lines, SK-N-MC and NS20Y, respectively. To investigate the function of this region, the intron and 245 bp at the 5'-end of exon 2 were investigated. Transient expression analyses using various chloramphenicol acetyltransferase constructs showed that the transcriptional activity of the intron is higher than that of the upstream promoter by 12-fold in SK-N-MC cells and by 5.5-fold in NS20Y cells in an orientation-dependent manner, indicating that the D1A intron is a strong promoter. Primer extension and ribonuclease protection assays revealed that transcription driven by the intron promoter is initiated at the junction of intron and exon 2 and at a cluster of nucleotides located 50 bp downstream from this junction. The same transcription start sites are utilized by the chloramphenicol acetyltransferase constructs employed in transfections as well as by the D1A gene expressed within the human caudate. The relative abundance of D1A transcripts originating from the upstream promoter compared with those transcribed from the intron promoter is 1.5-2.9 times in SK-N-MC cells and 2 times in the human caudate. Transcript stability studies in SK-N-MC cells revealed that longer D1A mRNA molecules containing exon 1 are degraded 1.8 times faster than shorter transcripts lacking exon 1. Although gel mobility shift assay could not detect DNA-protein interaction at the D1A intron, competitive co-transfection using the intron as competitor confirmed the presence of trans-acting factors at the intron. These data taken together indicate that the human D1A gene has two functional TATA-less promoters, both in D1A expressing cultured neuroblastoma cells and in the human striatum.
A 5′ Noncoding Exon Containing Engineered Intron Enhances Transgene Expression from Recombinant AAV Vectors in vivo

PubMed Central

Lu, Jiamiao; Williams, James A.; Luke, Jeremy; Zhang, Feijie; Chu, Kirk; Kay, Mark A.

2017-01-01

We previously developed a mini-intronic plasmid (MIP) expression system in which the essential bacterial elements for plasmid replication and selection are placed within an engineered intron contained within a universal 5′ UTR noncoding exon. Like minicircle DNA plasmids (devoid of bacterial backbone sequences), MIP plasmids overcome transcriptional silencing of the transgene. However, in addition MIP plasmids increase transgene expression by 2 and often >10 times higher than minicircle vectors in vivo and in vitro. Based on these findings, we examined the effects of the MIP intronic sequences in a recombinant adeno-associated virus (AAV) vector system. Recombinant AAV vectors containing an intron with a bacterial replication origin and bacterial selectable marker increased transgene expression by 40 to 100 times in vivo when compared with conventional AAV vectors. Therefore, inclusion of this noncoding exon/intron sequence upstream of the coding region can substantially enhance AAV-mediated gene expression in vivo. PMID:27903072
Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences.

PubMed

Bergman, C M; Kreitman, M

2001-08-01

Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.
Analysis and recognition of 5′ UTR intron splice sites in human pre-mRNA

PubMed Central

Eden, E.; Brunak, S.

2004-01-01

Prediction of splice sites in non-coding regions of genes is one of the most challenging aspects of gene structure recognition. We perform a rigorous analysis of such splice sites embedded in human 5′ untranslated regions (UTRs), and investigate correlations between this class of splice sites and other features found in the adjacent exons and introns. By restricting the training of neural network algorithms to ‘pure’ UTRs (not extending partially into protein coding regions), we for the first time investigate the predictive power of the splicing signal proper, in contrast to conventional splice site prediction, which typically relies on the change in sequence at the transition from protein coding to non-coding. By doing so, the algorithms were able to pick up subtler splicing signals that were otherwise masked by ‘coding’ noise, thus enhancing significantly the prediction of 5′ UTR splice sites. For example, the non-coding splice site predicting networks pick up compositional and positional bias in the 3′ ends of non-coding exons and 5′ non-coding intron ends, where cytosine and guanine are over-represented. This compositional bias at the true UTR donor sites is also visible in the synaptic weights of the neural networks trained to identify UTR donor sites. Conventional splice site prediction methods perform poorly in UTRs because the reading frame pattern is absent. The NetUTR method presented here performs 2–3-fold better compared with NetGene2 and GenScan in 5′ UTRs. We also tested the 5′ UTR trained method on protein coding regions, and discovered, surprisingly, that it works quite well (although it cannot compete with NetGene2). This indicates that the local splicing pattern in UTRs and coding regions is largely the same. The NetUTR method is made publicly available at www.cbs.dtu.dk/services/NetUTR. PMID:14960723
Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

PubMed

Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

2017-03-27

Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.
Using the NCBI Genome Databases to Compare the Genes for Human & Chimpanzee Beta Hemoglobin

ERIC Educational Resources Information Center

Offner, Susan

2010-01-01

The beta hemoglobin protein is identical in humans and chimpanzees. In this tutorial, students see that even though the proteins are identical, the genes that code for them are not. There are many more differences in the introns than in the exons, which indicates that coding regions of DNA are more highly conserved than non-coding regions.
Evaluation of non-coding variation in GLUT1 deficiency.

PubMed

Liu, Yu-Chi; Lee, Jia Wei Audrey; Bellows, Susannah T; Damiano, John A; Mullen, Saul A; Berkovic, Samuel F; Bahlo, Melanie; Scheffer, Ingrid E; Hildebrand, Michael S

2016-12-01

Loss-of-function mutations in SLC2A1, encoding glucose transporter-1 (GLUT-1), lead to dysfunction of glucose transport across the blood-brain barrier. Ten percent of cases with hypoglycorrhachia (fasting cerebrospinal fluid [CSF] glucose <2.2mmol/L) do not have mutations. We hypothesized that GLUT1 deficiency could be due to non-coding SLC2A1 variants. We performed whole exome sequencing of one proband with a GLUT1 phenotype and hypoglycorrhachia negative for SLC2A1 sequencing and copy number variants. We studied a further 55 patients with different epilepsies and low CSF glucose who did not have exonic mutations or copy number variants. We sequenced non-coding promoter and intronic regions. We performed mRNA studies for the recurrent intronic variant. The proband had a de novo splice site mutation five base pairs from the intron-exon boundary. Three of 55 patients had deep intronic SLC2A1 variants, including a recurrent variant in two. The recurrent variant produced less SLC2A1 mRNA transcript. Fasting CSF glucose levels show an age-dependent correlation, which makes the definition of hypoglycorrhachia challenging. Low CSF glucose levels may be associated with pathogenic SLC2A1 mutations including deep intronic SLC2A1 variants. Extending genetic screening to non-coding regions will enable diagnosis of more patients with GLUT1 deficiency, allowing implementation of the ketogenic diet to improve outcomes. © 2016 Mac Keith Press.
Identification of the human homolog of the imprinted mouse Air non-coding RNA

PubMed Central

Yotova, Iveta Y.; Vlatkovic, Irena M.; Pauler, Florian M.; Warczok, Katarzyna E.; Ambros, Peter F.; Oshimura, Mitsuo; Theussl, Hans-Christian; Gessler, Manfred; Wagner, Erwin F.; Barlow, Denise P.

2010-01-01

Genomic imprinting is widely conserved amongst placental mammals. Imprinted expression of IGF2R, however, differs between mice and humans. In mice, Igf2r imprinted expression is seen in all fetal and adult tissues. In humans, adult tissues lack IGF2R imprinted expression, but it is found in fetal tissues and Wilms' tumors where it is polymorphic and only seen in a small proportion of tested samples. Mouse Igf2r imprinted expression is controlled by the Air (Airn) ncRNA whose promoter lies in an intronic maternally-methylated CpG island. The human IGF2R gene carries a homologous intronic maternally-methylated CpG island of unknown function. Here, we use transfection and transgenic studies to show that the human IGF2R intronic CpG island is a ncRNA promoter. We also identify the same ncRNA at the endogenous human locus in 16–40% of Wilms' tumors. Thus, the human IGF2R gene shows evolutionary conservation of key features that control imprinted expression in the mouse. PMID:18789384
Expression analysis and in silico characterization of intronic long noncoding RNAs in renal cell carcinoma: emerging functional associations

PubMed Central

2013-01-01

Background Intronic and intergenic long noncoding RNAs (lncRNAs) are emerging gene expression regulators. The molecular pathogenesis of renal cell carcinoma (RCC) is still poorly understood, and in particular, limited studies are available for intronic lncRNAs expressed in RCC. Methods Microarray experiments were performed with custom-designed arrays enriched with probes for lncRNAs mapping to intronic genomic regions. Samples from 18 primary RCC tumors and 11 nontumor adjacent matched tissues were analyzed. Meta-analyses were performed with microarray expression data from three additional human tissues (normal liver, prostate tumor and kidney nontumor samples), and with large-scale public data for epigenetic regulatory marks and for evolutionarily conserved sequences. Results A signature of 29 intronic lncRNAs differentially expressed between RCC and nontumor samples was obtained (false discovery rate (FDR) <5%). A signature of 26 intronic lncRNAs significantly correlated with the RCC five-year patient survival outcome was identified (FDR <5%, p-value ≤0.01). We identified 4303 intronic antisense lncRNAs expressed in RCC, of which 22% were significantly (p <0.05) cis correlated with the expression of the mRNA in the same locus across RCC and three other human tissues. Gene Ontology (GO) analysis of those loci pointed to 'regulation of biological processes’ as the main enriched category. A module map analysis of the protein-coding genes significantly (p <0.05) trans correlated with the 20% most abundant lncRNAs, identified 51 enriched GO terms (p <0.05). We determined that 60% of the expressed lncRNAs are evolutionarily conserved. At the genomic loci containing the intronic RCC-expressed lncRNAs, a strong association (p <0.001) was found between their transcription start sites and genomic marks such as CpG islands, RNA Pol II binding and histones methylation and acetylation. Conclusion Intronic antisense lncRNAs are widely expressed in RCC tumors. Some of them are significantly altered in RCC in comparison with nontumor samples. The majority of these lncRNAs is evolutionarily conserved and possibly modulated by epigenetic modifications. Our data suggest that these RCC lncRNAs may contribute to the complex network of regulatory RNAs playing a role in renal cell malignant transformation. PMID:24238219
Pre-Mrna Introns as a Model for Cryptographic Algorithm:. Theory and Experiments

NASA Astrophysics Data System (ADS)

Regoli, Massimo

2010-01-01

The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. In particular the RNA sequences have some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algorithm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.
a Simple Symmetric Algorithm Using a Likeness with Introns Behavior in RNA Sequences

NASA Astrophysics Data System (ADS)

Regoli, Massimo

2009-02-01

The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences has some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algoritnm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.
A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

PubMed

Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

2006-04-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.
A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

PubMed Central

Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

2006-01-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031
Bio—Cryptography: A Possible Coding Role for RNA Redundancy

NASA Astrophysics Data System (ADS)

Regoli, M.

2009-03-01

The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences have some sections called Introns. Introns, derived from the term "intragenic regions," are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behavior in the access to the secret key to code the messages. In the RNA-Crypto System algorithm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.
A cautionary tale: the non-causal association between type 2 diabetes risk SNP, rs7756992, and levels of non-coding RNA, CDKAL1-v1.

PubMed

Locke, Jonathan M; Wei, Fan-Yan; Tomizawa, Kazuhito; Weedon, Michael N; Harries, Lorna W

2015-04-01

Intronic single nucleotide polymorphisms (SNPs) in the CDKAL1 gene are associated with risk of developing type 2 diabetes. A strong correlation between risk alleles and lower levels of the non-coding RNA, CDKAL1-v1, has recently been reported in whole blood extracted from Japanese individuals. We sought to replicate this association in two independent cohorts: one using whole blood from white UK-resident individuals, and one using a collection of human pancreatic islets, a more relevant tissue type to study with respect to the aetiology of diabetes. Levels of CDKAL1-v1 were measured by real-time PCR using RNA extracted from human whole blood (n = 70) and human pancreatic islets (n = 48). Expression with respect to genotype was then determined. In a simple linear regression model, expression of CDKAL1-v1 was associated with the lead type 2 diabetes-associated SNP, rs7756992, in whole blood and islets. However, these associations were abolished or substantially reduced in multiple regression models taking into account rs9366357 genotype: a moderately linked SNP explaining a much larger amount of the variation in CDKAL1-v1 levels, but not strongly associated with risk of type 2 diabetes. Contrary to previous findings, we provide evidence against a role for dysregulated expression of CDKAL1-v1 in mediating the association between intronic SNPs in CDKAL1 and susceptibility to type 2 diabetes. The results of this study illustrate how caution should be exercised when inferring causality from an association between disease-risk genotype and non-coding RNA expression.
Intergenic disease-associated regions are abundant in novel transcripts.

PubMed

Bartonicek, N; Clark, M B; Quek, X C; Torpy, J R; Pritchard, A L; Maag, J L V; Gloss, B S; Crawford, J; Taft, R J; Hayward, N K; Montgomery, G W; Mattick, J S; Mercer, T R; Dinger, M E

2017-12-28

Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.
Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics.

PubMed

Edwards, Scott V; Cloutier, Alison; Baker, Allan J

2017-11-01

Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600-∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biologists.
Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics

PubMed Central

Cloutier, Alison; Baker, Allan J.

2017-01-01

Abstract Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600–∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. PMID:28637293
Phylogenomic Resolution of the Phylogeny of Laurasiatherian Mammals: Exploring Phylogenetic Signals within Coding and Noncoding Sequences.

PubMed

Chen, Meng-Yun; Liang, Dan; Zhang, Peng

2017-08-01

The interordinal relationships of Laurasiatherian mammals are currently one of the most controversial questions in mammalian phylogenetics. Previous studies mainly relied on coding sequences (CDS) and seldom used noncoding sequences. Here, by data mining public genome data, we compiled an intron data set of 3,638 genes (all introns from a protein-coding gene are considered as a gene) (19,055,073 bp) and a CDS data set of 10,259 genes (20,994,285 bp), covering all major lineages of Laurasiatheria (except Pholidota). We found that the intron data contained stronger and more congruent phylogenetic signals than the CDS data. In agreement with this observation, concatenation and species-tree analyses of the intron data set yielded well-resolved and identical phylogenies, whereas the CDS data set produced weakly supported and incongruent results. Further analyses showed that the phylogeny inferred from the intron data is highly robust to data subsampling and change in outgroup, but the CDS data produced unstable results under the same conditions. Interestingly, gene tree statistical results showed that the most frequently observed gene tree topologies for the CDS and intron data are identical, suggesting that the major phylogenetic signal within the CDS data is actually congruent with that within the intron data. Our final result of Laurasiatheria phylogeny is (Eulipotyphla,((Chiroptera, Perissodactyla),(Carnivora, Cetartiodactyla))), favoring a close relationship between Chiroptera and Perissodactyla. Our study 1) provides a well-supported phylogenetic framework for Laurasiatheria, representing a step towards ending the long-standing "hard" polytomy and 2) argues that intron within genome data is a promising data resource for resolving rapid radiation events across the tree of life. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

The paradox of MHC-DRB exon/intron evolution: alpha-helix and beta-sheet encoding regions diverge while hypervariable intronic simple repeats coevolve with beta-sheet codons.

PubMed

Schwaiger, F W; Weyers, E; Epplen, C; Brün, J; Ruff, G; Crawford, A; Epplen, J T

1993-09-01

Twenty-one different caprine and 13 ovine MHC-DRB exon 2 sequences were determined including part of the adjacent introns containing simple repetitive (gt)n(ga)m elements. The positions for highly polymorphic DRB amino acids vary slightly among ungulates and other mammals. From man and mouse to ungulates the basic (gt)n(ga)m structure is fixed in evolution for 7 x 10(7) years whereas ample variations exist in the tandem (gt)n and (ga)m dinucleotides and especially their "degenerated" derivatives. Phylogenetic trees for the alpha-helices and beta-pleated sheets of the ungulate DRB sequences suggest different evolutionary histories. In hoofed animals as well as in humans DRB beta-sheet encoding sequences and adjacent intronic repeats can be assembled into virtually identical groups suggesting coevolution of noncoding as well as coding DNA. In contrast alpha-helices and C-terminal parts of the first DRB domain evolve distinctly. In the absence of a defined mechanism causing specific, site-directed mutations, double-recombination or gene-conversion-like events would readily explain this fact. The role of the intronic simple (gt)n(ga)m repeat is discussed with respect to these genetic exchange mechanisms during evolution.
Short intronic repeat sequences facilitate circular RNA production

PubMed Central

Liang, Dongming

2014-01-01

Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery “backsplices” and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3′ end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. PMID:25281217
RRE: a tool for the extraction of non-coding regions surrounding annotated genes from genomic datasets.

PubMed

Lazzarato, F; Franceschinis, G; Botta, M; Cordero, F; Calogero, R A

2004-11-01

RRE allows the extraction of non-coding regions surrounding a coding sequence [i.e. gene upstream region, 5'-untranslated region (5'-UTR), introns, 3'-UTR, downstream region] from annotated genomic datasets available at NCBI. RRE parser and web-based interface are accessible at http://www.bioinformatica.unito.it/bioinformatics/rre/rre.html
Circular RNA - New member of noncoding RNA with novel functions.

PubMed

Hsiao, Kuei-Yang; Sun, H Sunny; Tsai, Shaw-Jenq

2017-06-01

A growing body of evidence indicates that circular RNAs are not simply a side product of splicing but a new class of noncoding RNAs in higher eukaryotes. The progression for the studies of circular RNAs is accelerated by combination of several advanced technologies such as next generation sequencing, gene silencing (small interfering RNAs) and editing (CRISPR/Cas9). More and more studies showed that dysregulated expression of circular RNAs plays critical roles during the development of several human diseases. Herein, we review the current advance of circular RNAs for their biosynthesis, molecular functions, and implications in human diseases. Impact statement The accumulating evidence indicate that circular RNA (circRNA) is a novel class of noncoding RNA with diverse molecular functions. Our review summarizes the current hypotheses for the models of circRNA biosynthesis including the direct interaction between upstream and downstream introns and lariat-driven circularization. In addition, molecular functions such as a decoy of microRNA (miRNA) termed miRNA sponge, transcriptional regulator, and protein-like modulator are also discussed. Finally, we reviewed the potential roles of circRNAs in neural system, cardiovascular system as well as cancers. These should provide insightful information for studying the regulation and functions of circRNA in other model of human diseases.
Regulatory consequences of neuronal ELAV-like protein binding to coding and non-coding RNAs in human brain

PubMed Central

Scheckel, Claudia; Drapeau, Elodie; Frias, Maria A; Park, Christopher Y; Fak, John; Zucker-Scharff, Ilana; Kou, Yan; Haroutunian, Vahram; Ma'ayan, Avi

2016-01-01

Neuronal ELAV-like (nELAVL) RNA binding proteins have been linked to numerous neurological disorders. We performed crosslinking-immunoprecipitation and RNAseq on human brain, and identified nELAVL binding sites on 8681 transcripts. Using knockout mice and RNAi in human neuroblastoma cells, we showed that nELAVL intronic and 3' UTR binding regulates human RNA splicing and abundance. We validated hundreds of nELAVL targets among which were important neuronal and disease-associated transcripts, including Alzheimer's disease (AD) transcripts. We therefore investigated RNA regulation in AD brain, and observed differential splicing of 150 transcripts, which in some cases correlated with differential nELAVL binding. Unexpectedly, the most significant change of nELAVL binding was evident on non-coding Y RNAs. nELAVL/Y RNA complexes were specifically remodeled in AD and after acute UV stress in neuroblastoma cells. We propose that the increased nELAVL/Y RNA association during stress may lead to nELAVL sequestration, redistribution of nELAVL target binding, and altered neuronal RNA splicing. DOI: http://dx.doi.org/10.7554/eLife.10421.001 PMID:26894958
Intronic L1 Retrotransposons and Nested Genes Cause Transcriptional Interference by Inducing Intron Retention, Exonization and Cryptic Polyadenylation

PubMed Central

Kaer, Kristel; Branovets, Jelena; Hallikma, Anni; Nigumann, Pilvi; Speek, Mart

2011-01-01

Background Transcriptional interference has been recently recognized as an unexpectedly complex and mostly negative regulation of genes. Despite a relatively few studies that emerged in recent years, it has been demonstrated that a readthrough transcription derived from one gene can influence the transcription of another overlapping or nested gene. However, the molecular effects resulting from this interaction are largely unknown. Methodology/Principal Findings Using in silico chromosome walking, we searched for prematurely terminated transcripts bearing signatures of intron retention or exonization of intronic sequence at their 3′ ends upstream to human L1 retrotransposons, protein-coding and noncoding nested genes. We demonstrate that transcriptional interference induced by intronic L1s (or other repeated DNAs) and nested genes could be characterized by intron retention, forced exonization and cryptic polyadenylation. These molecular effects were revealed from the analysis of endogenous transcripts derived from different cell lines and tissues and confirmed by the expression of three minigenes in cell culture. While intron retention and exonization were comparably observed in introns upstream to L1s, forced exonization was preferentially detected in nested genes. Transcriptional interference induced by L1 or nested genes was dependent on the presence or absence of cryptic splice sites, affected the inclusion or exclusion of the upstream exon and the use of cryptic polyadenylation signals. Conclusions/Significance Our results suggest that transcriptional interference induced by intronic L1s and nested genes could influence the transcription of the large number of genes in normal as well as in tumor tissues. Therefore, this type of interference could have a major impact on the regulation of the host gene expression. PMID:22022525
Insertion of a self-splicing intron into the mtDNA of atriploblastic animal

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valles, Y.; Halanych, K.; Boore, J.L.

2006-04-14

Nephtys longosetosa is a carnivorous polychaete worm that lives in the intertidal and subtidal zones with worldwide distribution (pleijel&rouse2001). Its mitochondrial genome has the characteristics typical of most metazoans: 37 genes; circular molecule; almost no intergenic sequence; and no significant gene rearrangements when compared to other annelid mtDNAs (booremoritz19981995). Ubiquitous features as small intergenic regions and lack of introns suggested that metazoan mtDNAs are under strong selective pressures to reduce their genome size allowing for faster replication requirements (booremoritz19981995Lynch2005). Yet, in 1996 two type I introns were found in the mtDNA of the basal metazoan Metridium senile (FigureX). Breaking amore » long-standing rule (absence of introns in metazoan mtDNA), this finding was later supported by the further presence of group I introns in other cnidarians. Interestingly, only the class Anthozoa within cnidarians seems to harbor such introns. Although several hundreds of triploblastic metazoan mtDNAs have been sequenced, this study is the first evidence of mitochondrial introns in triploblastic metazoans. The cox1 gene of N. longosetosa has an intron of almost 2 kbs in length. This finding represents as well the first instance of a group II intron (anthozoans harbor group I introns) in all metazoan lineages. Opposite trends are observed within plants, fungi and protist mtDNAs, where introns (both group I and II) and other non-coding sequences are widespread. Plant, fungal and protist mtDNA structure and organization differ enormously from that of metazoan mtDNA. Both, plant and fungal mtDNA are dynamic molecules that undergo high rates of recombination, contain long intergenic spacer regions and harbor both group I and group II introns. However, as metazoans they have a conserved gene content. Protists, on the other hand have a striking variation of gene content and introns that account for the genome size variation. In contrast to this mtDNA structure and organization diversity, current genome level studies point to a monophyletic origin of the mitochondria (REFS), raising questions such as: what are the pressures at work shaping the evolution of the mitochondrial genome at 'higher' levels? What drives the absence of introns and other non-coding spacers in metazoan mtDNA? What characteristics must have an intron to be maintained in an environment where 'extra chromosomes' are usually selected against?« less
Metazoan tRNA introns generate stable circular RNAs in vivo

PubMed Central

Lu, Zhipeng; Filonov, Grigory S.; Noto, John J.; Schmidt, Casey A.; Hatkevich, Talia L.; Wen, Ying; Jaffrey, Samie R.; Matera, A. Gregory

2015-01-01

We report the discovery of a class of abundant circular noncoding RNAs that are produced during metazoan tRNA splicing. These transcripts, termed tRNA intronic circular (tric)RNAs, are conserved features of animal transcriptomes. Biogenesis of tricRNAs requires anciently conserved tRNA sequence motifs and processing enzymes, and their expression is regulated in an age-dependent and tissue-specific manner. Furthermore, we exploited this biogenesis pathway to develop an in vivo expression system for generating “designer” circular RNAs in human cells. Reporter constructs expressing RNA aptamers such as Spinach and Broccoli can be used to follow the transcription and subcellular localization of tricRNAs in living cells. Owing to the superior stability of circular vs. linear RNA isoforms, this expression system has a wide range of potential applications, from basic research to pharmaceutical science. PMID:26194134
Choosing and Using Introns in Molecular Phylogenetics

PubMed Central

Creer, Simon

2007-01-01

Introns are now commonly used in molecular phylogenetics in an attempt to recover gene trees that are concordant with species trees, but there are a range of genomic, logistical and analytical considerations that are infrequently discussed in empirical studies that utilize intron data. This review outlines expedient approaches for locus selection, overcoming paralogy problems, recombination detection methods and the identification and incorporation of LVHs in molecular systematics. A range of parsimony and Bayesian analytical approaches are also described in order to highlight the methods that can currently be employed to align sequences and treat indels in subsequent analyses. By covering the main points associated with the generation and analysis of intron data, this review aims to provide a comprehensive introduction to using introns (or any non-coding nuclear data partition) in contemporary phylogenetics. PMID:19461984
Short intronic repeat sequences facilitate circular RNA production.

PubMed

Liang, Dongming; Wilusz, Jeremy E

2014-10-15

Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery "backsplices" and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼ 30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3' end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. © 2014 Liang and Wilusz; Published by Cold Spring Harbor Laboratory Press.
Intron-exon organization of the active human protein S gene PS. alpha. and its pseudogene PS. beta. : Duplication and silencing during primate evolution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ploos van Amstel, H.; Reitsma, P.H.; van der Logt, C.P.

The human protein S locus on chromosome 3 consists of two protein S genes, PS{alpha} and PS{beta}. Here the authors report the cloning and characterization of both genes. Fifteen exons of the PS{alpha} gene were identified that together code for protein S mRNA as derived from the reported protein S cDNAs. Analysis by primer extension of liver protein S mRNA, however, reveals the presence of two mRNA forms that differ in the length of their 5{prime}-noncoding region. Both transcripts contain a 5{prime}-noncoding region longer than found in the protein S cDNAs. The two products may arise from alternative splicing ofmore » an additional intron in this region or from the usage of two start sites for transcription. The intron-exon organization of the PS{alpha} gene fully supports the hypothesis that the protein S gene is the product of an evolutional assembling process in which gene modules coding for structural/functional protein units also found in other coagulation proteins have been put upstream of the ancestral gene of a steroid hormone binding protein. The PS{beta} gene is identified as a pseudogene. It contains a large variety of detrimental aberrations, viz., the absence of exon I, a splice site mutation, three stop codons, and a frame shift mutation. Overall the two genes PS{alpha} and PS{beta} show between their exonic sequences 96.5% homology. Southern analysis of primate DNA showed that the duplication of the ancestral protein S gene has occurred after the branching of the orangutan from the African apes. A nonsense mutation that is present in the pseudogene of man also could be identified in one of the two protein S genes of both chimpanzee and gorilla. This implicates that silencing of one of the two protein S genes must have taken place before the divergence of the three African apes.« less
Biotechnological applications of mobile group II introns and their reverse transcriptases: gene targeting, RNA-seq, and non-coding RNA analysis.

PubMed

Enyeart, Peter J; Mohr, Georg; Ellington, Andrew D; Lambowitz, Alan M

2014-01-13

Mobile group II introns are bacterial retrotransposons that combine the activities of an autocatalytic intron RNA (a ribozyme) and an intron-encoded reverse transcriptase to insert site-specifically into DNA. They recognize DNA target sites largely by base pairing of sequences within the intron RNA and achieve high DNA target specificity by using the ribozyme active site to couple correct base pairing to RNA-catalyzed intron integration. Algorithms have been developed to program the DNA target site specificity of several mobile group II introns, allowing them to be made into 'targetrons.' Targetrons function for gene targeting in a wide variety of bacteria and typically integrate at efficiencies high enough to be screened easily by colony PCR, without the need for selectable markers. Targetrons have found wide application in microbiological research, enabling gene targeting and genetic engineering of bacteria that had been intractable to other methods. Recently, a thermostable targetron has been developed for use in bacterial thermophiles, and new methods have been developed for using targetrons to position recombinase recognition sites, enabling large-scale genome-editing operations, such as deletions, inversions, insertions, and 'cut-and-pastes' (that is, translocation of large DNA segments), in a wide range of bacteria at high efficiency. Using targetrons in eukaryotes presents challenges due to the difficulties of nuclear localization and sub-optimal magnesium concentrations, although supplementation with magnesium can increase integration efficiency, and directed evolution is being employed to overcome these barriers. Finally, spurred by new methods for expressing group II intron reverse transcriptases that yield large amounts of highly active protein, thermostable group II intron reverse transcriptases from bacterial thermophiles are being used as research tools for a variety of applications, including qRT-PCR and next-generation RNA sequencing (RNA-seq). The high processivity and fidelity of group II intron reverse transcriptases along with their novel template-switching activity, which can directly link RNA-seq adaptor sequences to cDNAs during reverse transcription, open new approaches for RNA-seq and the identification and profiling of non-coding RNAs, with potentially wide applications in research and biotechnology.
A novel non-coding RNA within an intron of CDH2 and association of its SNP with non-syndromic cleft lip and palate.

PubMed

Kumari, Priyanka; Singh, Subodh Kumar; Raman, Rajiva

2018-06-05

Genome-wide linkage analysis and whole genome sequencing in a Van der Woude syndrome (VWS) family revealed that the SNP, rs539075, within intron 2 of the cadherin 2 gene (CDH2) co-segregated with the disease phenotype. A study with nonsyndromic cleft lip with or without cleft palate (NSCL ± P) cases (N = 292) and controls (N = 287) established association of this SNP with NSCL ± P as a risk factor. RT-PCR based expression analysis of the SNP-harbouring region of intron 2 of CDH2 in the clefted lip and/or palate tissues of 16 patients revealed that the mutant allele expressed in all those individuals having it (hetero-/homozygous), whereas the wild type allele expressed in <50% of the samples in which it was present. The intronic transcript was also present in the prospective lip and palate region of 13.5 dpc mouse embryo, detected by RNA in situ hybridization and RT-PCR. These results including the in silico, characterization of the ~200 nt-intronic transcript showed that conformationally it fits best with noncoding small RNA, possibly a precursor of miRNA. Its function in the orofacial organogenesis remains to be elucidated which will enable us to define the role of this mutant ncRNA in the clefting of lip and palate. Copyright © 2018 Elsevier B.V. All rights reserved.
Metazoan tRNA introns generate stable circular RNAs in vivo.

PubMed

Lu, Zhipeng; Filonov, Grigory S; Noto, John J; Schmidt, Casey A; Hatkevich, Talia L; Wen, Ying; Jaffrey, Samie R; Matera, A Gregory

2015-09-01

We report the discovery of a class of abundant circular noncoding RNAs that are produced during metazoan tRNA splicing. These transcripts, termed tRNA intronic circular (tric)RNAs, are conserved features of animal transcriptomes. Biogenesis of tricRNAs requires anciently conserved tRNA sequence motifs and processing enzymes, and their expression is regulated in an age-dependent and tissue-specific manner. Furthermore, we exploited this biogenesis pathway to develop an in vivo expression system for generating "designer" circular RNAs in human cells. Reporter constructs expressing RNA aptamers such as Spinach and Broccoli can be used to follow the transcription and subcellular localization of tricRNAs in living cells. Owing to the superior stability of circular vs. linear RNA isoforms, this expression system has a wide range of potential applications, from basic research to pharmaceutical science. © 2015 Lu et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Microprocessor mediates transcriptional termination of long noncoding RNA transcripts hosting microRNAs.

PubMed

Dhir, Ashish; Dhir, Somdutta; Proudfoot, Nick J; Jopling, Catherine L

2015-04-01

MicroRNAs (miRNAs) play a major part in the post-transcriptional regulation of gene expression. Mammalian miRNA biogenesis begins with cotranscriptional cleavage of RNA polymerase II (Pol II) transcripts by the Microprocessor complex. Although most miRNAs are located within introns of protein-coding transcripts, a substantial minority of miRNAs originate from long noncoding (lnc) RNAs, for which transcript processing is largely uncharacterized. We show, by detailed characterization of liver-specific lnc-pri-miR-122 and genome-wide analysis in human cell lines, that most lncRNA transcripts containing miRNAs (lnc-pri-miRNAs) do not use the canonical cleavage-and-polyadenylation pathway but instead use Microprocessor cleavage to terminate transcription. Microprocessor inactivation leads to extensive transcriptional readthrough of lnc-pri-miRNA and transcriptional interference with downstream genes. Consequently we define a new RNase III-mediated, polyadenylation-independent mechanism of Pol II transcription termination in mammalian cells.
Cis-regulatory underpinnings of human GLI3 expression in embryonic craniofacial structures and internal organs.

PubMed

Abbasi, Amir A; Minhas, Rashid; Schmidt, Ansgar; Koch, Sabine; Grzeschik, Karl-Heinz

2013-10-01

The zinc finger transcription factor Gli3 is an important mediator of Sonic hedgehog (Shh) signaling. During early embryonic development Gli3 participates in patterning and growth of the central nervous system, face, skeleton, limb, tooth and gut. Precise regulation of the temporal and spatial expression of Gli3 is crucial for the proper specification of these structures in mammals and other vertebrates. Previously we reported a set of human intronic cis-regulators controlling almost the entire known repertoire of endogenous Gli3 expression in mouse neural tube and limbs. However, the genetic underpinning of GLI3 expression in other embryonic domains such as craniofacial structures and internal organs remain elusive. Here we demonstrate in a transgenic mice assay the potential of a subset of human/fish conserved non-coding sequences (CNEs) residing within GLI3 intronic intervals to induce reporter gene expression at known regions of endogenous Gli3 transcription in embryonic domains other than central nervous system (CNS) and limbs. Highly specific reporter expression was observed in craniofacial structures, eye, gut, and genitourinary system. Moreover, the comparison of expression patterns directed by these intronic cis-acting regulatory elements in mouse and zebrafish embryos suggests that in accordance with sequence conservation, the target site specificity of a subset of these elements remains preserved among these two lineages. Taken together with our recent investigations, it is proposed here that during vertebrate evolution the Gli3 expression control acquired multiple, independently acting, intronic enhancers for spatiotemporal patterning of CNS, limbs, craniofacial structures and internal organs. © 2013 The Authors Development, Growth & Differentiation © 2013 Japanese Society of Developmental Biologists.
Regulation of expression of two LY-6 family genes by intron retention and transcription induced chimerism

PubMed Central

Calvanese, Vincenzo; Mallya, Meera; Campbell, R Duncan; Aguado, Begoña

2008-01-01

Background Regulation of the expression of particular genes can rely on mechanisms that are different from classical transcriptional and translational control. The LY6G5B and LY6G6D genes encode LY-6 domain proteins, whose expression seems to be regulated in an original fashion, consisting of an intron retention event which generates, through an early premature stop codon, a non-coding transcript, preventing expression in most cell lines and tissues. Results The MHC LY-6 non-coding transcripts have shown to be stable and very abundant in the cell, and not subject to Nonsense Mediated Decay (NMD). This retention event appears not to be solely dependent on intron features, because in the case of LY6G5B, when the intron is inserted in the artificial context of a luciferase expression plasmid, it is fully spliced but strongly stabilises the resulting luciferase transcript. In addition, by quantitative PCR we found that the retained and spliced forms are differentially expressed in tissues indicating an active regulation of the non-coding transcript. EST database analysis revealed that these genes have an alternative expression pathway with the formation of Transcription Induced Chimeras (TIC). This data was confirmed by RT-PCR, revealing the presence of different transcripts that would encode the chimeric proteins CSNKβ-LY6G5B and G6F-LY6G6D, in which the LY-6 domain would join to a kinase domain and an Ig-like domain, respectively. Conclusion In conclusion, the LY6G5B and LY6G6D intron-retained transcripts are not subjected to NMD and are more abundant than the properly spliced forms. In addition, these genes form chimeric transcripts with their neighbouring same orientation 5' genes. Of interest is the fact that the 5' genes (CSNKβ or G6F) undergo differential splicing only in the context of the chimera (CSNKβ-LY6G5B or G6F-LY6G6C) and not on their own. PMID:18817541
An intronic microRNA silences genes that are functionally antagonistic to its host gene.

PubMed

Barik, Sailen

2008-09-01

MicroRNAs (miRNAs) are short noncoding RNAs that down-regulate gene expression by silencing specific target mRNAs. While many miRNAs are transcribed from their own genes, nearly half map within introns of 'host' genes, the significance of which remains unclear. We report that transcriptional activation of apoptosis-associated tyrosine kinase (AATK), essential for neuronal differentiation, also generates miR-338 from an AATK gene intron that silences a family of mRNAs whose protein products are negative regulators of neuronal differentiation. We conclude that an intronic miRNA, transcribed together with the host gene mRNA, may serve the interest of its host gene by silencing a cohort of genes that are functionally antagonistic to the host gene itself.
The Rise and Fall of the Gene.

ERIC Educational Resources Information Center

Mahadeva, Madhu; Randerson, Sherman

1985-01-01

Summarizes the current state of genetics, highlighting major historical events in the development of the field and discussing topics related to introns ("silent" or noncoding base sequences in eucaryotic genes) and exons (the coding parts of DNA). (JN)
A noncoding RNA transcribed from the AGAMOUS (AG) second intron binds to CURLY LEAF and represses AG expression in leaves.

PubMed

Wu, Hui-Wen; Deng, Shulin; Xu, Haiying; Mao, Hui-Zhu; Liu, Jun; Niu, Qi-Wen; Wang, Huan; Chua, Nam-Hai

2018-06-04

Dispersed H3K27 trimethylation (H3K27me3) of the AGAMOUS (AG) genomic locus is mediated by CURLY LEAF (CLF), a component of the Polycomb Repressive Complex (PRC) 2. Previous reports have shown that the AG second intron, which confers AG tissue-specific expression, harbors sequences targeted by several positive and negative regulators. Using RACE reverse transcription polymerase chain reaction, we found that the AG intron 2 encodes several noncoding RNAs. RNAi experiment showed that incRNA4 is needed for CLF repressive activity. AG-incRNA4RNAi lines showed increased leaf AG mRNA levels associated with a decrease of H3K27me3 levels; these plants displayed AG overexpression phenotypes. Genetic and biochemical analyses demonstrated that the AG-incRNA4 can associate with CLF to repress AG expression in leaf tissues through H3K27me3-mediated repression and to autoregulate its own expression level. The mechanism of AG-incRNA4-mediated repression may be relevant to investigations on tissue-specific expression of Arabidopsis MADS-box genes. © 2018 The Authors New Phytologist © 2018 New Phytologist Trust.

The 253-kb inversion and deep intronic mutations in UNC13D are present in North American patients with familial hemophagocytic lymphohistiocytosis 3.

PubMed

Qian, Yaping; Johnson, Judith A; Connor, Jessica A; Valencia, C Alexander; Barasa, Nathaniel; Schubert, Jeffery; Husami, Ammar; Kissell, Diane; Zhang, Ge; Weirauch, Matthew T; Filipovich, Alexandra H; Zhang, Kejian

2014-06-01

The mutations in UNC13D are responsible for familial hemophagocytic lymphohistiocytosis (FHL) type 3. A 253-kb inversion and two deep intronic mutations, c.118-308C > T and c.118-307G > A, in UNC13D were recently reported in European and Asian FHL3 patients. We sought to determine the prevalence of these three non-coding mutations in North American FHL patients and evaluate the significance of examining these new mutations in genetic testing. We performed DNA sequencing of UNC13D and targeted analysis of these three mutations in 1,709 North American patients with a suspected clinical diagnosis of hemophagocytic lymphohistiocytosis (HLH). The 253-kb inversion, intronic mutations c.118-308C > T and c.118-307G > A were found in 11, 15, and 4 patients, respectively, in which the genetic basis (bi-allelic mutations) explained 25 additional patients. Taken together with previously diagnosed FHL3 patients in our HLH patient registry, these three non-coding mutations were found in 31.6% (25/79) of the FHL3 patients. The 253-kb inversion, c.118-308C > T and c.118-307G > A accounted for 7.0%, 8.9%, and 1.3% of mutant alleles, respectively. Significantly, eight novel mutations in UNC13D are being reported in this study. To further evaluate the expression level of the newly reported intronic mutation c.118-307G > A, reverse transcription PCR and Western blot analysis revealed a significant reduction of both RNA and protein levels suggesting that the c.118-307G > A mutation affects transcription. These specified non-coding mutations were found in a significant number of North American patients and inclusion of them in mutation analysis will improve the molecular diagnosis of FHL3. © 2014 Wiley Periodicals, Inc.
Genome-wide identification and functional prediction of nitrogen-responsive intergenic and intronic long non-coding RNAs in maize (Zea mays L.).

PubMed

Lv, Yuanda; Liang, Zhikai; Ge, Min; Qi, Weicong; Zhang, Tifu; Lin, Feng; Peng, Zhaohua; Zhao, Han

2016-05-11

Nitrogen (N) is an essential and often limiting nutrient to plant growth and development. Previous studies have shown that the mRNA expressions of numerous genes are regulated by nitrogen supplies; however, little is known about the expressed non-coding elements, for example long non-coding RNAs (lncRNAs) that control the response of maize (Zea mays L.) to nitrogen. LncRNAs are a class of non-coding RNAs larger than 200 bp, which have emerged as key regulators in gene expression. In this study, we surveyed the intergenic/intronic lncRNAs in maize B73 leaves at the V7 stage under conditions of N-deficiency and N-sufficiency using ribosomal RNA depletion and ultra-deep total RNA sequencing approaches. By integration with mRNA expression profiles and physiological evaluations, 7245 lncRNAs and 637 nitrogen-responsive lncRNAs were identified that exhibited unique expression patterns. Co-expression network analysis showed that the nitrogen-responsive lncRNAs were enriched mainly in one of the three co-expressed modules. The genes in the enriched module are mainly involved in NADH dehydrogenase activity, oxidative phosphorylation and the nitrogen compounds metabolic process. We identified a large number of lncRNAs in maize and illustrated their potential regulatory roles in response to N stress. The results lay the foundation for further in-depth understanding of the molecular mechanisms of lncRNAs' role in response to nitrogen stresses.
An intronic ncRNA-dependent regulation of SORL1 expression affecting Aβ formation is upregulated in post-mortem Alzheimer's disease brain samples.

PubMed

Ciarlo, Eleonora; Massone, Sara; Penna, Ilaria; Nizzari, Mario; Gigoni, Arianna; Dieci, Giorgio; Russo, Claudio; Florio, Tullio; Cancedda, Ranieri; Pagano, Aldo

2013-03-01

Recent studies indicated that sortilin-related receptor 1 (SORL1) is a risk gene for late-onset Alzheimer's disease (AD), although its role in the aetiology and/or progression of this disorder is not fully understood. Here, we report the finding of a non-coding (nc) RNA (hereafter referred to as 51A) that maps in antisense configuration to intron 1 of the SORL1 gene. 51A expression drives a splicing shift of SORL1 from the synthesis of the canonical long protein variant A to an alternatively spliced protein form. This process, resulting in a decreased synthesis of SORL1 variant A, is associated with impaired processing of amyloid precursor protein (APP), leading to increased Aβ formation. Interestingly, we found that 51A is expressed in human brains, being frequently upregulated in cerebral cortices from individuals with Alzheimer's disease. Altogether, these findings document a novel ncRNA-dependent regulatory pathway that might have relevant implications in neurodegeneration.
Quantitative Profiling of Peptides from RNAs classified as non-coding

PubMed Central

Prabakaran, Sudhakaran; Hemberg, Martin; Chauhan, Ruchi; Winter, Dominic; Tweedie-Cullen, Ry Y.; Dittrich, Christian; Hong, Elizabeth; Gunawardena, Jeremy; Steen, Hanno; Kreiman, Gabriel; Steen, Judith A.

2014-01-01

Only a small fraction of the mammalian genome codes for messenger RNAs destined to be translated into proteins, and it is generally assumed that a large portion of transcribed sequences - including introns and several classes of non-coding RNAs (ncRNAs) do not give rise to peptide products. A systematic examination of translation and physiological regulation of ncRNAs has not been conducted. Here, we use computational methods to identify the products of non-canonical translation in mouse neurons by analyzing unannotated transcripts in combination with proteomic data. This study supports the existence of non-canonical translation products from both intragenic and extragenic genomic regions, including peptides derived from anti-sense transcripts and introns. Moreover, the studied novel translation products exhibit temporal regulation similar to that of proteins known to be involved in neuronal activity processes. These observations highlight a potentially large and complex set of biologically regulated translational events from transcripts formerly thought to lack coding potential. PMID:25403355
Arabidopsis Chloroplast Mini-Ribonuclease III Participates in rRNA Maturation and Intron Recycling

PubMed Central

Hotto, Amber M.; Castandet, Benoît; Gilet, Laetitia; Higdon, Andrea; Condon, Ciarán; Stern, David B.

2015-01-01

RNase III proteins recognize double-stranded RNA structures and catalyze endoribonucleolytic cleavages that often regulate gene expression. Here, we characterize the functions of RNC3 and RNC4, two Arabidopsis thaliana chloroplast Mini-RNase III-like enzymes sharing 75% amino acid sequence identity. Whereas rnc3 and rnc4 null mutants have no visible phenotype, rnc3/rnc4 (rnc3/4) double mutants are slightly smaller and chlorotic compared with the wild type. In Bacillus subtilis, the RNase Mini-III is integral to 23S rRNA maturation. In Arabidopsis, we observed imprecise maturation of 23S rRNA in the rnc3/4 double mutant, suggesting that exoribonucleases generated staggered ends in the absence of specific Mini-III-catalyzed cleavages. A similar phenotype was found at the 3′ end of the 16S rRNA, and the primary 4.5S rRNA transcript contained 3′ extensions, suggesting that Mini-III catalyzes several processing events of the polycistronic rRNA precursor. The rnc3/4 mutant showed overaccumulation of a noncoding RNA complementary to the 4.5S-5S rRNA intergenic region, and its presence correlated with that of the extended 4.5S rRNA precursor. Finally, we found rnc3/4-specific intron degradation intermediates that are probable substrates for Mini-III and show that B. subtilis Mini-III is also involved in intron regulation. Overall, this study extends our knowledge of the key role of Mini-III in intron and noncoding RNA regulation and provides important insight into plastid rRNA maturation. PMID:25724636
Structure and genomic organization of the human B1 receptor gene for kinins (BDKRB1).

PubMed

Bachvarov, D R; Hess, J F; Menke, J G; Larrivée, J F; Marceau, F

1996-05-01

Two subtypes of mammalian bradykinin receptors, B1 and B2 (BDKRB1 and BDKRB2), have been defined based on their pharmacological properties. The B1 type kinin receptors have weak affinity for intact BK or Lys-BK but strong affinity for kinin metabolites without the C-terminal arginine (e.g., des-Arg9-BK and Lys-des-Arg9-BK, also called des-Arg10-kallidin), which are generated by kininase I. The B1 receptor expression is up-regulated following tissue injury and inflammation (hyperemia, exudation, hyperalgesia, etc.). In the present study, we have cloned and sequenced the gene encoding human B1 receptor from a human genomic library. The human B1 receptor gene contains three exons separated by two introns. The first and the second exon are noncoding, while the coding region and the 3'-flanking region are located entirely on the third exon. The exon-intron arrangement of the human B1 receptor gene shows significant similarity with the genes encoding the B2 receptor subtype in human, mouse, and rat. Sequence analysis of the 5'-flanking region revealed the presence of a consensus TATA box and of numerous candidate transcription factor binding sequences. Primer extension experiments have shown the existence of multiple transcription initiation sites situated downstream and upstream from the consensus TATA box. Genomic Southern blot analysis indicated that the human B1 receptor is encoded by a single-copy gene.
Extremely hypomorphic and severe deep intronic variants in the ABCA4 locus result in varying Stargardt disease phenotypes.

PubMed

Zernant, Jana; Lee, Winston; Nagasaki, Takayuki; Collison, Frederick T; Fishman, Gerald A; Bertelsen, Mette; Rosenberg, Thomas; Gouras, Peter; Tsang, Stephen H; Allikmets, Rando

2018-05-30

Autosomal recessive Stargardt disease (STGD1, MIM 248200) is caused by mutations in the ABCA4 gene. Complete sequencing of the ABCA4 locus in STGD1 patients identifies two expected disease-causing alleles in ~75% of patients and only one mutation in ~15% of patients. Recently, many possibly pathogenic variants in deep intronic sequences of ABCA4 have been identified in the latter group. We extended our analyses of deep intronic ABCA4 variants and determined that one of these, c.4253+43G>A (rs61754045), is present in 29/1155 (2.6%) of STGD1 patients. The variant is found at statistically significantly higher frequency in patients with only one pathogenic ABCA4 allele, 23/160 (14.38%), MAF=0.072, compared to MAF=0.013 in all STGD1 cases and MAF=0.006 in the matching general population (P<1x10-7). The variant, which is not predicted to have any effect on splicing, is the first reported intronic "extremely hypomorphic allele" in the ABCA4 locus; i.e., it is pathogenic only when in trans with a loss-of-function ABCA4 allele. It results in a distinct clinical phenotype characterized by late-onset of symptoms and foveal sparing. In ~70% of cases the variant was allelic with the c.6006-609T>A (rs575968112) variant, which was deemed non-pathogenic. Another rare deep intronic variant, c.5196+1056A>G (rs886044749), found in 5/834 (0.6%) of STGD1 cases is, conversely, a severe allele. This study determines pathogenicity for three non-coding variants in STGD1 patients of European descent accounting for ~3% of the disease. Defining disease-associated alleles in the non-coding sequences of the ABCA4 locus can be accomplished by integrated clinical and genetic analyses. Cold Spring Harbor Laboratory Press.
Simultaneous sequencing of coding and noncoding RNA reveals a human transcriptome dominated by a small number of highly expressed noncoding genes.

PubMed

Boivin, Vincent; Deschamps-Francoeur, Gabrielle; Couture, Sonia; Nottingham, Ryan M; Bouchard-Bourelle, Philia; Lambowitz, Alan M; Scott, Michelle S; Abou-Elela, Sherif

2018-07-01

Comparing the abundance of one RNA molecule to another is crucial for understanding cellular functions but most sequencing techniques can target only specific subsets of RNA. In this study, we used a new fragmented ribodepleted TGIRT sequencing method that uses a thermostable group II intron reverse transcriptase (TGIRT) to generate a portrait of the human transcriptome depicting the quantitative relationship of all classes of nonribosomal RNA longer than 60 nt. Comparison between different sequencing methods indicated that FRT is more accurate in ranking both mRNA and noncoding RNA than viral reverse transcriptase-based sequencing methods, even those that specifically target these species. Measurements of RNA abundance in different cell lines using this method correlate with biochemical estimates, confirming tRNA as the most abundant nonribosomal RNA biotype. However, the single most abundant transcript is 7SL RNA, a component of the signal recognition particle. S tructured n on c oding RNAs (sncRNAs) associated with the same biological process are expressed at similar levels, with the exception of RNAs with multiple functions like U1 snRNA. In general, sncRNAs forming RNPs are hundreds to thousands of times more abundant than their mRNA counterparts. Surprisingly, only 50 sncRNA genes produce half of the non-rRNA transcripts detected in two different cell lines. Together the results indicate that the human transcriptome is dominated by a small number of highly expressed sncRNAs specializing in functions related to translation and splicing. © 2018 Boivin et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Evolution of the unspliced transcriptome.

PubMed

Engelhardt, Jan; Stadler, Peter F

2015-08-20

Despite their abundance, unspliced EST data have received little attention as a source of information on non-coding RNAs. Very little is know, therefore, about the genomic distribution of unspliced non-coding transcripts and their relationship with the much better studied regularly spliced products. In particular, their evolution has remained virtually unstudied. We systematically study the evidence on unspliced transcripts available in EST annotation tracks for human and mouse, comprising 104,980 and 66,109 unspliced EST clusters, respectively. Roughly one third of these are located totally inside introns of known genes (TINs) and another third overlaps exonic regions (PINs). Eleven percent are "intergenic", far away from any annotated gene. Direct evidence for the independent transcription of many PINs and TINs is obtained from CAGE tag and chromatin data. We predict more than 2000 3'UTR-associated RNA candidates for each human and mouse. Fifteen to twenty percent of the unspliced EST cluster are conserved between human and mouse. With the exception of TINs, the sequences of unspliced EST clusters evolve significantly slower than genomic background. Furthermore, like spliced lincRNAs, they show highly tissue-specific expression patterns. Unspliced long non-coding RNAs are an important, rapidly evolving, component of mammalian transcriptomes. Their analysis is complicated by their preferential association with complex transcribed loci that usually also harbor a plethora of spliced transcripts. Unspliced EST data, although typically disregarded in transcriptome analysis, can be used to gain insights into this rarely investigated transcriptome component. The frequently postulated connection between lack of splicing and nuclear retention and the surprising overlap of chromatin-associated transcripts suggests that this class of transcripts might be involved in chromatin organization and possibly other mechanisms of epigenetic control.
Crystal structure of group II intron domain 1 reveals a template for RNA assembly

DOE PAGES

Zhao, Chen; Rajashankar, Kanagalaghatta R.; Marcia, Marco; ...

2015-10-26

Although the importance of large noncoding RNAs is increasingly appreciated, our understanding of their structures and architectural dynamics remains limited. In particular, we know little about RNA folding intermediates and how they facilitate the productive assembly of RNA tertiary structures. In this paper, we report the crystal structure of an obligate intermediate that is required during the earliest stages of group II intron folding. Composed of domain 1 from the Oceanobacillus iheyensis group II intron (266 nucleotides), this intermediate retains native-like features but adopts a compact conformation in which the active site cleft is closed. Transition between this closed andmore » the open (native) conformation is achieved through discrete rotations of hinge motifs in two regions of the molecule. Finally, the open state is then stabilized by sequential docking of downstream intron domains, suggesting a 'first come, first folded' strategy that may represent a generalizable pathway for assembly of large RNA and ribonucleoprotein structures.« less
A fast-evolving human NPAS3 enhancer gained reporter expression in the developing forebrain of transgenic mice

PubMed Central

Kamm, Gretel B.; López-Leal, Rodrigo; Lorenzo, Juan R.; Franchini, Lucía F.

2013-01-01

The developmental brain gene NPAS3 stands out as a hot spot in human evolution because it contains the largest number of human-specific, fast-evolving, conserved, non-coding elements. In this paper we studied 2xHAR142, one of these elements that is located in the fifth intron of NPAS3. Using transgenic mice, we show that the mouse and chimp 2xHAR142 orthologues behave as transcriptional enhancers driving expression of the reporter gene lacZ to a similar NPAS3 expression subdomain in the mouse central nervous system. Interestingly, the human 2xHAR142 orthologue drives lacZ expression to an extended expression pattern in the nervous system. Thus, molecular evolution of 2xHAR142 provides the first documented example of human-specific heterotopy in the forebrain promoted by a transcriptional enhancer and suggests that it may have contributed to assemble the unique properties of the human brain. PMID:24218632
Alternative Splicing as a Target for Cancer Treatment.

PubMed

Martinez-Montiel, Nancy; Rosas-Murrieta, Nora Hilda; Anaya Ruiz, Maricruz; Monjaraz-Guzman, Eduardo; Martinez-Contreras, Rebeca

2018-02-11

Alternative splicing is a key mechanism determinant for gene expression in metazoan. During alternative splicing, non-coding sequences are removed to generate different mature messenger RNAs due to a combination of sequence elements and cellular factors that contribute to splicing regulation. A different combination of splicing sites, exonic or intronic sequences, mutually exclusive exons or retained introns could be selected during alternative splicing to generate different mature mRNAs that could in turn produce distinct protein products. Alternative splicing is the main source of protein diversity responsible for 90% of human gene expression, and it has recently become a hallmark for cancer with a full potential as a prognostic and therapeutic tool. Currently, more than 15,000 alternative splicing events have been associated to different aspects of cancer biology, including cell proliferation and invasion, apoptosis resistance and susceptibility to different chemotherapeutic drugs. Here, we present well established and newly discovered splicing events that occur in different cancer-related genes, their modification by several approaches and the current status of key tools developed to target alternative splicing with diagnostic and therapeutic purposes.
Identification of a possible susceptibility locus for UVB-induced skin tanning phenotype in Korean females using genomewide association study.

PubMed

Kwak, Taek-Jong; Chang, Yun-Hee; Shin, Young-Ah; Shin, Jung-Min; Kim, Ji-Hye; Lim, Seul-Ki; Lee, Sang-Hwha; Lee, Min-Geol; Yoon, Tae-Jin; Kim, Chang-Deok; Lee, Jeung-Hoon; Koh, Jae Sook; Seo, Young Kyoung; Chang, Min-Youl; Lee, Young

2015-12-01

A two-stage genomewide association (GWA) analysis was conducted to investigate the genetic factors influencing ultraviolet (UV)-induced skin pigmentation in Korean females after UV exposure. Previously, a GWA study evaluating ~500 000 single nucleotide polymorphisms (SNPs) in 99 Korean females identified eight SNPs that were highly associated with tanning ability. To confirm these associations, we genotyped the SNPs in an independent replication study (112 Korean females). We found that a novel SNP in the intron of the WW domain-containing oxidoreductase (WWOX) gene yielded significant replicated associations with skin tanning ability (P-value = 1.16 × 10(-4) ). To understand the functional consequences of this locus located in the non-coding region, we investigated the role of WWOX in human melanocytes using a recombinant adenovirus expressing a microRNA specific for WWOX. Inhibition of WWOX expression significantly increased the expression and activity of tyrosinase in human melanocytes. Taken together, our results suggest that genetic variants in the intronic region of WWOX could be determinants in the UV-induced tanning ability of Korean females. WWOX represents a new candidate gene to evaluate the molecular basis of the UV-induced tanning ability in individuals. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Molecular Evolution of the Non-Coding Eosinophil Granule Ontogeny Transcript

PubMed Central

Rose, Dominic; Stadler, Peter F.

2011-01-01

Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element. PMID:22303364
The genomic structure: proof of the role of non-coding DNA.

PubMed

Bouaynaya, Nidhal; Schonfeld, Dan

2006-01-01

We prove that the introns play the role of a decoy in absorbing mutations in the same way hollow uninhabited structures are used by the military to protect important installations. Our approach is based on a probability of error analysis, where errors are mutations which occur in the exon sequences. We derive the optimal exon length distribution, which minimizes the probability of error in the genome. Furthermore, to understand how can Nature generate the optimal distribution, we propose a diffusive random walk model for exon generation throughout evolution. This model results in an alpha stable exon length distribution, which is asymptotically equivalent to the optimal distribution. Experimental results show that both distributions accurately fit the real data. Given that introns also drive biological evolution by increasing the rate of unequal crossover between genes, we conclude that the role of introns is to maintain a genius balance between stability and adaptability in eukaryotic genomes.
Complex Tissue-Specific Patterns and Distribution of Multiple RAGE Splice Variants in Different Mammals

PubMed Central

López-Díez, Raquel; Rastrojo, Alberto; Villate, Olatz; Aguado, Begoña

2013-01-01

The receptor for advanced glycosylation end products (RAGE) is a multiligand receptor involved in diverse cell signaling pathways. Previous studies show that this gene expresses several splice variants in human, mouse, and dog. Alternative splicing (AS) plays an important role in expanding transcriptomic and proteomic diversity, and it has been related to disease. AS is also one of the main evolutionary mechanisms in mammalian genomes. However, limited information is available regarding the AS of RAGE in a wide context of mammalian tissues. In this study, we examined in detail the different RAGE mRNAs generated by AS from six mammals, including two primates (human and monkey), two artiodactyla (cow and pig), and two rodentia (mouse and rat) in 6–18 different tissues including fetal, adult, and tumor. By nested reverse transcription-polymerase chain reaction (RT-PCR) we identified a high number of splice variants including noncoding transcripts and predicted coding ones with different potential protein modifications affecting mainly the transmembrane and ligand-binding domains that could influence their biological function. However, analysis of RNA-seq data enabled detecting only the most abundant splice variants. More than 80% of the detected RT-PCR variants (87 of 101 transcripts) are novel (different exon/intron structure to the previously described ones), and interestingly, 20–60% of the total transcripts (depending on the species) are noncoding ones that present tissue specificity. Our results suggest that RAGE undergoes extensive AS in mammals, with different expression patterns among adult, fetal, and tumor tissues. Moreover, most splice variants seem to be species specific, especially the noncoding variants, with only two (canonical human Tv1-RAGE, and human N-truncated or Tv10-RAGE) conserved among the six different species. This could indicate a special evolution pattern of this gene at mRNA level. PMID:24273313
Transcription regulation by distal enhancers

PubMed Central

Stadhouders, Ralph; van den Heuvel, Anita; Kolovos, Petros; Jorna, Ruud; Leslie, Kris; Grosveld, Frank; Soler, Eric

2012-01-01

Genome-wide chromatin profiling efforts have shown that enhancers are often located at large distances from gene promoters within the noncoding genome. Whereas enhancers can stimulate transcription initiation by communicating with promoters via chromatin looping mechanisms, we propose that enhancers may also stimulate transcription elongation by physical interactions with intronic elements. We review here recent findings derived from the study of the hematopoietic system. PMID:22771987
Multi-step splicing of sphingomyelin synthase linear and circular RNAs.

PubMed

Filippenkov, Ivan B; Sudarkina, Olga Yu; Limborska, Svetlana A; Dergunova, Lyudmila V

2018-05-15

The SGMS1 gene encodes the enzyme sphingomyelin synthase 1 (SMS1), which is involved in the regulation of lipid metabolism, apoptosis, intracellular vesicular transport and other significant processes. The SGMS1 gene is located on chromosome 10 and has a size of 320 kb. Previously, we showed that dozens of alternative transcripts of the SGMS1 gene are present in various human tissues. In addition to mRNAs that provide synthesis of the SMS1 protein, this gene participates in the synthesis of non-coding transcripts, including circular RNAs (circRNAs), which include exons of the 5'-untranslated region (5'-UTR) and are highly represented in the brain. In this study, using the high-throughput technology RNA-CaptureSeq, many new SGMS1 transcripts were identified, including both intronic unspliced RNAs (premature RNAs) and RNAs formed via alternative splicing. Recursive exons (RS-exons) that can participate in the multi-step splicing of long introns of the gene were also identified. These exons participate in the formation of circRNAs. Thus, multi-step splicing may provide a variety of linear and circular RNAs of eukaryotic genes in tissues. Copyright © 2018 Elsevier B.V. All rights reserved.
SinEx DB: a database for single exon coding sequences in mammalian genomes.

PubMed

Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S

2016-01-01

Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.
Comparative evolutionary genomics of the HADH2 gene encoding Aβ-binding alcohol dehydrogenase/17β-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10)

PubMed Central

Marques, Alexandra T; Antunes, Agostinho; Fernandes, Pedro A; Ramos, Maria J

2006-01-01

Background The Aβ-binding alcohol dehydrogenase/17β-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10) is an enzyme involved in pivotal metabolic processes and in the mitochondrial dysfunction seen in the Alzheimer's disease. Here we use comparative genomic analyses to study the evolution of the HADH2 gene encoding ABAD/HSD10 across several eukaryotic species. Results Both vertebrate and nematode HADH2 genes showed a six-exon/five-intron organization while those of the insects had a reduced and varied number of exons (two to three). Eutherian mammal HADH2 genes revealed some highly conserved noncoding regions, which may indicate the presence of functional elements, namely in the upstream region about 1 kb of the transcription start site and in the first part of intron 1. These regions were also conserved between Tetraodon and Fugu fishes. We identified a conserved alternative splicing event between human and dog, which have a nine amino acid deletion, causing the removal of the strand βF. This strand is one of the seven strands that compose the core β-sheet of the Rossman fold dinucleotide-binding motif characteristic of the short chain dehydrogenase/reductase (SDR) family members. However, the fact that the substrate binding cleft residues are retained and the existence of a shared variant between human and dog suggest that it might be functional. Molecular adaptation analyses across eutherian mammal orthologues revealed the existence of sites under positive selection, some of which being localized in the substrate-binding cleft and in the insertion 1 region on loop D (an important region for the Aβ-binding to the enzyme). Interestingly, a higher than expected number of nonsynonymous substitutions were observed between human/chimpanzee and orangutan, with six out of the seven amino acid replacements being under molecular adaptation (including three in loop D and one in the substrate binding loop). Conclusion Our study revealed that HADH2 genes maintained a reasonable conserved organization across a large evolutionary distance. The conserved noncoding regions identified among mammals and between pufferfishes, the evidence of an alternative splicing variant conserved between human and dog, and the detection of positive selection across eutherian mammals, may be of importance for further research on ABAD/HSD10 function and its implication in the Alzheimer's disease. PMID:16899120

Comparative sequence analysis of the X-inactivation center region in mouse, human, and bovine.

PubMed

Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

2002-06-01

We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5' of Xist that was recently shown to attract histone modification early after the onset of X inactivation.
Advanced Design of Dumbbell-shaped Genetic Minimal Vectors Improves Non-coding and Coding RNA Expression.

PubMed

Jiang, Xiaoou; Yu, Han; Teo, Cui Rong; Tan, Genim Siu Xian; Goh, Sok Chin; Patel, Parasvi; Chua, Yiqiang Kevin; Hameed, Nasirah Banu Sahul; Bertoletti, Antonio; Patzel, Volker

2016-09-01

Dumbbell-shaped DNA minimal vectors lacking nontherapeutic genes and bacterial sequences are considered a stable, safe alternative to viral, nonviral, and naked plasmid-based gene-transfer systems. We investigated novel molecular features of dumbbell vectors aiming to reduce vector size and to improve the expression of noncoding or coding RNA. We minimized small hairpin RNA (shRNA) or microRNA (miRNA) expressing dumbbell vectors in size down to 130 bp generating the smallest genetic expression vectors reported. This was achieved by using a minimal H1 promoter with integrated transcriptional terminator transcribing the RNA hairpin structure around the dumbbell loop. Such vectors were generated with high conversion yields using a novel protocol. Minimized shRNA-expressing dumbbells showed accelerated kinetics of delivery and transcription leading to enhanced gene silencing in human tissue culture cells. In primary human T cells, minimized miRNA-expressing dumbbells revealed higher stability and triggered stronger target gene suppression as compared with plasmids and miRNA mimics. Dumbbell-driven gene expression was enhanced up to 56- or 160-fold by implementation of an intron and the SV40 enhancer compared with control dumbbells or plasmids. Advanced dumbbell vectors may represent one option to close the gap between durable expression that is achievable with integrating viral vectors and short-term effects triggered by naked RNA.
Transcription regulation by distal enhancers: who's in the loop?

PubMed

Stadhouders, Ralph; van den Heuvel, Anita; Kolovos, Petros; Jorna, Ruud; Leslie, Kris; Grosveld, Frank; Soler, Eric

2012-01-01

Genome-wide chromatin profiling efforts have shown that enhancers are often located at large distances from gene promoters within the noncoding genome. Whereas enhancers can stimulate transcription initiation by communicating with promoters via chromatin looping mechanisms, we propose that enhancers may also stimulate transcription elongation by physical interactions with intronic elements. We review here recent findings derived from the study of the hematopoietic system.
Elevated Rate of Fixation of Endogenous Retroviral Elements in Haplorhini TRIM5 and TRIM22 Genomic Sequences: Impact on Transcriptional Regulation

PubMed Central

Diehl, William E.; Johnson, Welkin E.; Hunter, Eric

2013-01-01

All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. Evolutionary studies involving the TRIM6/34/5/22 locus have predominantly focused on the coding sequence of the genes, finding that TRIM5 and TRIM22 have undergone high rates of both non-synonymous nucleotide replacements and in-frame insertions and deletions. We sought to understand if divergent evolutionary pressures on TRIM6/34/5/22 coding regions have selected for modifications in the non-coding regions of these genes and explore whether such non-coding changes may influence the biological function of these genes. The transcribed genomic regions, including the introns, of TRIM6, TRIM34, TRIM5, and TRIM22 from ten Haplorhini primates and one prosimian species were analyzed for transposable element content. In Haplorhini species, TRIM5 displayed an exaggerated interspecies variability, predominantly resulting from changes in the composition of transposable elements in the large first and fourth introns. Multiple lineage-specific endogenous retroviral long terminal repeats (LTRs) were identified in the first intron of TRIM5 and TRIM22. In the prosimian genome, we identified a duplication of TRIM5 with a concomitant loss of TRIM22. The transposable element content of the prosimian TRIM5 genes appears to largely represent the shared Haplorhini/prosimian ancestral state for this gene. Furthermore, we demonstrated that one such differentially fixed LTR provides for species-specific transcriptional regulation of TRIM22 in response to p53 activation. Our results identify a previously unrecognized source of species-specific variation in the antiviral TRIM genes, which can lead to alterations in their transcriptional regulation. These observations suggest that there has existed long-term pressure for exaptation of retroviral LTRs in the non-coding regions of these genes. This likely resulted from serial viral challenges and provided a mechanism for rapid alteration of transcriptional regulation. To our knowledge, this represents the first report of persistent evolutionary pressure for the capture of retroviral LTR insertions. PMID:23516500
On splice site prediction using weight array models: a comparison of smoothing techniques

NASA Astrophysics Data System (ADS)

Taher, Leila; Meinicke, Peter; Morgenstern, Burkhard

2007-11-01

In most eukaryotic genes, protein-coding exons are separated by non-coding introns which are removed from the primary transcript by a process called "splicing". The positions where introns are cut and exons are spliced together are called "splice sites". Thus, computational prediction of splice sites is crucial for gene finding in eukaryotes. Weight array models are a powerful probabilistic approach to splice site detection. Parameters for these models are usually derived from m-tuple frequencies in trusted training data and subsequently smoothed to avoid zero probabilities. In this study we compare three different ways of parameter estimation for m-tuple frequencies, namely (a) non-smoothed probability estimation, (b) standard pseudo counts and (c) a Gaussian smoothing procedure that we recently developed.
New encoded single-indicator sequences based on physico-chemical parameters for efficient exon identification.

PubMed

Meher, J K; Meher, P K; Dash, G N; Raval, M K

2012-01-01

The first step in gene identification problem based on genomic signal processing is to convert character strings into numerical sequences. These numerical sequences are then analysed spectrally or using digital filtering techniques for the period-3 peaks, which are present in exons (coding areas) and absent in introns (non-coding areas). In this paper, we have shown that single-indicator sequences can be generated by encoding schemes based on physico-chemical properties. Two new methods are proposed for generating single-indicator sequences based on hydration energy and dipole moments. The proposed methods produce high peak at exon locations and effectively suppress false exons (intron regions having greater peak than exon regions) resulting in high discriminating factor, sensitivity and specificity.
X-linked hypophosphatemia attributable to pseudoexons of the PHEX gene.

PubMed

Christie, P T; Harding, B; Nesbit, M A; Whyte, M P; Thakker, R V

2001-08-01

X-linked hypophosphatemia is commonly caused by mutations of the coding region of PHEX (phosphate-regulating gene with homologies to endopeptidases on the X chromosome). However, such PHEX mutations are not detected in approximately one third of X-linked hypophosphatemia patients who may harbor defects in the noncoding or intronic regions. We have therefore investigated 11 unrelated X-linked hypophosphatemia patients in whom coding region mutations had been excluded, for intronic mutations that may lead to mRNA splicing abnormalities, by the use of lymphoblastoid RNA and RT-PCRs. One X-linked hypophosphatemia patient was found to have 3 abnormally large transcripts, resulting from 51-bp, 100-bp, and 170-bp insertions, all of which would lead to missense peptides and premature termination codons. The origin of these transcripts was a mutation (g to t) at position +1268 of intron 7, which resulted in the occurrence of a high quality novel donor splice site (ggaagg to gtaagg). Splicing between this novel donor splice site and 3 preexisting, but normally silent, acceptor splice sites within intron 7 resulted in the occurrences of the 3 pseudoexons. This represents the first report of PHEX pseudoexons and reveals further the diversity of genetic abnormalities causing X-linked hypophosphatemia.
[Exon-intron structure of the fet5+ gene of Schizosaccharomyces pombe and physical mapping of genome encompassing regions].

PubMed

Shpakovskiĭ, G V; Lebedenko, E N

1998-01-01

Plasmid pYUK3 bearing the fet5+ gene of Schizosaccharomyces pombe was isolated from a genomic library of the fission yeast, and a detailed physical map of the whole genomic insert (ca. 9.6 Kbp) was constructed. The primary structure of the fet5+ gene and its flanking regions is established. The gene contains a single 45-bp intron in its distal part. A typical TATA-box (TATAAG) was found in the 5'-noncoding region ca. 50 bp upstream of the putative start of transcription, and the 3'-noncoding region contains AT-rich palindromes, which are probably involved in termination of the fet5+ transcription. A previously unidentified gene of Sz. pombe encoding a protein with some similarity to one of the transcriptional activators from the TBP (TATA-binding protein) group of SPT factors of transcription was found in the vicinity of the fet5+ gene. Taking into account that cDNA of the fet5(+)-gene was isolated as a suppressor of the genetic-defect of nuclear RNA polymerases I-III (Bioorg. Khim., 1997, vol. 23, No 3, pp. 234-237), this vicinity may be the first evidence of possible clustering, in the genome of the fission yeast, of genes participating in transcription regulation.
Common Variants in Cardiac Ion Channel Genes are Associated with Sudden Cardiac Death

PubMed Central

Albert, Christine M.; MacRae, Calum A.; Chasman, Daniel I.; VanDenburgh, Martin; Buring, Julie E; Manson, JoAnn E; Cook, Nancy R; Newton-Cheh, Christopher

2010-01-01

Background Rare variants in cardiac ion channel genes are associated with sudden cardiac death (SCD) in rare primary arrhythmic syndromes; however, it is unknown whether common variation in these same genes may contribute to SCD risk at the population level. Methods and Results We examined the association between 147 single nucleotide polymorphisms (SNPs) (137 tag, 5 non-coding SNPs associated with QT interval duration and 5 nonsynonymous SNPs) in 5 cardiac ion channel genes, KCNQ1, KCNH2, SCN5A, KCNE1 and KCNE2 and sudden and/or arrhythmic death in a combined nested case-control analysis among 516 cases and 1522 matched controls of European ancestry enrolled in six prospective cohort studies. After accounting for multiple testing, two SNPs (rs2283222 located in intron 11 in KCNQ1 and rs11720524 located in intron 1 in SCN5A) remained significantly associated with sudden/arrhythmic death (FDR = 0.01 and 0.03 respectively). Each increasing copy of the major T allele of rs2283222 or the major C allele of rs1172052 was associated with an OR = 1.36 (95% CI 1.16-1.60, P=0.0002) and 1.30 (95% CI 1.12-1.51, P=0.0005) respectively. Control for cardiovascular risk factors and/or limiting the analysis to definite SCDs did not significantly alter these relationships. Conclusion In this combined analysis of 6 prospective cohort studies, two common intronic variants in KCNQ1 and SCN5A were associated with SCD in individuals of European ancestry. Further study in other populations and investigation into the functional abnormalities associated with non-coding variation in these genes may lead to important insights into predisposition to lethal arrhythmias. PMID:20400777
Endogenous siRNAs and noncoding RNA-derived small RNAs are expressed in adult mouse hippocampus and are up-regulated in olfactory discrimination training.

PubMed

Smalheiser, Neil R; Lugli, Giovanni; Thimmapuram, Jyothi; Cook, Edwin H; Larson, John

2011-01-01

We previously proposed that endogenous siRNAs may regulate synaptic plasticity and long-term gene expression in the mammalian brain. Here, a hippocampal-dependent task was employed in which adult mice were trained to execute a nose-poke in a port containing one of two simultaneously present odors in order to obtain a reward. Mice demonstrating olfactory discrimination training were compared to pseudo-training and nose-poke control groups; size-selected hippocampal RNA was subjected to Illumina deep sequencing. Sequences that aligned uniquely and exactly to the genome without uncertain nucleotide assignments, within exons or introns of MGI annotated genes, were examined further. The data confirm that small RNAs having features of endogenous siRNAs are expressed in brain; that many of them derive from genes that regulate synaptic plasticity (and have been implicated in neuropsychiatric diseases); and that hairpin-derived endo-siRNAs and the 20- to 23-nt size class of small RNAs show a significant increase during an early stage of training. The most abundant putative siRNAs arose from an intronic inverted repeat within the SynGAP1 locus; this inverted repeat was a substrate for dicer in vitro, and SynGAP1 siRNA was specifically associated with Argonaute proteins in vivo. Unexpectedly, a dramatic increase with training (more than 100-fold) was observed for a class of 25- to 30-nt small RNAs derived from specific sites within snoRNAs and abundant noncoding RNAs (Y1 RNA, RNA component of mitochondrial RNAse P, 28S rRNA, and 18S rRNA). Further studies are warranted to characterize the role(s) played by endogenous siRNAs and noncoding RNA-derived small RNAs in learning and memory.
Nucleotide sequence of the COX1 gene in Kluyveromyces lactis mitochondrial DNA: evidence for recent horizontal transfer of a group II intron.

PubMed

Hardy, C M; Clark-Walker, G D

1991-07-01

The cytochrome oxidase subunit 1 gene (COX1) in K. lactis K8 mtDNA spans 8,826 bp and contains five exons (termed E1-E5) totalling 1,602 bp that show 88% nucleotide base matching and 91% amino acid homology to the equivalent gene in S. cerevisiae. The four introns (termed K1 cox1.1-1.4) contain open reading frames encoding proteins of 786, 333, 319 and 395 amino acids respectively that potentially encode maturase enzymes. The first intron belongs to group II whereas the remaining three are group I type B. Introns K1 cox1.1, 1.3, and 1.4 are found at identical locations to introns Sc cox1.2, 1.5 a, and 1.5 b respectively from S. cerevisiae. Horizontal transfer of an intron between recent progenitors of K. lactis and S. cerevisiae is suggested by the observation that K1 cox1.1 and Sc cox1.2 show 96% base matching. Sequence comparisons between K1 cox1.3/Sc cox1.5 a and K1 cox1.4/Sc cox1.5 b suggest that these introns are likely to have been present in the ancestral COX1 gene of these yeasts. Intron K1 cox1.2 is not found in S. cerevisiae and appears at an unique location in K. lactis. A feature of the DNA sequences of the group I introns K1 cox1.2, 1.3, and 1.4 is the presence of 11 GC-rich clusters inserted into both coding and noncoding regions. Immediately downstream of the COX1 gene is the ATPase subunit 8 gene (A8) that shows 82.6% base matching to its counterpart in S. cerevisiae mtDNA.
Upregulated long non-coding RNA SPRY4-IT1 predicts dismal prognosis for pancreatic ductal adenocarcinoma and regulates cell proliferation and apoptosis.

PubMed

Yao, Yue; Gao, Ping; Chen, Lili; Wang, Wei; Zhang, Jinchao; Li, Qiang; Xu, Yi

2018-06-15

Recently, long noncoding RNAs (lncRNAs) have been emerged as pivotal regulators in various human cancers, including pancreatic ductal adenocarcinoma (PDAC). SPRY4-intronic transcript 1 (SPRY4-IT1) was reported to be upregulated in some kind of human cancers. Here, we elucidated the biological functions and possible clinical values of SPRY4-IT1 on PDAC. In present study, expression of SPRY4-IT1 in PDAC tissues and corresponding normal tissues were explored by qRT-PCR experiments. The link between SPRY4-IT1 expression levels and clinicopathological significance was further analyzed. In addition, the oncogenic role of SPRY4-IT1 was detected both in vitro and in vivo. The results demonstrated that SPRY4-IT1 was abnormally upregulated in PDAC tissues and cell lines. Tumor stage and differentiation grade was closely correlated with SPRY4-IT1 expression. Additionally, decreased SPRY4-IT1 contributed to tumor suppressive effect through attenuating cell growth, clonogenic ability and facilitating apoptosis via Bcl-2/caspase-3 pathway in PANC1 and Capan-2 cells. Furthermore, the xenograft study confirmed the tumor proliferation-promoting role of SPRY4-IT1 in PANC1 cells. Taken together, these findings indicated that SPRY4-IT1 is a potential therapeutic target and prognosis biomarker for the patients with PDAC. Copyright © 2018. Published by Elsevier B.V.
Identification of human short introns

PubMed Central

Abebrese, Emmanuel L.; Arnold, Zachary R.; Armstrong, Katharine; Burns, Lindsay; Day, R. Thomas; Hsu, Daniel G.; Jarrell, Katherine; Luo, Yi; Mugayo, Daphine

2017-01-01

Canonical pre-mRNA splicing requires snRNPs and associated splicing factors to excise conserved intronic sequences, with a minimum intron length required for efficient splicing. Non-canonical splicing–intron excision without the spliceosome–has been documented; most notably, some tRNAs and the XBP1 mRNA contain short introns that are not removed by the spliceosome. There have been some efforts to identify additional short introns, but little is known about how many short introns are processed from mRNAs. Here, we report an approach to identify RNA short introns from RNA-Seq data, discriminating against small genomic deletions. We identify hundreds of short introns conserved among multiple human cell lines. These short introns are often alternatively spliced and are found in a variety of RNAs–both mRNAs and lncRNAs. Short intron splicing efficiency is increased by secondary structure, and we detect both canonical and non-canonical short introns. In many cases, splicing of these short introns from mRNAs is predicted to alter the reading frame and change protein output. Our findings imply that standard gene prediction models which often assume a lower limit for intron size fail to predict short introns effectively. We conclude that short introns are abundant in the human transcriptome, and short intron splicing represents an added layer to mRNA regulation. PMID:28520720
Mitochondrial genomes of the green macroalga Ulva pertusa (Ulvophyceae, Chlorophyta): novel insights into the evolution of mitogenomes in the Ulvophyceae.

PubMed

Liu, Feng; Melton, James T; Bi, Yuping

2017-10-01

To further understand the trends in the evolution of mitochondrial genomes (mitogenomes or mtDNAs) in the Ulvophyceae, the mitogenomes of two separate thalli of Ulva pertusa were sequenced. Two U. pertusa mitogenomes (Up1 and Up2) were 69,333 bp and 64,602 bp in length. These mitogenomes shared two ribosomal RNAs (rRNAs), 28 transfer RNAs (tRNAs), 29 protein-coding genes, and 12 open reading frames. The 4.7 kb difference in size was attributed to variation in intron content and tandem repeat regions. A total of six introns were present in the smaller U. pertusa mtDNA (Up2), while the larger mtDNA (Up1) had eight. The larger mtDNA had two additional group II introns in two genes (cox1 and cox2) and tandem duplication mutations in noncoding regions. Our results showed the first case of intraspecific variation in chlorophytan mitogenomes and provided further genomic data for the undersampled Ulvophyceae. © 2017 Phycological Society of America.
The non-coding RNA landscape of human hematopoiesis and leukemia.

PubMed

Schwarzer, Adrian; Emmrich, Stephan; Schmidt, Franziska; Beck, Dominik; Ng, Michelle; Reimer, Christina; Adams, Felix Ferdinand; Grasedieck, Sarah; Witte, Damian; Käbler, Sebastian; Wong, Jason W H; Shah, Anushi; Huang, Yizhou; Jammal, Razan; Maroz, Aliaksandra; Jongen-Lavrencic, Mojca; Schambach, Axel; Kuchenbauer, Florian; Pimanda, John E; Reinhardt, Dirk; Heckl, Dirk; Klusmann, Jan-Henning

2017-08-09

Non-coding RNAs have emerged as crucial regulators of gene expression and cell fate decisions. However, their expression patterns and regulatory functions during normal and malignant human hematopoiesis are incompletely understood. Here we present a comprehensive resource defining the non-coding RNA landscape of the human hematopoietic system. Based on highly specific non-coding RNA expression portraits per blood cell population, we identify unique fingerprint non-coding RNAs-such as LINC00173 in granulocytes-and assign these to critical regulatory circuits involved in blood homeostasis. Following the incorporation of acute myeloid leukemia samples into the landscape, we further uncover prognostically relevant non-coding RNA stem cell signatures shared between acute myeloid leukemia blasts and healthy hematopoietic stem cells. Our findings highlight the importance of the non-coding transcriptome in the formation and maintenance of the human blood hierarchy.While micro-RNAs are known regulators of haematopoiesis and leukemogenesis, the role of long non-coding RNAs is less clear. Here the authors provide a non-coding RNA expression landscape of the human hematopoietic system, highlighting their role in the formation and maintenance of the human blood hierarchy.
Parallel computation of genome-scale RNA secondary structure to detect structural constraints on human genome.

PubMed

Kawaguchi, Risa; Kiryu, Hisanori

2016-05-06

RNA secondary structure around splice sites is known to assist normal splicing by promoting spliceosome recognition. However, analyzing the structural properties of entire intronic regions or pre-mRNA sequences has been difficult hitherto, owing to serious experimental and computational limitations, such as low read coverage and numerical problems. Our novel software, "ParasoR", is designed to run on a computer cluster and enables the exact computation of various structural features of long RNA sequences under the constraint of maximal base-pairing distance. ParasoR divides dynamic programming (DP) matrices into smaller pieces, such that each piece can be computed by a separate computer node without losing the connectivity information between the pieces. ParasoR directly computes the ratios of DP variables to avoid the reduction of numerical precision caused by the cancellation of a large number of Boltzmann factors. The structural preferences of mRNAs computed by ParasoR shows a high concordance with those determined by high-throughput sequencing analyses. Using ParasoR, we investigated the global structural preferences of transcribed regions in the human genome. A genome-wide folding simulation indicated that transcribed regions are significantly more structural than intergenic regions after removing repeat sequences and k-mer frequency bias. In particular, we observed a highly significant preference for base pairing over entire intronic regions as compared to their antisense sequences, as well as to intergenic regions. A comparison between pre-mRNAs and mRNAs showed that coding regions become more accessible after splicing, indicating constraints for translational efficiency. Such changes are correlated with gene expression levels, as well as GC content, and are enriched among genes associated with cytoskeleton and kinase functions. We have shown that ParasoR is very useful for analyzing the structural properties of long RNA sequences such as mRNAs, pre-mRNAs, and long non-coding RNAs whose lengths can be more than a million bases in the human genome. In our analyses, transcribed regions including introns are indicated to be subject to various types of structural constraints that cannot be explained from simple sequence composition biases. ParasoR is freely available at https://github.com/carushi/ParasoR .
Microprocessor mediates transcriptional termination in long noncoding microRNA genes

PubMed Central

Dhir, Ashish; Dhir, Somdutta; Proudfoot, Nick J.; Jopling, Catherine L.

2015-01-01

MicroRNA (miRNA) play a major role in the post-transcriptional regulation of gene expression. Mammalian miRNA biogenesis begins with co-transcriptional cleavage of RNA polymerase II (Pol II) transcripts by the Microprocessor complex. While most miRNA are located within introns of protein coding genes, a substantial minority of miRNA originate from long non coding (lnc) RNA where transcript processing is largely uncharacterized. We show, by detailed characterization of liver-specific lnc-pri-miR-122 and genome-wide analysis in human cell lines, that most lnc-pri-miRNA do not use the canonical cleavage and polyadenylation (CPA) pathway, but instead use Microprocessor cleavage to terminate transcription. This Microprocessor inactivation leads to extensive transcriptional readthrough of lnc-pri-miRNA and transcriptional interference with downstream genes. Consequently we define a novel RNase III-mediated, polyadenylation-independent mechanism of Pol II transcription termination in mammalian cells. PMID:25730776
New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes.

PubMed

Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou

2011-11-01

Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.
New genetic variants of LATS1 detected in urinary bladder and colon cancer.

PubMed

Saadeldin, Mona K; Shawer, Heba; Mostafa, Ahmed; Kassem, Neemat M; Amleh, Asma; Siam, Rania

2014-01-01

LATS1, the large tumor suppressor 1 gene, encodes for a serine/threonine kinase protein and is implicated in cell cycle progression. LATS1 is down-regulated in various human cancers, such as breast cancer, and astrocytoma. Point mutations in LATS1 were reported in human sarcomas. Additionally, loss of heterozygosity of LATS1 chromosomal region predisposes to breast, ovarian, and cervical tumors. In the current study, we investigated LATS1 genetic variations including single nucleotide polymorphisms (SNPs), in 28 Egyptian patients with either urinary bladder or colon cancers. The LATS1 gene was amplified and sequenced and the expression of LATS1 at the RNA level was assessed in 12 urinary bladder cancer samples. We report, the identification of a total of 29 variants including previously identified SNPs within LATS1 coding and non-coding sequences. A total of 18 variants were novel. Majority of the novel variants, 13, were mapped to intronic sequences and un-translated regions of the gene. Four of the five novel variants located in the coding region of the gene, represented missense mutations within the serine/threonine kinase catalytic domain. Interestingly, LATS1 RNA steady state levels was lost in urinary bladder cancerous tissue harboring four specific SNPs (16045 + 41736 + 34614 + 56177) positioned in the 5'UTR, intron 6, and two silent mutations within exon 4 and exon 8, respectively. This study identifies novel single-base-sequence alterations in the LATS1 gene. These newly identified variants could potentially be used as novel diagnostic or prognostic tools in cancer.
Obesity-associated variants within FTO form long-range functional connections with IRX3.

PubMed

Smemo, Scott; Tena, Juan J; Kim, Kyoung-Han; Gamazon, Eric R; Sakabe, Noboru J; Gómez-Marín, Carlos; Aneas, Ivy; Credidio, Flavia L; Sobreira, Débora R; Wasserman, Nora F; Lee, Ju Hee; Puviindran, Vijitha; Tam, Davis; Shen, Michael; Son, Joe Eun; Vakili, Niki Alizadeh; Sung, Hoon-Ki; Naranjo, Silvia; Acemel, Rafael D; Manzanares, Miguel; Nagy, Andras; Cox, Nancy J; Hui, Chi-Chung; Gomez-Skarmeta, Jose Luis; Nóbrega, Marcelo A

2014-03-20

Genome-wide association studies (GWAS) have reproducibly associated variants within introns of FTO with increased risk for obesity and type 2 diabetes (T2D). Although the molecular mechanisms linking these noncoding variants with obesity are not immediately obvious, subsequent studies in mice demonstrated that FTO expression levels influence body mass and composition phenotypes. However, no direct connection between the obesity-associated variants and FTO expression or function has been made. Here we show that the obesity-associated noncoding sequences within FTO are functionally connected, at megabase distances, with the homeobox gene IRX3. The obesity-associated FTO region directly interacts with the promoters of IRX3 as well as FTO in the human, mouse and zebrafish genomes. Furthermore, long-range enhancers within this region recapitulate aspects of IRX3 expression, suggesting that the obesity-associated interval belongs to the regulatory landscape of IRX3. Consistent with this, obesity-associated single nucleotide polymorphisms are associated with expression of IRX3, but not FTO, in human brains. A direct link between IRX3 expression and regulation of body mass and composition is demonstrated by a reduction in body weight of 25 to 30% in Irx3-deficient mice, primarily through the loss of fat mass and increase in basal metabolic rate with browning of white adipose tissue. Finally, hypothalamic expression of a dominant-negative form of Irx3 reproduces the metabolic phenotypes of Irx3-deficient mice. Our data suggest that IRX3 is a functional long-range target of obesity-associated variants within FTO and represents a novel determinant of body mass and composition.

Non-Coding Keratin Variants Associate with Liver Fibrosis Progression in Patients with Hemochromatosis

PubMed Central

Lunova, Mariia; Guldiken, Nurdan; Lienau, Tim C.; Stickel, Felix; Omary, M. Bishr

2012-01-01

Background Keratins 8 and 18 (K8/K18) are intermediate filament proteins that protect the liver from various forms of injury. Exonic K8/K18 variants associate with adverse outcome in acute liver failure and with liver fibrosis progression in patients with chronic hepatitis C infection or primary biliary cirrhosis. Given the association of K8/K18 variants with end-stage liver disease and progression in several chronic liver disorders, we studied the importance of keratin variants in patients with hemochromatosis. Methods The entire K8/K18 exonic regions were analyzed in 162 hemochromatosis patients carrying homozygous C282Y HFE (hemochromatosis gene) mutations. 234 liver-healthy subjects were used as controls. Exonic regions were PCR-amplified and analyzed using denaturing high-performance liquid chromatography and DNA sequencing. Previously-generated transgenic mice overexpressing K8 G62C were studied for their susceptibility to iron overload. Susceptibility to iron toxicity of primary hepatocytes that express K8 wild-type and G62C was also assessed. Results We identified amino-acid-altering keratin heterozygous variants in 10 of 162 hemochromatosis patients (6.2%) and non-coding heterozygous variants in 6 additional patients (3.7%). Two novel K8 variants (Q169E/R275W) were found. K8 R341H was the most common amino-acid altering variant (4 patients), and exclusively associated with an intronic KRT8 IVS7+10delC deletion. Intronic, but not amino-acid-altering variants associated with the development of liver fibrosis. In mice, or ex vivo, the K8 G62C variant did not affect iron-accumulation in response to iron-rich diet or the extent of iron-induced hepatocellular injury. Conclusion In patients with hemochromatosis, intronic but not exonic K8/K18 variants associate with liver fibrosis development. PMID:22412904
High-throughput sequencing of the entire genomic regions of CCM1/KRIT1, CCM2 and CCM3/PDCD10 to search for pathogenic deep-intronic splice mutations in cerebral cavernous malformations.

PubMed

Rath, Matthias; Jenssen, Sönke E; Schwefel, Konrad; Spiegler, Stefanie; Kleimeier, Dana; Sperling, Christian; Kaderali, Lars; Felbor, Ute

2017-09-01

Cerebral cavernous malformations (CCM) are vascular lesions of the central nervous system that can cause headaches, seizures and hemorrhagic stroke. Disease-associated mutations have been identified in three genes: CCM1/KRIT1, CCM2 and CCM3/PDCD10. The precise proportion of deep-intronic variants in these genes and their clinical relevance is yet unknown. Here, a long-range PCR (LR-PCR) approach for target enrichment of the entire genomic regions of the three genes was combined with next generation sequencing (NGS) to screen for coding and non-coding variants. NGS detected all six CCM1/KRIT1, two CCM2 and four CCM3/PDCD10 mutations that had previously been identified by Sanger sequencing. Two of the pathogenic variants presented here are novel. Additionally, 20 stringently selected CCM index cases that had remained mutation-negative after conventional sequencing and exclusion of copy number variations were screened for deep-intronic mutations. The combination of bioinformatics filtering and transcript analyses did not reveal any deep-intronic splice mutations in these cases. Our results demonstrate that target enrichment by LR-PCR combined with NGS can be used for a comprehensive analysis of the entire genomic regions of the CCM genes in a research context. However, its clinical utility is limited as deep-intronic splice mutations in CCM1/KRIT1, CCM2 and CCM3/PDCD10 seem to be rather rare. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Ectopic expression of miR-126*, an intronic product of the vascular endothelial EGF-like 7 gene, regulates prostein translation and invasiveness of prostate cancer LNCaP cells.

PubMed

Musiyenko, Alla; Bitko, Vira; Barik, Sailen

2008-03-01

MicroRNAs (miRNAs) are endogenous noncoding RNAs that down-regulate gene expression by promoting cleavage or translational arrest of target mRNAs. While most miRNAs are transcribed from their own dedicated genes, some map to introns of 'host' transcripts, the biological significance of which remains unknown. Here, we show that prostate cells are naturally devoid of EGF-like domain 7 (Egfl7) transcripts and hence also deficient in a miRNA, miR-126*, generated from splicing and processing of its ninth intron. Use of recombinant and synthetic miRNAs or a specific antagomir established a role of miR-126* in silencing prostein in non-endothelial cells. We mapped two miR-126*-binding sites in the 3'UTR of the prostein mRNA required for translational repression. Transfection of synthetic miR-126* into prostate cancer LNCaP cells strongly reduced the translation of prostein. Interestingly, loss of prostein correlated with reduction of LNCaP cell migration and invasion. Thus, the robust expression of prostein protein in the prostate cells results from a combination of transcriptional activation of the prostein gene and absence of intronic miRNA-126* due to the prostate-specific repression of the Egfl7 gene. We conclude that intronic miRNAs from tissue-specific transcripts, or their natural absence, make cardinal contributions to cellular gene expression and phenotype. These findings also open the door to tissue-specific miRNA therapy.
Determinism and randomness in the evolution of introns and sine inserts in mouse and human mitochondrial solute carrier and cytokine receptor genes.

PubMed

Cianciulli, Antonia; Calvello, Rosa; Panaro, Maria A

2015-04-01

In the homologous genes studied, the exons and introns alternated in the same order in mouse and human. We studied, in both species: corresponding short segments of introns, whole corresponding introns and complete homologous genes. We considered the total number of nucleotides and the number and orientation of the SINE inserts. Comparisons of mouse and human data series showed that at the level of individual relatively short segments of intronic sequences the stochastic variability prevails in the local structuring, but at higher levels of organization a deterministic component emerges, conserved in mouse and human during the divergent evolution, despite the ample re-editing of the intronic sequences and the fact that processes such as SINE spread had taken place in an independent way in the two species. Intron conservation is negatively correlated with the SINE occupancy, suggesting that virus inserts interfere with the conservation of the sequences inherited from the common ancestor. Copyright © 2015 Elsevier Ltd. All rights reserved.
Perspectives of Long Non-Coding RNAs in Cancer Diagnostics

PubMed Central

Reis, Eduardo M.; Verjovski-Almeida, Sergio

2012-01-01

Long non-coding RNAs (lncRNAs) transcribed from intergenic and intronic regions of the human genome constitute a broad class of cellular transcripts that are under intensive investigation. While only a handful of lncRNAs have been characterized, their involvement in fundamental cellular processes that control gene expression highlights a central role in cell homeostasis. Not surprisingly, aberrant expression of regulatory lncRNAs has been increasingly documented in different types of cancer, where they can mediate both oncogenic or tumor suppressor effects. Interaction with chromatin remodeling complexes that promote silencing of specific genes or modulation of splicing factor proteins seem to be two general modes of lncRNA regulation, but it is conceivable that additional mechanisms of action are yet to be unveiled. LncRNAs show greater tissue specificity compared to protein-coding mRNAs making them attractive in the search of novel diagnostics/prognostics cancer biomarkers in body fluid samples. In fact, lncRNA prostate cancer antigen 3 can be detected in urine samples and has been shown to improve diagnosis of prostate cancer. We suggest that an unbiased screening of the presence of RNAs in easily accessible body fluids such as serum and urine might reveal novel circulating lncRNAs as potential biomarkers in many types of cancer. Annotation and functional characterization of the lncRNA complement of the cancer transcriptome will conceivably provide new venues for early diagnosis and treatment of the disease. PMID:22408643
Functional Studies and In Silico Analyses to Evaluate Non-Coding Variants in Inherited Cardiomyopathies.

PubMed

Frisso, Giulia; Detta, Nicola; Coppola, Pamela; Mazzaccara, Cristina; Pricolo, Maria Rosaria; D'Onofrio, Antonio; Limongelli, Giuseppe; Calabrò, Raffaele; Salvatore, Francesco

2016-11-10

Point mutations are the most common cause of inherited diseases. Bioinformatics tools can help to predict the pathogenicity of mutations found during genetic screening, but they may work less well in determining the effect of point mutations in non-coding regions. In silico analysis of intronic variants can reveal their impact on the splicing process, but the consequence of a given substitution is generally not predictable. The aim of this study was to functionally test five intronic variants ( MYBPC3 -c.506-2A>C, MYBPC3 -c.906-7G>T, MYBPC3 -c.2308+3G>C, SCN5A -c.393-5C>A, and ACTC1 -c.617-7T>C) found in five patients affected by inherited cardiomyopathies in the attempt to verify their pathogenic role. Analysis of the MYBPC3 -c.506-2A>C mutation in mRNA from the peripheral blood of one of the patients affected by hypertrophic cardiac myopathy revealed the loss of the canonical splice site and the use of an alternative splicing site, which caused the loss of the first seven nucleotides of exon 5 ( MYBPC3 -G169AfsX14). In the other four patients, we generated minigene constructs and transfected them in HEK-293 cells. This minigene approach showed that MYBPC3 -c.2308+3G>C and SCN5A -c.393-5C>A altered pre-mRNA processing, thus resulting in the skipping of one exon. No alterations were found in either MYBPC3 -c.906-7G>T or ACTC1 -c.617-7T>C. In conclusion, functional in vitro analysis of the effects of potential splicing mutations can confirm or otherwise the putative pathogenicity of non-coding mutations, and thus help to guide the patient's clinical management and improve genetic counseling in affected families.
Authentication of Botanical Origin in Herbal Teas by Plastid Noncoding DNA Length Polymorphisms.

PubMed

Uncu, Ali Tevfik; Uncu, Ayse Ozgur; Frary, Anne; Doganlar, Sami

2015-07-01

The aim of this study was to develop a DNA barcode assay to authenticate the botanical origin of herbal teas. To reach this aim, we tested the efficiency of a PCR-capillary electrophoresis (PCR-CE) approach on commercial herbal tea samples using two noncoding plastid barcodes, the trnL intron and the intergenic spacer between trnL and trnF. Barcode DNA length polymorphisms proved successful in authenticating the species origin of herbal teas. We verified the validity of our approach by sequencing species-specific barcode amplicons from herbal tea samples. Moreover, we displayed the utility of PCR-CE assays coupled with sequencing to identify the origin of undeclared plant material in herbal tea samples. The PCR-CE assays proposed in this work can be applied as routine tests for the verification of botanical origin in herbal teas and can be extended to authenticate all types of herbal foodstuffs.
Noncoding sequence classification based on wavelet transform analysis: part I

NASA Astrophysics Data System (ADS)

Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.

2017-09-01

DNA sequences in human genome can be divided into the coding and noncoding ones. Coding sequences are those that are read during the transcription. The identification of coding sequences has been widely reported in literature due to its much-studied periodicity. Noncoding sequences represent the majority of the human genome. They play an important role in gene regulation and differentiation among the cells. However, noncoding sequences do not exhibit periodicities that correlate to their functions. The ENCODE (Encyclopedia of DNA elements) and Epigenomic Roadmap Project projects have cataloged the human noncoding sequences into specific functions. We study characteristics of noncoding sequences with wavelet analysis of genomic signals.
Extreme heterogeneity of polyadenylation sites in mRNAs encoding chloroplast RNA-binding proteins in Nicotiana plumbaginifolia.

PubMed

Klahre, U; Hemmings-Mieszczak, M; Filipowicz, W

1995-06-01

We have previously characterized nuclear cDNA clones encoding two RNA binding proteins, CP-RBP30 and CP-RBP-31, which are targeted to chloroplasts in Nicotiana plumbaginifolia. In this report we describe the analysis of the 3'-untranslated regions (3'-UTRs) in 22 CP-RBP30 and 8 CP-RBP31 clones which reveals that mRNAs encoding both proteins have a very complex polyadenylation pattern. Fourteen distinct poly(A) sites were identified among CP-RBP30 clones and four sites among the CP-RBP31 clones. The authenticity of the sites was confirmed by RNase A/T1 mapping of N. plumbaginifolia RNA. CP-RBP30 provides an extreme example of the heterogeneity known to be a feature of mRNA polyadenylation in higher plants. Using PCR we have demonstrated that CP-RBP genes in N. plumbaginifolia and N. sylvestris, in addition to the previously described introns interrupting the coding region, contain an intron located in the 3' non-coding part of the gene. In the case of the CP-RBP31, we have identified one polyadenylation event occurring in this intron.
Long Non-Coding RNAs Differentially Expressed between Normal versus Primary Breast Tumor Tissues Disclose Converse Changes to Breast Cancer-Related Protein-Coding Genes

PubMed Central

Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U.; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N.; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O.

2014-01-01

Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes. PMID:25264628
Long non-coding RNAs differentially expressed between normal versus primary breast tumor tissues disclose converse changes to breast cancer-related protein-coding genes.

PubMed

Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O

2014-01-01

Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes.
Comparative Analysis of Vertebrate Dystrophin Loci Indicate Intron Gigantism as a Common Feature

PubMed Central

Pozzoli, Uberto; Elgar, Greg; Cagliani, Rachele; Riva, Laura; Comi, Giacomo P.; Bresolin, Nereo; Bardoni, Alessandra; Sironi, Manuela

2003-01-01

The human DMD gene is the largest known to date, spanning > 2000 kb on the X chromosome. The gene size is mainly accounted for by huge intronic regions. We sequenced 190 kb of Fugu rubripes (pufferfish) genomic DNA corresponding to the complete dystrophin gene (FrDMD) and provide the first report of gene structure and sequence comparison among dystrophin genomic sequences from different vertebrate organisms. Almost all intron positions and phases are conserved between FrDMD and its mammalian counterparts, and the predicted protein product of the Fugu gene displays 55% identity and 71% similarity to human dystrophin. In analogy to the human gene, FrDMD presents several-fold longer than average intronic regions. Analysis of intron sequences of the human and murine genes revealed that they are extremely conserved in size and that a similar fraction of total intron length is represented by repetitive elements; moreover, our data indicate that intron expansion through repeat accumulation in the two orthologs is the result of independent insertional events. The hypothesis that intron length might be functionally relevant to the DMD gene regulation is proposed and substantiated by the finding that dystrophin intron gigantism is common to the three vertebrate genes. [Supplemental material is available online at www.genome.org.] PMID:12727896
Rare Noncoding Mutations Extend the Mutational Spectrum in the PGAP3 Subtype of Hyperphosphatasia with Mental Retardation Syndrome

PubMed Central

Knaus, Alexej; Awaya, Tomonari; Helbig, Ingo; Afawi, Zaid; Pendziwiat, Manuela; Abu‐Rachma, Jubran; Thompson, Miles D.; Cole, David E.; Skinner, Steve; Annese, Fran; Canham, Natalie; Schweiger, Michal R.; Robinson, Peter N.; Mundlos, Stefan; Kinoshita, Taroh; Munnich, Arnold

2016-01-01

ABSTRACT HPMRS or Mabry syndrome is a heterogeneous glycosylphosphatidylinositol (GPI) anchor deficiency that is caused by an impairment of synthesis or maturation of the GPI‐anchor. The expressivity of the clinical features in HPMRS varies from severe syndromic forms with multiple organ malformations to mild nonsyndromic intellectual disability. In about half of the patients with the clinical diagnosis of HPMRS, pathogenic mutations can be identified in the coding region in one of the six genes, one among them is PGAP3. In this work, we describe a screening approach with sequence specific baits for transcripts of genes of the GPI pathway that allows the detection of functionally relevant mutations also including introns and the 5′ and 3′ UTR. By this means, we also identified pathogenic noncoding mutations, which increases the diagnostic yield for HPMRS on the basis of intellectual disability and elevated serum alkaline phosphatase. In eight affected individuals from different ethnicities, we found seven novel pathogenic mutations in PGAP3. Besides five missense mutations, we identified an intronic mutation, c.558‐10G>A, that causes an aberrant splice product and a mutation in the 3′UTR, c.*559C>T, that is associated with substantially lower mRNA levels. We show that our novel screening approach is a useful rapid detection tool for alterations in genes coding for key components of the GPI pathway. PMID:27120253
Ferritin gene organization: differences between plants and animals suggest possible kingdom-specific selective constraints.

PubMed

Proudhon, D; Wei, J; Briat, J; Theil, E C

1996-03-01

Ferritin, a protein widespread in nature, concentrates iron approximately 10(11)-10(12)-fold above the solubility within a spherical shell of 24 subunits; it derives in plants and animals from a common ancestor (based on sequence) but displays a cytoplasmic location in animals compared to the plastid in contemporary plants. Ferritin gene regulation in plants and animals is altered by development, hormones, and excess iron; iron signals target DNA in plants but mRNA in animals. Evolution has thus conserved the two end points of ferritin gene expression, the physiological signals and the protein structure, while allowing some divergence of the genetic mechanisms. Comparison of ferritin gene organization in plants and animals, made possible by the cloning of a dicot (soybean) ferritin gene presented here and the recent cloning of two monocot (maize) ferritin genes, shows evolutionary divergence in ferritin gene organization between plants and animals but conservation among plants or among animals; divergence in the genetic mechanism for iron regulation is reflected by the absence in all three plant genes of the IRE, a highly conserved, noncoding sequence in vertebrate animal ferritin mRNA. In plant ferritin genes, the number of introns (n = 7) is higher than in animals (n = 3). Second, no intron positions are conserved when ferritin genes of plants and animals are compared, although all ferritin gene introns are in the coding region; within kingdoms, the intron positions in ferritin genes are conserved. Finally, secondary protein structure has no apparent relationship to intron/exon boundaries in plant ferritin genes, whereas in animal ferritin genes the correspondence is high. The structural differences in introns/exons among phylogenetically related ferritin coding sequences and the high conservation of the gene structure within plant or animal kingdoms of the gene structure within plant or animal kingdoms suggest that kingdom-specific functional constraints may exist to maintain a particular intron/exon pattern within ferritin genes. In the case of plants, where ferritin gene intron placement is unrelated to triplet codons or protein structure, and where ferritin is targeted to the plastid, the selection pressure on gene organization may relate to RNA function and plastid/nuclear signaling.
The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101

NASA Astrophysics Data System (ADS)

Pfreundt, Ulrike; Kopf, Matthias; Belkin, Natalia; Berman-Frank, Ilana; Hess, Wolfgang R.

2014-08-01

Blooms of the dinitrogen-fixing marine cyanobacterium Trichodesmium considerably contribute to new nitrogen inputs into tropical oceans. Intriguingly, only 60% of the Trichodesmium erythraeum IMS101 genome sequence codes for protein, compared with ~85% in other sequenced cyanobacterial genomes. The extensive non-coding genome fraction suggests space for an unusually high number of unidentified, potentially regulatory non-protein-coding RNAs (ncRNAs). To identify the transcribed fraction of the genome, here we present a genome-wide map of transcriptional start sites (TSS) at single nucleotide resolution, revealing the activity of 6,080 promoters. We demonstrate that T. erythraeum has the highest number of actively splicing group II introns and the highest percentage of TSS yielding ncRNAs of any bacterium examined to date. We identified a highly transcribed retroelement that serves as template repeat for the targeted mutation of at least 12 different genes by mutagenic homing. Our findings explain the non-coding portion of the T. erythraeum genome by the transcription of an unusually high number of non-coding transcripts in addition to the known high incidence of transposable elements. We conclude that riboregulation and RNA maturation-dependent processes constitute a major part of the Trichodesmium regulatory apparatus.
Differential expression of non-coding RNAs and continuous evolution of the X chromosome in testicular transcriptome of two mouse species.

PubMed

Homolka, David; Ivanek, Robert; Forejt, Jiri; Jansa, Petr

2011-02-14

Tight regulation of testicular gene expression is a prerequisite for male reproductive success, while differentiation of gene activity in spermatogenesis is important during speciation. Thus, comparison of testicular transcriptomes between closely related species can reveal unique regulatory patterns and shed light on evolutionary constraints separating the species. Here, we compared testicular transcriptomes of two closely related mouse species, Mus musculus and Mus spretus, which diverged more than one million years ago. We analyzed testicular expression using tiling arrays overlapping Chromosomes 2, X, Y and mitochondrial genome. An excess of differentially regulated non-coding RNAs was found on Chromosome 2 including the intronic antisense RNAs, intergenic RNAs and premature forms of Piwi-interacting RNAs (piRNAs). Moreover, striking difference was found in the expression of X-linked G6pdx gene, the parental gene of the autosomal retrogene G6pd2. The prevalence of non-coding RNAs among differentially expressed transcripts indicates their role in species-specific regulation of spermatogenesis. The postmeiotic expression of G6pdx in Mus spretus points towards the continuous evolution of X-chromosome silencing and provides an example of expression change accompanying the out-of-the X-chromosomal retroposition.
Molecular breakpoint cloning and gene expression studies of a novel translocation t(4;15)(q27;q11.2) associated with Prader-Willi syndrome

PubMed Central

Schüle, Birgitt; Albalwi, Mohammed; Northrop, Emma; Francis, David I; Rowell, Margaret; Slater, Howard R; Gardner, RJ McKinlay; Francke, Uta

2005-01-01

Background Prader-Willi syndrome (MIM #176270; PWS) is caused by lack of the paternally-derived copies, or their expression, of multiple genes in a 4 Mb region on chromosome 15q11.2. Known mechanisms include large deletions, maternal uniparental disomy or mutations involving the imprinting center. De novo balanced reciprocal translocations in 5 reported individuals had breakpoints clustering in SNRPN intron 2 or exon 20/intron 20. To further dissect the PWS phenotype and define the minimal critical region for PWS features, we have studied a 22 year old male with a milder PWS phenotype and a de novo translocation t(4;15)(q27;q11.2). Methods We used metaphase FISH to narrow the breakpoint region and molecular analyses to map the breakpoints on both chromosomes at the nucleotide level. The expression of genes on chromosome 15 on both sides of the breakpoint was determined by RT-PCR analyses. Results Pertinent clinical features include neonatal hypotonia with feeding difficulties, hypogonadism, short stature, late-onset obesity, learning difficulties, abnormal social behavior and marked tolerance to pain, as well as sticky saliva and narcolepsy. Relative macrocephaly and facial features are not typical for PWS. The translocation breakpoints were identified within SNRPN intron 17 and intron 10 of a spliced non-coding transcript in band 4q27. LINE and SINE sequences at the exchange points may have contributed to the translocation event. By RT-PCR of lymphoblasts and fibroblasts, we find that upstream SNURF/SNRPN exons and snoRNAs HBII-437 and HBII-13 are expressed, but the downstream snoRNAs PWCR1/HBII-85 and HBII-438A/B snoRNAs are not. Conclusion As part of the PWCR1/HBII-85 snoRNA cluster is highly conserved between human and mice, while no copy of HBII-438 has been found in mouse, we conclude that PWCR1/HBII-85 snoRNAs is likely to play a major role in the PWS- phenotype. PMID:15877813
Molecular breakpoint cloning and gene expression studies of a novel translocation t(4;15)(q27;q11.2) associated with Prader-Willi syndrome.

PubMed

Schüle, Birgitt; Albalwi, Mohammed; Northrop, Emma; Francis, David I; Rowell, Margaret; Slater, Howard R; Gardner, R J McKinlay; Francke, Uta

2005-05-06

Prader-Willi syndrome (MIM #176270; PWS) is caused by lack of the paternally-derived copies, or their expression, of multiple genes in a 4 Mb region on chromosome 15q11.2. Known mechanisms include large deletions, maternal uniparental disomy or mutations involving the imprinting center. De novo balanced reciprocal translocations in 5 reported individuals had breakpoints clustering in SNRPN intron 2 or exon 20/intron 20. To further dissect the PWS phenotype and define the minimal critical region for PWS features, we have studied a 22 year old male with a milder PWS phenotype and a de novo translocation t(4;15)(q27;q11.2). We used metaphase FISH to narrow the breakpoint region and molecular analyses to map the breakpoints on both chromosomes at the nucleotide level. The expression of genes on chromosome 15 on both sides of the breakpoint was determined by RT-PCR analyses. Pertinent clinical features include neonatal hypotonia with feeding difficulties, hypogonadism, short stature, late-onset obesity, learning difficulties, abnormal social behavior and marked tolerance to pain, as well as sticky saliva and narcolepsy. Relative macrocephaly and facial features are not typical for PWS. The translocation breakpoints were identified within SNRPN intron 17 and intron 10 of a spliced non-coding transcript in band 4q27. LINE and SINE sequences at the exchange points may have contributed to the translocation event. By RT-PCR of lymphoblasts and fibroblasts, we find that upstream SNURF/SNRPN exons and snoRNAs HBII-437 and HBII-13 are expressed, but the downstream snoRNAs PWCR1/HBII-85 and HBII-438A/B snoRNAs are not. As part of the PWCR1/HBII-85 snoRNA cluster is highly conserved between human and mice, while no copy of HBII-438 has been found in mouse, we conclude that PWCR1/HBII-85 snoRNAs is likely to play a major role in the PWS- phenotype.
Transposable Elements Are Major Contributors to the Origin, Diversification, and Regulation of Vertebrate Long Noncoding RNAs

PubMed Central

Kapusta, Aurélie; Zhuo, Xiaoyu; Ramsay, LeeAnn; Bourque, Guillaume; Yandell, Mark; Feschotte, Cédric

2013-01-01

Advances in vertebrate genomics have uncovered thousands of loci encoding long noncoding RNAs (lncRNAs). While progress has been made in elucidating the regulatory functions of lncRNAs, little is known about their origins and evolution. Here we explore the contribution of transposable elements (TEs) to the makeup and regulation of lncRNAs in human, mouse, and zebrafish. Surprisingly, TEs occur in more than two thirds of mature lncRNA transcripts and account for a substantial portion of total lncRNA sequence (∼30% in human), whereas they seldom occur in protein-coding transcripts. While TEs contribute less to lncRNA exons than expected, several TE families are strongly enriched in lncRNAs. There is also substantial interspecific variation in the coverage and types of TEs embedded in lncRNAs, partially reflecting differences in the TE landscapes of the genomes surveyed. In human, TE sequences in lncRNAs evolve under greater evolutionary constraint than their non–TE sequences, than their intronic TEs, or than random DNA. Consistent with functional constraint, we found that TEs contribute signals essential for the biogenesis of many lncRNAs, including ∼30,000 unique sites for transcription initiation, splicing, or polyadenylation in human. In addition, we identified ∼35,000 TEs marked as open chromatin located within 10 kb upstream of lncRNA genes. The density of these marks in one cell type correlate with elevated expression of the downstream lncRNA in the same cell type, suggesting that these TEs contribute to cis-regulation. These global trends are recapitulated in several lncRNAs with established functions. Finally a subset of TEs embedded in lncRNAs are subject to RNA editing and predicted to form secondary structures likely important for function. In conclusion, TEs are nearly ubiquitous in lncRNAs and have played an important role in the lineage-specific diversification of vertebrate lncRNA repertoires. PMID:23637635
Alternative splicing of a viral mirtron differentially affects the expression of other microRNAs from its cluster and of the host transcript

PubMed Central

Rasschaert, Perrine; Dambrine, Ginette; Rasschaert, Denis; Laurent, Sylvie

2016-01-01

ABSTRACT Interplay between alternative splicing and the Microprocessor may have differential effects on the expression of intronic miRNAs organized into clusters. We used a viral model — the LAT long non-coding RNA (LAT lncRNA) of Marek's disease oncogenic herpesvirus (MDV-1), which has the mdv1-miR-M8-M6-M7-M10 cluster embedded in its first intron — to assess the impact of splicing modifications on the biogenesis of each of the miRNAs from the cluster. Drosha silencing and alternative splicing of an extended exon 2 of the LAT lncRNA from a newly identified 3′ splice site (SS) at the end of the second miRNA of the cluster showed that mdv1-miR-M6 was a 5′-tailed mirtron. We have thus identified the first 5′-tailed mirtron within a cluster of miRNAs for which alternative splicing is directly associated with differential expression of the other miRNAs of the cluster, with an increase in intronic mdv1-miR-M8 expression and a decrease in expression of the exonic mdv1-miR-M7, and indirectly associated with regulation of the host transcript. According to the alternative 3SS used for the host intron splicing, the mdv1-miR-M6 is processed as a mirtron by the spliceosome, dispatching the other miRNAs of the cluster into intron and exon, or as a canonical miRNA by the Microprocessor complex. The viral mdv1-miR-M6 mirtron is the first mirtron described that can also follow the canonical pathway. PMID:27715458

Identification of a deep intronic mutation in the COL6A2 gene by a novel custom oligonucleotide CGH array designed to explore allelic and genetic heterogeneity in collagen VI-related myopathies

PubMed Central

2010-01-01

Background Molecular characterization of collagen-VI related myopathies currently relies on standard sequencing, which yields a detection rate approximating 75-79% in Ullrich congenital muscular dystrophy (UCMD) and 60-65% in Bethlem myopathy (BM) patients as PCR-based techniques tend to miss gross genomic rearrangements as well as copy number variations (CNVs) in both the coding sequence and intronic regions. Methods We have designed a custom oligonucleotide CGH array in order to investigate the presence of CNVs in the coding and non-coding regions of COL6A1, A2, A3, A5 and A6 genes and a group of genes functionally related to collagen VI. A cohort of 12 patients with UCMD/BM negative at sequencing analysis and 2 subjects carrying a single COL6 mutation whose clinical phenotype was not explicable by inheritance were selected and the occurrence of allelic and genetic heterogeneity explored. Results A deletion within intron 1A of the COL6A2 gene, occurring in compound heterozygosity with a small deletion in exon 28, previously detected by routine sequencing, was identified in a BM patient. RNA studies showed monoallelic transcription of the COL6A2 gene, thus elucidating the functional effect of the intronic deletion. No pathogenic mutations were identified in the remaining analyzed patients, either within COL6A genes, or in genes functionally related to collagen VI. Conclusions Our custom CGH array may represent a useful complementary diagnostic tool, especially in recessive forms of the disease, when only one mutant allele is detected by standard sequencing. The intronic deletion we identified represents the first example of a pure intronic mutation in COL6A genes. PMID:20302629
Comparative Sequence Analysis of the X-Inactivation Center Region in Mouse, Human, and Bovine

PubMed Central

Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

2002-01-01

We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5′ of Xist that was recently shown to attract histone modification early after the onset of X inactivation. [The sequence data described in this paper have been submitted to the EMBL data library under accession nos. AJ421478, AJ421479, AJ421480, and AJ421481. Online supplemental data are available at http://pbil.univ-lyon1.fr/datasets/Xic2002/data.html and www.genome.org.] PMID:12045143
Analysis of a new homozygous deletion in the tumor suppressor region at 3p12.3 reveals two novel intronic noncoding RNA genes.

PubMed

Angeloni, Debora; ter Elst, Arja; Wei, Ming Hui; van der Veen, Anneke Y; Braga, Eleonora A; Klimov, Eugene A; Timmer, Tineke; Korobeinikova, Luba; Lerman, Michael I; Buys, Charles H C M

2006-07-01

Homozygous deletions or loss of heterozygosity (LOH) at human chromosome band 3p12 are consistent features of lung and other malignancies, suggesting the presence of a tumor suppressor gene(s) (TSG) at this location. Only one gene has been cloned thus far from the overlapping region deleted in lung and breast cancer cell lines U2020, NCI H2198, and HCC38. It is DUTT1 (Deleted in U Twenty Twenty), also known as ROBO1, FLJ21882, and SAX3, according to HUGO. DUTT1, the human ortholog of the fly gene ROBO, has homology with NCAM proteins. Extensive analyses of DUTT1 in lung cancer have not revealed any mutations, suggesting that another gene(s) at this location could be of importance in lung cancer initiation and progression. Here, we report the discovery of a new, small, homozygous deletion in the small cell lung cancer (SCLC) cell line GLC20, nested in the overlapping, critical region. The deletion was delineated using several polymorphic markers and three overlapping P1 phage clones. Fiber-FISH experiments revealed the deletion was approximately 130 kb. Comparative genomic sequence analysis uncovered short sequence elements highly conserved among mammalian genomes and the chicken genome. The discovery of two EST clusters within the deleted region led to the isolation of two noncoding RNA (ncRNA) genes. These were subsequently found differentially expressed in various tumors when compared to their normal tissues. The ncRNA and other highly conserved sequence elements in the deleted region may represent miRNA targets of importance in cancer initiation or progression. Published 2006 Wiley-Liss, Inc.
Characterization of the intronic portion of cadherin superfamily members, common cancer orchestrators

PubMed Central

Oliveira, Patrícia; Sanges, Remo; Huntsman, David; Stupka, Elia; Oliveira, Carla

2012-01-01

Cadherins are cell–cell adhesion proteins essential for the maintenance of tissue architecture and integrity, and their impairment is often associated with human cancer. Knowledge regarding regulatory mechanisms associated with cadherin misexpression in cancer is scarce. Specific features of the intronic-structure and intronic-based regulatory mechanisms in the cadherin superfamily are unidentified. This study aims at systematically characterizing the intronic portion of cadherin superfamily members and the identification of intronic regions constituting putative targets/triggers of regulation, using a bioinformatic approach and biological data mining. Our study demonstrates that the cadherin superfamily genes harbour specific characteristics in comparison to all non-cadherin genes, both from the genomic and transcriptional standpoints. Cadherin superfamily genes display higher average total intron number and significantly longer introns than other genes and across the entire vertebrate lineage. Moreover, in the human genome, we observed an uncommon high frequency of MIR (mammalian-wide interspersed repeats) and MaLR (mammalian-wide interspersed repeats, a subtype of LTR) regulatory-associated repetitive elements at 5′-located introns, concomitantly with increased de novo intronic transcription. Using this approach, we identified cadherin intronic-specific sites that may constitute novel targets/triggers of cadherin superfamily expression regulation. These findings pinpoint the need to identify mechanisms affecting particularly MIR and MaLR elements located in introns 2 and 3 of human cadherin genes, possibly important in the expression modulation of this superfamily in homeostasis and cancer. PMID:22317972
Patterns and rates of intron divergence between humans and chimpanzees

PubMed Central

Gazave, Elodie; Marqués-Bonet, Tomàs; Fernando, Olga; Charlesworth, Brian; Navarro, Arcadi

2007-01-01

Background Introns, which constitute the largest fraction of eukaryotic genes and which had been considered to be neutral sequences, are increasingly acknowledged as having important functions. Several studies have investigated levels of evolutionary constraint along introns and across classes of introns of different length and location within genes. However, thus far these studies have yielded contradictory results. Results We present the first analysis of human-chimpanzee intron divergence, in which differences in the number of substitutions per intronic site (Ki) can be interpreted as the footprint of different intensities and directions of the pressures of natural selection. Our main findings are as follows: there was a strong positive correlation between intron length and divergence; there was a strong negative correlation between intron length and GC content; and divergence rates vary along introns and depending on their ordinal position within genes (for instance, first introns are more GC rich, longer and more divergent, and divergence is lower at the 3' and 5' ends of all types of introns). Conclusion We show that the higher divergence of first introns is related to their larger size. Also, the lower divergence of short introns suggests that they may harbor a relatively greater proportion of regulatory elements than long introns. Moreover, our results are consistent with the presence of functionally relevant sequences near the 5' and 3' ends of introns. Finally, our findings suggest that other parts of introns may also be under selective constraints. PMID:17309804
Enhancer Variants Synergistically Drive Dysfunction of a Gene Regulatory Network In Hirschsprung Disease

DOE PAGES

Chatterjee, Sumantra; Kapoor, Ashish; Akiyama, Jennifer A.; ...

2016-09-29

Common sequence variants in cis-regulatory elements (CREs) are suspected etiological causes of complex disorders. We previously identified an intronic enhancer variant in the RET gene disrupting SOX10 binding and increasing Hirschsprung disease (HSCR) risk 4-fold. We now show that two other functionally independent CRE variants, one binding Gata2 and the other binding Rarb, also reduce Ret expression and increase risk 2- and 1.7-fold. By studying human and mouse fetal gut tissues and cell lines, we demonstrate that reduced RET expression propagates throughout its gene regulatory network, exerting effects on both its positive and negative feedback components. We also provide evidencemore » that the presence of a combination of CRE variants synergistically reduces RET expression and its effects throughout the GRN. These studies show how the effects of functionally independent non-coding variants in a coordinated gene regulatory network amplify their individually small effects, providing a model for complex disorders.« less
Enhancer Variants Synergistically Drive Dysfunction of a Gene Regulatory Network In Hirschsprung Disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chatterjee, Sumantra; Kapoor, Ashish; Akiyama, Jennifer A.

Common sequence variants in cis-regulatory elements (CREs) are suspected etiological causes of complex disorders. We previously identified an intronic enhancer variant in the RET gene disrupting SOX10 binding and increasing Hirschsprung disease (HSCR) risk 4-fold. We now show that two other functionally independent CRE variants, one binding Gata2 and the other binding Rarb, also reduce Ret expression and increase risk 2- and 1.7-fold. By studying human and mouse fetal gut tissues and cell lines, we demonstrate that reduced RET expression propagates throughout its gene regulatory network, exerting effects on both its positive and negative feedback components. We also provide evidencemore » that the presence of a combination of CRE variants synergistically reduces RET expression and its effects throughout the GRN. These studies show how the effects of functionally independent non-coding variants in a coordinated gene regulatory network amplify their individually small effects, providing a model for complex disorders.« less
Characterization of the cod (Gadus morhua) steroidogenic acute regulatory protein (StAR) sheds light on StAR gene structure in fish.

PubMed

Goetz, Frederick W; Norberg, Birgitta; McCauley, Linda A R; Iliev, Dimitar B

2004-03-01

The full-length cDNA for the cod (Gadus morhua) StAR was cloned by RT-PCR and library screening using ovarian RNA. From the library screening, 2 size classes of cDNA were obtained; a 1577 bp cDNA (cStAR1) and a 2851 bp cDNA (cStAR2). The cStAR1 cDNA presumably encodes a protein of 286 amino acids. The cStAR2 cDNA was composed of 6 separated sequences that contained all of the coding regions of cStAR1 when added together, but also contained 5 noncoding regions not observed in cStAR1. Polymerase chain reactions of cod genomic DNA produced products slightly larger than cStAR2. The sequence of these products were the same as cStAR2 but revealed one additional noncoding region (intron). Thus, the fish StAR gene contains the same number of exons (7) and introns (6) as observed in mammals, but is approximately half the size of the mammalian gene. Using Northern analysis and RT-PCR, cStAR1 expression was observed only in testes, ovaries and head kidneys. Polymerase chain reaction products were also observed using cDNA from steroidogenic tissues and primers designed to regions specific for cStAR2, indicating that cStAR2 is expressed in tissues and may account for the presence of larger transcripts observed on Northern blots.
Steric antisense inhibition of AMPA receptor Q/R editing reveals tight coupling to intronic editing sites and splicing

PubMed Central

Penn, Andrew C.; Balik, Ales; Greger, Ingo H.

2013-01-01

Adenosine-to-Inosine (A-to-I) RNA editing is a post-transcriptional mechanism, evolved to diversify the transcriptome in metazoa. In addition to wide-spread editing in non-coding regions protein recoding by RNA editing allows for fine tuning of protein function. Functional consequences are only known for some editing sites and the combinatorial effect between multiple sites (functional epistasis) is currently unclear. Similarly, the interplay between RNA editing and splicing, which impacts on post-transcriptional gene regulation, has not been resolved. Here, we describe a versatile antisense approach, which will aid resolving these open questions. We have developed and characterized morpholino oligos targeting the most efficiently edited site—the AMPA receptor GluA2 Q/R site. We show that inhibition of editing closely correlates with intronic editing efficiency, which is linked to splicing efficiency. In addition to providing a versatile tool our data underscore the unique efficiency of a physiologically pivotal editing site. PMID:23172291
Remarkable sequence conservation of the last intron in the PKD1 gene.

PubMed

Rodova, Marianna; Islam, M Rafiq; Peterson, Kenneth R; Calvet, James P

2003-10-01

The last intron of the PKD1 gene (intron 45) was found to have exceptionally high sequence conservation across four mammalian species: human, mouse, rat, and dog. This conservation did not extend to the comparable intron in pufferfish. Pairwise comparisons for intron 45 showed 91% identity (human vs. dog) to 100% identity (mouse vs. rat) for an average for all four species of 94% identity. In contrast, introns 43 and 44 of the PKD1 gene had average pairwise identities of 57% and 54%, and exons 43, 44, and 45 and the coding region of exon 46 had average pairwise identities of 80%, 84%, 82%, and 80%. Intron 45 is 90 to 95 bp in length, with the major region of sequence divergence being in a central 4-bp to 9-bp variable region. RNA secondary structure analysis of intron 45 predicts a branching stem-loop structure in which the central variable region lies in one loop and the putative branch point sequence lies in another loop, suggesting that the intron adopts a specific stem-loop structure that may be important for its removal. Although intron 45 appears to conform to the class of small, G-triplet-containing introns that are spliced by a mechanism utilizing intron definition, its high sequence conservation may be a reflection of constraints imposed by a unique mechanism that coordinates splicing of this last PKD1 intron with polyadenylation.
The presence, role and clinical use of spermatozoal RNAs

PubMed Central

Jodar, Meritxell; Selvaraju, Sellappan; Sendler, Edward; Diamond, Michael P.; Krawetz, Stephen A.

2013-01-01

BACKGROUND Spermatozoa are highly differentiated, transcriptionally inert cells characterized by a compact nucleus with minimal cytoplasm. Nevertheless they contain a suite of unique RNAs that are delivered to oocyte upon fertilization. They are likely integrated as part of many different processes including genome recognition, consolidation-confrontation, early embryonic development and epigenetic transgenerational inherence. Spermatozoal RNAs also provide a window into the developmental history of each sperm thereby providing biomarkers of fertility and pregnancy outcome which are being intensely studied. METHODS Literature searches were performed to review the majority of spermatozoal RNA studies that described potential functions and clinical applications with emphasis on Next-Generation Sequencing. Human, mouse, bovine and stallion were compared as their distribution and composition of spermatozoal RNAs, using these techniques, have been described. RESULTS Comparisons highlighted the complexity of the population of spermatozoal RNAs that comprises rRNA, mRNA and both large and small non-coding RNAs. RNA-seq analysis has revealed that only a fraction of the larger RNAs retain their structure. While rRNAs are the most abundant and are highly fragmented, ensuring a translationally quiescent state, other RNAs including some mRNAs retain their functional potential, thereby increasing the opportunity for regulatory interactions. Abundant small non-coding RNAs retained in spermatozoa include miRNAs and piRNAs. Some, like miR-34c are essential to the early embryo development required for the first cellular division. Others like the piRNAs are likely part of the genomic dance of confrontation and consolidation. Other non-coding spermatozoal RNAs include transposable elements, annotated lnc-RNAs, intronic retained elements, exonic elements, chromatin-associated RNAs, small-nuclear ILF3/NF30 associated RNAs, quiescent RNAs, mse-tRNAs and YRNAs. Some non-coding RNAs are known to act as epigenetic modifiers, inducing histone modifications and DNA methylation, perhaps playing a role in transgenerational epigenetic inherence. Transcript profiling holds considerable potential for the discovery of fertility biomarkers for both agriculture and human medicine. Comparing the differential RNA profiles of infertile and fertile individuals as well as assessing species similarities, should resolve the regulatory pathways contributing to male factor infertility. CONCLUSIONS Dad delivers a complex population of RNAs to the oocyte at fertilization that likely influences fertilization, embryo development, the phenotype of the offspring and possibly future generations. Development is continuing on the use of spermatozoal RNA profiles as phenotypic markers of male factor status for use as clinical diagnostics of the father's contribution to the birth of a healthy child. PMID:23856356
Identification and Characterization of Small Noncoding RNAs in Genome Sequences of the Edible Fungus Pleurotus ostreatus

PubMed Central

Zhao, Mengran; Hsiang, Tom; Feng, Xiaoxing

2016-01-01

Noncoding RNAs (ncRNAs) have been identified in many fungi. However, no genome-scale identification of ncRNAs has been inventoried for basidiomycetes. In this research, we detected 254 small noncoding RNAs (sncRNAs) in a genome assembly of an isolate (CCEF00389) of Pleurotus ostreatus, which is a widely cultivated edible basidiomycetous fungus worldwide. The identified sncRNAs include snRNAs, snoRNAs, tRNAs, and miRNAs. SnRNA U1 was not found in CCEF00389 genome assembly and some other basidiomycetous genomes by BLASTn. This implies that if snRNA U1 of basidiomycetes exists, it has a sequence that varies significantly from other organisms. By analyzing the distribution of sncRNA loci, we found that snRNAs and most tRNAs (88.6%) were located in pseudo-UTR regions, while miRNAs are commonly found in introns. To analyze the evolutionary conservation of the sncRNAs in P. ostreatus, we aligned all 254 sncRNAs to the genome assemblies of some other Agaricomycotina fungi. The results suggest that most sncRNAs (77.56%) were highly conserved in P. ostreatus, and 20% were conserved in Agaricomycotina fungi. These findings indicate that most sncRNAs of P. ostreatus were not conserved across Agaricomycotina fungi. PMID:27703969
Differential Expression of Non-Coding RNAs and Continuous Evolution of the X Chromosome in Testicular Transcriptome of Two Mouse Species

PubMed Central

Homolka, David; Ivanek, Robert; Forejt, Jiri; Jansa, Petr

2011-01-01

Background Tight regulation of testicular gene expression is a prerequisite for male reproductive success, while differentiation of gene activity in spermatogenesis is important during speciation. Thus, comparison of testicular transcriptomes between closely related species can reveal unique regulatory patterns and shed light on evolutionary constraints separating the species. Methodology/Principal Findings Here, we compared testicular transcriptomes of two closely related mouse species, Mus musculus and Mus spretus, which diverged more than one million years ago. We analyzed testicular expression using tiling arrays overlapping Chromosomes 2, X, Y and mitochondrial genome. An excess of differentially regulated non-coding RNAs was found on Chromosome 2 including the intronic antisense RNAs, intergenic RNAs and premature forms of Piwi-interacting RNAs (piRNAs). Moreover, striking difference was found in the expression of X-linked G6pdx gene, the parental gene of the autosomal retrogene G6pd2. Conclusions/Significance The prevalence of non-coding RNAs among differentially expressed transcripts indicates their role in species-specific regulation of spermatogenesis. The postmeiotic expression of G6pdx in Mus spretus points towards the continuous evolution of X-chromosome silencing and provides an example of expression change accompanying the out-of-the X-chromosomal retroposition. PMID:21347268
Mitochondrial genome evolution in the Saccharomyces sensu stricto complex.

PubMed

Ruan, Jiangxing; Cheng, Jian; Zhang, Tongcun; Jiang, Huifeng

2017-01-01

Exploring the evolutionary patterns of mitochondrial genomes is important for our understanding of the Saccharomyces sensu stricto (SSS) group, which is a model system for genomic evolution and ecological analysis. In this study, we first obtained the complete mitochondrial sequences of two important species, Saccharomyces mikatae and Saccharomyces kudriavzevii. We then compared the mitochondrial genomes in the SSS group with those of close relatives, and found that the non-coding regions evolved rapidly, including dramatic expansion of intergenic regions, fast evolution of introns and almost 20-fold higher rearrangement rates than those of the nuclear genomes. However, the coding regions, and especially the protein-coding genes, are more conserved than those in the nuclear genomes of the SSS group. The different evolutionary patterns of coding and non-coding regions in the mitochondrial and nuclear genomes may be related to the origin of the aerobic fermentation lifestyle in this group. Our analysis thus provides novel insights into the evolution of mitochondrial genomes.
Interactions between the promoter and first intron are involved in transcriptional control of alpha 1(I) collagen gene expression.

PubMed Central

Bornstein, P; McKay, J; Liska, D J; Apone, S; Devarayalu, S

1988-01-01

The first intron of the human collagen alpha 1(I) gene contains several positively and negatively acting elements. We have studied the transcription of collagen-human growth hormone fusion genes, containing deletions and rearrangements of collagen intronic sequences, by transient transfection of chick tendon fibroblasts and NIH 3T3 cells. In chick tendon fibroblasts, but not in 3T3 cells, inversion of intronic sequences containing a previously studied 274-base-pair segment, A274, resulted in markedly reduced human growth hormone mRNA levels as determined by an RNase protection assay. This inhibitory effect was largely alleviated when deletions were introduced in the collagen promoter of plasmids containing negatively oriented intronic sequences. Evidence for interaction of the promoter with the intronic segment, A274, was obtained by gel mobility shift assays. We suggest that promoter-intron interactions, mediated by DNA-binding proteins, regulate collagen gene transcription. Inversion of intronic segments containing critical interactive elements might then lead to an altered geometry and reduced activity of a transcriptional complex in those cells with sufficiently high levels of appropriate transcription factors. We further suggest that the deleted promoter segment plays a key role in directing DNA interactions involved in transcriptional control. Images PMID:3211130
Technical Advance: Transcription factor, promoter, and enhancer utilization in human myeloid cells.

PubMed

Joshi, Anagha; Pooley, Christopher; Freeman, Tom C; Lennartsson, Andreas; Babina, Magda; Schmidl, Christian; Geijtenbeek, Teunis; Michoel, Tom; Severin, Jessica; Itoh, Masayoshi; Lassmann, Timo; Kawaji, Hideya; Hayashizaki, Yoshihide; Carninci, Piero; Forrest, Alistair R R; Rehli, Michael; Hume, David A

2015-05-01

The generation of myeloid cells from their progenitors is regulated at the level of transcription by combinatorial control of key transcription factors influencing cell-fate choice. To unravel the global dynamics of this process at the transcript level, we generated transcription profiles for 91 human cell types of myeloid origin by use of CAGE profiling. The CAGE sequencing of these samples has allowed us to investigate diverse aspects of transcription control during myelopoiesis, such as identification of novel transcription factors, miRNAs, and noncoding RNAs specific to the myeloid lineage. We further reconstructed a transcription regulatory network by clustering coexpressed transcripts and associating them with enriched cis-regulatory motifs. With the use of the bidirectional expression as a proxy for enhancers, we predicted over 2000 novel enhancers, including an enhancer 38 kb downstream of IRF8 and an intronic enhancer in the KIT gene locus. Finally, we highlighted relevance of these data to dissect transcription dynamics during progressive maturation of granulocyte precursors. A multifaceted analysis of the myeloid transcriptome is made available (www.myeloidome.roslin.ed.ac.uk). This high-quality dataset provides a powerful resource to study transcriptional regulation during myelopoiesis and to infer the likely functions of unannotated genes in human innate immunity. © The Author(s).
A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

PubMed Central

Csuros, Miklos; Rogozin, Igor B.; Koonin, Eugene V.

2011-01-01

Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing. PMID:21935348
Branchpoint selection in the splicing of U12-dependent introns in vitro.

PubMed

McConnell, Timothy S; Cho, Soo-Jin; Frilander, Mikko J; Steitz, Joan A

2002-05-01

In metazoans, splicing of introns from pre-mRNAs can occur by two pathways: the major U2-dependent or the minor U12-dependent pathways. Whereas the U2-dependent pathway has been well characterized, much about the U12-dependent pathway remains to be discovered. Most of the information regarding U12-type introns has come from in vitro studies of a very few known introns of this class. To expand our understanding of U12-type splicing, especially to test the hypothesis that the simple base-pairing mechanism between the intron and U12 snRNA defines the branchpoint of U12-dependent introns, additional in vitro splicing substrates were created from three putative U12-type introns: the third intron of the Xenopus RPL1 a gene (XRP), the sixth intron of the Xenopus TFIIS.oA gene (XTF), and the first intron of the human Sm E gene (SME). In vitro splicing in HeLa nuclear extract confirmed U12-dependent splicing of each of these introns. Surprisingly, branchpoint mapping of the XRP splicing intermediate shows use of the upstream rather than the downstream of two consecutive adenosines within the branchpoint sequence (BPS), contrary to the prediction based on alignment with the sixth intron of human P120, a U12-dependent intron whose branch site was previously determined. Also, in the SME intron, the position of the branchpoint A residue within the region base paired with U12 differs from that in P120 and XTF. Analysis of these three additional introns therefore rules out simple models for branchpoint selection by the U12-type spliceosome.
Branchpoint selection in the splicing of U12-dependent introns in vitro.

PubMed Central

McConnell, Timothy S; Cho, Soo-Jin; Frilander, Mikko J; Steitz, Joan A

2002-01-01

In metazoans, splicing of introns from pre-mRNAs can occur by two pathways: the major U2-dependent or the minor U12-dependent pathways. Whereas the U2-dependent pathway has been well characterized, much about the U12-dependent pathway remains to be discovered. Most of the information regarding U12-type introns has come from in vitro studies of a very few known introns of this class. To expand our understanding of U12-type splicing, especially to test the hypothesis that the simple base-pairing mechanism between the intron and U12 snRNA defines the branchpoint of U12-dependent introns, additional in vitro splicing substrates were created from three putative U12-type introns: the third intron of the Xenopus RPL1 a gene (XRP), the sixth intron of the Xenopus TFIIS.oA gene (XTF), and the first intron of the human Sm E gene (SME). In vitro splicing in HeLa nuclear extract confirmed U12-dependent splicing of each of these introns. Surprisingly, branchpoint mapping of the XRP splicing intermediate shows use of the upstream rather than the downstream of two consecutive adenosines within the branchpoint sequence (BPS), contrary to the prediction based on alignment with the sixth intron of human P120, a U12-dependent intron whose branch site was previously determined. Also, in the SME intron, the position of the branchpoint A residue within the region base paired with U12 differs from that in P120 and XTF. Analysis of these three additional introns therefore rules out simple models for branchpoint selection by the U12-type spliceosome. PMID:12022225
Faster-X evolution of gene expression is driven by recessive adaptive cis-regulatory variation in Drosophila.

PubMed

Llopart, Ana

2018-05-01

The hemizygosity of the X (Z) chromosome fully exposes the fitness effects of mutations on that chromosome and has evolutionary consequences on the relative rates of evolution of X and autosomes. Specifically, several population genetics models predict increased rates of evolution in X-linked loci relative to autosomal loci. This prediction of faster-X evolution has been evaluated and confirmed for both protein coding sequences and gene expression. In the case of faster-X evolution for gene expression divergence, it is often assumed that variation in 5' noncoding sequences is associated with variation in transcript abundance between species but a formal, genomewide test of this hypothesis is still missing. Here, I use whole genome sequence data in Drosophila yakuba and D. santomea to evaluate this hypothesis and report positive correlations between sequence divergence at 5' noncoding sequences and gene expression divergence. I also examine polymorphism and divergence in 9,279 noncoding sequences located at the 5' end of annotated genes and detected multiple signals of positive selection. Notably, I used the traditional synonymous sites as neutral reference to test for adaptive evolution, but I also used bases 8-30 of introns <65 bp, which have been proposed to be a better neutral choice. X-linked genes with high degree of male-biased expression show the most extreme adaptive pattern at 5' noncoding regions, in agreement with faster-X evolution for gene expression divergence and a higher incidence of positively selected recessive mutations. © 2018 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

Localization of a bacterial group II intron-encoded protein in human cells.

PubMed

Reinoso-Colacio, Mercedes; García-Rodríguez, Fernando Manuel; García-Cañadas, Marta; Amador-Cubero, Suyapa; García Pérez, José Luis; Toro, Nicolás

2015-08-05

Group II introns are mobile retroelements that self-splice from precursor RNAs to form ribonucleoparticles (RNP), which can invade new specific genomic DNA sites. This specificity can be reprogrammed, for insertion into any desired DNA site, making these introns useful tools for bacterial genetic engineering. However, previous studies have suggested that these elements may function inefficiently in eukaryotes. We investigated the subcellular distribution, in cultured human cells, of the protein encoded by the group II intron RmInt1 (IEP) and several mutants. We created fusions with yellow fluorescent protein (YFP) and with a FLAG epitope. We found that the IEP was localized in the nucleus and nucleolus of the cells. Remarkably, it also accumulated at the periphery of the nuclear matrix. We were also able to identify spliced lariat intron RNA, which co-immunoprecipitated with the IEP, suggesting that functional RmInt1 RNPs can be assembled in cultured human cells.
Localization of a bacterial group II intron-encoded protein in human cells

PubMed Central

Reinoso-Colacio, Mercedes; García-Rodríguez, Fernando Manuel; García-Cañadas, Marta; Amador-Cubero, Suyapa; Pérez, José Luis García; Toro, Nicolás

2015-01-01

Group II introns are mobile retroelements that self-splice from precursor RNAs to form ribonucleoparticles (RNP), which can invade new specific genomic DNA sites. This specificity can be reprogrammed, for insertion into any desired DNA site, making these introns useful tools for bacterial genetic engineering. However, previous studies have suggested that these elements may function inefficiently in eukaryotes. We investigated the subcellular distribution, in cultured human cells, of the protein encoded by the group II intron RmInt1 (IEP) and several mutants. We created fusions with yellow fluorescent protein (YFP) and with a FLAG epitope. We found that the IEP was localized in the nucleus and nucleolus of the cells. Remarkably, it also accumulated at the periphery of the nuclear matrix. We were also able to identify spliced lariat intron RNA, which co-immunoprecipitated with the IEP, suggesting that functional RmInt1 RNPs can be assembled in cultured human cells. PMID:26244523
SelTarbase, a database of human mononucleotide-microsatellite mutations and their potential impact to tumorigenesis and immunology

PubMed Central

Woerner, Stefan M.; Yuan, Yan P.; Benner, Axel; Korff, Sebastian; von Knebel Doeberitz, Magnus; Bork, Peer

2010-01-01

About 15% of human colorectal cancers and, at varying degrees, other tumor entities as well as nearly all tumors related to Lynch syndrome are hallmarked by microsatellite instability (MSI) as a result of a defective mismatch repair system. The functional impact of resulting mutations depends on their genomic localization. Alterations within coding mononucleotide repeat tracts (MNRs) can lead to protein truncation and formation of neopeptides, whereas alterations within untranslated MNRs can alter transcription level or transcript stability. These mutations may provide selective advantage or disadvantage to affected cells. They may further concern the biology of microsatellite unstable cells, e.g. by generating immunogenic peptides induced by frameshifts mutations. The Selective Targets database (http://www.seltarbase.org) is a curated database of a growing number of public MNR mutation data in microsatellite unstable human tumors. Regression calculations for various MSI–H tumor entities indicating statistically deviant mutation frequencies predict TGFBR2, BAX, ACVR2A and others that are shown or highly suspected to be involved in MSI tumorigenesis. Many useful tools for further analyzing genomic DNA, derived wild-type and mutated cDNAs and peptides are integrated. A comprehensive database of all human coding, untranslated, non-coding RNA- and intronic MNRs (MNR_ensembl) is also included. Herewith, SelTarbase presents as a plenty instrument for MSI-carcinogenesis-related research, diagnostics and therapy. PMID:19820113
Cross-species amplification of mitochondrial DNA sequence-tagged-site markers in conifers: the nature of polymorphism and variation within and among species in Picea.

PubMed

Jaramillo-Correa, J P; Bousquet, J; Beaulieu, J; Isabel, N; Perron, M; Bouillé, M

2003-05-01

Primers previously developed to amplify specific non-coding regions of the mitochondrial genome in Angiosperms, and new primers for additional non-coding mtDNA regions, were tested for their ability to direct DNA amplification in 12 conifer taxa and to detect sequence-tagged-site (STS) polymorphisms within and among eight species in Picea. Out of 12 primer pairs, nine were successful at amplifying mtDNA in most of the taxa surveyed. In conifers, indels and substitutions were observed for several loci, allowing them to distinguish between families, genera and, in some cases, between species within genera. In Picea, interspecific polymorphism was detected for four loci, while intraspecific variation was observed for three of the mtDNA regions studied. One of these (SSU rRNA V1 region) exhibited indel polymorphisms, and the two others ( nad1 intron b/c and nad5 intron1) revealed restriction differences after digestion with Sau3AI (PCR-RFLP). A fourth locus, the nad4L- orf25 intergenic region, showed a multibanding pattern for most of the spruce species, suggesting a possible gene duplication. Maternal inheritance, expected for mtDNA in conifers, was observed for all polymorphic markers except the intergenic region nad4L- orf25. Pooling of the variation observed with the remaining three markers resulted in two to six different mtDNA haplotypes within the different species of Picea. Evidence for intra-genomic recombination was observed in at least two taxa. Thus, these mitotypes are likely to be more informative than single-locus haplotypes. They should be particularly useful for the study of biogeography and the dynamics of hybrid zones.
Development and utilization of novel intron length polymorphic markers in foxtail millet (Setaria italica (L.) P. Beauv.).

PubMed

Gupta, Sarika; Kumari, Kajal; Das, Jyotirmoy; Lata, Charu; Puranik, Swati; Prasad, Manoj

2011-07-01

Introns are noncoding sequences in a gene that are transcribed to precursor mRNA but spliced out during mRNA maturation and are abundant in eukaryotic genomes. The availability of codominant molecular markers and saturated genetic linkage maps have been limited in foxtail millet (Setaria italica (L.) P. Beauv.). Here, we describe the development of 98 novel intron length polymorphic (ILP) markers in foxtail millet using sequence information of the model plant rice. A total of 575 nonredundant expressed sequence tag (EST) sequences were obtained, of which 327 and 248 unique sequences were from dehydration- and salinity-stressed suppression subtractive hybridization libraries, respectively. The BLAST analysis of 98 EST sequences suggests a nearly defined function for about 64% of them, and they were grouped into 11 different functional categories. All 98 ILP primer pairs showed a high level of cross-species amplification in two millets and two nonmillets species ranging from 90% to 100%, with a mean of ∼97%. The mean observed heterozygosity and Nei's average gene diversity 0.016 and 0.171, respectively, established the efficiency of the ILP markers for distinguishing the foxtail millet accessions. Based on 26 ILP markers, a reasonable dendrogram of 45 foxtail millet accessions was constructed, demonstrating the utility of ILP markers in germplasm characterizations and genomic relationships in millets and nonmillets species.
A screen for nuclear transcripts identifies two linked noncoding RNAs associated with SC35 splicing domains

PubMed Central

Hutchinson, John N; Ensminger, Alexander W; Clemson, Christine M; Lynch, Christopher R; Lawrence, Jeanne B; Chess, Andrew

2007-01-01

Background Noncoding RNA species play a diverse set of roles in the eukaryotic cell. While much recent attention has focused on smaller RNA species, larger noncoding transcripts are also thought to be highly abundant in mammalian cells. To search for large noncoding RNAs that might control gene expression or mRNA metabolism, we used Affymetrix expression arrays to identify polyadenylated RNA transcripts displaying nuclear enrichment. Results This screen identified no more than three transcripts; XIST, and two unique noncoding nuclear enriched abundant transcripts (NEAT) RNAs strikingly located less than 70 kb apart on human chromosome 11: NEAT1, a noncoding RNA from the locus encoding for TncRNA, and NEAT2 (also known as MALAT-1). While the two NEAT transcripts share no significant homology with each other, each is conserved within the mammalian lineage, suggesting significant function for these noncoding RNAs. NEAT2 is extraordinarily well conserved for a noncoding RNA, more so than even XIST. Bioinformatic analyses of publicly available mouse transcriptome data support our findings from human cells as they confirm that the murine homologs of these noncoding RNAs are also nuclear enriched. RNA FISH analyses suggest that these noncoding RNAs function in mRNA metabolism as they demonstrate an intimate association of these RNA species with SC35 nuclear speckles in both human and mouse cells. These studies show that one of these transcripts, NEAT1 localizes to the periphery of such domains, whereas the neighboring transcript, NEAT2, is part of the long-sought polyadenylated component of nuclear speckles. Conclusion Our genome-wide screens in two mammalian species reveal no more than three abundant large non-coding polyadenylated RNAs in the nucleus; the canonical large noncoding RNA XIST and NEAT1 and NEAT2. The function of these noncoding RNAs in mRNA metabolism is suggested by their high levels of conservation and their intimate association with SC35 splicing domains in multiple mammalian species. PMID:17270048
Biological significance of long non-coding RNA FTX expression in human colorectal cancer.

PubMed

Guo, Xiao-Bo; Hua, Zhu; Li, Chen; Peng, Li-Pan; Wang, Jing-Shen; Wang, Bo; Zhi, Qiao-Ming

2015-01-01

The purpose of this study was to determine the expression of long non-coding RNA (lncRNA) FTX and analyze its prognostic and biological significance in colorectal cancer (CRC). A quantitative reverse transcription PCR was performed to detect the expression of long non-coding RNA FTX in 35 pairs of colorectal cancer and corresponding noncancerous tissues. The expression of long non-coding RNA FTX was detected in 187 colorectal cancer tissues and its correlations with clinicopathological factors of patients were examined. Univariate and multivariate analyses were performed to analyze the prognostic significance of Long Non-coding RNA FTX expression. The effects of long non-coding RNA FTX expression on malignant phenotypes of colorectal cancer cells and its possible biological significances were further determined. Long non-coding RNA FTX was significantly upregulated in colorectal cancer tissues, and low long non-coding RNA FTX expression was significantly correlated with differentiation grade, lymph vascular invasion, and clinical stage. Patients with high long non-coding RNA FTX showed poorer overall survival than those with low long non-coding RNA FTX. Multivariate analyses indicated that status of long non-coding RNA FTX was an independent prognostic factor for patients. Functional analyses showed that upregulation of long non-coding RNA FTX significantly promoted growth, migration, invasion, and increased colony formation in colorectal cancer cells. Therefore, long non-coding RNA FTX may be a potential biomarker for predicting the survival of colorectal cancer patients and might be a molecular target for treatment of human colorectal cancer.
An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome

PubMed Central

Ferlaino, Michael; Rogers, Mark F.; Shihab, Hashem A.; Mort, Matthew; Cooper, David N.; Gaunt, Tom R.; Campbell, Colin

2018-01-01

Background Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. Results We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. Conclusions FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome. PMID:28985712
An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome.

PubMed

Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin

2017-10-06

Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.
Comparative analysis of the 5{prime} genomic and promoter regions between the mouse (Hdh) and human Huntington disease (HD) gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kalchman, M.; Lin, B.; Nasir, J.

1994-09-01

The mouse homologue of the Huntington disease gene (Hdh) has recently been cloned and mapped to a region of synteny with the human, on mouse chromosome 5. The two genes share a high degree of both coding (90% amino acid) and nucleotide (86.2%) identity. We have subsequently performed a detailed comparison of the genomic organization of the 5{prime} region of the two genes encompassing the promoter region and first five exons of both the human and mouse genes. The comparative sequence analysis of the promoter region between HD and Hdh reveals two highly conserved regions. One region (-56 to -118)more » (+1 is the ATG start codon), shared 84% nucleotide identity and another region (-130 to -206) had 81% nucleotide identity. Nine putative Sp1 sites appear in the human promoter region contrasted with only 3 in a similar region in the mouse. Furthermore, 17 and 20 base pair direct repeats present in the HD 5{prime} region are absent in the similar Hdh region. Although both the mouse and human intron/exon boundaries conform to the GT/AG rule, the intron sizes between HD and Hdh are markedly different. The first four introns in Hdh are 15, 7, 5 and 0.5 kb compared to sizes of 10, 15, 7 and 0.5 kb, respectively. Comparison between the mouse and human intronic sequences immediately adjacent to the first five exons (excluding exon 1) reveals only about 46 to 50% identity within the first 60 bp of intronic sequence. Furthermore, we have identified novel polymorphic di-, tri- and tetra-nucleotide repeats in Hdh introns of various mouse strains that are not present in the human. For example, polymorphic CT repeats are present in introns 2 and 4 of Hdh and a novel mouse 56 AAG trinucleotide repeat (interrupted by an AAGG) is also located within intron 2. This information concerning the promoter and genomic organization of both HD and Hdh is critical for designing appropriate gene targetting vectors for studying the normal function of the HD and Hdh genes in model systems.« less
Mitochondrion-to-Chloroplast DNA Transfers and Intragenomic Proliferation of Chloroplast Group II Introns in Gloeotilopsis Green Algae (Ulotrichales, Ulvophyceae).

PubMed

Turmel, Monique; Otis, Christian; Lemieux, Claude

2016-09-19

To probe organelle genome evolution in the Ulvales/Ulotrichales clade, the newly sequenced chloroplast and mitochondrial genomes of Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea (Ulotrichales) were compared with those of Pseudendoclonium akinetum (Ulotrichales) and of the few other green algae previously sampled in the Ulvophyceae. At 105,236 bp, the G planctonica mitochondrial DNA (mtDNA) is the largest mitochondrial genome reported so far among chlorophytes, whereas the 221,431-bp G planctonica and 262,888-bp G sarcinoidea chloroplast DNAs (cpDNAs) are the largest chloroplast genomes analyzed among the Ulvophyceae. Gains of non-coding sequences largely account for the expansion of these genomes. Both Gloeotilopsis cpDNAs lack the inverted repeat (IR) typically found in green plants, indicating that two independent IR losses occurred in the Ulvales/Ulotrichales. Our comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Our analyses also unveiled a number of genetic novelties. Short mtDNA fragments were discovered in two distinct regions of the G sarcinoidea cpDNA, providing the first evidence for intracellular inter-organelle gene migration in green algae. We identified for the first time in green algal organelles, group II introns with LAGLIDADG ORFs as well as group II introns inserted into untranslated gene regions. We discovered many group II introns occupying sites not previously documented for the chloroplast genome and demonstrated that a number of them arose by intragenomic proliferation, most likely through retrohoming. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Mitochondrion-to-Chloroplast DNA Transfers and Intragenomic Proliferation of Chloroplast Group II Introns in Gloeotilopsis Green Algae (Ulotrichales, Ulvophyceae)

PubMed Central

Turmel, Monique; Otis, Christian; Lemieux, Claude

2016-01-01

Abstract To probe organelle genome evolution in the Ulvales/Ulotrichales clade, the newly sequenced chloroplast and mitochondrial genomes of Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea (Ulotrichales) were compared with those of Pseudendoclonium akinetum (Ulotrichales) and of the few other green algae previously sampled in the Ulvophyceae. At 105,236 bp, the G. planctonica mitochondrial DNA (mtDNA) is the largest mitochondrial genome reported so far among chlorophytes, whereas the 221,431-bp G. planctonica and 262,888-bp G. sarcinoidea chloroplast DNAs (cpDNAs) are the largest chloroplast genomes analyzed among the Ulvophyceae. Gains of non-coding sequences largely account for the expansion of these genomes. Both Gloeotilopsis cpDNAs lack the inverted repeat (IR) typically found in green plants, indicating that two independent IR losses occurred in the Ulvales/Ulotrichales. Our comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Our analyses also unveiled a number of genetic novelties. Short mtDNA fragments were discovered in two distinct regions of the G. sarcinoidea cpDNA, providing the first evidence for intracellular inter-organelle gene migration in green algae. We identified for the first time in green algal organelles, group II introns with LAGLIDADG ORFs as well as group II introns inserted into untranslated gene regions. We discovered many group II introns occupying sites not previously documented for the chloroplast genome and demonstrated that a number of them arose by intragenomic proliferation, most likely through retrohoming. PMID:27503298
Noncoding RNAs in human intervertebral disc degeneration: An integrated microarray study.

PubMed

Liu, Xu; Che, Lu; Xie, Yan-Ke; Hu, Qing-Jie; Ma, Chi-Jiao; Pei, Yan-Jun; Wu, Zhi-Gang; Liu, Zhi-Heng; Fan, Li-Ying; Wang, Hai-Qiang

2015-09-01

Accumulating evidence indicates that noncoding RNAs play important roles in a multitude of biological processes. The striking findings of miRNAs (microRNAs) and lncRNAs (long noncoding RNAs) as members of noncoding RNAs open up an exciting era in the studies of gene regulation. More recently, the reports of circRNAs (circular RNAs) add fuel to the noncoding RNAs research. Human intervertebral disc degeneration (IDD) is a main cause of low back pain as a disabling spinal disease. We have addressed the expression profiles if miRNAs, lncRNAs and mRNAs in IDD (Wang et al., J Pathology, 2011 and Wan et al., Arthritis Res Ther, 2014). Furthermore, we thoroughly analysed noncoding RNAs, including miRNAs, lncRNAs and circRNAs in IDD using the very same samples. Here we delineate in detail the contents of the aforementioned microarray analyses. Microarray and sample annotation data were deposited in GEO under accession number GSE67567 as SuperSeries. The integrated analyses of these noncoding RNAs will shed a novel light on coding-noncoding regulatory machinery.
Circular RNAs: A novel type of biomarker and genetic tools in cancer

PubMed Central

Han, Yi-Neng; Xia, Sheng-Qiang; Zhang, Yuan-Yuan; Zheng, Jun-Hua; Li, Wei

2017-01-01

Circular RNAs (circRNAs) are a novel type of universal and diverse endogenous noncoding RNAs (ncRNAs) and they form a covalently closed continuous loop without 5′ or 3′ tails unlike linear RNAs. Most circRNAs are presented with characteristics of abundance, stability, conservatism, and often exhibiting tissue/developmental-stage-specific expression. CircRNAs are generated either from exons or introns by back splicing or lariat introns. CircRNAs play important roles as miRNA sponges, gene transcription and expression regulators, RNA-binding protein (RBP) sponges and protein/peptide translators. Emerging evidence revealed the function of circRNAs in cancer and may potentially serve as a required novel biomarker and therapeutic target for cancer treatment. In this review, we discuss about the origins, characteristics and functions of circRNA and how they work as miRNA sponges, gene transcription and expression regulators, RBP sponges in cancer as well as current research methods of circRNAs, providing evidence for the significance of circRNAs in cancer diagnosis and clinical treatment. PMID:28969093
Microbial and Natural Metabolites That Inhibit Splicing: A Powerful Alternative for Cancer Treatment.

PubMed

Martínez-Montiel, Nancy; Rosas-Murrieta, Nora Hilda; Martínez-Montiel, Mónica; Gaspariano-Cholula, Mayra Patricia; Martínez-Contreras, Rebeca D

2016-01-01

In eukaryotes, genes are frequently interrupted with noncoding sequences named introns. Alternative splicing is a nuclear mechanism by which these introns are removed and flanking coding regions named exons are joined together to generate a message that will be translated in the cytoplasm. This mechanism is catalyzed by a complex machinery known as the spliceosome, which is conformed by more than 300 proteins and ribonucleoproteins that activate and regulate the precision of gene expression when assembled. It has been proposed that several genetic diseases are related to defects in the splicing process, including cancer. For this reason, natural products that show the ability to regulate splicing have attracted enormous attention due to its potential use for cancer treatment. Some microbial metabolites have shown the ability to inhibit gene splicing and the molecular mechanism responsible for this inhibition is being studied for future applications. Here, we summarize the main types of natural products that have been characterized as splicing inhibitors, the recent advances regarding molecular and cellular effects related to these molecules, and the applications reported so far in cancer therapeutics.
RNA editing of non-coding RNA and its role in gene regulation.

PubMed

Daniel, Chammiran; Lagergren, Jens; Öhman, Marie

2015-10-01

It has for a long time been known that repetitive elements, particularly Alu sequences in human, are edited by the adenosine deaminases acting on RNA, ADAR, family. The functional interpretation of these events has been even more difficult than that of editing events in coding sequences, but today there is an emerging understanding of their downstream effects. A surprisingly large fraction of the human transcriptome contains inverted Alu repeats, often forming long double stranded structures in RNA transcripts, typically occurring in introns and UTRs of protein coding genes. Alu repeats are also common in other primates, and similar inverted repeats can frequently be found in non-primates, although the latter are less prone to duplex formation. In human, as many as 700,000 Alu elements have been identified as substrates for RNA editing, of which many are edited at several sites. In fact, recent advancements in transcriptome sequencing techniques and bioinformatics have revealed that the human editome comprises at least a hundred million adenosine to inosine (A-to-I) editing sites in Alu sequences. Although substantial additional efforts are required in order to map the editome, already present knowledge provides an excellent starting point for studying cis-regulation of editing. In this review, we will focus on editing of long stem loop structures in the human transcriptome and how it can effect gene expression. Copyright © 2015 Elsevier B.V. and Société Française de Biochimie et Biologie Moléculaire (SFBBM). All rights reserved.
Biological significance of long non-coding RNA FTX expression in human colorectal cancer

PubMed Central

Guo, Xiao-Bo; Hua, Zhu; Li, Chen; Peng, Li-Pan; Wang, Jing-Shen; Wang, Bo; Zhi, Qiao-Ming

2015-01-01

The purpose of this study was to determine the expression of long non-coding RNA (lncRNA) FTX and analyze its prognostic and biological significance in colorectal cancer (CRC). A quantitative reverse transcription PCR was performed to detect the expression of long non-coding RNA FTX in 35 pairs of colorectal cancer and corresponding noncancerous tissues. The expression of long non-coding RNA FTX was detected in 187 colorectal cancer tissues and its correlations with clinicopathological factors of patients were examined. Univariate and multivariate analyses were performed to analyze the prognostic significance of Long Non-coding RNA FTX expression. The effects of long non-coding RNA FTX expression on malignant phenotypes of colorectal cancer cells and its possible biological significances were further determined. Long non-coding RNA FTX was significantly upregulated in colorectal cancer tissues, and low long non-coding RNA FTX expression was significantly correlated with differentiation grade, lymph vascular invasion, and clinical stage. Patients with high long non-coding RNA FTX showed poorer overall survival than those with low long non-coding RNA FTX. Multivariate analyses indicated that status of long non-coding RNA FTX was an independent prognostic factor for patients. Functional analyses showed that upregulation of long non-coding RNA FTX significantly promoted growth, migration, invasion, and increased colony formation in colorectal cancer cells. Therefore, long non-coding RNA FTX may be a potential biomarker for predicting the survival of colorectal cancer patients and might be a molecular target for treatment of human colorectal cancer. PMID:26629053
Genome-wide Discovery of Circular RNAs in the Leaf and Seedling Tissues of Arabidopsis Thaliana

PubMed Central

Dou, Yongchao; Li, Shengjun; Yang, Weilong; Liu, Kan; Du, Qian; Ren, Guodong; Yu, Bin; Zhang, Chi

2017-01-01

Background: Recently, identification and functional studies of circular RNAs, a type of non-coding RNAs arising from a ligation of 3’ and 5’ ends of a linear RNA molecule, were conducted in mammalian cells with the development of RNA-seq technology. Method: Since compared with animals, studies on circular RNAs in plants are less thorough, a genome-wide identification of circular RNA candidates in Arabidopsis was conducted with our own developed bioinformatics tool to several existing RNA-seq datasets specifically for non-coding RNAs. Results: A total of 164 circular RNA candidates were identified from RNA-seq data, and 4 circular RNA transcripts, including both exonic and intronic circular RNAs, were experimentally validated. Interestingly, our results show that circular RNA transcripts are enriched in the photosynthesis system for the leaf tissue and correlated to the higher expression levels of their parent genes. Sixteen out of all 40 genes that have circular RNA candidates are related to the photosynthesis system, and out of the total 146 exonic circular RNA candidates, 63 are found in chloroplast. PMID:29081691
SHOX gene and conserved noncoding element deletions/duplications in Colombian patients with idiopathic short stature.

PubMed

Sandoval, Gloria Tatiana Vinasco; Jaimes, Giovanna Carola; Barrios, Mauricio Coll; Cespedes, Camila; Velasco, Harvy Mauricio

2014-03-01

SHOX gene mutations or haploinsufficiency cause a wide range of phenotypes such as Leri Weill dyschondrosteosis (LWD), Turner syndrome, and disproportionate short stature (DSS). However, this gene has also been found to be mutated in cases of idiopathic short stature (ISS) with a 3-15% frequency. In this study, the multiplex ligation-dependent probe amplification (MLPA) technique was employed to determine the frequency of SHOX gene mutations and their conserved noncoding elements (CNE) in Colombian patients with ISS. Patients were referred from different centers around the county. From a sample of 62 patients, 8.1% deletions and insertions in the intragenic regions and in the CNE were found. This result is similar to others published in other countries. Moreover, an isolated case of CNE 9 duplication and a new intron 6b deletion in another patient, associated with ISS, are described. This is one of the first studies of a Latin American population in which deletions/duplications of the SHOX gene and its CNE are examined in patients with ISS.
SHOX gene and conserved noncoding element deletions/duplications in Colombian patients with idiopathic short stature

PubMed Central

Sandoval, Gloria Tatiana Vinasco; Jaimes, Giovanna Carola; Barrios, Mauricio Coll; Cespedes, Camila; Velasco, Harvy Mauricio

2014-01-01

SHOX gene mutations or haploinsufficiency cause a wide range of phenotypes such as Leri Weill dyschondrosteosis (LWD), Turner syndrome, and disproportionate short stature (DSS). However, this gene has also been found to be mutated in cases of idiopathic short stature (ISS) with a 3–15% frequency. In this study, the multiplex ligation-dependent probe amplification (MLPA) technique was employed to determine the frequency of SHOX gene mutations and their conserved noncoding elements (CNE) in Colombian patients with ISS. Patients were referred from different centers around the county. From a sample of 62 patients, 8.1% deletions and insertions in the intragenic regions and in the CNE were found. This result is similar to others published in other countries. Moreover, an isolated case of CNE 9 duplication and a new intron 6b deletion in another patient, associated with ISS, are described. This is one of the first studies of a Latin American population in which deletions/duplications of the SHOX gene and its CNE are examined in patients with ISS. PMID:24689071

WES homozygosity mapping in a recessive form of Charcot-Marie-Tooth neuropathy reveals intronic GDAP1 variant leading to a premature stop codon.

PubMed

Masingue, Marion; Perrot, Jimmy; Carlier, Robert-Yves; Piguet-Lacroix, Guenaelle; Latour, Philippe; Stojkovic, Tanya

2018-05-01

Charcot-Marie-Tooth disease (CMT) refers to a group of clinically and genetically heterogeneous inherited neuropathies. Ganglioside-induced differentiation-associated protein 1 GDAP1-related CMT has been reported in an autosomal dominant or recessive form in patients presenting either axonal or demyelinating neuropathy. We report two Sri Lankan sisters born to consanguineous parents and presenting with a severe axonal sensorimotor neuropathy. The early onset of the disease, the distal and proximal weakness and atrophy leading to major disability, along with areflexia, and, most notably, vocal cord and diaphragm paralysis were highly evocative of a GDAP1-related CMT. However, sequencing of the coding regions of the gene was normal. Whole-exome sequencing (WES) was performed and revealed that the largest region of homozygosity was around GDAP1 with several variants, mostly in non-coding regions. In view of the high clinical suspicion of GDAP1 gene involvement, we examined the variants in this gene and this, along with functional studies, allowed us to identify an alternative splicing site revealing a cryptic in-frame stop codon in intron 4 responsible for a severe loss of wild-type GDAP1. This work is the first to describe a deleterious mutation in GDAP1 gene outside of coding sequences or intronic junctions and emphasizes the importance of interpreting molecular analysis, and in particular WES results, in light of the clinical and electrophysiological phenotype.
RNA structure in splicing: An evolutionary perspective.

PubMed

Lin, Chien-Ling; Taggart, Allison J; Fairbrother, William G

2016-09-01

Pre-mRNA splicing is a key post-transcriptional regulation process in which introns are excised and exons are ligated together. A novel class of structured intron was recently discovered in fish. Simple expansions of complementary AC and GT dimers at opposite boundaries of an intron were found to form a bridging structure, thereby enforcing correct splice site pairing across the intron. In some fish introns, the RNA structures are strong enough to bypass the need of regulatory protein factors for splicing. Here, we discuss the prevalence and potential functions of highly structured introns. In humans, structured introns usually arise through the co-occurrence of C and G-rich repeats at intron boundaries. We explore the potentially instructive example of the HLA receptor genes. In HLA pre-mRNA, structured introns flank the exons that encode the highly polymorphic β sheet cleft, making the processing of the transcript robust to variants that disrupt splicing factor binding. While selective forces that have shaped HLA receptor are fairly atypical, numerous other highly polymorphic genes that encode receptors contain structured introns. Finally, we discuss how the elevated mutation rate associated with the simple repeats that often compose structured intron can make structured introns themselves rapidly evolving elements.
The Arabidopsis homolog of human minor spliceosomal protein U11-48K plays a crucial role in U12 intron splicing and plant development

PubMed Central

Xu, Tao; Kim, Bo Mi; Kwak, Kyung Jin; Jung, Hyun Ju; Kang, Hunseung

2016-01-01

The minor U12 introns are removed from precursor mRNAs by the U12 intron-specific minor spliceosome. Among the seven ribonucleoproteins unique to the minor spliceosome, denoted as U11/U12-20K, U11/U12-25K, U11/U12-31K, U11/U12-65K, U11-35K, U11-48K, and U11-59K, the roles of only U11/U12-31K and U11/U12-65K have been demonstrated in U12 intron splicing and plant development. Here, the functional role of the Arabidopsis homolog of human U11-48K in U12 intron splicing and the development of Arabidopsis thaliana was examined using transgenic knockdown plants. The u11-48k mutants exhibited several defects in growth and development, such as severely arrested primary inflorescence stems, formation of serrated leaves, production of many rosette leaves after bolting, and delayed senescence. The splicing of most U12 introns analyzed was impaired in the u11-48k mutants. Comparative analysis of the splicing defects and phenotypes among the u11/u12-31k, u11-48k, and u11/12-65k mutants showed that the severity of abnormal development was closely correlated with the degree of impairment in U12 intron splicing. Taken together, these results provide compelling evidence that the Arabidopsis homolog of human U11-48K protein, as well as U11/U12-31K and U11/U12-65K proteins, is necessary for correct splicing of U12 introns and normal plant growth and development. PMID:27091878
Global Profiling of hnRNP A2/B1-RNA Binding on Chromatin Highlights LncRNA Interactions.

PubMed

Nguyen, Eric D; Balas, Maggie M; Griffin, April M; Roberts, Justin T; Johnson, Aaron M

2018-06-23

Long noncoding RNAs (lncRNAs) often carry out their functions through associations with adaptor proteins. We recently identified heterogeneous ribonucleoprotein (hnRNP) A2/B1 as an adaptor of the human HOTAIR lncRNA. hnRNP A2 and B1 are splice isoforms of the same gene. The spliced version of HOTAIR preferentially associates with the B1 isoform, which we hypothesize contributes to RNA-RNA matching between HOTAIR and transcripts of target genes in breast cancer. Here we used enhanced cross-linking immunoprecipitation (eCLIP) to map the direct interactions between A2/B1 and RNA in breast cancer cells. Despite differing by only twelve amino acids, the A2 and B1 splice isoforms associate preferentially with distinct populations of RNA in vivo. Through cellular fractionation experiments we characterize the pattern of RNA association in chromatin, nucleoplasm, and cytoplasm. We find that a majority of interactions occur on chromatin, even those that do not contribute to co-transcriptional splicing. A2/B1 binding site locations on multiple RNAs hint at a contribution to the regulation and function of lncRNAs. Surprisingly, the strongest A2/B1 binding site occurs in a retained intron of HOTAIR, which interrupts an RNA-RNA interaction hotspot. In vitro eCLIP experiments highlight additional exonic B1 binding sites in HOTAIR which also surround the RNA-RNA interaction hotspot. Interestingly, a version of HOTAIR with the intron retained is still capable of making RNA-RNA interactions in vitro through the hotspot region. Our data further characterize the multiple functions of a repurposed splicing factor with isoform-biased interactions, and highlight that the majority of these functions occur on chromatin-associated RNA.
Current Research on Non-Coding Ribonucleic Acid (RNA).

PubMed

Wang, Jing; Samuels, David C; Zhao, Shilin; Xiang, Yu; Zhao, Ying-Yong; Guo, Yan

2017-12-05

Non-coding ribonucleic acid (RNA) has without a doubt captured the interest of biomedical researchers. The ability to screen the entire human genome with high-throughput sequencing technology has greatly enhanced the identification, annotation and prediction of the functionality of non-coding RNAs. In this review, we discuss the current landscape of non-coding RNA research and quantitative analysis. Non-coding RNA will be categorized into two major groups by size: long non-coding RNAs and small RNAs. In long non-coding RNA, we discuss regular long non-coding RNA, pseudogenes and circular RNA. In small RNA, we discuss miRNA, transfer RNA, piwi-interacting RNA, small nucleolar RNA, small nuclear RNA, Y RNA, single recognition particle RNA, and 7SK RNA. We elaborate on the origin, detection method, and potential association with disease, putative functional mechanisms, and public resources for these non-coding RNAs. We aim to provide readers with a complete overview of non-coding RNAs and incite additional interest in non-coding RNA research.
BIALLELIC POLYMORPHISM IN THE INTRON REGION OF B-TUBULIN GENE OF CRYPTOSPORIDIUM PARASITES

EPA Science Inventory

Nucleotide sequencing of polymerase chain reaction-amplified intron region of the Cryptosporidium parvum B-tubulin gene in 26 human and 15 animal isolates revealed distinct genetic polymorphism between the human and bovine genotypes. The separation of 2 genotypes of C. parvum is...
Genome-scale deletion screening of human long non-coding RNAs using a paired-guide RNA CRISPR library

PubMed Central

Zhu, Shiyou; Li, Wei; Liu, Jingze; Chen, Chen-Hao; Liao, Qi; Xu, Ping; Xu, Han; Xiao, Tengfei; Cao, Zhongzheng; Peng, Jingyu; Yuan, Pengfei; Brown, Myles; Liu, Xiaole Shirley; Wei, Wensheng

2017-01-01

CRISPR/Cas9 screens have been widely adopted to analyse coding gene functions, but high throughput screening of non-coding elements using this method is more challenging, because indels caused by a single cut in non-coding regions are unlikely to produce a functional knockout. A high-throughput method to produce deletions of non-coding DNA is needed. Herein, we report a high throughput genomic deletion strategy to screen for functional long non-coding RNAs (lncRNAs) that is based on a lentiviral paired-guide RNA (pgRNA) library. Applying our screening method, we identified 51 lncRNAs that can positively or negatively regulate human cancer cell growth. We individually validated 9 lncRNAs using CRISPR/Cas9-mediated genomic deletion and functional rescue, CRISPR activation or inhibition, and gene expression profiling. Our high-throughput pgRNA genome deletion method should enable rapid identification of functional mammalian non-coding elements. PMID:27798563
Refined mapping of autoimmune disease associated genetic variants with gene expression suggests an important role for non-coding RNAs.

PubMed

Ricaño-Ponce, Isis; Zhernakova, Daria V; Deelen, Patrick; Luo, Oscar; Li, Xingwang; Isaacs, Aaron; Karjalainen, Juha; Di Tommaso, Jennifer; Borek, Zuzanna Agnieszka; Zorro, Maria M; Gutierrez-Achury, Javier; Uitterlinden, Andre G; Hofman, Albert; van Meurs, Joyce; Netea, Mihai G; Jonkers, Iris H; Withoff, Sebo; van Duijn, Cornelia M; Li, Yang; Ruan, Yijun; Franke, Lude; Wijmenga, Cisca; Kumar, Vinod

2016-04-01

Genome-wide association and fine-mapping studies in 14 autoimmune diseases (AID) have implicated more than 250 loci in one or more of these diseases. As more than 90% of AID-associated SNPs are intergenic or intronic, pinpointing the causal genes is challenging. We performed a systematic analysis to link 460 SNPs that are associated with 14 AID to causal genes using transcriptomic data from 629 blood samples. We were able to link 71 (39%) of the AID-SNPs to two or more nearby genes, providing evidence that for part of the AID loci multiple causal genes exist. While 54 of the AID loci are shared by one or more AID, 17% of them do not share candidate causal genes. In addition to finding novel genes such as ULK3, we also implicate novel disease mechanisms and pathways like autophagy in celiac disease pathogenesis. Furthermore, 42 of the AID SNPs specifically affected the expression of 53 non-coding RNA genes. To further understand how the non-coding genome contributes to AID, the SNPs were linked to functional regulatory elements, which suggest a model where AID genes are regulated by network of chromatin looping/non-coding RNAs interactions. The looping model also explains how a causal candidate gene is not necessarily the gene closest to the AID SNP, which was the case in nearly 50% of cases. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
The brown algae Pl.LSU/2 group II intron-encoded protein has functional reverse transcriptase and maturase activities.

PubMed

Zerbato, Madeleine; Holic, Nathalie; Moniot-Frin, Sophie; Ingrao, Dina; Galy, Anne; Perea, Javier

2013-01-01

Group II introns are self-splicing mobile elements found in prokaryotes and eukaryotic organelles. These introns propagate by homing into precise genomic locations, following assembly of a ribonucleoprotein complex containing the intron-encoded protein (IEP) and the spliced intron RNA. Engineered group II introns are now commonly used tools for targeted genomic modifications in prokaryotes but not in eukaryotes. We speculate that the catalytic activation of currently known group II introns is limited in eukaryotic cells. The brown algae Pylaiella littoralis Pl.LSU/2 group II intron is uniquely capable of in vitro ribozyme activity at physiological level of magnesium but this intron remains poorly characterized. We purified and characterized recombinant Pl.LSU/2 IEP. Unlike most IEPs, Pl.LSU/2 IEP displayed a reverse transcriptase activity without intronic RNA. The Pl.LSU/2 intron could be engineered to splice accurately in Saccharomyces cerevisiae and splicing efficiency was increased by the maturase activity of the IEP. However, spliced transcripts were not expressed. Furthermore, intron splicing was not detected in human cells. While further tool development is needed, these data provide the first functional characterization of the PI.LSU/2 IEP and the first evidence that the Pl.LSU/2 group II intron splicing occurs in vivo in eukaryotes in an IEP-dependent manner.
CRISPR/Cas9 Genome Editing Reveals That the Intron Is Not Essential for var2csa Gene Activation or Silencing in Plasmodium falciparum.

PubMed

Bryant, Jessica M; Regnault, Clément; Scheidig-Benatar, Christine; Baumgarten, Sebastian; Guizetti, Julien; Scherf, Artur

2017-07-11

Plasmodium falciparum relies on monoallelic expression of 1 of 60 var virulence genes for antigenic variation and host immune evasion. Each var gene contains a conserved intron which has been implicated in previous studies in both activation and repression of transcription via several epigenetic mechanisms, including interaction with the var promoter, production of long noncoding RNAs (lncRNAs), and localization to repressive perinuclear sites. However, functional studies have relied primarily on artificial expression constructs. Using the recently developed P. falciparum clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 system, we directly deleted the var2csa P. falciparum 3D7_1200600 (Pf3D7_1200600) endogenous intron, resulting in an intronless var gene in a natural, marker-free chromosomal context. Deletion of the var2csa intron resulted in an upregulation of transcription of the var2csa gene in ring-stage parasites and subsequent expression of the PfEMP1 protein in late-stage parasites. Intron deletion did not affect the normal temporal regulation and subsequent transcriptional silencing of the var gene in trophozoites but did result in increased rates of var gene switching in some mutant clones. Transcriptional repression of the intronless var2csa gene could be achieved via long-term culture or panning with the CD36 receptor, after which reactivation was possible with chondroitin sulfate A (CSA) panning. These data suggest that the var2csa intron is not required for silencing or activation in ring-stage parasites but point to a subtle role in regulation of switching within the var gene family. IMPORTANCE Plasmodium falciparum is the most virulent species of malaria parasite, causing high rates of morbidity and mortality in those infected. Chronic infection depends on an immune evasion mechanism termed antigenic variation, which in turn relies on monoallelic expression of 1 of ~60 var genes. Understanding antigenic variation and the transcriptional regulation of monoallelic expression is important for developing drugs and/or vaccines. The var gene family encodes the antigenic surface proteins that decorate infected erythrocytes. Until recently, studying the underlying genetic elements that regulate monoallelic expression in P. falciparum was difficult, and most studies relied on artificial systems such as episomal reporter genes. Our study was the first to use CRISPR/Cas9 genome editing for the functional study of an important, conserved genetic element of var genes-the intron-in an endogenous, episome-free manner. Our findings shed light on the role of the var gene intron in transcriptional regulation of monoallelic expression. Copyright © 2017 Bryant et al.
Noncoding sequence classification based on wavelet transform analysis: part II

NASA Astrophysics Data System (ADS)

Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez-Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.

2017-09-01

DNA sequences in human genome can be divided into the coding and noncoding ones. We hypothesize that the characteristic periodicities of the noncoding sequences are related to their function. We describe the procedure to identify these characteristic periodicities using the wavelet analysis. Our results show that three groups of noncoding sequences, each one with different biological function, may be differentiated by their wavelet coefficients within specific frequency range.
Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution.

PubMed

Rogozin, Igor B; Wolf, Yuri I; Sorokin, Alexander V; Mirkin, Boris G; Koonin, Eugene V

2003-09-02

Sequencing of eukaryotic genomes allows one to address major evolutionary problems, such as the evolution of gene structure. We compared the intron positions in 684 orthologous gene sets from 8 complete genomes of animals, plants, fungi, and protists and constructed parsimonious scenarios of evolution of the exon-intron structure for the respective genes. Approximately one-third of the introns in the malaria parasite Plasmodium falciparum are shared with at least one crown group eukaryote; this number indicates that these introns have been conserved through >1.5 billion years of evolution that separate Plasmodium from the crown group. Paradoxically, humans share many more introns with the plant Arabidopsis thaliana than with the fly or nematode. The inferred evolutionary scenario holds that the common ancestor of Plasmodium and the crown group and, especially, the common ancestor of animals, plants, and fungi had numerous introns. Most of these ancestral introns, which are retained in the genomes of vertebrates and plants, have been lost in fungi, nematodes, arthropods, and probably Plasmodium. In addition, numerous introns have been inserted into vertebrate and plant genes, whereas, in other lineages, intron gain was much less prominent.
Analysis for complete genomic sequence of HLA-B and HLA-C alleles in the Chinese Han population.

PubMed

Zhu, F; He, Y; Zhang, W; He, J; He, J; Xu, X; Lv, H; Yan, L

2011-08-01

In the present study, we have determined the complete genomic sequence and analysed the intron polymorphism of partial HLA-B and HLA-C alleles in the Chinese Han population. Over 3.0 kb DNA fragments of HLA-B and HLA-C loci were amplified by polymerase chain reaction from partial 5' untranslated region to 3' noncoding region respectively, and then the amplified products were sequenced. Full-length nucleotide sequences of 14 HLA-B alleles and 10 HLA-C alleles were obtained and have been submitted to GenBank and IMGT/HLA database. Two novel alleles of HLA-B*52:01:01:02 and HLA-B*59:01:01:02 were identified, and the complete genomic sequence of HLA-B*52:01:01:01 was firstly reported. Totally 157 and 167 polymorphism positions were found in the full-length genomic sequence of HLA-B and HLA-C loci respectively. Our results suggested that many single nucleotide polymorphisms existed in the exon and intron regions, and the data can provide useful information for understanding the evolution of HLA-B and HLA-C alleles. © 2011 Blackwell Publishing Ltd.
The Brown Algae Pl.LSU/2 Group II Intron-Encoded Protein Has Functional Reverse Transcriptase and Maturase Activities

PubMed Central

Zerbato, Madeleine; Holic, Nathalie; Moniot-Frin, Sophie; Ingrao, Dina; Galy, Anne; Perea, Javier

2013-01-01

Group II introns are self-splicing mobile elements found in prokaryotes and eukaryotic organelles. These introns propagate by homing into precise genomic locations, following assembly of a ribonucleoprotein complex containing the intron-encoded protein (IEP) and the spliced intron RNA. Engineered group II introns are now commonly used tools for targeted genomic modifications in prokaryotes but not in eukaryotes. We speculate that the catalytic activation of currently known group II introns is limited in eukaryotic cells. The brown algae Pylaiella littoralis Pl.LSU/2 group II intron is uniquely capable of in vitro ribozyme activity at physiological level of magnesium but this intron remains poorly characterized. We purified and characterized recombinant Pl.LSU/2 IEP. Unlike most IEPs, Pl.LSU/2 IEP displayed a reverse transcriptase activity without intronic RNA. The Pl.LSU/2 intron could be engineered to splice accurately in Saccharomyces cerevisiae and splicing efficiency was increased by the maturase activity of the IEP. However, spliced transcripts were not expressed. Furthermore, intron splicing was not detected in human cells. While further tool development is needed, these data provide the first functional characterization of the PI.LSU/2 IEP and the first evidence that the Pl.LSU/2 group II intron splicing occurs in vivo in eukaryotes in an IEP-dependent manner. PMID:23505475
Intron retention generates ANKRD1 splice variants that are co-regulated with the main transcript in normal and failing myocardium.

PubMed

Torrado, Mario; Iglesias, Raquel; Nespereira, Beatriz; Centeno, Alberto; López, Eduardo; Mikhailov, Alexander T

2009-07-01

The cardiac ankyrin repeat domain 1 protein (ANKRD1, also known as CARP) has been extensively characterized with regard to its proposed functions as a cardio-enriched transcriptional co-factor and stress-inducible myofibrillar protein. The present results show the occurrence of alternative splicing by intron retention events in the pig and human ankrd1 gene. In pig heart, ankrd1 is expressed as four alternatively spliced transcripts, three of which have non-excised introns: ankrd1-contained introns 6, 7 and 8 (i.e., ankrd1-i6,7,8), ankrd1-contained introns 7 and 8 (i.e., ankrd1-i7,8), and ankrd1 retained only intron 8 (i.e., ankrd1-i8). In the human heart, two orthologues of porcine intron-retaining ankrd1 variants (i.e., ankrd1-i8 and ankrd1-i7,8) are detected. We demonstrate that these newly-identified intron-retaining ankrd1 transcripts are functionally intact, efficiently translated into protein in vitro and exported to the cytoplasm in cardiomyocytes in vivo. In the piglet heart, both the intronless and intron-retaining ankrd1 mRNAs are co-expressed in a chamber-dependent manner being more abundant in the left as compared to the right myocardium. Our data further indicate co-upregulation of the ankrd1 spliced variants in myocardium in the porcine model of diastolic heart failure. Most significantly, we demonstrate that in vivo forced expression of recombinant intronless ankrd1 markedly increases the levels of intron-retaining ankrd1 variants (but not of the endogenous main transcript) in piglet myocardium, suggesting that ANKRD1 may positively regulate the expression of its own intron-containing RNAs in response to cardiac stress. Overall, our findings demonstrate that in cardiomyocytes ANKRD1 can exist in multiple isoforms which may contribute to the functional diversity of this factor in heart development and disease.
Introns Protect Eukaryotic Genomes from Transcription-Associated Genetic Instability.

PubMed

Bonnet, Amandine; Grosso, Ana R; Elkaoutari, Abdessamad; Coleno, Emeline; Presle, Adrien; Sridhara, Sreerama C; Janbon, Guilhem; Géli, Vincent; de Almeida, Sérgio F; Palancade, Benoit

2017-08-17

Transcription is a source of genetic instability that can notably result from the formation of genotoxic DNA:RNA hybrids, or R-loops, between the nascent mRNA and its template. Here we report an unexpected function for introns in counteracting R-loop accumulation in eukaryotic genomes. Deletion of endogenous introns increases R-loop formation, while insertion of an intron into an intronless gene suppresses R-loop accumulation and its deleterious impact on transcription and recombination in yeast. Recruitment of the spliceosome onto the mRNA, but not splicing per se, is shown to be critical to attenuate R-loop formation and transcription-associated genetic instability. Genome-wide analyses in a number of distant species differing in their intron content, including human, further revealed that intron-containing genes and the intron-richest genomes are best protected against R-loop accumulation and subsequent genetic instability. Our results thereby provide a possible rationale for the conservation of introns throughout the eukaryotic lineage. Copyright © 2017 Elsevier Inc. All rights reserved.
A systematic evaluation of expression of HERV-W elements; influence of genomic context, viral structure and orientation

PubMed Central

2011-01-01

Background One member of the W family of human endogenous retroviruses (HERV) appears to have been functionally adopted by the human host. Nevertheless, a highly diversified and regulated transcription from a range of HERV-W elements has been observed in human tissues and cells. Aberrant expression of members of this family has also been associated with human disease such as multiple sclerosis (MS) and schizophrenia. It is not known whether this broad expression of HERV-W elements represents transcriptional leakage or specific transcription initiated from the retroviral promoter in the long terminal repeat (LTR) region. Therefore, potential influences of genomic context, structure and orientation on the expression levels of individual HERV-W elements in normal human tissues were systematically investigated. Results Whereas intronic HERV-W elements with a pseudogene structure exhibited a strong anti-sense orientation bias, intronic elements with a proviral structure and solo LTRs did not. Although a highly variable expression across tissues and elements was observed, systematic effects of context, structure and orientation were also observed. Elements located in intronic regions appeared to be expressed at higher levels than elements located in intergenic regions. Intronic elements with proviral structures were expressed at higher levels than those elements bearing hallmarks of processed pseudogenes or solo LTRs. Relative to their corresponding genes, intronic elements integrated on the sense strand appeared to be transcribed at higher levels than those integrated on the anti-sense strand. Moreover, the expression of proviral elements appeared to be independent from that of their corresponding genes. Conclusions Intronic HERV-W provirus integrations on the sense strand appear to have elicited a weaker negative selection than pseudogene integrations of transcripts from such elements. Our current findings suggest that the previously observed diversified and tissue-specific expression of elements in the HERV-W family is the result of both directed transcription (involving both the LTR and internal sequence) and leaky transcription of HERV-W elements in normal human tissues. PMID:21226900
Comparative Analysis of the Complete Plastomes of Apostasia wallichii and Neuwiedia singapureana (Apostasioideae) Reveals Different Evolutionary Dynamics of IR/SSC Boundary among Photosynthetic Orchids.

PubMed

Niu, Zhitao; Pan, Jiajia; Zhu, Shuying; Li, Ludan; Xue, Qingyun; Liu, Wei; Ding, Xiaoyu

2017-01-01

Apostasioideae, consists of only two genera, Apostasia and Neuwiedia , which are mainly distributed in Southeast Asia and northern Australia. The floral structure, taxonomy, biogeography, and genome variation of Apostasioideae have been intensively studied. However, detailed analyses of plastome composition and structure and comparisons with those of other orchid subfamilies have not yet been conducted. Here, the complete plastome sequences of Apostasia wallichii and Neuwiedia singapureana were sequenced and compared with 43 previously published photosynthetic orchid plastomes to characterize the plastome structure and evolution in the orchids. Unlike many orchid plastomes (e.g., Paphiopedilum and Vanilla ), the plastomes of Apostasioideae contain a full set of 11 functional NADH dehydrogenase ( ndh ) genes. The distribution of repeat sequences and simple sequence repeat elements enhanced the view that the mutation rate of non-coding regions was higher than that of coding regions. The 10 loci- ndhA intron, matK-5'trnK , clpP-psbB , rps8-rpl14 , trnT-trnL , 3'trnK-matK , clpP intron , psbK-trnK , trnS-psbC , and ndhF-rpl32 -that had the highest degrees of sequence variability were identified as mutational hotspots for the Apostasia plastome. Furthermore, our results revealed that plastid genes exhibited a variable evolution rate within and among different orchid genus. Considering the diversified evolution of both coding and non-coding regions, we suggested that the plastome-wide evolution of orchid species was disproportional. Additionally, the sequences flanking the inverted repeat/small single copy (IR/SSC) junctions of photosynthetic orchid plastomes were categorized into three types according to the presence/absence of ndh genes. Different evolutionary dynamics for each of the three IR/SSC types of photosynthetic orchid plastomes were also proposed.
Comparative Analysis of the Complete Plastomes of Apostasia wallichii and Neuwiedia singapureana (Apostasioideae) Reveals Different Evolutionary Dynamics of IR/SSC Boundary among Photosynthetic Orchids

PubMed Central

Niu, Zhitao; Pan, Jiajia; Zhu, Shuying; Li, Ludan; Xue, Qingyun; Liu, Wei; Ding, Xiaoyu

2017-01-01

Apostasioideae, consists of only two genera, Apostasia and Neuwiedia, which are mainly distributed in Southeast Asia and northern Australia. The floral structure, taxonomy, biogeography, and genome variation of Apostasioideae have been intensively studied. However, detailed analyses of plastome composition and structure and comparisons with those of other orchid subfamilies have not yet been conducted. Here, the complete plastome sequences of Apostasia wallichii and Neuwiedia singapureana were sequenced and compared with 43 previously published photosynthetic orchid plastomes to characterize the plastome structure and evolution in the orchids. Unlike many orchid plastomes (e.g., Paphiopedilum and Vanilla), the plastomes of Apostasioideae contain a full set of 11 functional NADH dehydrogenase (ndh) genes. The distribution of repeat sequences and simple sequence repeat elements enhanced the view that the mutation rate of non-coding regions was higher than that of coding regions. The 10 loci—ndhA intron, matK-5′trnK, clpP-psbB, rps8-rpl14, trnT-trnL, 3′trnK-matK, clpP intron, psbK-trnK, trnS-psbC, and ndhF-rpl32—that had the highest degrees of sequence variability were identified as mutational hotspots for the Apostasia plastome. Furthermore, our results revealed that plastid genes exhibited a variable evolution rate within and among different orchid genus. Considering the diversified evolution of both coding and non-coding regions, we suggested that the plastome-wide evolution of orchid species was disproportional. Additionally, the sequences flanking the inverted repeat/small single copy (IR/SSC) junctions of photosynthetic orchid plastomes were categorized into three types according to the presence/absence of ndh genes. Different evolutionary dynamics for each of the three IR/SSC types of photosynthetic orchid plastomes were also proposed. PMID:29046685
The effects of sex-biased gene expression and X-linkage on rates of sequence evolution in Drosophila.

PubMed

Campos, José Luis; Johnston, Keira; Charlesworth, Brian

2017-12-08

A faster rate of adaptive evolution of X-linked genes compared with autosomal genes (the faster-X effect) can be caused by the fixation of recessive or partially recessive advantageous mutations. This effect should be largest for advantageous mutations that affect only male fitness, and least for mutations that affect only female fitness. We tested these predictions in Drosophila melanogaster by using coding and functionally significant non-coding sequences of genes with different levels of sex-biased expression. Consistent with theory, nonsynonymous substitutions in most male-biased and unbiased genes show faster adaptive evolution on the X. However, genes with very low recombination rates do not show such an effect, possibly as a consequence of Hill-Robertson interference. Contrary to expectation, there was a substantial faster-X effect for female-biased genes. After correcting for recombination rate differences, however, female-biased genes did not show a faster X-effect. Similar analyses of non-coding UTRs and long introns showed a faster-X effect for all groups of genes, other than introns of female-biased genes. Given the strong evidence that deleterious mutations are mostly recessive or partially recessive, we would expect a slower rate of evolution of X-linked genes for slightly deleterious mutations that become fixed by genetic drift. Surprisingly, we found little evidence for this after correcting for recombination rate, implying that weakly deleterious mutations are mostly close to being semidominant. This is consistent with evidence from polymorphism data, which we use to test how models of selection that assume semidominance with no sex-specific fitness effects may bias estimates of purifying selection. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Conserved expression of transposon-derived non-coding transcripts in primate stem cells.

PubMed

Ramsay, LeeAnn; Marchetto, Maria C; Caron, Maxime; Chen, Shu-Huang; Busche, Stephan; Kwan, Tony; Pastinen, Tomi; Gage, Fred H; Bourque, Guillaume

2017-02-28

A significant portion of expressed non-coding RNAs in human cells is derived from transposable elements (TEs). Moreover, it has been shown that various long non-coding RNAs (lncRNAs), which come from the human endogenous retrovirus subfamily H (HERVH), are not only expressed but required for pluripotency in human embryonic stem cells (hESCs). To identify additional TE-derived functional non-coding transcripts, we generated RNA-seq data from induced pluripotent stem cells (iPSCs) of four primate species (human, chimpanzee, gorilla, and rhesus) and searched for transcripts whose expression was conserved. We observed that about 30% of TE instances expressed in human iPSCs had orthologous TE instances that were also expressed in chimpanzee and gorilla. Notably, our analysis revealed a number of repeat families with highly conserved expression profiles including HERVH but also MER53, which is known to be the source of a placental-specific family of microRNAs (miRNAs). We also identified a number of repeat families from all classes of TEs, including MLT1-type and Tigger families, that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved. Together, these results describe TE families and TE-derived lncRNAs whose conserved expression patterns can be used to identify what are likely functional TE-derived non-coding transcripts in primate iPSCs.
Mobile Bacterial Group II Introns at the Crux of Eukaryotic Evolution

PubMed Central

Lambowitz, Alan M.; Belfort, Marlene

2015-01-01

SUMMARY This review focuses on recent developments in our understanding of group II intron function, the relationships of these introns to retrotransposons and spliceosomes, and how their common features have informed thinking about bacterial group II introns as key elements in eukaryotic evolution. Reverse transcriptase-mediated and host factor-aided intron retrohoming pathways are considered along with retrotransposition mechanisms to novel sites in bacteria, where group II introns are thought to have originated. DNA target recognition and movement by target-primed reverse transcription infer an evolutionary relationship among group II introns, non-LTR retrotransposons, such as LINE elements, and telomerase. Additionally, group II introns are almost certainly the progenitors of spliceosomal introns. Their profound similarities include splicing chemistry extending to RNA catalysis, reaction stereochemistry, and the position of two divalent metals that perform catalysis at the RNA active site. There are also sequence and structural similarities between group II introns and the spliceosome’s small nuclear RNAs (snRNAs) and between a highly conserved core spliceosomal protein Prp8 and a group II intron-like reverse transcriptase. It has been proposed that group II introns entered eukaryotes during bacterial endosymbiosis or bacterial-archaeal fusion, proliferated within the nuclear genome, necessitating evolution of the nuclear envelope, and fragmented giving rise to spliceosomal introns. Thus, these bacterial self-splicing mobile elements have fundamentally impacted the composition of extant eukaryotic genomes, including the human genome, most of which is derived from close relatives of mobile group II introns. PMID:25878921
A common class of transcripts with 5'-intron depletion, distinct early coding sequence features, and N1-methyladenosine modification.

PubMed

Cenik, Can; Chua, Hon Nian; Singh, Guramrit; Akef, Abdalla; Snyder, Michael P; Palazzo, Alexander F; Moore, Melissa J; Roth, Frederick P

2017-03-01

Introns are found in 5' untranslated regions (5'UTRs) for 35% of all human transcripts. These 5'UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5'UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5'UTR intron status, we developed a classifier that can predict 5'UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5 ' proximal- i ntron- m inus-like-coding regions ("5IM" transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5' cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5' proximal positions. Finally, N 1 -methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5' proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N 1 -methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC. © 2017 Cenik et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
The Arabidopsis homolog of human minor spliceosomal protein U11-48K plays a crucial role in U12 intron splicing and plant development.

PubMed

Xu, Tao; Kim, Bo Mi; Kwak, Kyung Jin; Jung, Hyun Ju; Kang, Hunseung

2016-05-01

The minor U12 introns are removed from precursor mRNAs by the U12 intron-specific minor spliceosome. Among the seven ribonucleoproteins unique to the minor spliceosome, denoted as U11/U12-20K, U11/U12-25K, U11/U12-31K, U11/U12-65K, U11-35K, U11-48K, and U11-59K, the roles of only U11/U12-31K and U11/U12-65K have been demonstrated in U12 intron splicing and plant development. Here, the functional role of the Arabidopsis homolog of human U11-48K in U12 intron splicing and the development of Arabidopsis thaliana was examined using transgenic knockdown plants. The u11-48k mutants exhibited several defects in growth and development, such as severely arrested primary inflorescence stems, formation of serrated leaves, production of many rosette leaves after bolting, and delayed senescence. The splicing of most U12 introns analyzed was impaired in the u11-48k mutants. Comparative analysis of the splicing defects and phenotypes among the u11/u12-31k, u11-48k, and u11/12-65k mutants showed that the severity of abnormal development was closely correlated with the degree of impairment in U12 intron splicing. Taken together, these results provide compelling evidence that the Arabidopsis homolog of human U11-48K protein, as well as U11/U12-31K and U11/U12-65K proteins, is necessary for correct splicing of U12 introns and normal plant growth and development. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.
A Bioinformatics-Based Alternative mRNA Splicing Code that May Explain Some Disease Mutations Is Conserved in Animals.

PubMed

Qu, Wen; Cingolani, Pablo; Zeeberg, Barry R; Ruden, Douglas M

2017-01-01

Deep sequencing of cDNAs made from spliced mRNAs indicates that most coding genes in many animals and plants have pre-mRNA transcripts that are alternatively spliced. In pre-mRNAs, in addition to invariant exons that are present in almost all mature mRNA products, there are at least 6 additional types of exons, such as exons from alternative promoters or with alternative polyA sites, mutually exclusive exons, skipped exons, or exons with alternative 5' or 3' splice sites. Our bioinformatics-based hypothesis is that, in analogy to the genetic code, there is an "alternative-splicing code" in introns and flanking exon sequences, analogous to the genetic code, that directs alternative splicing of many of the 36 types of introns. In humans, we identified 42 different consensus sequences that are each present in at least 100 human introns. 37 of the 42 top consensus sequences are significantly enriched or depleted in at least one of the 36 types of introns. We further supported our hypothesis by showing that 96 out of 96 analyzed human disease mutations that affect RNA splicing, and change alternative splicing from one class to another, can be partially explained by a mutation altering a consensus sequence from one type of intron to that of another type of intron. Some of the alternative splicing consensus sequences, and presumably their small-RNA or protein targets, are evolutionarily conserved from 50 plant to animal species. We also noticed the set of introns within a gene usually share the same splicing codes, thus arguing that one sub-type of splicesosome might process all (or most) of the introns in a given gene. Our work sheds new light on a possible mechanism for generating the tremendous diversity in protein structure by alternative splicing of pre-mRNAs.
Gene relocations within chloroplast genomes of Jasminum and Menodora (Oleaceae) are due to multiple, overlapping inversions.

PubMed

Lee, Hae-Lim; Jansen, Robert K; Chumley, Timothy W; Kim, Ki-Joong

2007-05-01

The chloroplast (cp) DNA sequence of Jasminum nudiflorum (Oleaceae-Jasmineae) is completed and compared with the large single-copy region sequences from 6 related species. The cp genomes of the tribe Jasmineae (Jasminum and Menodora) show several distinctive rearrangements, including inversions, gene duplications, insertions, inverted repeat expansions, and gene and intron losses. The ycf4-psaI region in Jasminum section Primulina was relocated as a result of 2 overlapping inversions of 21,169 and 18,414 bp. The 1st, larger inversion is shared by all members of the Jasmineae indicating that it occurred in the common ancestor of the tribe. Similar rearrangements were also identified in the cp genome of Menodora. In this case, 2 fragments including ycf4 and rps4-trnS-ycf3 genes were moved by 2 additional inversions of 14 and 59 kb that are unique to Menodora. Other rearrangements in the Oleaceae are confined to certain regions of the Jasminum and Menodora cp genomes, including the presence of highly repeated sequences and duplications of coding and noncoding sequences that are inserted into clpP and between rbcL and psaI. These insertions are correlated with the loss of 2 introns in clpP and a serial loss of segments of accD. The loss of the accD gene and clpP introns in both the monocot family Poaceae and the eudicot family Oleaceae are clearly independent evolutionary events. However, their genome organization is surprisingly similar despite the distant relationship of these 2 angiosperm families.
Transcription Factor KLF5 Binds a Cyclin E1 Polymorphic Intronic Enhancer to Confer Increased Bladder Cancer Risk

PubMed Central

Pattison, Jillian M.; Posternak, Valeriya; Cole, Michael D.

2016-01-01

It is well established that environmental toxins, such as exposure to arsenic, are risk factors in the development of urinary bladder cancer, yet recent genome-wide association studies (GWAS) provide compelling evidence that there is a strong genetic component associated with disease predisposition. A single nucleotide polymorphism (SNP), rs8102137, was identified on chromosome 19q12, residing 6 kb upstream of the important cell cycle regulator and proto-oncogene, Cyclin E1 (CCNE1). However, the functional role of this variant in bladder cancer predisposition has been unclear since it lies within a non-coding region of the genome. Here, it is demonstrated that bladder cancer cells heterozygous for this SNP exhibit biased allelic expression of CCNE1 with 1.5-fold more transcription occurring from the risk allele. Furthermore, using chromatin immunoprecipitation assays, a novel enhancer element was identified within the first intron of CCNE1 that binds Kruppel-like Factor 5 (KLF5), a known transcriptional activator in bladder cancer. Moreover, the data reveal that the presence of rs200996365, a SNP in high linkage disequilibrium with rs8102137 residing in the center of a KLF5 motif, alters KLF5 binding to this genomic region. Through luciferase assays and CRISPR-Cas9 genome editing, a novel polymorphic intronic regulatory element controlling CCNE1 transcription is characterized. These studies uncover how a cancer-associated polymorphism mechanistically contributes to an increased predisposition for bladder cancer development. Implications A polymorphic KLF5 binding site near the CCNE1 gene explains genetic risk identified through genome wide association studies. PMID:27514407
Structure of the human type IV collagen COL4A6 gene, which is mutated in Alport syndrome-associated leiomyomatosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Xu; Zhou, Jing; Reeders, S.T.

1996-05-01

Basement membrane (type IV) collagen, a subfamily of the collagen protein family, is encoded by six distinct genes in mammals. Three of those, COL4A3, COL4A4, and COL4A5, are linked with Alport syndrome (hereditary nephritis). Patients with leimoyomatosis associated with Alport syndrome have been shown to have deletions in the 5{prime} end of the COL4A6 gene, in addition to having deletions in COL4A6. The human COL4A6 gene is reported to be 425 kb as determined by mapping of overlapping YAC clones by probes for its 5{prime} and 3{prime} ends. In the present study we describe the complete exon/intron size pattern ofmore » the human COL4A6 gene. The 12 {lambda} phage clones characterized in the study spanned a total of 110 kb, including 85 kb of the actual gene and 25 kb of flanking sequences. The overlapping clones contained all 46 exons of the gene and all introns, except for intron 2. Since the total size of the exons and all introns except for intron 2 is about 85 kb, intron 2 must be about 340 kb. All exons of the gene were assigned to EcoRI restriction fragments to facilitate analysis of the gene in patients with leiomyomatosis associated with Alport syndrome. The exon size pattern of COL4A6 is highly homologous with that of the human and mouse COL4A2 genes, with 27 of the 46 exons of COL4A6 being identical in size between the genes. 42 refs., 2 figs., 3 tabs.« less
LncRNA-DANCR: A valuable cancer related long non-coding RNA for human cancers.

PubMed

Thin, Khaing Zar; Liu, Xuefang; Feng, Xiaobo; Raveendran, Sudheesh; Tu, Jian Cheng

2018-06-01

Long noncoding RNAs (lncRNA) are a type of noncoding RNA that comprise of longer than 200 nucleotides sequences. They can regulate chromosome structure, gene expression and play an essential role in the pathophysiology of human diseases, especially in tumorigenesis and progression. Nowadays, they are being targeted as potential biomarkers for various cancer types. And many research studies have proven that lncRNAs might bring a new era to cancer diagnosis and support treatment management. The purpose of this review was to inspect the molecular mechanism and clinical significance of long non-coding RNA- differentiation antagonizing nonprotein coding RNA(DANCR) in various types of human cancers. In this review, we summarize and figure out recent research studies concerning the expression and biological mechanisms of lncRNA-DANCR in tumour development. The related studies were obtained through a systematic search of PubMed, Embase and Cochrane Library. Long non-coding RNAs-DANCR is a valuable cancer-related lncRNA that its dysregulated expression was found in a variety of malignancies, including hepatocellular carcinoma, breast cancer, glioma, colorectal cancer, gastric cancer, and lung cancer. The aberrant expressions of DANCR have been shown to contribute to proliferation, migration and invasion of cancer cells. Long non-coding RNAs-DANCR likely serves as a useful disease biomarker or therapeutic cancer target. Copyright © 2018 Elsevier GmbH. All rights reserved.
Association of SNCA with Parkinson: replication in the Harvard NeuroDiscovery Center Biomarker Study

PubMed Central

Ding, Hongliu; Sarokhan, Alison K.; Roderick, Sarah S.; Bakshi, Rachit; Maher, Nancy E.; Ashourian, Paymon; Kan, Caroline G.; Chang, Sunny; Santarlasci, Andrea; Swords, Kyleen E.; Ravina, Bernard M.; Hayes, Michael T.; Sohur, U. Shivraj; Wills, Anne-Marie; Flaherty, Alice W.; Unni, Vivek K.; Hung, Albert Y.; Selkoe, Dennis J.; Schwarzschild, Michael A.; Schlossmacher, Michael G.; Sudarsky, Lewis R.; Growdon, John H.; Ivinson, Adrian J.; Hyman, Bradley T.; Scherzer, Clemens R.

2011-01-01

Background Mutations in the α-synuclein gene (SNCA) cause autosomal dominant forms of Parkinson’s disease, but the substantial risk conferred by this locus to the common sporadic disease has only recently emerged from genome-wide association studies. Methods Here we genotyped a prioritized non-coding variant in SNCA intron-4 in 344 patients with Parkinson’s and 275 controls from the longitudinal Harvard NeuroDiscovery Center Biomarker Study. Results The common minor allele of rs2736990 was associated with elevated disease susceptibility (odds ratio = 1.40, P value = 0.0032). Conclusions This result increases confidence in the notion that in many clinically well-characterized patients genetic variation in SNCA contributes to “sporadic” disease. PMID:21953863
Functional understanding of the diverse exon-intron structures of human GPCR genes.

PubMed

Hammond, Dorothy A; Olman, Victor; Xu, Ying

2014-02-01

The GPCR genes have a variety of exon-intron structures even though their proteins are all structurally homologous. We have examined all human GPCR genes with at least two functional protein isoforms, totaling 199, aiming to gain an understanding of what may have contributed to the large diversity of the exon-intron structures of the GPCR genes. The 199 genes have a total of 808 known protein splicing isoforms with experimentally verified functions. Our analysis reveals that 1301 (80.6%) adjacent exon-exon pairs out of the total of 1,613 in the 199 genes have either exactly one exon skipped or the intron in-between retained in at least one of the 808 protein splicing isoforms. This observation has a statistical significance p-value of 2.051762 * e(-09), assuming that the observed splicing isoforms are independent of the exon-intron structures. Our interpretation of this observation is that the exon boundaries of the GPCR genes are not randomly determined; instead they may be selected to facilitate specific alternative splicing for functional purposes.
Structural and Functional Characterization of Ribosomal Protein Gene Introns in Sponges

PubMed Central

Perina, Drago; Korolija, Marina; Mikoč, Andreja; Roller, Maša; Pleše, Bruna; Imešek, Mirna; Morrow, Christine; Batel, Renato; Ćetković, Helena

2012-01-01

Ribosomal protein genes (RPGs) are a powerful tool for studying intron evolution. They exist in all three domains of life and are much conserved. Accumulating genomic data suggest that RPG introns in many organisms abound with non-protein-coding-RNAs (ncRNAs). These ancient ncRNAs are small nucleolar RNAs (snoRNAs) essential for ribosome assembly. They are also mobile genetic elements and therefore probably important in diversification and enrichment of transcriptomes through various mechanisms such as intron/exon gain/loss. snoRNAs in basal metazoans are poorly characterized. We examined 449 RPG introns, in total, from four demosponges: Amphimedon queenslandica, Suberites domuncula, Suberites ficus and Suberites pagurorum and showed that RPG introns from A. queenslandica share position conservancy and some structural similarity with “higher” metazoans. Moreover, our study indicates that mobile element insertions play an important role in the evolution of their size. In four sponges 51 snoRNAs were identified. The analysis showed discrepancies between the snoRNA pools of orthologous RPG introns between S. domuncula and A. queenslandica. Furthermore, these two sponges show as much conservancy of RPG intron positions between each other as between themselves and human. Sponges from the Suberites genus show consistency in RPG intron position conservation. However, significant differences in some of the orthologous RPG introns of closely related sponges were observed. This indicates that RPG introns are dynamic even on these shorter evolutionary time scales. PMID:22880015
Structural and functional characterization of ribosomal protein gene introns in sponges.

PubMed

Perina, Drago; Korolija, Marina; Mikoč, Andreja; Roller, Maša; Pleše, Bruna; Imešek, Mirna; Morrow, Christine; Batel, Renato; Ćetković, Helena

2012-01-01

Ribosomal protein genes (RPGs) are a powerful tool for studying intron evolution. They exist in all three domains of life and are much conserved. Accumulating genomic data suggest that RPG introns in many organisms abound with non-protein-coding-RNAs (ncRNAs). These ancient ncRNAs are small nucleolar RNAs (snoRNAs) essential for ribosome assembly. They are also mobile genetic elements and therefore probably important in diversification and enrichment of transcriptomes through various mechanisms such as intron/exon gain/loss. snoRNAs in basal metazoans are poorly characterized. We examined 449 RPG introns, in total, from four demosponges: Amphimedon queenslandica, Suberites domuncula, Suberites ficus and Suberites pagurorum and showed that RPG introns from A. queenslandica share position conservancy and some structural similarity with "higher" metazoans. Moreover, our study indicates that mobile element insertions play an important role in the evolution of their size. In four sponges 51 snoRNAs were identified. The analysis showed discrepancies between the snoRNA pools of orthologous RPG introns between S. domuncula and A. queenslandica. Furthermore, these two sponges show as much conservancy of RPG intron positions between each other as between themselves and human. Sponges from the Suberites genus show consistency in RPG intron position conservation. However, significant differences in some of the orthologous RPG introns of closely related sponges were observed. This indicates that RPG introns are dynamic even on these shorter evolutionary time scales.
Characterization of noncoding regulatory DNA in the human genome.

PubMed

Elkon, Ran; Agami, Reuven

2017-08-08

Genetic variants associated with common diseases are usually located in noncoding parts of the human genome. Delineation of the full repertoire of functional noncoding elements, together with efficient methods for probing their biological roles, is therefore of crucial importance. Over the past decade, DNA accessibility and various epigenetic modifications have been associated with regulatory functions. Mapping these features across the genome has enabled researchers to begin to document the full complement of putative regulatory elements. High-throughput reporter assays to probe the functions of regulatory regions have also been developed but these methods separate putative regulatory elements from the chromosome so that any effects of chromatin context and long-range regulatory interactions are lost. Definitive assignment of function(s) to putative cis-regulatory elements requires perturbation of these elements. Genome-editing technologies are now transforming our ability to perturb regulatory elements across entire genomes. Interpretation of high-throughput genetic screens that incorporate genome editors might enable the construction of an unbiased map of functional noncoding elements in the human genome.
Interpreting Mammalian Evolution using Fugu Genome Comparisons

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stubbs, L; Ovcharenko, I; Loots, G G

2004-04-02

Comparative sequence analysis of the human and the pufferfish Fugu rubripes (fugu) genomes has revealed several novel functional coding and noncoding regions in the human genome. In particular, the fugu genome has been extremely valuable for identifying transcriptional regulatory elements in human loci harboring unusually high levels of evolutionary conservation to rodent genomes. In such regions, the large evolutionary distance between human and fishes provides an additional filter through which functional noncoding elements can be detected with high efficiency.
Non-coding RNAs: Therapeutic Strategies and Delivery Systems.

PubMed

Ling, Hui

The vast majority of the human genome is transcribed into RNA molecules that do not code for proteins, which could be small ones approximately 20 nucleotide in length, known as microRNAs, or transcripts longer than 200 bp, defined as long noncoding RNAs. The prevalent deregulation of microRNAs in human cancers prompted immediate interest on the therapeutic value of microRNAs as drugs and drug targets. Many features of microRNAs such as well-defined mechanisms, and straightforward oligonucleotide design further make them attractive candidates for therapeutic development. The intensive efforts of exploring microRNA therapeutics are reflected by the large body of preclinical studies using oligonucleotide-based mimicking and blocking, culminated by the recent entry of microRNA therapeutics in clinical trial for several human diseases including cancer. Meanwhile, microRNA therapeutics faces the challenge of effective and safe delivery of nucleic acid therapeutics into the target site. Various chemical modifications of nucleic acids and delivery systems have been developed to increase targeting specificity and efficacy, and reduce the associated side effects including activation of immune response. Recently, long noncoding RNAs become attractive targets for therapeutic intervention because of their association with complex and delicate phenotypes, and their unconventional pharmaceutical activities such as capacity of increasing output of proteins. Here I discuss the general therapeutic strategies targeting noncoding RNAs, review delivery systems developed to maximize noncoding RNA therapeutic efficacy, and offer perspectives on the future development of noncoding RNA targeting agents for colorectal cancer.
[Relevance of long non-coding RNAs in tumour biology].

PubMed

Nagy, Zoltán; Szabó, Diána Rita; Zsippai, Adrienn; Falus, András; Rácz, Károly; Igaz, Péter

2012-09-23

The discovery of the biological relevance of non-coding RNA molecules represents one of the most significant advances in contemporary molecular biology. It has turned out that a major fraction of the non-coding part of the genome is transcribed. Beside small RNAs (including microRNAs) more and more data are disclosed concerning long non-coding RNAs of 200 nucleotides to 100 kb length that are implicated in the regulation of several basic molecular processes (cell proliferation, chromatin functioning, microRNA-mediated effects, etc.). Some of these long non-coding RNAs have been associated with human tumours, including H19, HOTAIR, MALAT1, etc., the different expression of which has been noted in various neoplasms relative to healthy tissues. Long non-coding RNAs may represent novel markers of molecular diagnostics and they might even turn out to be targets of therapeutic intervention.
Comparative analysis of human protein-coding and noncoding RNAs between brain and 10 mixed cell lines by RNA-Seq.

PubMed

Chen, Geng; Yin, Kangping; Shi, Leming; Fang, Yuanzhang; Qi, Ya; Li, Peng; Luo, Jian; He, Bing; Liu, Mingyao; Shi, Tieliu

2011-01-01

In their expression process, different genes can generate diverse functional products, including various protein-coding or noncoding RNAs. Here, we investigated the protein-coding capacities and the expression levels of their isoforms for human known genes, the conservation and disease association of long noncoding RNAs (ncRNAs) with two transcriptome sequencing datasets from human brain tissues and 10 mixed cell lines. Comparative analysis revealed that about two-thirds of the genes expressed between brain and cell lines are the same, but less than one-third of their isoforms are identical. Besides those genes specially expressed in brain and cell lines, about 66% of genes expressed in common encoded different isoforms. Moreover, most genes dominantly expressed one isoform and some genes only generated protein-coding (or noncoding) RNAs in one sample but not in another. We found 282 human genes could encode both protein-coding and noncoding RNAs through alternative splicing in the two samples. We also identified more than 1,000 long ncRNAs, and most of those long ncRNAs contain conserved elements across either 46 vertebrates or 33 placental mammals or 10 primates. Further analysis showed that some long ncRNAs differentially expressed in human breast cancer or lung cancer, several of those differentially expressed long ncRNAs were validated by RT-PCR. In addition, those validated differentially expressed long ncRNAs were found significantly correlated with certain breast cancer or lung cancer related genes, indicating the important biological relevance between long ncRNAs and human cancers. Our findings reveal that the differences of gene expression profile between samples mainly result from the expressed gene isoforms, and highlight the importance of studying genes at the isoform level for completely illustrating the intricate transcriptome.
Human growth hormone (GH1) gene polymorphism map in a normal-statured adult population

PubMed Central

Esteban, Cristina; Audí, Laura; Carrascosa, Antonio; Fernández-Cancio, Mónica; Pérez-Arroyo, Annalisa; Ulied, Angels; Andaluz, Pilar; Arjona, Rosa; Albisu, Marian; Clemente, María; Gussinyé, Miquel; Yeste, Diego

2007-01-01

Objective GH1 gene presents a complex map of single nucleotide polymorphisms (SNPs) in the entire promoter, coding and noncoding regions. The aim of the study was to establish the complete map of GH1 gene SNPs in our control normal population and to analyse its association with adult height. Design, subjects and measurements A systematic GH1 gene analysis was designed in a control population of 307 adults of both sexes with height normally distributed within normal range for the same population: −2 standard deviation scores (SDS) to +2 SDS. An analysis was performed on individual and combined genotype associations with adult height. Results Twenty-five SNPs presented a frequency over 1%: 11 in the promoter (P1 to P11), three in the 5′UTR region (P12 to P14), one in exon 1 (P15), three in intron 1 (P16 to P18), two in intron 2 (P19 and P20), two in exon 4 (P21 and P22) and three in intron 4 (P23 to P25). Twenty-nine additional changes with frequencies under 1% were found in 29 subjects. P8, P19, P20 and P25 had not been previously described. P6, P12, P17 and P25 accounted for 6·2% of the variation in adult height (P = 0·0007) in this population with genotypes A/G at P6, G/G at P6 and A/G at P12 decreasing height SDS (−0·063 ± 0·031, −0·693 ± 0·350 and −0·489 ± 0·265, Mean ± SE) and genotypes A/T at P17 and T/G at P25 increasing height SDS (+1·094 ± 0·456 and +1·184 ± 0·432). Conclusions This study established the GH1 gene sequence variation map in a normal adult height control population confirming the high density of SNPs in a relatively small gene. Our study shows that the more frequent SNPs did not significantly contribute to height determination, while only one promoter and two intronic SNPs contributed significantly to it. Studies in larger populations will have to confirm the associations and in vitro functional studies will elucidate the mechanisms involved. Systematic GH1 gene analysis in patients with growth delay and suspected GH deficiency/insufficiency will clarify whether different SNP frequencies and/or the presence of different sequence changes may be associated with phenotypes in them. PMID:17223997
Gene-specific cell labeling using MiMIC transposons

PubMed Central

Gnerer, Joshua P.; Venken, Koen J. T.; Dierick, Herman A.

2015-01-01

Binary expression systems such as GAL4/UAS, LexA/LexAop and QF/QUAS have greatly enhanced the power of Drosophila as a model organism by allowing spatio-temporal manipulation of gene function as well as cell and neural circuit function. Tissue-specific expression of these heterologous transcription factors relies on random transposon integration near enhancers or promoters that drive the binary transcription factor embedded in the transposon. Alternatively, gene-specific promoter elements are directly fused to the binary factor within the transposon followed by random or site-specific integration. However, such insertions do not consistently recapitulate endogenous expression. We used Minos-Mediated Integration Cassette (MiMIC) transposons to convert host loci into reliable gene-specific binary effectors. MiMIC transposons allow recombinase-mediated cassette exchange to modify the transposon content. We developed novel exchange cassettes to convert coding intronic MiMIC insertions into gene-specific binary factor protein-traps. In addition, we expanded the set of binary factor exchange cassettes available for non-coding intronic MiMIC insertions. We show that binary factor conversions of different insertions in the same locus have indistinguishable expression patterns, suggesting that they reliably reflect endogenous gene expression. We show the efficacy and broad applicability of these new tools by dissecting the cellular expression patterns of the Drosophila serotonin receptor gene family. PMID:25712101

Quantifying the mechanisms of domain gain in animal proteins.

PubMed

Buljan, Marija; Frankish, Adam; Bateman, Alex

2010-01-01

Protein domains are protein regions that are shared among different proteins and are frequently functionally and structurally independent from the rest of the protein. Novel domain combinations have a major role in evolutionary innovation. However, the relative contributions of the different molecular mechanisms that underlie domain gains in animals are still unknown. By using animal gene phylogenies we were able to identify a set of high confidence domain gain events and by looking at their coding DNA investigate the causative mechanisms. Here we show that the major mechanism for gains of new domains in metazoan proteins is likely to be gene fusion through joining of exons from adjacent genes, possibly mediated by non-allelic homologous recombination. Retroposition and insertion of exons into ancestral introns through intronic recombination are, in contrast to previous expectations, only minor contributors to domain gains and have accounted for less than 1% and 10% of high confidence domain gain events, respectively. Additionally, exonization of previously non-coding regions appears to be an important mechanism for addition of disordered segments to proteins. We observe that gene duplication has preceded domain gain in at least 80% of the gain events. The interplay of gene duplication and domain gain demonstrates an important mechanism for fast neofunctionalization of genes.
Role of genomic architecture in the expression dynamics of long noncoding RNAs during differentiation of human neuroblastoma cells.

PubMed

Batagov, Arsen O; Yarmishyn, Aliaksandr A; Jenjaroenpun, Piroon; Tan, Jovina Z; Nishida, Yuichiro; Kurochkin, Igor V

2013-10-16

Mammalian genomes are extensively transcribed producing thousands of long non-protein-coding RNAs (lncRNAs). The biological significance and function of the vast majority of lncRNAs remain unclear. Recent studies have implicated several lncRNAs as playing important roles in embryonic development and cancer progression. LncRNAs are characterized with different genomic architectures in relationship with their associated protein-coding genes. Our study aimed at bridging lncRNA architecture with dynamical patterns of their expression using differentiating human neuroblastoma cells model. LncRNA expression was studied in a 120-hours timecourse of differentiation of human neuroblastoma SH-SY5Y cells into neurons upon treatment with retinoic acid (RA), the compound used for the treatment of neuroblastoma. A custom microarray chip was utilized to interrogate expression levels of 9,267 lncRNAs in the course of differentiation. We categorized lncRNAs into 19 architecture classes according to their position relatively to protein-coding genes. For each architecture class, dynamics of expression of lncRNAs was studied in association with their protein-coding partners. It allowed us to demonstrate positive correlation of lncRNAs with their associated protein-coding genes at bidirectional promoters and for sense-antisense transcript pairs. In contrast, lncRNAs located in the introns and downstream of the protein-coding genes were characterized with negative correlation modes. We further classified the lncRNAs by the temporal patterns of their expression dynamics. We found that intronic and bidirectional promoter architectures are associated with rapid RA-dependent induction or repression of the corresponding lncRNAs, followed by their constant expression. At the same time, lncRNAs expressed downstream of protein-coding genes are characterized by rapid induction, followed by transcriptional repression. Quantitative RT-PCR analysis confirmed the discovered functional modes for several selected lncRNAs associated with proteins involved in cancer and embryonic development. This is the first report detailing dynamical changes of multiple lncRNAs during RA-induced neuroblastoma differentiation. Integration of genomic and transcriptomic levels of information allowed us to demonstrate specific behavior of lncRNAs organized in different genomic architectures. This study also provides a list of lncRNAs with possible roles in neuroblastoma.
Associations between variants of FADS genes and omega-3 and omega-6 milk fatty acids of Canadian Holstein cows

PubMed Central

2014-01-01

Background Fatty acid desaturase 1 (FADS1) and 2 (FADS2) genes code respectively for the enzymes delta-5 and delta-6 desaturases which are rate limiting enzymes in the synthesis of polyunsaturated omega-3 and omega-6 fatty acids (FAs). Omega-3 and-6 FAs as well as conjugated linoleic acid (CLA) are present in bovine milk and have demonstrated positive health effects in humans. Studies in humans have shown significant relationships between genetic variants in FADS1 and 2 genes with plasma and tissue concentrations of omega-3 and-6 FAs. The aim of this study was to evaluate the extent of sequence variations within these two genes in Canadian Holstein cows as well as the association between sequence variants and health promoting FAs in milk. Results Thirty three SNPs were detected within the studied regions of genes including a synonymous mutation (FADS1-07, rs42187261, 306Tyr > Tyr) in exon 8 of FADS1, a non-synonymous mutation (FADS2-14, rs211580559, 294Ala > Val) within FADS2 exon 7, a splice site SNP (FADS2-05, rs211263660), a 3′UTR SNP (FADS2-23, rs109772589), and another 3′UTR SNP with an effect on a microRNA binding site within FADS2 gene (FADS2-19, rs210169303). Association analyses showed significant relations between three out of seven tested SNPs and several FAs. Significant associations (FDR P < 0.05) were recorded between FADS2-23 (rs109772589) and two omega-6 FAs (dihomogamma linolenic acid [C20:3n6] and arachidonic acid [C20:4n6]), FADS1-07 (rs42187261) and one omega-3 FA (eicosapentaenoic acid, C20:5n3) and tricosanoic acid (C23:0), and one intronic SNP, FADS1-01 (rs136261927) and C20:3n6. Conclusion Our study has demonstrated positive associations between three SNPs within FADS1 and FADS2 genes (a SNP within the 3’UTR, a synonymous SNP and an intronic SNP), with three milk PUFAs of Canadian Holstein cows thus suggesting possible involvement of synonymous and non-coding region variants in FA synthesis. These SNPs may serve as potential genetic markers in breeding programs to increase milk FAs that are of benefit to human health. PMID:24533445
Associations between variants of FADS genes and omega-3 and omega-6 milk fatty acids of Canadian Holstein cows.

PubMed

Ibeagha-Awemu, Eveline M; Akwanji, Kingsley A; Beaudoin, Frédéric; Zhao, Xin

2014-02-17

Fatty acid desaturase 1 (FADS1) and 2 (FADS2) genes code respectively for the enzymes delta-5 and delta-6 desaturases which are rate limiting enzymes in the synthesis of polyunsaturated omega-3 and omega-6 fatty acids (FAs). Omega-3 and-6 FAs as well as conjugated linoleic acid (CLA) are present in bovine milk and have demonstrated positive health effects in humans. Studies in humans have shown significant relationships between genetic variants in FADS1 and 2 genes with plasma and tissue concentrations of omega-3 and-6 FAs. The aim of this study was to evaluate the extent of sequence variations within these two genes in Canadian Holstein cows as well as the association between sequence variants and health promoting FAs in milk. Thirty three SNPs were detected within the studied regions of genes including a synonymous mutation (FADS1-07, rs42187261, 306Tyr > Tyr) in exon 8 of FADS1, a non-synonymous mutation (FADS2-14, rs211580559, 294Ala > Val) within FADS2 exon 7, a splice site SNP (FADS2-05, rs211263660), a 3'UTR SNP (FADS2-23, rs109772589), and another 3'UTR SNP with an effect on a microRNA binding site within FADS2 gene (FADS2-19, rs210169303). Association analyses showed significant relations between three out of seven tested SNPs and several FAs. Significant associations (FDR P < 0.05) were recorded between FADS2-23 (rs109772589) and two omega-6 FAs (dihomogamma linolenic acid [C20:3n6] and arachidonic acid [C20:4n6]), FADS1-07 (rs42187261) and one omega-3 FA (eicosapentaenoic acid, C20:5n3) and tricosanoic acid (C23:0), and one intronic SNP, FADS1-01 (rs136261927) and C20:3n6. Our study has demonstrated positive associations between three SNPs within FADS1 and FADS2 genes (a SNP within the 3'UTR, a synonymous SNP and an intronic SNP), with three milk PUFAs of Canadian Holstein cows thus suggesting possible involvement of synonymous and non-coding region variants in FA synthesis. These SNPs may serve as potential genetic markers in breeding programs to increase milk FAs that are of benefit to human health.
An Intron 9 CYP19 Gene Variant (IVS9+5G>A), Present in an Aromatase-Deficient Girl, Affects Normal Splicing and Is Also Present in Normal Human Steroidogenic Tissues.

PubMed

Saraco, Nora; Nesi-Franca, Suzana; Sainz, Romina; Marino, Roxana; Marques-Pereira, Rosana; La Pastina, Julia; Perez Garrido, Natalia; Sandrini, Romolo; Rivarola, Marco Aurelio; de Lacerda, Luiz; Belgorosky, Alicia

2015-01-01

Splicing CYP19 gene variants causing aromatase deficiency in 46,XX disorder of sexual development (DSD) patients have been reported in a few cases. A misbalance between normal and aberrant splicing variants was proposed to explain spontaneous pubertal breast development but an incomplete sex maturation progress. The aim of this study was to functionally characterize a novel CYP19A1 intronic homozygote mutation (IVS9+5G>A) in a 46,XX DSD girl presenting spontaneous breast development and primary amenorrhea, and to evaluate similar splicing variant expression in normal steroidogenic tissues. Genomic DNA analysis, splicing prediction programs, splicing assays, and in vitro protein expression and enzyme activity analyses were carried out. CYP19A1 mRNA expression in human steroidogenic tissues was also studied. A novel IVS9+5G>A homozygote mutation was found. In silico analysis predicts the disappearance of the splicing donor site in intron 9, confirmed by patient peripheral leukocyte cP450arom and in vitro studies. Protein analysis showed a shorter and inactive protein. The intron 9 transcript variant was also found in human steroidogenic tissues. The mutation IVS9+5G>A generates a splicing variant that includes intron 9 which is also present in normal human steroidogenic tissues, suggesting that a misbalance between normal and aberrant splicing variants might occur in target tissues, explaining the clinical phenotype in the affected patient. © 2015 S. Karger AG, Basel.
Circular RNA: A new star of noncoding RNAs.

PubMed

Qu, Shibin; Yang, Xisheng; Li, Xiaolei; Wang, Jianlin; Gao, Yuan; Shang, Runze; Sun, Wei; Dou, Kefeng; Li, Haimin

2015-09-01

Circular RNAs (circRNAs) are a novel type of RNA that, unlike linear RNAs, form a covalently closed continuous loop and are highly represented in the eukaryotic transcriptome. Recent studies have discovered thousands of endogenous circRNAs in mammalian cells. CircRNAs are largely generated from exonic or intronic sequences, and reverse complementary sequences or RNA-binding proteins (RBPs) are necessary for circRNA biogenesis. The majority of circRNAs are conserved across species, are stable and resistant to RNase R, and often exhibit tissue/developmental-stage-specific expression. Recent research has revealed that circRNAs can function as microRNA (miRNA) sponges, regulators of splicing and transcription, and modifiers of parental gene expression. Emerging evidence indicates that circRNAs might play important roles in atherosclerotic vascular disease risk, neurological disorders, prion diseases and cancer; exhibit aberrant expression in colorectal cancer (CRC) and pancreatic ductal adenocarcinoma (PDAC); and serve as diagnostic or predictive biomarkers of some diseases. Similar to miRNAs and long noncoding RNAs (lncRNAs), circRNAs are becoming a new research hotspot in the field of RNA and could be widely involved in the processes of life. Herein, we review the formation and properties of circRNAs, their functions, and their potential significance in disease. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Evidence of function for conserved noncoding sequences in Arabidopsis thaliana.

PubMed

Spangler, Jacob B; Subramaniam, Sabarinath; Freeling, Michael; Feltus, F Alex

2012-01-01

• Whole genome duplication events provide a lineage with a large reservoir of genes that can be molded by evolutionary forces into phenotypes that fit alternative environments. A well-studied whole genome duplication, the α-event, occurred in an ancestor of the model plant Arabidopsis thaliana. Retained segments of the α-event have been defined in recent years in the form of duplicate protein coding sequences (α-pairs) and associated conserved noncoding DNA sequences (CNSs). Our aim was to identify any association between CNSs and α-pair co-functionality at the gene expression level. • Here, we tested for correlation between CNS counts and α-pair co-expression and expression intensity across nine expression datasets: aerial tissue, flowers, leaves, roots, rosettes, seedlings, seeds, shoots and whole plants. • We provide evidence for a putative regulatory role of the CNSs. The association of CNSs with α-pair co-expression and expression intensity varied by gene function, subgene position and the presence of transcription factor binding motifs. A range of possible CNS regulatory mechanisms, including intron-mediated enhancement, messenger RNA fold stability and transcriptional regulation, are discussed. • This study provides a framework to understand how CNS motifs are involved in the maintenance of gene expression after a whole genome duplication event. © 2011 The Authors. New Phytologist © 2011 New Phytologist Trust.
Whole-genome resequencing reveals candidate mutations for pig prolificacy.

PubMed

Li, Wen-Ting; Zhang, Meng-Meng; Li, Qi-Gang; Tang, Hui; Zhang, Li-Fan; Wang, Ke-Jun; Zhu, Mu-Zhen; Lu, Yun-Feng; Bao, Hai-Gang; Zhang, Yuan-Ming; Li, Qiu-Yan; Wu, Ke-Liang; Wu, Chang-Xin

2017-12-20

Changes in pig fertility have occurred as a result of domestication, but are not understood at the level of genetic variation. To identify variations potentially responsible for prolificacy, we sequenced the genomes of the highly prolific Taihu pig breed and four control breeds. Genes involved in embryogenesis and morphogenesis were targeted in the Taihu pig, consistent with the morphological differences observed between the Taihu pig and others during pregnancy. Additionally, excessive functional non-coding mutations have been specifically fixed or nearly fixed in the Taihu pig. We focused attention on an oestrogen response element (ERE) within the first intron of the bone morphogenetic protein receptor type-1B gene ( BMPR1B ) that overlaps with a known quantitative trait locus (QTL) for pig fecundity. Using 242 pigs from 30 different breeds, we confirmed that the genotype of the ERE was nearly fixed in the Taihu pig. ERE function was assessed by luciferase assays, examination of histological sections, chromatin immunoprecipitation, quantitative polymerase chain reactions, and western blots. The results suggest that the ERE may control pig prolificacy via the cis-regulation of BMPR1B expression. This study provides new insight into changes in reproductive performance and highlights the role of non-coding mutations in generating phenotypic diversity between breeds. © 2017 The Author(s).
Using a Euclid distance discriminant method to find protein coding genes in the yeast genome.

PubMed

Zhang, Chun-Ting; Wang, Ju; Zhang, Ren

2002-02-01

The Euclid distance discriminant method is used to find protein coding genes in the yeast genome, based on the single nucleotide frequencies at three codon positions in the ORFs. The method is extremely simple and may be extended to find genes in prokaryotic genomes or eukaryotic genomes with less introns. Six-fold cross-validation tests have demonstrated that the accuracy of the algorithm is better than 93%. Based on this, it is found that the total number of protein coding genes in the yeast genome is less than or equal to 5579 only, about 3.8-7.0% less than 5800-6000, which is currently widely accepted. The base compositions at three codon positions are analyzed in details using a graphic method. The result shows that the preference codons adopted by yeast genes are of the RGW type, where R, G and W indicate the bases of purine, non-G and A/T, whereas the 'codons' in the intergenic sequences are of the form NNN, where N denotes any base. This fact constitutes the basis of the algorithm to distinguish between coding and non-coding ORFs in the yeast genome. The names of putative non-coding ORFs are listed here in detail.
Non-coding RNA may be associated with cytoplasmic male sterility in Silene vulgaris

PubMed Central

Stone, James D.; Koloušková, Pavla; Sloan, Daniel B.

2017-01-01

Abstract Cytoplasmic male sterility (CMS) is a widespread phenomenon in flowering plants caused by mitochondrial (mt) genes. CMS genes typically encode novel proteins that interfere with mt functions and can be silenced by nuclear fertility-restorer genes. Although the molecular basis of CMS is well established in a number of crop systems, our understanding of it in natural populations is far more limited. To identify CMS genes in a gynodioecious plant, Silene vulgaris, we constructed mt transcriptomes and compared transcript levels and RNA editing patterns in floral bud tissue from female and hermaphrodite full siblings. The transcriptomes from female and hermaphrodite individuals were very similar overall with respect to variation in levels of transcript abundance across the genome, the extent of RNA editing, and the order in which RNA editing and intron splicing events occurred. We found only a single genomic region that was highly overexpressed and differentially edited in females relative to hermaphrodites. This region is not located near any other transcribed elements and lacks an open-reading frame (ORF) of even moderate size. To our knowledge, this transcript would represent the first non-coding mt RNA associated with CMS in plants and is, therefore, an important target for future functional validation studies. PMID:28369520
Transcriptome-wide discovery of circular RNAs in Archaea

PubMed Central

Danan, Miri; Schwartz, Schraga; Edelheit, Sarit; Sorek, Rotem

2012-01-01

Circular RNA forms had been described in all domains of life. Such RNAs were shown to have diverse biological functions, including roles in the life cycle of viral and viroid genomes, and in maturation of permuted tRNA genes. Despite their potentially important biological roles, discovery of circular RNAs has so far been mostly serendipitous. We have developed circRNA-seq, a combined experimental/computational approach that enriches for circular RNAs and allows profiling their prevalence in a whole-genome, unbiased manner. Application of this approach to the archaeon Sulfolobus solfataricus P2 revealed multiple circular transcripts, a subset of which was further validated independently. The identified circular RNAs included expected forms, such as excised tRNA introns and rRNA processing intermediates, but were also enriched with non-coding RNAs, including C/D box RNAs and RNase P, as well as circular RNAs of unknown function. Many of the identified circles were conserved in Sulfolobus acidocaldarius, further supporting their functional significance. Our results suggest that circular RNAs, and particularly circular non-coding RNAs, are more prevalent in archaea than previously recognized, and might have yet unidentified biological roles. Our study establishes a specific and sensitive approach for identification of circular RNAs using RNA-seq, and can readily be applied to other organisms. PMID:22140119
SMN deficiency in severe models of spinal muscular atrophy causes widespread intron retention and DNA damage

PubMed Central

Jangi, Mohini; Fleet, Christina; Cullen, Patrick; Gupta, Shipra V.; Mekhoubad, Shila; Chiao, Eric; Allaire, Norm; Bennett, C. Frank; Rigo, Frank; Krainer, Adrian R.; Hurt, Jessica A.; Carulli, John P.; Staropoli, John F.

2017-01-01

Spinal muscular atrophy (SMA), an autosomal recessive neuromuscular disease, is the leading monogenic cause of infant mortality. Homozygous loss of the gene survival of motor neuron 1 (SMN1) causes the selective degeneration of lower motor neurons and subsequent atrophy of proximal skeletal muscles. The SMN1 protein product, survival of motor neuron (SMN), is ubiquitously expressed and is a key factor in the assembly of the core splicing machinery. The molecular mechanisms by which disruption of the broad functions of SMN leads to neurodegeneration remain unclear. We used an antisense oligonucleotide (ASO)-based inducible mouse model of SMA to investigate the SMN-specific transcriptome changes associated with neurodegeneration. We found evidence of widespread intron retention, particularly of minor U12 introns, in the spinal cord of mice 30 d after SMA induction, which was then rescued by a therapeutic ASO. Intron retention was concomitant with a strong induction of the p53 pathway and DNA damage response, manifesting as γ-H2A.X positivity in neurons of the spinal cord and brain. Widespread intron retention and markers of the DNA damage response were also observed with SMN depletion in human SH-SY5Y neuroblastoma cells and human induced pluripotent stem cell-derived motor neurons. We also found that retained introns, high in GC content, served as substrates for the formation of transcriptional R-loops. We propose that defects in intron removal in SMA promote DNA damage in part through the formation of RNA:DNA hybrid structures, leading to motor neuron death. PMID:28270613
CRNDE: An important oncogenic long non-coding RNA in human cancers.

PubMed

Zhang, Jiaming; Yin, Minuo; Peng, Gang; Zhao, Yingchao

2018-06-01

Aberrant overexpression of long non-coding RNA CRNDE (Colorectal Neoplasia Differentially Expressed) is confirmed in various human cancers, which is correlated with advanced clinicopathological features and poor prognosis. CRNDE promotes cancer cell proliferation, migration and invasion, and suppresses apoptosis in complicated mechanisms, which result in the initialization and development of human cancers. In this review, we provide an overview of the oncogenic role and potential clinical applications of CRNDE. © 2018 John Wiley & Sons Ltd.
Whole-exome/genome sequencing and genomics.

PubMed

Grody, Wayne W; Thompson, Barry H; Hudgins, Louanne

2013-12-01

As medical genetics has progressed from a descriptive entity to one focused on the functional relationship between genes and clinical disorders, emphasis has been placed on genomics. Genomics, a subelement of genetics, is the study of the genome, the sum total of all the genes of an organism. The human genome, which is contained in the 23 pairs of nuclear chromosomes and in the mitochondrial DNA of each cell, comprises >6 billion nucleotides of genetic code. There are some 23,000 protein-coding genes, a surprisingly small fraction of the total genetic material, with the remainder composed of noncoding DNA, regulatory sequences, and introns. The Human Genome Project, launched in 1990, produced a draft of the genome in 2001 and then a finished sequence in 2003, on the 50th anniversary of the initial publication of Watson and Crick's paper on the double-helical structure of DNA. Since then, this mass of genetic information has been translated at an ever-increasing pace into useable knowledge applicable to clinical medicine. The recent advent of massively parallel DNA sequencing (also known as shotgun, high-throughput, and next-generation sequencing) has brought whole-genome analysis into the clinic for the first time, and most of the current applications are directed at children with congenital conditions that are undiagnosable by using standard genetic tests for single-gene disorders. Thus, pediatricians must become familiar with this technology, what it can and cannot offer, and its technical and ethical challenges. Here, we address the concepts of human genomic analysis and its clinical applicability for primary care providers.
Discovering weighted patterns in intron sequences using self-adaptive harmony search and back-propagation algorithms.

PubMed

Huang, Yin-Fu; Wang, Chia-Ming; Liou, Sing-Wu

2013-01-01

A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete.
Discovering Weighted Patterns in Intron Sequences Using Self-Adaptive Harmony Search and Back-Propagation Algorithms

PubMed Central

Wang, Chia-Ming; Liou, Sing-Wu

2013-01-01

A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete. PMID:23737711
The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

PubMed Central

Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.

2015-01-01

Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, ncCADD and ncGWAVA, and find both scores are significantly predictive of human dosage sensitive genes and appear to carry information beyond conservation, as assessed by ncGERP. These results highlight that the intolerance of noncoding sequence stretches in the human genome can provide a critical complementary tool to other genome annotation approaches to help identify the parts of the human genome increasingly likely to harbor mutations that influence risk of disease. PMID:26332131
Association of keratin 8/18 variants with non-alcoholic fatty liver disease and insulin resistance in Chinese patients: A case-control study.

PubMed

Li, Rui; Liao, Xian-Hua; Ye, Jun-Zhao; Li, Min-Rui; Wu, Yan-Qin; Hu, Xuan; Zhong, Bi-Hui

2017-06-14

To test the hypothesis that K8/K18 variants predispose humans to non-alcoholic fatty liver disease (NAFLD) progression and its metabolic phenotypes. We selected a total of 373 unrelated adult subjects from our Physical Examination Department, including 200 unrelated NAFLD patients and 173 controls of both genders and different ages. Diagnoses of NAFLD were established according to ultrasonic signs of fatty liver. All subjects were tested for population characteristics, lipid profile, liver tests, as well as glucose tests. Genomic DNA was obtained from peripheral blood with a DNeasy Tissue Kit. K8/K18 coding regions were analyzed, including 15 exons and exon-intron boundaries. Among 200 NAFLD patients, 10 (5%) heterozygous carriers of keratin variants were identified. There were 5 amino-acid-altering heterozygous variants and 6 non-coding heterozygous variants. One novel amino-acid-altering heterozygous variant (K18 N193S) and three novel non-coding variants were observed (K8 IVS5-9A→G, K8 IVS6+19G→A, K18 T195T). A total of 9 patients had a single variant and 1 patient had compound variants (K18 N193S+K8 IVS3-15C→G). Only one R341H variant was found in the control group (1 of 173, 0.58%). The frequency of keratin variants in NAFLD patients was significantly higher than that in the control group (5% vs 0.58%, P = 0.015). Notably, the keratin variants were significantly associated with insulin resistance (IR) in NAFLD patients (8.86% in NAFLD patients with IR vs 2.5% in NAFLD patients without IR, P = 0.043). K8/K18 variants are overrepresented in Chinese NAFLD patients and might accelerate liver fat storage through IR.
Genetic variation in eleven phase I drug metabolism genes in an ethnically diverse population.

PubMed

Solus, Joseph F; Arietta, Brenda J; Harris, James R; Sexton, David P; Steward, John Q; McMunn, Chara; Ihrie, Patrick; Mehall, Janelle M; Edwards, Todd L; Dawson, Elliott P

2004-10-01

The extent of genetic variation found in drug metabolism genes and its contribution to interindividual variation in response to medication remains incompletely understood. To better determine the identity and frequency of variation in 11 phase I drug metabolism genes, the exons and flanking intronic regions of the cytochrome P450 (CYP) isoenzyme genes CYP1A1, CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4 and CYP3A5 were amplified from genomic DNA and sequenced. A total of 60 kb of bi-directional sequence was generated from each of 93 human DNAs, which included Caucasian, African-American and Asian samples. There were 388 different polymorphisms identified. These included 269 non-coding, 45 synonymous and 74 non-synonymous polymorphisms. Of these, 54% were novel and included 176 non-coding, 14 synonymous and 21 non-synonymous polymorphisms. Of the novel variants observed, 85 were represented by single occurrences of the minor allele in the sample set. Much of the variation observed was from low-frequency alleles. Comparatively, these genes are variation-rich. Calculations measuring genetic diversity revealed that while the values for the individual genes are widely variable, the overall nucleotide diversity of 7.7 x 10(-4) and polymorphism parameter of 11.5 x 10(-4) are higher than those previously reported for other gene sets. Several independent measurements indicate that these genes are under selective pressure, particularly for polymorphisms corresponding to non-synonymous amino acid changes. There is relatively little difference in measurements of diversity among the ethnic groups, but there are large differences among the genes and gene subfamilies themselves. Of the three CYP subfamilies involved in phase I drug metabolism (1, 2, and 3), subfamily 2 displays the highest levels of genetic diversity.
Multi-species comparative analysis of the equine ACE gene identifies a highly conserved potential transcription factor binding site in intron 16.

PubMed

Hamilton, Natasha A; Tammen, Imke; Raadsma, Herman W

2013-01-01

Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism.

Multi-Species Comparative Analysis of the Equine ACE Gene Identifies a Highly Conserved Potential Transcription Factor Binding Site in Intron 16

PubMed Central

Hamilton, Natasha A.; Tammen, Imke; Raadsma, Herman W.

2013-01-01

Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism. PMID:23408978
Natural Selection and Functional Potentials of Human Noncoding Elements Revealed by Analysis of Next Generation Sequencing Data

PubMed Central

Xu, Shuhua

2015-01-01

Noncoding DNA sequences (NCS) have attracted much attention recently due to their functional potentials. Here we attempted to reveal the functional roles of noncoding sequences from the point of view of natural selection that typically indicates the functional potentials of certain genomic elements. We analyzed nearly 37 million single nucleotide polymorphisms (SNPs) of Phase I data of the 1000 Genomes Project. We estimated a series of key parameters of population genetics and molecular evolution to characterize sequence variations of the noncoding genome within and between populations, and identified the natural selection footprints in NCS in worldwide human populations. Our results showed that purifying selection is prevalent and there is substantial constraint of variations in NCS, while positive selectionis more likely to be specific to some particular genomic regions and regional populations. Intriguingly, we observed larger fraction of non-conserved NCS variants with lower derived allele frequency in the genome, indicating possible functional gain of non-conserved NCS. Notably, NCS elements are enriched for potentially functional markers such as eQTLs, TF motif, and DNase I footprints in the genome. More interestingly, some NCS variants associated with diseases such as Alzheimer's disease, Type 1 diabetes, and immune-related bowel disorder (IBD) showed signatures of positive selection, although the majority of NCS variants, reported as risk alleles by genome-wide association studies, showed signatures of negative selection. Our analyses provided compelling evidence of natural selection forces on noncoding sequences in the human genome and advanced our understanding of their functional potentials that play important roles in disease etiology and human evolution. PMID:26053627
Variability among the Most Rapidly Evolving Plastid Genomic Regions is Lineage-Specific: Implications of Pairwise Genome Comparisons in Pyrus (Rosaceae) and Other Angiosperms for Marker Choice

PubMed Central

Ter-Voskanyan, Hasmik; Allgaier, Martin; Borsch, Thomas

2014-01-01

Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae)—a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC–trnV, trnR–atpA, ndhF–rpl32, psbM–trnD, and trnQ–rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters). Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid), Olea (asterids) and Cymbidium (monocots) showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF–rpl32 and trnK–rps16) were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations. Sequencing whole plastid genomes to find markers for evolutionary analyses is therefore particularly useful when overall genetic distances are low. PMID:25405773
Human intron-encoded Alu RNAs are processed and packaged into Wdr79-associated nucleoplasmic box H/ACA RNPs

PubMed Central

Jády, Beáta E.; Ketele, Amandine; Kiss, Tamás

2012-01-01

Alu repetitive sequences are the most abundant short interspersed DNA elements in the human genome. Full-length Alu elements are composed of two tandem sequence monomers, the left and right Alu arms, both derived from the 7SL signal recognition particle RNA. Since Alu elements are common in protein-coding genes, they are frequently transcribed into pre-mRNAs. Here, we demonstrate that the right arms of nascent Alu transcripts synthesized within pre-mRNA introns are processed into metabolically stable small RNAs. The intron-encoded Alu RNAs, termed AluACA RNAs, are structurally highly reminiscent of box H/ACA small Cajal body (CB) RNAs (scaRNAs). They are composed of two hairpin units followed by the essential H (AnAnnA) and ACA box motifs. The mature AluACA RNAs associate with the four H/ACA core proteins: dyskerin, Nop10, Nhp2, and Gar1. Moreover, the 3′ hairpin of AluACA RNAs carries two closely spaced CB localization motifs, CAB boxes (UGAG), which bind Wdr79 in a cumulative fashion. In contrast to canonical H/ACA scaRNPs, which concentrate in CBs, the AluACA RNPs accumulate in the nucleoplasm. Identification of 348 human AluACA RNAs demonstrates that intron-encoded AluACA RNAs represent a novel, large subgroup of H/ACA RNAs, which are apparently confined to human or primate cells. PMID:22892240
The human myelin oligodendrocyte glycoprotein (MOG) gene: Complete nucleotide sequence and structural characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paule Roth, M.; Malfroy, L.; Offer, C.

1995-07-20

Human myelin oligodendrocyte glycoprotein (MOG), a myelin component of the central nervous system, is a candidate target antigen for autoimmune-mediated demyelination. We have isolated and sequenced part of a cosmid clone that contains the entire human MOG gene. The primary nuclear transcript, extending from the putative start of transcription to the site of poly(A) addition, is 15,561 nucleotides in length. The human MOG gene contains 8 exons, separated by 7 introns; canonical intron/exon boundary sites are observed at each junction. The introns vary in size from 242 to 6484 bp and contain numerous repetitive DNA elements, including 14 Alu sequencesmore » within 3 introns. Another Alu element is located in the 3{prime}-untranslated region of the gene. Alu sequences were classified with respect to subfamily assignment. Seven hundred sixty-three nucleotides 5{prime} of the transcription start and 1214 nucleotides 3{prime} of the poly(A) addition sites were also sequenced. The 5{prime}-flanking region revealed the presence of several consensus sequences that could be relevant in the transcription of the MOG gene, in particular binding sites in common with other myelin gene promoters. Two polymorphic intragenic dinucleotide (CA){sub n} and tetranucleotide (TAAA){sub n} repeats were identified and may provide genetic marker tools for association and linkage studies. 50 refs., 3 figs., 3 tabs.« less
Progressive changes in non-coding RNA profile in leucocytes with age

PubMed Central

Muñoz-Culla, Maider; Irizar, Haritz; Gorostidi, Ana; Alberro, Ainhoa; Osorio-Querejeta, Iñaki; Ruiz-Martínez, Javier; Olascoaga, Javier; de Munain, Adolfo López; Otaegui, David

2017-01-01

It has been observed that immune cell deterioration occurs in the elderly, as well as a chronic low-grade inflammation called inflammaging. These cellular changes must be driven by numerous changes in gene expression and in fact, both protein-coding and non-coding RNA expression alterations have been observed in peripheral blood mononuclear cells from elder people. In the present work we have studied the expression of small non-coding RNA (microRNA and small nucleolar RNA -snoRNA-) from healthy individuals from 24 to 79 years old. We have observed that the expression of 69 non-coding RNAs (56 microRNAs and 13 snoRNAs) changes progressively with chronological age. According to our results, the age range from 47 to 54 is critical given that it is the period when the expression trend (increasing or decreasing) of age-related small non-coding RNAs is more pronounced. Furthermore, age-related miRNAs regulate genes that are involved in immune, cell cycle and cancer-related processes, which had already been associated to human aging. Therefore, human aging could be studied as a result of progressive molecular changes, and different age ranges should be analysed to cover the whole aging process. PMID:28448962
Selective inhibitors of trypanosomal uridylyl transferase RET1 establish druggability of RNA post-transcriptional modifications

PubMed Central

Cording, Amy; Gormally, Michael; Bond, Peter J.; Carrington, Mark; Balasubramanian, Shankar; Miska, Eric A.; Thomas, Beth

2017-01-01

ABSTRACT Non-coding RNAs are crucial regulators for a vast array of cellular processes and have been implicated in human disease. These biological processes represent a hitherto untapped resource in our fight against disease. In this work we identify small molecule inhibitors of a non-coding RNA uridylylation pathway. The TUTase family of enzymes is important for modulating non-coding RNA pathways in both human cancer and pathogen systems. We demonstrate that this new class of drug target can be accessed with traditional drug discovery techniques. Using the Trypanosoma brucei TUTase, RET1, we identify TUTase inhibitors and lay the groundwork for the use of this new target class as a therapeutic opportunity for the under-served disease area of African Trypanosomiasis. In a broader sense this work demonstrates the therapeutic potential for targeting RNA post-transcriptional modifications with small molecules in human disease. PMID:26786754
Selective inhibitors of trypanosomal uridylyl transferase RET1 establish druggability of RNA post-transcriptional modifications.

PubMed

Cording, Amy; Gormally, Michael; Bond, Peter J; Carrington, Mark; Balasubramanian, Shankar; Miska, Eric A; Thomas, Beth

2017-05-04

Non-coding RNAs are crucial regulators for a vast array of cellular processes and have been implicated in human disease. These biological processes represent a hitherto untapped resource in our fight against disease. In this work we identify small molecule inhibitors of a non-coding RNA uridylylation pathway. The TUTase family of enzymes is important for modulating non-coding RNA pathways in both human cancer and pathogen systems. We demonstrate that this new class of drug target can be accessed with traditional drug discovery techniques. Using the Trypanosoma brucei TUTase, RET1, we identify TUTase inhibitors and lay the groundwork for the use of this new target class as a therapeutic opportunity for the under-served disease area of African Trypanosomiasis. In a broader sense this work demonstrates the therapeutic potential for targeting RNA post-transcriptional modifications with small molecules in human disease.
Aberrant splicing in maize rough endosperm3 reveals a conserved role for U12 splicing in eukaryotic multicellular development

PubMed Central

Barbazuk, W. Brad

2017-01-01

RNA splicing of U12-type introns functions in human cell differentiation, but it is not known whether this class of introns has a similar role in plants. The maize ROUGH ENDOSPERM3 (RGH3) protein is orthologous to the human splicing factor, ZRSR2. ZRSR2 mutations are associated with myelodysplastic syndrome (MDS) and cause U12 splicing defects. Maize rgh3 mutants have aberrant endosperm cell differentiation and proliferation. We found that most U12-type introns are retained or misspliced in rgh3. Genes affected in rgh3 and ZRSR2 mutants identify cell cycle and protein glycosylation as common pathways disrupted. Transcripts with retained U12-type introns can be found in polysomes, suggesting that splicing efficiency can alter protein isoforms. The rgh3 mutant protein disrupts colocalization with a known ZRSR2-interacting protein, U2AF2. These results indicate conserved function for RGH3/ZRSR2 in U12 splicing and a deeply conserved role for the minor spliceosome to promote cell differentiation from stem cells to terminal fates. PMID:28242684
Sensing Self and Foreign Circular RNAs by Intron Identity.

PubMed

Chen, Y Grace; Kim, Myoungjoo V; Chen, Xingqi; Batista, Pedro J; Aoyama, Saeko; Wilusz, Jeremy E; Iwasaki, Akiko; Chang, Howard Y

2017-07-20

Circular RNAs (circRNAs) are single-stranded RNAs that are joined head to tail with largely unknown functions. Here we show that transfection of purified in vitro generated circRNA into mammalian cells led to potent induction of innate immunity genes and confers protection against viral infection. The nucleic acid sensor RIG-I is necessary to sense foreign circRNA, and RIG-I and foreign circRNA co-aggregate in cytoplasmic foci. CircRNA activation of innate immunity is independent of a 5' triphosphate, double-stranded RNA structure, or the primary sequence of the foreign circRNA. Instead, self-nonself discrimination depends on the intron that programs the circRNA. Use of a human intron to express a foreign circRNA sequence abrogates immune activation, and mature human circRNA is associated with diverse RNA binding proteins reflecting its endogenous splicing and biogenesis. These results reveal innate immune sensing of circRNA and highlight introns-the predominant output of mammalian transcription-as arbiters of self-nonself identity. Copyright © 2017 Elsevier Inc. All rights reserved.
Gene regulation by noncoding RNAs

PubMed Central

Patil, Veena S.; Zhou, Rui; Rana, Tariq M.

2015-01-01

The past two decades have seen an explosion in research on noncoding RNAs and their physiological and pathological functions. Several classes of small (20–30 nucleotides) and long (>200 nucleotides) noncoding RNAs have been firmly established as key regulators of gene expression in myriad processes ranging from embryonic development to innate immunity. In this review, we focus on our current understanding of the molecular mechanisms underlying the biogenesis and function of small interfering RNAs (siRNAs), microRNAs (miRNAs), and Piwi-interacting RNAs (piRNAs). In addition, we briefly review the relevance of small and long noncoding RNAs to human physiology and pathology and their potential to be exploited as therapeutic agents. PMID:24164576
Integrating non-coding RNAs in JAK-STAT regulatory networks

PubMed Central

Witte, Steven; Muljo, Stefan A

2014-01-01

Being a well-characterized pathway, JAK-STAT signaling serves as a valuable paradigm for studying the architecture of gene regulatory networks. The discovery of untranslated or non-coding RNAs, namely microRNAs and long non-coding RNAs, provides an opportunity to elucidate their roles in such networks. In principle, these regulatory RNAs can act as downstream effectors of the JAK-STAT pathway and/or affect signaling by regulating the expression of JAK-STAT components. Examples of interactions between signaling pathways and non-coding RNAs have already emerged in basic cell biology and human diseases such as cancer, and can potentially guide the identification of novel biomarkers or drug targets for medicine. PMID:24778925
Genomic organization and expression of the human MSH3 gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Watanabe, Atsushi; Ikejima, Miyoko; Suzuki, Noriko

1996-02-01

We have studied the expression and genomic organization of the human MSH3 gene, which encodes a human homologue of the bacterial DNA mismatch repair protein MutS. This gene is located upstream of the dihydrofolate reductase (DHFR) gene. Northern analysis has demonstrated that the hMSH3 gene is expressed in a variety of human tissues at low levels, like the DHFR gene. Characterization of cosmid clones has shown that the hMSH3 gene consists of 24 exons spanning at least 160 kb. All exon-intron junction sequences match the classical GT/AG rule, except that intron 6 has AT and AA at the ends. Twomore » major transcripts of 5.0 and 3.8 kb have been shown to be derived from the differential use of two polyadenylation sites. Elucidation of the complete genomic organization and the nucleotide sequences of the introns of the hMSH3 gene should be useful for studying the function of this gene and the possible involvement of specific mutations of the hMSH3 gene in some diseases. 34 refs., 5 figs., 1 tab.« less
Allopolyploidization and evolution of species with reduced floral structures in Lepidium L. (Brassicaceae)

PubMed Central

Lee, Ji-Young; Mummenhoff, Klaus; Bowman, John L.

2002-01-01

Understanding the pattern of speciation in a group of plants is critical for understanding its morphological evolution. Lepidium is the genus with the largest variation in floral structure in Brassicaceae, a family in which the floral ground plan is remarkably stable. However, flowers in more than half of Lepidium species have reduced stamen numbers, and most of these also have reduced petals. The species with reduced flowers are geographically biased, distributed mostly in the Americas and Australia/ New Zealand. Previous phylogenetic studies using noncoding regions of chloroplast DNA and rDNA internal transcribed spacer were incongruent in most New World species relationships. These data, combined with the presence of many polyploid Lepidium species, implied a reticulate history of the genus but did not provide enough information to infer the evolutionary pattern of flower structures. To address this question more thoroughly, sequences of the first intron of a single copy nuclear gene, PISTILLATA, were determined from 43 species. Phylogenetic analysis of the PI intron suggests that many species in the New World have originated from allopolyploidization, and that this is correlated with floral reduction. Interspecific hybrids were generated to understand why allopolyploidization is associated with reduced flowers. The phenotypes of F1 flowers indicate allelic dominance of the absence of lateral stamens, suggesting that propagation of dominant alleles through interspecific hybridization could account for the abundance of the allopolyploid species without lateral stamens. PMID:12481035
Gene-specific cell labeling using MiMIC transposons.

PubMed

Gnerer, Joshua P; Venken, Koen J T; Dierick, Herman A

2015-04-30

Binary expression systems such as GAL4/UAS, LexA/LexAop and QF/QUAS have greatly enhanced the power of Drosophila as a model organism by allowing spatio-temporal manipulation of gene function as well as cell and neural circuit function. Tissue-specific expression of these heterologous transcription factors relies on random transposon integration near enhancers or promoters that drive the binary transcription factor embedded in the transposon. Alternatively, gene-specific promoter elements are directly fused to the binary factor within the transposon followed by random or site-specific integration. However, such insertions do not consistently recapitulate endogenous expression. We used Minos-Mediated Integration Cassette (MiMIC) transposons to convert host loci into reliable gene-specific binary effectors. MiMIC transposons allow recombinase-mediated cassette exchange to modify the transposon content. We developed novel exchange cassettes to convert coding intronic MiMIC insertions into gene-specific binary factor protein-traps. In addition, we expanded the set of binary factor exchange cassettes available for non-coding intronic MiMIC insertions. We show that binary factor conversions of different insertions in the same locus have indistinguishable expression patterns, suggesting that they reliably reflect endogenous gene expression. We show the efficacy and broad applicability of these new tools by dissecting the cellular expression patterns of the Drosophila serotonin receptor gene family. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
COL1A1 transgene expression in stably transfected osteoblastic cells. Relative contributions of first intron, 3'-flanking sequences, and sequences derived from the body of the human COL1A1 minigene

NASA Technical Reports Server (NTRS)

Breault, D. T.; Lichtler, A. C.; Rowe, D. W.

1997-01-01

Collagen reporter gene constructs have be used to identify cell-specific sequences needed for transcriptional activation. The elements required for endogenous levels of COL1A1 expression, however, have not been elucidated. The human COL1A1 minigene is expressed at high levels and likely harbors sequence elements required for endogenous levels of activity. Using stably transfected osteoblastic Py1a cells, we studied a series of constructs (pOBColCAT) designed to characterize further the elements required for high level of expression. pOBColCAT, which contains the COL1A1 first intron, was expressed at 50-100-fold higher levels than ColCAT 3.6, which lacks the first intron. This difference is best explained by improved mRNA processing rather than a transcriptional effect. Furthermore, variation in activity observed with the intron deletion constructs is best explained by altered mRNA splicing. Two major regions of the human COL1A1 minigene, the 3'-flanking sequences and the minigene body, were introduced into pOBColCAT to assess both transcriptional enhancing activity and the effect on mRNA stability. Analysis of the minigene body, which includes the first five exons and introns fused with the terminal six introns and exons, revealed an orientation-independent 5-fold increase in CAT activity. In contrast the 3'-flanking sequences gave rise to a modest 61% increase in CAT activity. Neither region increased the mRNA half-life of the parent construct, suggesting that CAT-specific mRNA instability elements may serve as dominant negative regulators of stability. This study suggests that other sites within the body of the COL1A1 minigene are important for high expression, e.g. during periods of rapid extracellular matrix production.
Ftx is a non-coding RNA which affects Xist expression and chromatin structure within the X-inactivation center region.

PubMed

Chureau, Corinne; Chantalat, Sophie; Romito, Antonio; Galvani, Angélique; Duret, Laurent; Avner, Philip; Rougeulle, Claire

2011-02-15

X chromosome inactivation (XCI) is an essential epigenetic process which involves several non-coding RNAs (ncRNAs), including Xist, the master regulator of X-inactivation initiation. Xist is flanked in its 5' region by a large heterochromatic hotspot, which contains several transcription units including a gene of unknown function, Ftx (five prime to Xist). In this article, we describe the characterization and functional analysis of murine Ftx. We present evidence that Ftx produces a conserved functional long ncRNA, and additionally hosts microRNAs (miR) in its introns. Strikingly, Ftx partially escapes X-inactivation and is upregulated specifically in female ES cells at the onset of X-inactivation, an expression profile which closely follows that of Xist. We generated Ftx null ES cells to address the function of this gene. In these cells, only local changes in chromatin marks are detected within the hotspot, indicating that Ftx is not involved in the global maintenance of the heterochromatic structure of this region. The Ftx mutation, however, results in widespread alteration of transcript levels within the X-inactivation center (Xic) and particularly important decreases in Xist RNA levels, which were correlated with increased DNA methylation at the Xist CpG island. Altogether our results indicate that Ftx is a positive regulator of Xist and lead us to propose that Ftx is a novel ncRNA involved in XCI.
Aurora-A as a Modifier of Breast Cancer Risk in BRCA 1/2 Mutation Carriers

DTIC Science & Technology

2007-06-01

Dieter Schaefer, Institute of Human Genetics, University of Frankfurt, Frankfurt, Germany; Norbert Arnold, University of Schleswig- Holstein , Campus...Intron 2 Opossum Mouse Rat Cow Dog Intron 1 Figure 3 | The FGFR2 locus. a, Map of the whole FGFR2 gene, viewed relative to common SNPs on HapMap
Deep intronic GPR143 mutation in a Japanese family with ocular albinism

PubMed Central

Naruto, Takuya; Okamoto, Nobuhiko; Masuda, Kiyoshi; Endo, Takao; Hatsukawa, Yoshikazu; Kohmoto, Tomohiro; Imoto, Issei

2015-01-01

Deep intronic mutations are often ignored as possible causes of human disease. Using whole-exome sequencing, we analysed genomic DNAs of a Japanese family with two male siblings affected by ocular albinism and congenital nystagmus. Although mutations or copy number alterations of coding regions were not identified in candidate genes, the novel intronic mutation c.659-131 T > G within GPR143 intron 5 was identified as hemizygous in affected siblings and as heterozygous in the unaffected mother. This mutation was predicted to create a cryptic splice donor site within intron 5 and activate a cryptic acceptor site at 41nt upstream, causing the insertion into the coding sequence of an out-of-frame 41-bp pseudoexon with a premature stop codon in the aberrant transcript, which was confirmed by minigene experiments. This result expands the mutational spectrum of GPR143 and suggests the utility of next-generation sequencing integrated with in silico and experimental analyses for improving the molecular diagnosis of this disease. PMID:26061757
Deep intronic GPR143 mutation in a Japanese family with ocular albinism.

PubMed

Naruto, Takuya; Okamoto, Nobuhiko; Masuda, Kiyoshi; Endo, Takao; Hatsukawa, Yoshikazu; Kohmoto, Tomohiro; Imoto, Issei

2015-06-10

Deep intronic mutations are often ignored as possible causes of human disease. Using whole-exome sequencing, we analysed genomic DNAs of a Japanese family with two male siblings affected by ocular albinism and congenital nystagmus. Although mutations or copy number alterations of coding regions were not identified in candidate genes, the novel intronic mutation c.659-131 T > G within GPR143 intron 5 was identified as hemizygous in affected siblings and as heterozygous in the unaffected mother. This mutation was predicted to create a cryptic splice donor site within intron 5 and activate a cryptic acceptor site at 41nt upstream, causing the insertion into the coding sequence of an out-of-frame 41-bp pseudoexon with a premature stop codon in the aberrant transcript, which was confirmed by minigene experiments. This result expands the mutational spectrum of GPR143 and suggests the utility of next-generation sequencing integrated with in silico and experimental analyses for improving the molecular diagnosis of this disease.

Sexual dimorphism of AMBRA1-related autistic features in human and mouse.

PubMed

Mitjans, M; Begemann, M; Ju, A; Dere, E; Wüstefeld, L; Hofer, S; Hassouna, I; Balkenhol, J; Oliveira, B; van der Auwera, S; Tammer, R; Hammerschmidt, K; Völzke, H; Homuth, G; Cecconi, F; Chowdhury, K; Grabe, H; Frahm, J; Boretius, S; Dandekar, T; Ehrenreich, H

2017-10-10

Ambra1 is linked to autophagy and neurodevelopment. Heterozygous Ambra1 deficiency induces autism-like behavior in a sexually dimorphic manner. Extraordinarily, autistic features are seen in female mice only, combined with stronger Ambra1 protein reduction in brain compared to males. However, significance of AMBRA1 for autistic phenotypes in humans and, apart from behavior, for other autism-typical features, namely early brain enlargement or increased seizure propensity, has remained unexplored. Here we show in two independent human samples that a single normal AMBRA1 genotype, the intronic SNP rs3802890-AA, is associated with autistic features in women, who also display lower AMBRA1 mRNA expression in peripheral blood mononuclear cells relative to female GG carriers. Located within a non-coding RNA, likely relevant for mRNA and protein interaction, rs3802890 (A versus G allele) may affect its stability through modification of folding, as predicted by in silico analysis. Searching for further autism-relevant characteristics in Ambra1 +/- mice, we observe reduced interest of female but not male mutants regarding pheromone signals of the respective other gender in the social intellicage set-up. Moreover, altered pentylentetrazol-induced seizure propensity, an in vivo readout of neuronal excitation-inhibition dysbalance, becomes obvious exclusively in female mutants. Magnetic resonance imaging reveals mild prepubertal brain enlargement in both genders, uncoupling enhanced brain dimensions from the primarily female expression of all other autistic phenotypes investigated here. These data support a role of AMBRA1/Ambra1 partial loss-of-function genotypes for female autistic traits. Moreover, they suggest Ambra1 heterozygous mice as a novel multifaceted and construct-valid genetic mouse model for female autism.
The RNA-binding landscape of RBM10 and its role in alternative splicing regulation in models of mouse early development.

PubMed

Rodor, Julie; FitzPatrick, David R; Eyras, Eduardo; Cáceres, Javier F

2017-01-02

Mutations in the RNA-binding protein, RBM10, result in a human syndromic form of cleft palate, termed TARP syndrome. A role for RBM10 in alternative splicing regulation has been previously demonstrated in human cell lines. To uncover the cellular functions of RBM10 in a cell line that is relevant to the phenotype observed in TARP syndrome, we used iCLIP to identify its endogenous RNA targets in a mouse embryonic mandibular cell line. We observed that RBM10 binds to pre-mRNAs with significant enrichment in intronic regions, in agreement with a role for this protein in pre-mRNA splicing. In addition to protein-coding transcripts, RBM10 also binds to a variety of cellular RNAs, including non-coding RNAs, such as spliceosomal small nuclear RNAs, U2 and U12. RNA-seq was used to investigate changes in gene expression and alternative splicing in RBM10 KO mouse mandibular cells and also in mouse ES cells. We uncovered a role for RBM10 in the regulation of alternative splicing of common transcripts in both cell lines but also identified cell-type specific events. Importantly, those pre-mRNAs that display changes in alternative splicing also contain RBM10 iCLIP tags, suggesting a direct role of RBM10 in these events. Finally, we show that depletion of RBM10 in mouse ES cells leads to proliferation defects and to gross alterations in their differentiation potential. These results demonstrate a role for RBM10 in the regulation of alternative splicing in two cell models of mouse early development and suggests that mutations in RBM10 could lead to splicing changes that affect normal palate development and cause human disease.
Isolation and identification of gene-specific microRNAs.

PubMed

Lin, Shi-Lung; Chang, Donald C; Ying, Shao-Yao

2006-01-01

Prediction of microRNA (miRNA) candidates using computer programming has identified hundreds and hundreds of genomic hairpin sequences, of which, the functions remain to be determined. Because direct transfection of hairpin-like miRNA precursors (pre)-miRNAs in mammalian cells is not always sufficient to trigger effective RNA-induced gene-silencing complex (RISC) assembly, a key step for RNA interference (RNAi)-related gene silencing, we developed an intronic miRNA-expressing system to overcome this problem, and successfully increased the efficiency and effectiveness of miRNA-associated RNAi induction in vitro and in vivo. By insertion of a hairpin-like pre-miRNA structure into the intron region of a gene, this intronic miRNA biogenesis system has been found to depend on a coupled interaction of nascent precursor messenger RNA transcription and intron excision within a specific nuclear region proximal to genomic perichromatin fibrils. The intronic miRNA was transcribed by RNA type II polymerases, coexpressed with a primary gene transcript, and excised out of its encoding gene transcript by intracellular RNA splicing and processing mechanisms. Currently, some ribonuclease III endonucleases have been found to be involved in the processing of spliced introns and probably facilitating the intronic miRNA maturation. Using this miRNA-expressing system, we have shown for the first time that the intron-derived miRNAs were able to induce strong RNAi effects in not only human and mouse cells but also zebrafish, chicken embryos, and adult mice. Based on the strand complementarity between the designed miRNA and its target gene sequence, we have also developed a miRNA isolation protocol to purify and identify the mature miRNAs generated by the intronic miRNA-expressing system. Several intronic miRNA identities and structures are currently confirmed to be active in vitro and in vivo. According to this proof- of-principle method, we now have the knowledge to design pre-miRNA inserts that are more efficient and effective for the intronic miRNA-expressing system.
Isolation and identification of gene-specific microRNAs.

PubMed

Lin, Shi-Lung; Chang, Donald C; Ying, Shao-Yao

2013-01-01

Computer programming has identified hundreds of genomic hairpin sequences, many with functions remain to be determined. Because direct transfection of hairpin-like miRNA precursors (pre)-miRNAs in mammalian cells is not always sufficient to trigger effective RNA-induced gene silencing complex (RISC) assembly, a key step for RNA interference (RNAi)-related gene silencing, we developed an intronic miRNA-expressing system to overcome this problem by inserting a hairpin-like pre-miRNA structure into the intron region of a gene and successfully increased the efficiency and effectiveness of miRNA-associated RNAi induction in vitro and in vivo. This intronic miRNA biogenesis has been found to depend on a coupled interaction of nascent precursor messenger RNA transcription and intron excision within a specific nuclear region proximal to genomic perichromatin fibrils. The intronic miRNA was transcribed by RNA type II polymerases, coexpressed with a primary gene transcript, and excised out of its encoding gene transcript by intracellular RNA splicing and processing mechanisms. Currently, some ribonuclease III endonucleases have been found to be involved in the processing of spliced introns and probably facilitating the intronic miRNA maturation. Using this miRNA generation system, we have shown for the first time that the intron-derived miRNAs were able to induce strong RNAi effects in not only human and mouse cells but also zebrafishes, chicken embryos, and adult mice. We have also developed an miRNA isolation protocol, based on the complementarity between the designed miRNA and its target gene sequence, to purify and identify the mature miRNAs generated by the intronic miRNA-expressing system. Several intronic miRNA identities and structures are currently confirmed to be active in vitro and in vivo. According to this proven-of-principle method, we now have full knowledge to design pre-miRNA inserts that are more efficient and effective for the intronic miRNA-expressing systems.
Role of non-coding RNAs in non-aging-related neurological disorders.

PubMed

Vieira, A S; Dogini, D B; Lopes-Cendes, I

2018-06-11

Protein coding sequences represent only 2% of the human genome. Recent advances have demonstrated that a significant portion of the genome is actively transcribed as non-coding RNA molecules. These non-coding RNAs are emerging as key players in the regulation of biological processes, and act as "fine-tuners" of gene expression. Neurological disorders are caused by a wide range of genetic mutations, epigenetic and environmental factors, and the exact pathophysiology of many of these conditions is still unknown. It is currently recognized that dysregulations in the expression of non-coding RNAs are present in many neurological disorders and may be relevant in the mechanisms leading to disease. In addition, circulating non-coding RNAs are emerging as potential biomarkers with great potential impact in clinical practice. In this review, we discuss mainly the role of microRNAs and long non-coding RNAs in several neurological disorders, such as epilepsy, Huntington disease, fragile X-associated ataxia, spinocerebellar ataxias, amyotrophic lateral sclerosis (ALS), and pain. In addition, we give information about the conditions where microRNAs have demonstrated to be potential biomarkers such as in epilepsy, pain, and ALS.
Imprinted and X-linked non-coding RNAs as potential regulators of human placental function

PubMed Central

Buckberry, Sam; Bianco-Miotto, Tina; Roberts, Claire T

2014-01-01

Pregnancy outcome is inextricably linked to placental development, which is strictly controlled temporally and spatially through mechanisms that are only partially understood. However, increasing evidence suggests non-coding RNAs (ncRNAs) direct and regulate a considerable number of biological processes and therefore may constitute a previously hidden layer of regulatory information in the placenta. Many ncRNAs, including both microRNAs and long non-coding transcripts, show almost exclusive or predominant expression in the placenta compared with other somatic tissues and display altered expression patterns in placentas from complicated pregnancies. In this review, we explore the results of recent genome-scale and single gene expression studies using human placental tissue, but include studies in the mouse where human data are lacking. Our review focuses on the ncRNAs epigenetically regulated through genomic imprinting or X-chromosome inactivation and includes recent evidence surrounding the H19 lincRNA, the imprinted C19MC cluster microRNAs, and X-linked miRNAs associated with pregnancy complications. PMID:24081302
Circular non-coding RNA ANRIL modulates ribosomal RNA maturation and atherosclerosis in humans

PubMed Central

Holdt, Lesca M.; Stahringer, Anika; Sass, Kristina; Pichler, Garwin; Kulak, Nils A.; Wilfert, Wolfgang; Kohlmaier, Alexander; Herbst, Andreas; Northoff, Bernd H.; Nicolaou, Alexandros; Gäbel, Gabor; Beutner, Frank; Scholz, Markus; Thiery, Joachim; Musunuru, Kiran; Krohn, Knut; Mann, Matthias; Teupser, Daniel

2016-01-01

Circular RNAs (circRNAs) are broadly expressed in eukaryotic cells, but their molecular mechanism in human disease remains obscure. Here we show that circular antisense non-coding RNA in the INK4 locus (circANRIL), which is transcribed at a locus of atherosclerotic cardiovascular disease on chromosome 9p21, confers atheroprotection by controlling ribosomal RNA (rRNA) maturation and modulating pathways of atherogenesis. CircANRIL binds to pescadillo homologue 1 (PES1), an essential 60S-preribosomal assembly factor, thereby impairing exonuclease-mediated pre-rRNA processing and ribosome biogenesis in vascular smooth muscle cells and macrophages. As a consequence, circANRIL induces nucleolar stress and p53 activation, resulting in the induction of apoptosis and inhibition of proliferation, which are key cell functions in atherosclerosis. Collectively, these findings identify circANRIL as a prototype of a circRNA regulating ribosome biogenesis and conferring atheroprotection, thereby showing that circularization of long non-coding RNAs may alter RNA function and protect from human disease. PMID:27539542
Variation in the Oxytocin Receptor Gene Predicts Brain Region Specific Expression and Social Attachment

PubMed Central

King, Lanikea B.; Walum, Hasse; Inoue, Kiyoshi; Eyrich, Nicholas W.; Young, Larry J.

2015-01-01

Background Oxytocin (OXT) modulates several aspects of social behavior. Intranasal OXT is a leading candidate for treating social deficits in autism spectrum disorder (ASD) and common genetic variants in the human oxytocin receptor (OXTR) are associated with emotion recognition, relationship quality and ASD. Animal models have revealed that individual differences in Oxtr expression in the brain drive social behavior variation. Our understanding of how genetic variation contributes to brain OXTR expression is very limited. Methods We investigated Oxtr expression in monogamous prairie voles, which have a well characterized OXT system. We quantified brain region-specific levels of Oxtr mRNA and OXTR protein with established neuroanatomical methods. We used pyrosequencing to investigate allelic imbalance of Oxtr mRNA, a molecular signature of polymorphic genetic regulatory elements. We performed next-generation sequencing to discover variants in and near the Oxtr gene. We investigated social attachment using the partner preference test. Results Our allelic imbalance data demonstrates that genetic variants contribute to individual differences in Oxtr expression, but only in particular brain regions, including the nucleus accumbens (NAcc), where OXTR signaling facilitates social attachment. Next-generation sequencing identified one polymorphism in the Oxtr intron, near a putative cis-regulatory element, explaining 74% of the variance in striatal Oxtr expression specifically. Males homozygous for the high expressing allele display enhanced social attachment. Discussion Taken together, these findings provide convincing evidence for robust genetic influence on Oxtr expression and provide novel insights into how non-coding polymorphisms in the OXTR might influence individual differences in human social cognition and behavior PMID:26893121
RNA Sequencing of the Exercise Transcriptome in Equine Athletes

PubMed Central

Verini-Supplizi, Andrea; Barcaccia, Gianni; Albiero, Alessandro; D'Angelo, Michela; Campagna, Davide; Valle, Giorgio; Felicetti, Michela; Silvestrelli, Maurizio; Cappelli, Katia

2013-01-01

The horse is an optimal model organism for studying the genomic response to exercise-induced stress, due to its natural aptitude for athletic performance and the relative homogeneity of its genetic and environmental backgrounds. Here, we applied RNA-sequencing analysis through the use of SOLiD technology in an experimental framework centered on exercise-induced stress during endurance races in equine athletes. We monitored the transcriptional landscape by comparing gene expression levels between animals at rest and after competition. Overall, we observed a shift from coding to non-coding regions, suggesting that the stress response involves the differential expression of not annotated regions. Notably, we observed significant post-race increases of reads that correspond to repeats, especially the intergenic and intronic L1 and L2 transposable elements. We also observed increased expression of the antisense strands compared to the sense strands in intronic and regulatory regions (1 kb up- and downstream) of the genes, suggesting that antisense transcription could be one of the main mechanisms for transposon regulation in the horse under stress conditions. We identified a large number of transcripts corresponding to intergenic and intronic regions putatively associated with new transcriptional elements. Gene expression and pathway analysis allowed us to identify several biological processes and molecular functions that may be involved with exercise-induced stress. Ontology clustering reflected mechanisms that are already known to be stress activated (e.g., chemokine-type cytokines, Toll-like receptors, and kinases), as well as “nucleic acid binding” and “signal transduction activity” functions. There was also a general and transient decrease in the global rates of protein synthesis, which would be expected after strenuous global stress. In sum, our network analysis points toward the involvement of specific gene clusters in equine exercise-induced stress, including those involved in inflammation, cell signaling, and immune interactions. PMID:24391776
Identification of single nucleotide polymorphisms in the agouti signaling protein (ASIP) gene in some goat breeds in tropical and temperate climates.

PubMed

Adefenwa, Mufliat A; Peters, Sunday O; Agaviezor, Brilliant O; Wheto, Matthew; Adekoya, Khalid O; Okpeku, Moses; Oboh, Bola; Williams, Gabriel O; Adebambo, Olufunmilayo A; Singh, Mahipal; Thomas, Bolaji; De Donato, Marcos; Imumorin, Ikhide G

2013-07-01

The agouti-signaling protein (ASIP) plays a major role in mammalian pigmentation as an antagonist to melanocortin-1 receptor gene to stimulate pheomelanin synthesis, a major pigment conferring mammalian coat color. We sequenced a 352 bp fragment of ASIP gene spanning part of exon 2 and part of intron 2 in 215 animals representing six goat breeds from Nigeria and the United States: West African Dwarf, predominantly black; Red Sokoto, mostly red; and Sahel, mostly white from Nigeria; black and white Alpine, brown and white Spanish and white Saanen from the US. Twenty haplotypes from nine mutations representing three intronic, one silent and five missense (p.S19R, p.N35K, p.L36V, p.M42L and p.L45W) mutations were identified in Nigerian goats. Approximately 89 % of Nigerian goats carry haplotype 1 (TGCCATCCG) which seems to be the wild type configuration of mutations in this region of the gene. Although we found no association between these polymorphisms in the ASIP gene and coat color in Nigerian goats, in-silico functional analysis predicts putative deleterious functional impact of the p.L45W mutation on the basic amino-terminal domain of ASIP. In the American goats, two intronic mutations, g.293G>A and g.327C>A, were identified in the Alpine breed, although the g.293G>A mutation is common to American and Nigerian goat populations. All Sannen and Sahel goats in this study belong to haplotypes 1 of both populations which seem to be the wild-type composite ASIP haplotype. Overall, there was no clear association of this portion of the ASIP gene interrogated in this study with coat color variation. Therefore, additional genomic analyses of promoter sequence, the entire coding and non-coding regions of the ASIP gene will be required to obtain a definite conclusion.
Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas.

PubMed

Calin, George A; Liu, Chang-gong; Ferracin, Manuela; Hyslop, Terry; Spizzo, Riccardo; Sevignani, Cinzia; Fabbri, Muller; Cimmino, Amelia; Lee, Eun Joo; Wojcik, Sylwia E; Shimizu, Masayoshi; Tili, Esmerina; Rossi, Simona; Taccioli, Cristian; Pichiorri, Flavia; Liu, Xiuping; Zupo, Simona; Herlea, Vlad; Gramantieri, Laura; Lanza, Giovanni; Alder, Hansjuerg; Rassenti, Laura; Volinia, Stefano; Schmittgen, Thomas D; Kipps, Thomas J; Negrini, Massimo; Croce, Carlo M

2007-09-01

Noncoding RNA (ncRNA) transcripts are thought to be involved in human tumorigenesis. We report that a large fraction of genomic ultraconserved regions (UCRs) encode a particular set of ncRNAs whose expression is altered in human cancers. Genome-wide profiling revealed that UCRs have distinct signatures in human leukemias and carcinomas. UCRs are frequently located at fragile sites and genomic regions involved in cancers. We identified certain UCRs whose expression may be regulated by microRNAs abnormally expressed in human chronic lymphocytic leukemia, and we proved that the inhibition of an overexpressed UCR induces apoptosis in colon cancer cells. Our findings argue that ncRNAs and interaction between noncoding genes are involved in tumorigenesis to a greater extent than previously thought.
Rpl13a small nucleolar RNAs regulate systemic glucose metabolism

PubMed Central

Lee, Jiyeon; Harris, Alexis N.; Holley, Christopher L.; Mahadevan, Jana; Pyles, Kelly D.; Lavagnino, Zeno; Scherrer, David E.; Fujiwara, Hideji; Sidhu, Rohini; Zhang, Jessie; Huang, Stanley Ching-Cheng; Piston, David W.; Remedi, Maria S.; Urano, Fumihiko; Ory, Daniel S.

2016-01-01

Small nucleolar RNAs (snoRNAs) are non-coding RNAs that form ribonucleoproteins to guide covalent modifications of ribosomal and small nuclear RNAs in the nucleus. Recent studies have also uncovered additional non-canonical roles for snoRNAs. However, the physiological contributions of these small RNAs are largely unknown. Here, we selectively deleted four snoRNAs encoded within the introns of the ribosomal protein L13a (Rpl13a) locus in a mouse model. Loss of Rpl13a snoRNAs altered mitochondrial metabolism and lowered reactive oxygen species tone, leading to increased glucose-stimulated insulin secretion from pancreatic islets and enhanced systemic glucose tolerance. Islets from mice lacking Rpl13a snoRNAs demonstrated blunted oxidative stress responses. Furthermore, these mice were protected against diabetogenic stimuli that cause oxidative stress damage to islets. Our study illuminates a previously unrecognized role for snoRNAs in metabolic regulation. PMID:27820699
Analysis of methylated patterns and quality-related genes in tobacco (Nicotiana tabacum) cultivars.

PubMed

Jiao, Junna; Jia, Yanlong; Lv, Zhuangwei; Sun, Chuanfei; Gao, Lijie; Yan, Xiaoxiao; Cui, Liusu; Tang, Zongxiang; Yan, Benju

2014-08-01

Methylation-sensitive amplified polymorphism was used in this study to investigate epigenetic information of four tobacco cultivars: Yunyan 85, NC89, K326, and Yunyan 87. The DNA fragments with methylated information were cloned by reamplified PCR and sequenced. The results of Blast alignments showed that the genes with methylation information included chitinase, nitrate reductase, chloroplast DNA, mitochondrial DNA, ornithine decarboxylase, ribulose carboxylase, and promoter sequences. Homologous comparison in three cloned gene sequences (nitrate reductase, ornithine decarboxylase, and ribulose decarboxylase) indicated that geographic factors had significant influence on the whole genome methylation. Introns also contained different information in different tobacco cultivars. These findings suggest that synthetic mechanisms for tobacco aromatic components could be affected by different environmental factors leading to variation of noncoding regions in the genome, which finally results in different fragrance and taste in different tobacco cultivars.
Novel pre-mRNA splicing of intronically integrated HBV generates oncogenic chimera in hepatocellular carcinoma.

PubMed

Chiu, Yung-Tuen; Wong, John K L; Choi, Shing-Wan; Sze, Karen M F; Ho, Daniel W H; Chan, Lo-Kong; Lee, Joyce M F; Man, Kwan; Cherny, Stacey; Yang, Wan-Ling; Wong, Chun-Ming; Sham, Pak-Chung; Ng, Irene O L

2016-06-01

Hepatitis B virus (HBV) integration is common in HBV-associated hepatocellular carcinoma (HCC) and may play an important pathogenic role through the production of chimeric HBV-human transcripts. We aimed to screen the transcriptome for HBV integrations in HCCs. Transcriptome sequencing was performed on paired HBV-associated HCCs and corresponding non-tumorous liver tissues to identify viral-human chimeric sites. Validation was further performed in an expanded cohort of human HCCs. Here we report the discovery of a novel pre-mRNA splicing mechanism in generating HBV-human chimeric protein. This mechanism was exemplified by the formation of a recurrent HBV-cyclin A2 (CCNA2) chimeric transcript (A2S), as detected in 12.5% (6 of 48) of HCC patients, but in none of the 22 non-HCC HBV-associated cirrhotic liver samples examined. Upon the integration of HBV into the intron of the CCNA2 gene, the mammalian splicing machinery utilized the foreign splice sites at 282nt. and 458nt. of the HBV genome to generate a pseudo-exon, forming an in-frame chimeric fusion with CCNA2. The A2S chimeric protein gained a non-degradable property and promoted cell cycle progression, demonstrating its potential oncogenic functions. A pre-mRNA splicing mechanism is involved in the formation of HBV-human chimeric proteins. This represents a novel and possibly common mechanism underlying the formation of HBV-human chimeric transcripts from intronically integrated HBV genome with functional impact. HBV is involved in the mammalian pre-mRNA splicing machinery in the generation of potential tumorigenic HBV-human chimeras. This study also provided insight on the impact of intronic HBV integration with the gain of splice sites in the development of HBV-associated HCC. Copyright © 2016 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
Design of retrovirus vectors for transfer and expression of the human. beta. -globin gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, A.D.; Bender, M.A.; Harris, E.A.S.

1988-11-01

Regulated expression of the human ..beta..-globin gene has been demonstrated in cultured murine erythroleukemia cells and in mice after retrovirus-mediated gene transfer. However, the low titer of recombinant viruses described to date results in relatively inefficient gene transfer, which limits their usefulness for animal studies and for potential gene therapy in humans for diseases involving defective ..beta..-globin genes. The authors found regions that interfered with virus production within intron 2 of the ..beta..-globin gene and on both sides of the gene. The flanking regions could be removed, but intron 2 was required for ..beta..-globin expression. Inclusion of ..beta..-globin introns necessitatesmore » an antisense orientation of the gene within the retrovirus vector. However, they found no effect of the antisense ..beta..-globin transcription on virus production. A region downstream of the ..beta..-globin gene that stimulates expression of the gene in transgenic mice was included in the viruses without detrimental effects on virus titer. Virus titers of over 10/sup 6/ CFU/ml were obtained with the final vector design, which retained the ability to direct regulated expression of human ..beta..-globin in murine erythroleukemia cells. The vector also allowed transfer and expression of the human ..beta..-globin gene in hematopoietic cells (CFU-S cells) in mice.« less
Chromosomal localization and partial genomic structure of the human peroxisome proliferator activated receptor-gamma (hPPAR gamma) gene.

PubMed

Beamer, B A; Negri, C; Yen, C J; Gavrilova, O; Rumberger, J M; Durcan, M J; Yarnall, D P; Hawkins, A L; Griffin, C A; Burns, D K; Roth, J; Reitman, M; Shuldiner, A R

1997-04-28

We determined the chromosomal localization and partial genomic structure of the coding region of the human PPAR gamma gene (hPPAR gamma), a nuclear receptor important for adipocyte differentiation and function. Sequence analysis and long PCR of human genomic DNA with primers that span putative introns revealed that intron positions and sizes of hPPAR gamma are similar to those previously determined for the mouse PPAR gamma gene[13]. Fluorescent in situ hybridization localized hPPAR gamma to chromosome 3, band 3p25. Radiation hybrid mapping with two independent primer pairs was consistent with hPPAR gamma being within 1.5 Mb of marker D3S1263 on 3p25-p24.2. These sequences of the intron/exon junctions of the 6 coding exons shared by hPPAR gamma 1 and hPPAR gamma 2 will facilitate screening for possible mutations. Furthermore, D3S1263 is a suitable polymorphic marker for linkage analysis to evaluate PPAR gamma's potential contribution to genetic susceptibility to obesity, lipoatrophy, insulin resistance, and diabetes.
The miR-199-dynamin regulatory axis controls receptor-mediated endocytosis.

PubMed

Aranda, Juan F; Canfrán-Duque, Alberto; Goedeke, Leigh; Suárez, Yajaira; Fernández-Hernando, Carlos

2015-09-01

Small non-coding RNAs (microRNAs) are important regulators of gene expression that modulate many physiological processes; however, their role in regulating intracellular transport remains largely unknown. Intriguingly, we found that the dynamin (DNM) genes, a GTPase family of proteins responsible for endocytosis in eukaryotic cells, encode the conserved miR-199a and miR-199b family of miRNAs within their intronic sequences. Here, we demonstrate that miR-199a and miR-199b regulate endocytic transport by controlling the expression of important mediators of endocytosis such as clathrin heavy chain (CLTC), Rab5A, low-density lipoprotein receptor (LDLR) and caveolin-1 (Cav-1). Importantly, miR-199a-5p and miR-199b-5p overexpression markedly inhibits CLTC, Rab5A, LDLR and Cav-1 expression, thus preventing receptor-mediated endocytosis in human cell lines (Huh7 and HeLa). Of note, miR-199a-5p inhibition increases target gene expression and receptor-mediated endocytosis. Taken together, our work identifies a new mechanism by which microRNAs regulate intracellular trafficking. In particular, we demonstrate that the DNM, miR-199a-5p and miR-199b-5p genes act as a bifunctional locus that regulates endocytosis, thus adding an unexpected layer of complexity in the regulation of intracellular trafficking. © 2015. Published by The Company of Biologists Ltd.
Isolation and Identification of Gene-Specific MicroRNAs.

PubMed

Lin, Shi-Lung; Chang, Donald C; Ying, Shao-Yao

2018-01-01

Computer programming has identified hundreds of genomic hairpin sequences, many with functions yet to be determined. Because transfection of hairpin-like microRNA precursors (pre-miRNAs) into mammalian cells is not always sufficient to trigger RNA-induced gene silencing complex (RISC) assembly, a key step for inducing RNA interference (RNAi)-related gene silencing, we have developed an intronic miRNA expression system to overcome this problem by inserting a hairpin-like pre-miRNA structure into the intron region of a gene, and hence successfully increase the efficiency and effectiveness of miRNA-associated RNAi induction in vitro and in vivo. This intronic miRNA biogenesis mechanism has been found to depend on a coupled interaction of nascent messenger RNA transcription and intron excision within a specific nuclear region proximal to genomic perichromatin fibrils. The intronic miRNA so obtained is transcribed by type-II RNA polymerases, coexpressed within a primary gene transcript, and then excised out of the gene transcript by intracellular RNA splicing and processing machineries. After that, ribonuclease III (RNaseIII) endonucleases further process the spliced introns into mature miRNAs. Using this intronic miRNA expression system, we have shown for the first time that the intron-derived miRNAs are able to elicit strong RNAi effects in not only human and mouse cells in vitro but also in zebrafishes, chicken embryos, and adult mice in vivo. We have also developed a miRNA isolation protocol, based on the complementarity between the designed miRNA and its targeted gene sequence, to purify and identify the mature miRNAs generated. As a result, several intronic miRNA identities and structures have been confirmed. According to this proof-of-principle methodology, we now have full knowledge to design various intronic pre-miRNA inserts that are more efficient and effective for inducing specific gene silencing effects in vitro and in vivo.
Many human accelerated regions are developmental enhancers

PubMed Central

Capra, John A.; Erwin, Genevieve D.; McKinsey, Gabriel; Rubenstein, John L. R.; Pollard, Katherine S.

2013-01-01

The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology. PMID:24218637
Identification of Novel Long Non-coding and Circular RNAs in Human Papillomavirus-Mediated Cervical Cancer

PubMed Central

Wang, Hongbo; Zhao, Yingchao; Chen, Mingyue; Cui, Jie

2017-01-01

Cervical cancer is the third most common cancer worldwide and the fourth leading cause of cancer-associated mortality in women. Accumulating evidence indicates that long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs) may play key roles in the carcinogenesis of different cancers; however, little is known about the mechanisms of lncRNAs and circRNAs in the progression and metastasis of cervical cancer. In this study, we explored the expression profiles of lncRNAs, circRNAs, miRNAs, and mRNAs in HPV16 (human papillomavirus genotype 16) mediated cervical squamous cell carcinoma and matched adjacent non-tumor (ATN) tissues from three patients with high-throughput RNA sequencing (RNA-seq). In total, we identified 19 lncRNAs, 99 circRNAs, 28 miRNAs, and 304 mRNAs that were commonly differentially expressed (DE) in different patients. Among the non-coding RNAs, 3 lncRNAs and 44 circRNAs are novel to our knowledge. Functional enrichment analysis showed that DE lncRNAs, miRNAs, and mRNAs were enriched in pathways crucial to cancer as well as other gene ontology (GO) terms. Furthermore, the co-expression network and function prediction suggested that all 19 DE lncRNAs could play different roles in the carcinogenesis and development of cervical cancer. The competing endogenous RNA (ceRNA) network based on DE coding and non-coding RNAs showed that each miRNA targeted a number of lncRNAs and circRNAs. The link between part of the miRNAs in the network and cervical cancer has been validated in previous studies, and these miRNAs targeted the majority of the novel non-coding RNAs, thus suggesting that these novel non-coding RNAs may be involved in cervical cancer. Taken together, our study shows that DE non-coding RNAs could be further developed as diagnostic and therapeutic biomarkers of cervical cancer. The complex ceRNA network also lays the foundation for future research of the roles of coding and non-coding RNAs in cervical cancer. PMID:28970820

Identification of novel mRNAs and lncRNAs associated with mouse experimental colitis and human inflammatory bowel disease.

PubMed

Rankin, Carl Robert; Theodorou, Evangelos; Law, Ivy Ka Man; Rowe, Lorraine; Kokkotou, Efi; Pekow, Joel; Wang, Jiafang; Martin, Martin G; Pothoulakis, Charalabos; Padua, David Miguel

2018-06-28

Inflammatory bowel disease (IBD) is a complex disorder that is associated with significant morbidity. While many recent advances have been made with new diagnostic and therapeutic tools, a deeper understanding of its basic pathophysiology is needed to continue this trend towards improving treatments. By utilizing an unbiased, high-throughput transcriptomic analysis of two well-established mouse models of colitis, we set out to uncover novel coding and non-coding RNAs that are differentially expressed in the setting of colonic inflammation. RNA-seq analysis was performed using colonic tissue from two mouse models of colitis, a dextran sodium sulfate induced model and a genetic-induced model in mice lacking IL-10. We identified 81 coding RNAs that were commonly altered in both experimental models. Of these coding RNAs, 12 of the human orthologs were differentially expressed in a transcriptomic analysis of IBD patients. Interestingly, 5 of the 12 of human differentially expressed genes have not been previously identified as IBD-associated genes, including ubiquitin D. Our analysis also identified 15 non-coding RNAs that were differentially expressed in either mouse model. Surprisingly, only three non-coding RNAs were commonly dysregulated in both of these models. The discovery of these new coding and non-coding RNAs expands our transcriptional knowledge of mouse models of IBD and offers additional targets to deepen our understanding of the pathophysiology of IBD.
Short interspersed nuclear elements (SINEs) are abundant in Solanaceae and have a family-specific impact on gene structure and genome organization.

PubMed

Seibt, Kathrin M; Wenke, Torsten; Muders, Katja; Truberg, Bernd; Schmidt, Thomas

2016-05-01

Short interspersed nuclear elements (SINEs) are highly abundant non-autonomous retrotransposons that are widespread in plants. They are short in size, non-coding, show high sequence diversity, and are therefore mostly not or not correctly annotated in plant genome sequences. Hence, comparative studies on genomic SINE populations are rare. To explore the structural organization and impact of SINEs, we comparatively investigated the genome sequences of the Solanaceae species potato (Solanum tuberosum), tomato (Solanum lycopersicum), wild tomato (Solanum pennellii), and two pepper cultivars (Capsicum annuum). Based on 8.5 Gbp sequence data, we annotated 82 983 SINE copies belonging to 10 families and subfamilies on a base pair level. Solanaceae SINEs are dispersed over all chromosomes with enrichments in distal regions. Depending on the genome assemblies and gene predictions, 30% of all SINE copies are associated with genes, particularly frequent in introns and untranslated regions (UTRs). The close association with genes is family specific. More than 10% of all genes annotated in the Solanaceae species investigated contain at least one SINE insertion, and we found genes harbouring up to 16 SINE copies. We demonstrate the involvement of SINEs in gene and genome evolution including the donation of splice sites, start and stop codons and exons to genes, enlargement of introns and UTRs, generation of tandem-like duplications and transduction of adjacent sequence regions. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
WGSSAT: A High-Throughput Computational Pipeline for Mining and Annotation of SSR Markers From Whole Genomes.

PubMed

Pandey, Manmohan; Kumar, Ravindra; Srivastava, Prachi; Agarwal, Suyash; Srivastava, Shreya; Nagpure, Naresh S; Jena, Joy K; Kushwaha, Basdeo

2018-03-16

Mining and characterization of Simple Sequence Repeat (SSR) markers from whole genomes provide valuable information about biological significance of SSR distribution and also facilitate development of markers for genetic analysis. Whole genome sequencing (WGS)-SSR Annotation Tool (WGSSAT) is a graphical user interface pipeline developed using Java Netbeans and Perl scripts which facilitates in simplifying the process of SSR mining and characterization. WGSSAT takes input in FASTA format and automates the prediction of genes, noncoding RNA (ncRNA), core genes, repeats and SSRs from whole genomes followed by mapping of the predicted SSRs onto a genome (classified according to genes, ncRNA, repeats, exonic, intronic, and core gene region) along with primer identification and mining of cross-species markers. The program also generates a detailed statistical report along with visualization of mapped SSRs, genes, core genes, and RNAs. The features of WGSSAT were demonstrated using Takifugu rubripes data. This yielded a total of 139 057 SSR, out of which 113 703 SSR primer pairs were uniquely amplified in silico onto a T. rubripes (fugu) genome. Out of 113 703 mined SSRs, 81 463 were from coding region (including 4286 exonic and 77 177 intronic), 7 from RNA, 267 from core genes of fugu, whereas 105 641 SSR and 601 SSR primer pairs were uniquely mapped onto the medaka genome. WGSSAT is tested under Ubuntu Linux. The source code, documentation, user manual, example dataset and scripts are available online at https://sourceforge.net/projects/wgssat-nbfgr.
Polymorphisms of clip domain serine proteinase and serine proteinase homolog in the swimming crab Portunus trituberculatus and their association with Vibrio alginolyticus

NASA Astrophysics Data System (ADS)

Liu, Meng; Liu, Yuan; Hui, Min; Song, Chengwen; Cui, Zhaoxia

2017-03-01

Clip domain serine proteases (cSPs) and their homologs (SPHs) play an important role in various biological processes that are essential components of extracellular signaling cascades, especially in the innate immune responses of invertebrates. Here, polymorphisms of PtcSP and PtSPH from the swimming crab Portunus trituberculatus were investigated to explore their association with resistance/susceptibility to Vibrio alginolyticus. Polymorphic loci were identified using Clustal X, and characterized with SPSS 16.0 software, and then the significance of genotype and allele frequencies between resistant and susceptible stocks was determined by a χ 2 test. A total of 109 and 77 single nucleotide polymorphisms (SNPs) were identified in the genomic fragments of PtcSP and PtSPH, respectively. Notably, nearly half of PtSPH polymorphisms were found in the non-coding exon 1. Fourteen SNPs investigated were significantly associated with susceptibility/resistance to V. alginolyticus ( P <0.05). Among them, eight SNPs were observed in introns, and one synonymous, four non-synonymous SNPs and one ins-del were found in coding exons. In addition, five simple sequence repeats (SSRs) were detected in intron 3 of PtcSP. Although there was no statistically significant difference of allele frequencies, the SSRs showed different polymorphic alleles on the basis of the repeat number between resistant and susceptible stocks. After further validation, polymorphisms investigated here might be applied to select potential molecular markers of P. trituberculatus with resistance to V. alginolyticus.
Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data.

PubMed

Huang, Yi-Fei; Gulko, Brad; Siepel, Adam

2017-04-01

Many genetic variants that influence phenotypes of interest are located outside of protein-coding genes, yet existing methods for identifying such variants have poor predictive power. Here we introduce a new computational method, called LINSIGHT, that substantially improves the prediction of noncoding nucleotide sites at which mutations are likely to have deleterious fitness consequences, and which, therefore, are likely to be phenotypically important. LINSIGHT combines a generalized linear model for functional genomic data with a probabilistic model of molecular evolution. The method is fast and highly scalable, enabling it to exploit the 'big data' available in modern genomics. We show that LINSIGHT outperforms the best available methods in identifying human noncoding variants associated with inherited diseases. In addition, we apply LINSIGHT to an atlas of human enhancers and show that the fitness consequences at enhancers depend on cell type, tissue specificity, and constraints at associated promoters.
Identification and characterization of long non-coding RNAs in subcutaneous adipose tissue from castrated and intact full-sib pair Huainan male pigs

USDA-ARS?s Scientific Manuscript database

Testosterone deficiency is associated with obesity in humans. It has been proven that long non-coding RNAs (lncRNAs) regulate adipose tissue metabolism; therefore, we first study the role of lncRNAs on testosterone deficiency-induced fat deposition using castrated male pigs as the model animal. The ...
Hiding in Plain Sight: Rediscovering the Importance of Noncoding RNA in Human Malignancy.

PubMed

Feeley, Kyle P; Edmonds, Mick D

2018-05-01

At the time of its construction in the 1950s, the central dogma of molecular biology was a useful model that represented the current state of knowledge for the flow of genetic information after a period of prolific scientific discovery. Unknowingly, it also biased many of our assumptions going forward. Whether intentional or not, genomic elements not fitting into this paradigm were deemed unimportant and emphasis on the study of protein-coding genes prevailed for decades. The phrase "Junk DNA," first popularized in the 1960s, is still used with alarming frequency to describe the entirety of noncoding DNA. It has since become apparent that RNA molecules not coding for protein are vitally important in both normal development and human malignancy. Cancer researchers have been pioneers in determining noncoding RNA function and developing new technologies to study these molecules. In this review, we will discuss well known and newly emerging species of noncoding RNAs, their functions in cancer, and new technologies being utilized to understand their mechanisms of action in cancer. Cancer Res; 78(9); 2149-58. ©2018 AACR . ©2018 American Association for Cancer Research.
The active gene that encodes human High Mobility Group 1 protein (HMG1) contains introns and maps to chromosome 13

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ferrari, S.; Finelli, P.; Rocchi, M.

The human genome contains a large number of sequences related to the cDNA for High Mobility Group 1 protein (HMG1), which so far has hampered the cloning and mapping of the active HMG1 gene. We show that the human HMG1 gene contains introns, while the HMG1-related sequences do not and most likely are retrotransposed pseudogenes. We identified eight YACs from the ICI and CEPH libraries that contain the human HMG1 gene. The HMG1 gene is similar in structure to the previously characterized murine homologue and maps to human chromosome 13 and q12, as determined by in situ hybridization. The mousemore » Hmg1 gene maps to the telomeric region of murine Chromosome 5, which is syntenic to the human 13q12 band. 18 refs., 3 figs.« less
Cis-encoded non-coding antisense RNAs in streptococci and other low GC Gram (+) bacterial pathogens

PubMed Central

Cho, Kyu Hong; Kim, Jeong-Ho

2015-01-01

Due to recent advances of bioinformatics and high throughput sequencing technology, discovery of regulatory non-coding RNAs in bacteria has been increased to a great extent. Based on this bandwagon, many studies searching for trans-acting small non-coding RNAs in streptococci have been performed intensively, especially in the important human pathogen, group A and B streptococci. However, studies for cis-encoded non-coding antisense RNAs in streptococci have been scarce. A recent study shows antisense RNAs are involved in virulence gene regulation in group B streptococcus, S. agalactiae. This suggests antisense RNAs could have important roles in the pathogenesis of streptococcal pathogens. In this review, we describe recent discoveries of chromosomal cis-encoded antisense RNAs in streptococcal pathogens and other low GC Gram (+) bacteria to provide a guide for future studies. PMID:25859258
MYC Targeted Long Noncoding RNA DANCR Promotes Cancer in Part by Reducing p21 Levels.

PubMed

Lu, Yunqi; Hu, Zhongyi; Mangala, Lingegowda S; Stine, Zachary E; Hu, Xiaowen; Jiang, Dahai; Xiang, Yan; Zhang, Youyou; Pradeep, Sunila; Rodriguez-Aguayo, Cristian; Lopez-Berestein, Gabriel; DeMarzo, Angelo M; Sood, Anil K; Zhang, Lin; Dang, Chi V

2018-01-01

The MYC oncogene broadly promotes transcription mediated by all nuclear RNA polymerases, thereby acting as a positive modifier of global gene expression. Here, we report that MYC stimulates the transcription of DANCR, a long noncoding RNA (lncRNA) that is widely overexpressed in human cancer. We identified DANCR through its overexpression in a transgenic model of MYC-induced lymphoma, but found that it was broadly upregulated in many human cancer cell lines and cancers, including most notably in prostate and ovarian cancers. Mechanistic investigations indicated that DANCR limited the expression of cell-cycle inhibitor p21 (CDKN1A) and that the inhibitory effects of DANCR loss on cell proliferation could be partially rescued by p21 silencing. In a xenograft model of human ovarian cancer, a nanoparticle-mediated siRNA strategy to target DANCR in vivo was sufficient to strongly inhibit tumor growth. Our observations expand knowledge of how MYC drives cancer cell proliferation by identifying DANCR as a critical lncRNA widely overexpressed in human cancers. Significance: These findings expand knowledge of how MYC drives cancer cell proliferation by identifying an oncogenic long noncoding RNA that is widely overexpressed in human cancers. Cancer Res; 78(1); 64-74. ©2017 AACR . ©2017 American Association for Cancer Research.
Genetic modification of bone-marrow mesenchymal stem cells and hematopoietic cells with human coagulation factor IX-expressing plasmids.

PubMed

Sam, Mohammad Reza; Azadbakhsh, Azadeh Sadat; Farokhi, Farrah; Rezazadeh, Kobra; Sam, Sohrab; Zomorodipour, Alireza; Haddad-Mashadrizeh, Aliakbar; Delirezh, Nowruz; Mokarizadeh, Aram

2016-05-01

Ex-vivo gene therapy of hemophilias requires suitable bioreactors for secretion of hFIX into the circulation and stem cells hold great potentials in this regard. Viral vectors are widely manipulated and used to transfer hFIX gene into stem cells. However, little attention has been paid to the manipulation of hFIX transgene itself. Concurrently, the efficacy of such a therapeutic approach depends on determination of which vectors give maximal transgene expression. With this in mind, TF-1 (primary hematopoietic lineage) and rat-bone marrow mesenchymal stem cells (BMSCs) were transfected with five hFIX-expressing plasmids containing different combinations of two human β-globin (hBG) introns inside the hFIX-cDNA and Kozak element and hFIX expression was evaluated by different methods. In BMSCs and TF-1 cells, the highest hFIX level was obtained from the intron-less and hBG intron-I,II containing plasmids respectively. The highest hFIX activity was obtained from the cells that carrying the hBG intron-I,II containing plasmids. BMSCs were able to produce higher hFIX by 1.4 to 4.7-fold increase with activity by 2.4 to 4.4-fold increase compared to TF-1 cells transfected with the same constructs. BMSCs and TF-1 cells could be effectively bioengineered without the use of viral vectors and hFIX minigene containing hBG introns could represent a particular interest in stem cell-based gene therapy of hemophilias. Copyright © 2016 International Alliance for Biological Standardization. Published by Elsevier Ltd. All rights reserved.
HOTAIR: An Oncogenic Long Non-Coding RNA in Human Cancer.

PubMed

Tang, Qing; Hann, Swei Sunny

2018-05-24

Long non-coding RNAs (LncRNAs) represent a novel class of noncoding RNAs that are longer than 200 nucleotides without protein-coding potential and function as novel master regulators in various human diseases, including cancer. Accumulating evidence shows that lncRNAs are dysregulated and implicated in various aspects of cellular homeostasis, such as proliferation, apoptosis, mobility, invasion, metastasis, chromatin remodeling, gene transcription, and post-transcriptional processing. However, the mechanisms by which lncRNAs regulate various biological functions in human diseases have yet to be determined. HOX antisense intergenic RNA (HOTAIR) is a recently discovered lncRNA and plays a critical role in various areas of cancer, such as proliferation, survival, migration, drug resistance, and genomic stability. In this review, we briefly introduce the concept, identification, and biological functions of HOTAIR. We then describe the involvement of HOTAIR that has been associated with tumorigenesis, growth, invasion, cancer stem cell differentiation, metastasis, and drug resistance in cancer. We also discuss emerging insights into the role of HOTAIR as potential biomarkers and therapeutic targets for novel treatment paradigms in cancer. © 2018 The Author(s). Published by S. Karger AG, Basel.
Splicing regulation and dysregulation of cholinergic genes expressed at the neuromuscular junction.

PubMed

Ohno, Kinji; Rahman, Mohammad Alinoor; Nazim, Mohammad; Nasrin, Farhana; Lin, Yingni; Takeda, Jun-Ichi; Masuda, Akio

2017-08-01

We humans have evolved by acquiring diversity of alternative RNA metabolisms including alternative means of splicing and transcribing non-coding genes, and not by acquiring new coding genes. Tissue-specific and developmental stage-specific alternative RNA splicing is achieved by tightly regulated spatiotemporal regulation of expressions and activations of RNA-binding proteins that recognize their cognate splicing cis-elements on nascent RNA transcripts. Genes expressed at the neuromuscular junction are also alternatively spliced. In addition, germline mutations provoke aberrant splicing by compromising binding of RNA-binding proteins, and cause congenital myasthenic syndromes (CMS). We present physiological splicing mechanisms of genes for agrin (AGRN), acetylcholinesterase (ACHE), MuSK (MUSK), acetylcholine receptor (AChR) α1 subunit (CHRNA1), and collagen Q (COLQ) in human, and their aberration in diseases. Splicing isoforms of AChE T , AChE H , and AChE R are generated by hnRNP H/F. Skipping of MUSK exon 10 makes a Wnt-insensitive MuSK isoform, which is unique to human. Skipping of exon 10 is achieved by coordinated binding of hnRNP C, YB-1, and hnRNP L to exon 10. Exon P3A of CHRNA1 is alternatively included to generate a non-functional AChR α1 subunit in human. Molecular dissection of splicing mutations in patients with CMS reveals that exon P3A is alternatively skipped by hnRNP H, polypyrimidine tract-binding protein 1, and hnRNP L. Similarly, analysis of an exonic mutation in COLQ exon 16 in a CMS patient discloses that constitutive splicing of exon 16 requires binding of serine arginine-rich splicing factor 1. Intronic and exonic splicing mutations in CMS enable us to dissect molecular mechanisms underlying alternative and constitutive splicing of genes expressed at the neuromuscular junction. This is an article for the special issue XVth International Symposium on Cholinergic Mechanisms. © 2017 International Society for Neurochemistry.
Intron retention and nuclear loss of SFPQ are molecular hallmarks of ALS.

PubMed

Luisier, Raphaelle; Tyzack, Giulia E; Hall, Claire E; Mitchell, Jamie S; Devine, Helen; Taha, Doaa M; Malik, Bilal; Meyer, Ione; Greensmith, Linda; Newcombe, Jia; Ule, Jernej; Luscombe, Nicholas M; Patani, Rickie

2018-05-22

Mutations causing amyotrophic lateral sclerosis (ALS) strongly implicate ubiquitously expressed regulators of RNA processing. To understand the molecular impact of ALS-causing mutations on neuronal development and disease, we analysed transcriptomes during in vitro differentiation of motor neurons (MNs) from human control and patient-specific VCP mutant induced-pluripotent stem cells (iPSCs). We identify increased intron retention (IR) as a dominant feature of the splicing programme during early neural differentiation. Importantly, IR occurs prematurely in VCP mutant cultures compared with control counterparts. These aberrant IR events are also seen in independent RNAseq data sets from SOD1- and FUS-mutant MNs. The most significant IR is seen in the SFPQ transcript. The SFPQ protein binds extensively to its retained intron, exhibits lower nuclear abundance in VCP mutant cultures and is lost from nuclei of MNs in mouse models and human sporadic ALS. Collectively, we demonstrate SFPQ IR and nuclear loss as molecular hallmarks of familial and sporadic ALS.
cncRNAs: Bi-functional RNAs with protein coding and non-coding functions

PubMed Central

Kumari, Pooja; Sampath, Karuna

2015-01-01

For many decades, the major function of mRNA was thought to be to provide protein-coding information embedded in the genome. The advent of high-throughput sequencing has led to the discovery of pervasive transcription of eukaryotic genomes and opened the world of RNA-mediated gene regulation. Many regulatory RNAs have been found to be incapable of protein coding and are hence termed as non-coding RNAs (ncRNAs). However, studies in recent years have shown that several previously annotated non-coding RNAs have the potential to encode proteins, and conversely, some coding RNAs have regulatory functions independent of the protein they encode. Such bi-functional RNAs, with both protein coding and non-coding functions, which we term as ‘cncRNAs’, have emerged as new players in cellular systems. Here, we describe the functions of some cncRNAs identified from bacteria to humans. Because the functions of many RNAs across genomes remains unclear, we propose that RNAs be classified as coding, non-coding or both only after careful analysis of their functions. PMID:26498036
A dynamic intron retention program enriched in RNA processing genes regulates gene expression during terminal erythropoiesis

DOE PAGES

Pimentel, Harold; Parra, Marilyn; Gee, Sherry L.; ...

2015-11-03

Differentiating erythroblasts execute a dynamic alternative splicing program shown here to include extensive and diverse intron retention (IR) events. Cluster analysis revealed hundreds of developmentallydynamic introns that exhibit increased IR in mature erythroblasts, and are enriched in functions related to RNA processing such as SF3B1 spliceosomal factor. Distinct, developmentally-stable IR clusters are enriched in metal-ion binding functions and include mitoferrin genes SLC25A37 and SLC25A28 that are critical for iron homeostasis. Some IR transcripts are abundant, e.g. comprising ~50% of highly-expressed SLC25A37 and SF3B1 transcripts in late erythroblasts, and thereby limiting functional mRNA levels. IR transcripts tested were predominantly nuclearlocalized. Splicemore » site strength correlated with IR among stable but not dynamic intron clusters, indicating distinct regulation of dynamically-increased IR in late erythroblasts. Retained introns were preferentially associated with alternative exons with premature termination codons (PTCs). High IR was observed in disease-causing genes including SF3B1 and the RNA binding protein FUS. Comparative studies demonstrated that the intron retention program in erythroblasts shares features with other tissues but ultimately is unique to erythropoiesis. Finally, we conclude that IR is a multi-dimensional set of processes that post-transcriptionally regulate diverse gene groups during normal erythropoiesis, misregulation of which could be responsible for human disease.« less
A dynamic intron retention program enriched in RNA processing genes regulates gene expression during terminal erythropoiesis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pimentel, Harold; Parra, Marilyn; Gee, Sherry L.

Differentiating erythroblasts execute a dynamic alternative splicing program shown here to include extensive and diverse intron retention (IR) events. Cluster analysis revealed hundreds of developmentallydynamic introns that exhibit increased IR in mature erythroblasts, and are enriched in functions related to RNA processing such as SF3B1 spliceosomal factor. Distinct, developmentally-stable IR clusters are enriched in metal-ion binding functions and include mitoferrin genes SLC25A37 and SLC25A28 that are critical for iron homeostasis. Some IR transcripts are abundant, e.g. comprising ~50% of highly-expressed SLC25A37 and SF3B1 transcripts in late erythroblasts, and thereby limiting functional mRNA levels. IR transcripts tested were predominantly nuclearlocalized. Splicemore » site strength correlated with IR among stable but not dynamic intron clusters, indicating distinct regulation of dynamically-increased IR in late erythroblasts. Retained introns were preferentially associated with alternative exons with premature termination codons (PTCs). High IR was observed in disease-causing genes including SF3B1 and the RNA binding protein FUS. Comparative studies demonstrated that the intron retention program in erythroblasts shares features with other tissues but ultimately is unique to erythropoiesis. Finally, we conclude that IR is a multi-dimensional set of processes that post-transcriptionally regulate diverse gene groups during normal erythropoiesis, misregulation of which could be responsible for human disease.« less
Towards barcode markers in Fungi: an intron map of Ascomycota mitochondria.

PubMed

Santamaria, Monica; Vicario, Saverio; Pappadà, Graziano; Scioscia, Gaetano; Scazzocchio, Claudio; Saccone, Cecilia

2009-06-16

A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality. In particular the potential use of mitochondrial DNA species markers has been taken in account. Unfortunately, a serious difficulty in the PCR and bioinformatic surveys is due to the presence of mobile introns in almost all the fungal mitochondrial genes. The aim of this work is to verify the incidence of this phenomenon in Ascomycota, testing, at the same time, a new bioinformatic tool for extracting and managing sequence databases annotations, in order to identify the mitochondrial gene regions where introns are missing so as to propose them as species markers. The general trend towards a large occurrence of introns in the mitochondrial genome of Fungi has been confirmed in Ascomycota by an extensive bioinformatic analysis, performed on all the entries concerning 11 mitochondrial protein coding genes and 2 mitochondrial rRNA (ribosomal RNA) specifying genes, belonging to this phylum, available in public nucleotide sequence databases. A new query approach has been developed to retrieve effectively introns information included in these entries. After comparing the new query-based approach with a blast-based procedure, with the aim of designing a faithful Ascomycota mitochondrial intron map, the first method appeared clearly the most accurate. Within this map, despite the large pervasiveness of introns, it is possible to distinguish specific regions comprised in several genes, including the full NADH dehydrogenase subunit 6 (ND6) gene, which could be considered as barcode candidates for Ascomycota due to their paucity of introns and to their length, above 400 bp, comparable to the lower end size of the length range of barcodes successfully used in animals. The development of the new query system described here would answer the pressing requirement to improve drastically the bioinformatics support to the DNA Barcode Initiative. The large scale investigation of Ascomycota mitochondrial introns performed through this tool, allowing to exclude the introns-rich sequences from the barcode candidates exploration, could be the first step towards a mitochondrial barcoding strategy for these organisms, similar to the standard approach employed in metazoans.
Emergence of the Noncoding Cancer Genome: A Target of Genetic and Epigenetic Alterations.

PubMed

Zhou, Stanley; Treloar, Aislinn E; Lupien, Mathieu

2016-11-01

The emergence of whole-genome annotation approaches is paving the way for the comprehensive annotation of the human genome across diverse cell and tissue types exposed to various environmental conditions. This has already unmasked the positions of thousands of functional cis-regulatory elements integral to transcriptional regulation, such as enhancers, promoters, and anchors of chromatin interactions that populate the noncoding genome. Recent studies have shown that cis-regulatory elements are commonly the targets of genetic and epigenetic alterations associated with aberrant gene expression in cancer. Here, we review these findings to showcase the contribution of the noncoding genome and its alteration in the development and progression of cancer. We also highlight the opportunities to translate the biological characterization of genetic and epigenetic alterations in the noncoding cancer genome into novel approaches to treat or monitor disease. The majority of genetic and epigenetic alterations accumulate in the noncoding genome throughout oncogenesis. Discriminating driver from passenger events is a challenge that holds great promise to improve our understanding of the etiology of different cancer types. Advancing our understanding of the noncoding cancer genome may thus identify new therapeutic opportunities and accelerate our capacity to find improved biomarkers to monitor various stages of cancer development. Cancer Discov; 6(11); 1215-29. ©2016 AACR. ©2016 American Association for Cancer Research.
Efficiency of introns from various origins in fish cells.

PubMed

Bétancourt, O H; Attal, J; Théron, M C; Puissant, C; Houdebine, L M

1993-06-01

Several vectors containing (1) regulatory regions from Rous sarcoma virus (RSV), human cytomegalovirus (CMV), and herpes simplex thymidine kinase (TK); (2) introns from early or late SV40 genes and from trout growth hormone gene (tGH); (3) chloramphenicol acetyltransferase gene (CAT); and (4) transcription terminators from SV40 were transfected into carp EPC cells, salmon CHSE cells, tilapia TO2 cells, quail QT6 cells, and hamster CHO cells. CAT activity was measured in extracts from several cell lines 3 days after transfection and in the fish EPC stable clones. The CMV and RSV promoters were the most potent in all cell types. The intron from late SV40 genes (VP1 intron) worked properly in QT6 and CHO cells but not in EPC and very weakly in TO2 cells. The tGH intron was efficient in all cell types but preferentially in fish cells. The small t intron from SV40 was processed in all cell types. The small t and, to a lesser extent, the tGH introns amplified expression of cat gene in stable clones, in comparison to the transiently transfected cells. These results indicate that elements from mammalian genes may not be properly recognized by the fish cellular machinery and in an unpredictable manner. This finding suggests that vectors prepared to express foreign genes in transfected cultured fish cells and transgenic fish should preferably contain DNA sequences from fish genes or, alternatively, those sequences from mammalian genes that have been previously proved to be compatible with the fish cellular machinery.

ERβ regulates miR-21 expression and inhibits invasion and metastasis in cancer cells

NASA Astrophysics Data System (ADS)

Tian, Junmei; Tu, Zhenzhen; Chen, Wei R.; Gu, Yueqing

2012-03-01

In human, estrogens play important roles in many physiological processes, and is also found to be connected with numerous cancers. In these diseases, estrogen mediates its effects through the estrogen receptor (ER), which serves as the basis for many current clinical diagnosis. Two forms of the estrogen receptor have been identified, ERα and ERβ, and show different and specific functions. The two estrogen receptors belong to a family of ligand-regulated transcription factors. Estrogen via ERα stimulates proliferation in the breast, uterus, and developing prostate, while estrogen via ERβ inhibits proliferation and promotes differentiation in the prostate, mammary gland, colon, lung, and bone marrow stem cells. MicroRNAs (miRs) are small non-coding RNA molecules that occur naturally and downregulate protein expression by translational blockade of the target mRNA or by promoting mRNA decay. MiR-21 is one of the most studied miRNAs in cancers. MiR-21 is overexpressed in the most solid tumors, promoting progression and metastasis. The miR-21 gene is located on the chromosome 17, in the 10th intron of a protein-coding gene, TMEM49. While, the function of TMEM49 is currently unknown. Our experiment is designed to identity the relationship between miR-21 and ERβ in cancer progression. The human cancer cells were transfected with ERβ. Real-time PCR analysis showed that the expression level of miR-21 was significantly inhibited down by ERβ treatment. As MTT assay showed the tumor cell survival rate was also inhibited significantly. Go/Gl phase cell cycle arrest was founded and tumor cell apoptosis was induced in ERβ group.
Distinct patterns of alteration of myc genes associated with integration of human papillomavirus type 16 or type 45 DNA in two genital tumours.

PubMed

Sastre-Garau, X; Favre, M; Couturier, J; Orth, G

2000-08-01

We previously described two genital carcinomas (IC2, IC4) containing human papillomavirus type 16 (HPV-16)- or HPV-18-related sequences integrated in chromosomal bands containing the c-myc (8q24) or N-myc (2p24) gene, respectively. The c-myc gene was rearranged and amplified in IC2 cells without evidence of overexpression. The N-myc gene was amplified and highly transcribed in IC4 cells. Here, the sequence of an 8039 bp IC4 DNA fragment containing the integrated viral sequences and the cellular junctions is reported. A 3948 bp segment of the genome of HPV-45 encompassing the upstream regulatory region and the E6 and E7 ORFs was integrated into the untranslated part of N-myc exon 3, upstream of the N-myc polyadenylation signal. Both N-myc and HPV-45 sequences were amplified 10- to 20-fold. The 3' ends of the major N-myc transcript were mapped upstream of the 5' junction. A minor N-myc/HPV-45 fusion transcript was also identified, as well as two abundant transcripts from the HPV-45 E6-E7 region. Large amounts of N-myc protein were detected in IC4 cells. A major alteration of c-myc sequences in IC2 cells involved the insertion of a non-coding sequence into the second intron and their co-amplification with the third exon, without any evidence for the integration of HPV-16 sequences within or close to the gene. Different patterns of myc gene alterations may thus be associated with integration of HPV DNA in genital tumours, including the activation of the protooncogene via a mechanism of insertional mutagenesis and/or gene amplification.
Functional annotation of the vlinc class of non-coding RNAs using systems biology approach

PubMed Central

Laurent, Georges St.; Vyatkin, Yuri; Antonets, Denis; Ri, Maxim; Qi, Yao; Saik, Olga; Shtokalo, Dmitry; de Hoon, Michiel J.L.; Kawaji, Hideya; Itoh, Masayoshi; Lassmann, Timo; Arner, Erik; Forrest, Alistair R.R.; Nicolas, Estelle; McCaffrey, Timothy A.; Carninci, Piero; Hayashizaki, Yoshihide; Wahlestedt, Claes; Kapranov, Philipp

2016-01-01

Functionality of the non-coding transcripts encoded by the human genome is the coveted goal of the modern genomics research. While commonly relied on the classical methods of forward genetics, integration of different genomics datasets in a global Systems Biology fashion presents a more productive avenue of achieving this very complex aim. Here we report application of a Systems Biology-based approach to dissect functionality of a newly identified vast class of very long intergenic non-coding (vlinc) RNAs. Using highly quantitative FANTOM5 CAGE dataset, we show that these RNAs could be grouped into 1542 novel human genes based on analysis of insulators that we show here indeed function as genomic barrier elements. We show that vlincRNAs genes likely function in cis to activate nearby genes. This effect while most pronounced in closely spaced vlincRNA–gene pairs can be detected over relatively large genomic distances. Furthermore, we identified 101 vlincRNA genes likely involved in early embryogenesis based on patterns of their expression and regulation. We also found another 109 such genes potentially involved in cellular functions also happening at early stages of development such as proliferation, migration and apoptosis. Overall, we show that Systems Biology-based methods have great promise for functional annotation of non-coding RNAs. PMID:27001520
The circulating non-coding RNA landscape for biomarker research: lessons and prospects from cardiovascular diseases.

PubMed

St Ecedil Pień, Ewa; Costa, Marina C; Kurc, Szczepan; Drożdż, Anna; Cortez-Dias, Nuno; Enguita, Francisco J

2018-06-07

Pervasive transcription of the human genome is responsible for the production of a myriad of non-coding RNA molecules (ncRNAs) some of them with regulatory functions. The pivotal role of ncRNAs in cardiovascular biology has been unveiled in the last decade, starting from the characterization of the involvement of micro-RNAs in cardiovascular development and function, and followed by the use of circulating ncRNAs as biomarkers of cardiovascular diseases. The human non-coding secretome is composed by several RNA species that circulate in body fluids and could be used as biomarkers for diagnosis and outcome prediction. In cardiovascular diseases, secreted ncRNAs have been described as biomarkers of several conditions including myocardial infarction, cardiac failure, and atrial fibrillation. Among circulating ncRNAs, micro-RNAs (miRNAs), long noncoding RNAs (lncRNAs) and circular RNAs (circRNAs) have been proposed as biomarkers in different cardiovascular diseases. In comparison with standard biomarkers, the biochemical nature of ncRNAs offers better stability and flexible storage conditions of the samples, and increased sensitivity and specificity. In this review we describe the current trends and future prospects of the use of the ncRNA secretome components as biomarkers of cardiovascular diseases, including the opening questions related with their secretion mechanisms and regulatory actions.
Paraspeckles: nuclear bodies built on long noncoding RNA

PubMed Central

Bond, Charles S.

2009-01-01

Paraspeckles are ribonucleoprotein bodies found in the interchromatin space of mammalian cell nuclei. These structures play a role in regulating the expression of certain genes in differentiated cells by nuclear retention of RNA. The core paraspeckle proteins (PSF/SFPQ, P54NRB/NONO, and PSPC1 [paraspeckle protein 1]) are members of the DBHS (Drosophila melanogaster behavior, human splicing) family. These proteins, together with the long nonprotein-coding RNA NEAT1 (MEN-ϵ/β), associate to form paraspeckles and maintain their integrity. Given the large numbers of long noncoding transcripts currently being discovered through whole transcriptome analysis, paraspeckles may be a paradigm for a class of subnuclear bodies formed around long noncoding RNA. PMID:19720872
Mechanisms and Regulation of Alternative Pre-mRNA Splicing

PubMed Central

Lee, Yeon

2015-01-01

Precursor messenger RNA (pre-mRNA) splicing is a critical step in the posttranscriptional regulation of gene expression, providing significant expansion of the functional proteome of eukaryotic organisms with limited gene numbers. Split eukaryotic genes contain intervening sequences or introns disrupting protein-coding exons, and intron removal occurs by repeated assembly of a large and highly dynamic ribonucleoprotein complex termed the spliceosome, which is composed of five small nuclear ribonucleoprotein particles, U1, U2, U4/U6, and U5. Biochemical studies over the past 10 years have allowed the isolation as well as compositional, functional, and structural analysis of splicing complexes at distinct stages along the spliceosome cycle. The average human gene contains eight exons and seven introns, producing an average of three or more alternatively spliced mRNA isoforms. Recent high-throughput sequencing studies indicate that 100% of human genes produce at least two alternative mRNA isoforms. Mechanisms of alternative splicing include RNA–protein interactions of splicing factors with regulatory sites termed silencers or enhancers, RNA–RNA base-pairing interactions, or chromatin-based effects that can change or determine splicing patterns. Disease-causing mutations can often occur in splice sites near intron borders or in exonic or intronic RNA regulatory silencer or enhancer elements, as well as in genes that encode splicing factors. Together, these studies provide mechanistic insights into how spliceosome assembly, dynamics, and catalysis occur; how alternative splicing is regulated and evolves; and how splicing can be disrupted by cis- and trans-acting mutations leading to disease states. These findings make the spliceosome an attractive new target for small-molecule, antisense, and genome-editing therapeutic interventions. PMID:25784052
Differential splicing of human androgen receptor pre-mRNA in X-linked reifenstein syndrome, because of a deletion involving a putative branch site

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ris-Stalpers, C.; Verleun-Mooijman, M.C.T.; Blaeij, T.J.P. de

1994-04-01

The analysis of the androgen receptor (AR) gene, mRNA, and protein in a subject with X-linked Reifenstein syndrome (partial androgen insensitivity) is reported. The presence of two mature AR transcripts in genital skin fibroblasts of the patient is established, and, by reverse transcriptase-PCR and RNase transcription analysis, the wild-type transcript and a transcript in which exon 3 sequences are absent without disruption of the translational reading frame are identified. Sequencing and hybridization analysis show a deletion of >6 kb in intron 2 of the human AR gene, starting 18 bp upstream of exon 3. The deletion includes the putative branch-pointmore » sequence (BPS) but not the acceptor splice site on the intron 2/exon 3 boundary. The deletion of the putative intron 2 BPS results in 90% inhibition of wild-type splicing. The mutant transcript encodes an AR protein lacking the second zinc finger of the DNA-binding domain. Western/immunoblotting analysis is used to show that the mutant AR protein is expressed in genital skin fibroblasts of the patient. The residual 10% wild-type transcript can be the result of the use of a cryptic BPS located 63 bp upstream of the intron 2/exon 3 boundary of the mutant AR gene. The mutated AR protein has no transcription-activating potential and does not influence the transactivating properties of the wild-type AR, as tested in cotransfection studies. It is concluded that the partial androgen-insensitivity syndrome of this patient is the consequence of the limited amount of wild-type AR protein expressed in androgen target cells, resulting from the deletion of the intron 2 putative BPS. 42 refs., 6 figs., 1 tab.« less
Combinatorial control of Drosophila circular RNA expression by intronic repeats, hnRNPs, and SR proteins

PubMed Central

Kramer, Marianne C.; Liang, Dongming; Tatomer, Deirdre C.; Gold, Beth; March, Zachary M.; Cherry, Sara; Wilusz, Jeremy E.

2015-01-01

Thousands of eukaryotic protein-coding genes are noncanonically spliced to produce circular RNAs. Bioinformatics has indicated that long introns generally flank exons that circularize in Drosophila, but the underlying mechanisms by which these circular RNAs are generated are largely unknown. Here, using extensive mutagenesis of expression plasmids and RNAi screening, we reveal that circularization of the Drosophila laccase2 gene is regulated by both intronic repeats and trans-acting splicing factors. Analogous to what has been observed in humans and mice, base-pairing between highly complementary transposable elements facilitates backsplicing. Long flanking repeats (∼400 nucleotides [nt]) promote circularization cotranscriptionally, whereas pre-mRNAs containing minimal repeats (<40 nt) generate circular RNAs predominately after 3′ end processing. Unlike the previously characterized Muscleblind (Mbl) circular RNA, which requires the Mbl protein for its biogenesis, we found that Laccase2 circular RNA levels are not controlled by Mbl or the Laccase2 gene product but rather by multiple hnRNP (heterogeneous nuclear ribonucleoprotein) and SR (serine–arginine) proteins acting in a combinatorial manner. hnRNP and SR proteins also regulate the expression of other Drosophila circular RNAs, including Plexin A (PlexA), suggesting a common strategy for regulating backsplicing. Furthermore, the laccase2 flanking introns support efficient circularization of diverse exons in Drosophila and human cells, providing a new tool for exploring the functional consequences of circular RNA expression across eukaryotes. PMID:26450910
Identification of novel non-coding small RNAs from Streptococcus pneumoniae TIGR4 using high-resolution genome tiling arrays

PubMed Central

2010-01-01

Background The identification of non-coding transcripts in human, mouse, and Escherichia coli has revealed their widespread occurrence and functional importance in both eukaryotic and prokaryotic life. In prokaryotes, studies have shown that non-coding transcripts participate in a broad range of cellular functions like gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Streptococcus pneumoniae (pneumococcus), an obligate human respiratory pathogen responsible for significant worldwide morbidity and mortality. Tiling microarrays enable genome wide mRNA profiling as well as identification of novel transcripts at a high-resolution. Results Here, we describe a high-resolution transcription map of the S. pneumoniae clinical isolate TIGR4 using genomic tiling arrays. Our results indicate that approximately 66% of the genome is expressed under our experimental conditions. We identified a total of 50 non-coding small RNAs (sRNAs) from the intergenic regions, of which 36 had no predicted function. Half of the identified sRNA sequences were found to be unique to S. pneumoniae genome. We identified eight overrepresented sequence motifs among sRNA sequences that correspond to sRNAs in different functional categories. Tiling arrays also identified approximately 202 operon structures in the genome. Conclusions In summary, the pneumococcal operon structures and novel sRNAs identified in this study enhance our understanding of the complexity and extent of the pneumococcal 'expressed' genome. Furthermore, the results of this study open up new avenues of research for understanding the complex RNA regulatory network governing S. pneumoniae physiology and virulence. PMID:20525227
Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng

2005-09-10

Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs eachmore » inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.« less
3' terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing.

PubMed

Goldfarb, Katherine C; Cech, Thomas R

2013-09-21

Post-transcriptional 3' end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3' RACE coupled with high-throughput sequencing to characterize the 3' terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. The 3' terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3' terminus of an in vitro transcribed MRP RNA control and the differing 3' terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). 3' RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3' terminal sequences of noncoding RNAs.
Noncoding RNAs and the control of signalling via nuclear receptor regulation in health and disease.

PubMed

Cathcart, Paul; Lucchesi, Walter; Ottaviani, Silvia; De Giorgio, Alex; Krell, Jonathan; Stebbing, Justin; Castellano, Leandro

2015-08-01

Nuclear receptors belong to a superfamily of proteins that play central roles in human biology, orchestrating a large variety of biological functions in both health and disease. Understanding the interactions and regulatory pathways of NRs will allow development of potential therapeutic interventions for a multitude of disease processes. Non-coding RNAs have recently been discovered to have significant interactions with NR signalling pathways via a variety of biological connections. This review summarises the known interactions between ncRNAs and the NR superfamily in health, embryogenesis and a plethora of human diseases. Copyright © 2015 Elsevier Ltd. All rights reserved.
Towards a complete map of the human long non-coding RNA transcriptome.

PubMed

Uszczynska-Ratajczak, Barbara; Lagarde, Julien; Frankish, Adam; Guigó, Roderic; Johnson, Rory

2018-05-23

Gene maps, or annotations, enable us to navigate the functional landscape of our genome. They are a resource upon which virtually all studies depend, from single-gene to genome-wide scales and from basic molecular biology to medical genetics. Yet present-day annotations suffer from trade-offs between quality and size, with serious but often unappreciated consequences for downstream studies. This is particularly true for long non-coding RNAs (lncRNAs), which are poorly characterized compared to protein-coding genes. Long-read sequencing technologies promise to improve current annotations, paving the way towards a complete annotation of lncRNAs expressed throughout a human lifetime.
Comprehensive Reconstruction and Visualization of Non-Coding Regulatory Networks in Human

PubMed Central

Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba

2014-01-01

Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape. PMID:25540777
Structural architecture of the human long non-coding RNA, steroid receptor RNA activator

PubMed Central

Novikova, Irina V.; Hennelly, Scott P.; Sanbonmatsu, Karissa Y.

2012-01-01

While functional roles of several long non-coding RNAs (lncRNAs) have been determined, the molecular mechanisms are not well understood. Here, we report the first experimentally derived secondary structure of a human lncRNA, the steroid receptor RNA activator (SRA), 0.87 kB in size. The SRA RNA is a non-coding RNA that coactivates several human sex hormone receptors and is strongly associated with breast cancer. Coding isoforms of SRA are also expressed to produce proteins, making the SRA gene a unique bifunctional system. Our experimental findings (SHAPE, in-line, DMS and RNase V1 probing) reveal that this lncRNA has a complex structural organization, consisting of four domains, with a variety of secondary structure elements. We examine the coevolution of the SRA gene at the RNA structure and protein structure levels using comparative sequence analysis across vertebrates. Rapid evolutionary stabilization of RNA structure, combined with frame-disrupting mutations in conserved regions, suggests that evolutionary pressure preserves the RNA structural core rather than its translational product. We perform similar experiments on alternatively spliced SRA isoforms to assess their structural features. PMID:22362738
Comprehensive reconstruction and visualization of non-coding regulatory networks in human.

PubMed

Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba

2014-01-01

Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape.
Characterization of the human gene (TBXAS1) encoding thromboxane synthase.

PubMed

Miyata, A; Yokoyama, C; Ihara, H; Bandoh, S; Takeda, O; Takahashi, E; Tanabe, T

1994-09-01

The gene encoding human thromboxane synthase (TBXAS1) was isolated from a human EMBL3 genomic library using human platelet thromboxane synthase cDNA as a probe. Nucleotide sequencing revealed that the human thromboxane synthase gene spans more than 75 kb and consists of 13 exons and 12 introns, of which the splice donor and acceptor sites conform to the GT/AG rule. The exon-intron boundaries of the thromboxane synthase gene were similar to those of the human cytochrome P450 nifedipine oxidase gene (CYP3A4) except for introns 9 and 10, although the primary sequences of these enzymes exhibited 35.8% identity each other. The 1.2-kb of the 5'-flanking region sequence contained potential binding sites for several transcription factors (AP-1, AP-2, GATA-1, CCAAT box, xenobiotic-response element, PEA-3, LF-A1, myb, basic transcription element and cAMP-response element). Primer-extension analysis indicated the multiple transcription-start sites, and the major start site was identified as an adenine residue located 142 bases upstream of the translation-initiation site. However, neither a typical TATA box nor a typical CAAT box is found within the 100-b upstream of the translation-initiation site. Southern-blot analysis revealed the presence of one copy of the thromboxane synthase gene per haploid genome. Furthermore, a fluorescence in situ hybridization study revealed that the human gene for thromboxane synthase is localized to band q33-q34 of the long arm of chromosome 7. A tissue-distribution study demonstrated that thromboxane synthase mRNA is widely expressed in human tissues and is particularly abundant in peripheral blood leukocyte, spleen, lung and liver. The low but significant levels of mRNA were observed in kidney, placenta and thymus.
Germline genetics of cancer of unknown primary (CUP) and its specific subtypes.

PubMed

Hemminki, Kari; Chen, Bowang; Kumar, Abhishek; Melander, Olle; Manjer, Jonas; Hallmans, Göran; Pettersson-Kymmer, Ulrika; Ohlsson, Claes; Folprecht, Gunnar; Löffler, Harald; Krämer, Alwin; Försti, Asta

2016-04-19

Cancer of unknown primary site (CUP) is a fatal cancer diagnosed through metastases at various organs. Little is known about germline genetics of CUP which appears worth of a search in view of reported familial associations in CUP. In the present study, samples from CUP patients were identified from 2 Swedish biobanks and a German clinical trial, totaling 578 CUP patients and 7628 regionally matched controls. Diagnostic data specified the organ where metastases were diagnosed. We carried out a genome-wide association study on CUP cases and controls. In the whole sample set, 6 loci reached an allelic p-value in the range of 10-7 and were supported by data from the three centers. Three associations were located next to non-coding RNA genes. rs2660852 flanked 5'UTR of LTA4H (leukotriene A4 hydrolase), rs477145 was intronic to TIAM1 (T-cell lymphoma invasion and metastases) and rs2835931 was intronic to KCNJ6 (potassium channel, inwardly rectifying subfamily J, member 6). In analysis of subgroups of CUP patients (smokers, non-smokers and CUP with liver metastases) genome-wide significant associations were noted. For patients with liver metastases associations on chromosome 6 and 11, the latter including a cluster of genes DHCR7 and NADSYN1, encoding key enzymes in cholesterol and NAD synthesis, and KRTAP5-7, encoding a keratin associated protein. This first GWAS on CUP provide preliminary evidence that germline genes relating to inflammation (LTA4H), metastatic promotion (TIAM1) in association with lipid metabolic disturbance (chromosome 11 cluster) may contribute to the risk of CUP.
Generation of a neuro-specific microarray reveals novel differentially expressed noncoding RNAs in mouse models for neurodegenerative diseases.

PubMed

Gstir, Ronald; Schafferer, Simon; Scheideler, Marcel; Misslinger, Matthias; Griehl, Matthias; Daschil, Nina; Humpel, Christian; Obermair, Gerald J; Schmuckermair, Claudia; Striessnig, Joerg; Flucher, Bernhard E; Hüttenhofer, Alexander

2014-12-01

We have generated a novel, neuro-specific ncRNA microarray, covering 1472 ncRNA species, to investigate their expression in different mouse models for central nervous system diseases. Thereby, we analyzed ncRNA expression in two mouse models with impaired calcium channel activity, implicated in Epilepsy or Parkinson's disease, respectively, as well as in a mouse model mimicking pathophysiological aspects of Alzheimer's disease. We identified well over a hundred differentially expressed ncRNAs, either from known classes of ncRNAs, such as miRNAs or snoRNAs or which represented entirely novel ncRNA species. Several differentially expressed ncRNAs in the calcium channel mouse models were assigned as miRNAs and target genes involved in calcium signaling, thus suggesting feedback regulation of miRNAs by calcium signaling. In the Alzheimer mouse model, we identified two snoRNAs, whose expression was deregulated prior to amyloid plaque formation. Interestingly, the presence of snoRNAs could be detected in cerebral spine fluid samples in humans, thus potentially serving as early diagnostic markers for Alzheimer's disease. In addition to known ncRNAs species, we also identified 63 differentially expressed, entirely novel ncRNA candidates, located in intronic or intergenic regions of the mouse genome, genomic locations, which previously have been shown to harbor the majority of functional ncRNAs. © 2014 Gstir et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Screening of Variations in CD22 Gene in Children with B-Precursor Acute Lymphoblastic Leukemia.

PubMed

Aslar Oner, Deniz; Akin, Dilara Fatma; Sipahi, Kadir; Mumcuoglu, Mine; Ezer, Ustun; Kürekci, A Emin; Akar, Nejat

2016-09-01

CD22 is expressed on the surface of B-cell lineage cells from the early progenitor stage of pro-B cell until terminal differentiation to mature B cells. It plays a role in signal transduction and as a regulator of B-cell receptor signaling in B-cell development. We aimed to screen exons 9-14 of the CD22 gene, which is a mutational hot spot region in B-precursor acute lymphoblastic leukemia (pre-B ALL) patients, to find possible genetic variants that could play role in the pathogenesis of pre-B ALL in Turkish children. This study included 109 Turkish children with pre-B ALL who were diagnosed at Losante Hospital for Children with Leukemia. Genomic DNA was extracted from both peripheral blood and bone marrow leukocytes. Gene amplification was performed with PCR, and all samples were screened for the variants by single strand conformation polymorphism. Samples showing band shifts were sequenced on an automated sequencer. In our patient group a total of 9 variants were identified in the CD22 gene by sequencing: a novel variant in intron 10 (T2199G); a missense variant in exon 12; 5 intronic variants between exon 12 and intron 13; a novel intronic variant (C2424T); and a synonymous in exon 13. Thirteen of 109 children (11.9%) carried the T2199G novel intronic variant located in intron 10, and 17 of 109 children (15.6%) carried the C2424T novel intronic variant. Novel variants in the CD22 gene in children with pre-B ALL in Turkey that are not present, in the Human Gene Mutation Database or NCBI SNP database, were found.

Dysferlin rescue by spliceosome-mediated pre-mRNA trans-splicing targeting introns harbouring weakly defined 3' splice sites.

PubMed

Philippi, Susanne; Lorain, Stéphanie; Beley, Cyriaque; Peccate, Cécile; Précigout, Guillaume; Spuler, Simone; Garcia, Luis

2015-07-15

The modification of the pre-mRNA cis-splicing process employing a pre-mRNA trans-splicing molecule (PTM) is an attractive strategy for the in situ correction of genes whose careful transcription regulation and full-length expression is determinative for protein function, as it is the case for the dysferlin (DYSF, Dysf) gene. Loss-of-function mutations of DYSF result in different types of muscular dystrophy mainly manifesting as limb girdle muscular dystrophy 2B (LGMD2B) and Miyoshi muscular dystrophy 1 (MMD1). We established a 3' replacement strategy for mutated DYSF pre-mRNAs induced by spliceosome-mediated pre-mRNA trans-splicing (SmaRT) by the use of a PTM. In contrast to previously established SmaRT strategies, we particularly focused on the identification of a suitable pre-mRNA target intron other than the optimization of the PTM design. By targeting DYSF pre-mRNA introns harbouring differentially defined 3' splice sites (3' SS), we found that target introns encoding weakly defined 3' SSs were trans-spliced successfully in vitro in human LGMD2B myoblasts as well as in vivo in skeletal muscle of wild-type and Dysf(-/-) mice. For the first time, we demonstrate rescue of Dysf protein by SmaRT in vivo. Moreover, we identified concordant qualities among the successfully targeted Dysf introns and targeted endogenous introns in previously reported SmaRT approaches that might facilitate a selective choice of target introns in future SmaRT strategies. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Alternative Polyadenylation Allows Differential Negative Feedback of Human miRNA miR-579 on Its Host Gene ZFR

PubMed Central

Hinske, Ludwig Christian; Galante, Pedro A. F.; Limbeck, Elisabeth; Möhnle, Patrick; Parmigiani, Raphael B.; Ohno-Machado, Lucila; Camargo, Anamaria A.; Kreth, Simone

2015-01-01

About half of the known miRNA genes are located within protein-coding host genes, and are thus subject to co-transcription. Accumulating data indicate that this coupling may be an intrinsic mechanism to directly regulate the host gene’s expression, constituting a negative feedback loop. Inevitably, the cell requires a yet largely unknown repertoire of methods to regulate this control mechanism. We propose APA as one possible mechanism by which negative feedback of intronic miRNA on their host genes might be regulated. Using in-silico analyses, we found that host genes that contain seed matching sites for their intronic miRNAs yield longer 32UTRs with more polyadenylation sites. Additionally, the distribution of polyadenylation signals differed significantly between these host genes and host genes of miRNAs that do not contain potential miRNA binding sites. We then transferred these in-silico results to a biological example and investigated the relationship between ZFR and its intronic miRNA miR-579 in a U87 cell line model. We found that ZFR is targeted by its intronic miRNA miR-579 and that alternative polyadenylation allows differential targeting. We additionally used bioinformatics analyses and RNA-Seq to evaluate a potential cross-talk between intronic miRNAs and alternative polyadenylation. CPSF2, a gene previously associated with alternative polyadenylation signal recognition, might be linked to intronic miRNA negative feedback by altering polyadenylation signal utilization. PMID:25799583
Antisense Masking of an hnRNP A1/A2 Intronic Splicing Silencer Corrects SMN2 Splicing in Transgenic Mice

PubMed Central

Hua, Yimin; Vickers, Timothy A.; Okunola, Hazeem L.; Bennett, C. Frank; Krainer, Adrian R.

2008-01-01

survival of motor neuron 2, centromeric (SMN2) is a gene that modifies the severity of spinal muscular atrophy (SMA), a motor-neuron disease that is the leading genetic cause of infant mortality. Increasing inclusion of SMN2 exon 7, which is predominantly skipped, holds promise to treat or possibly cure SMA; one practical strategy is the disruption of splicing silencers that impair exon 7 recognition. By using an antisense oligonucleotide (ASO)-tiling method, we systematically screened the proximal intronic regions flanking exon 7 and identified two intronic splicing silencers (ISSs): one in intron 6 and a recently described one in intron 7. We analyzed the intron 7 ISS by mutagenesis, coupled with splicing assays, RNA-affinity chromatography, and protein overexpression, and found two tandem hnRNP A1/A2 motifs within the ISS that are responsible for its inhibitory character. Mutations in these two motifs, or ASOs that block them, promote very efficient exon 7 inclusion. We screened 31 ASOs in this region and selected two optimal ones to test in human SMN2 transgenic mice. Both ASOs strongly increased hSMN2 exon 7 inclusion in the liver and kidney of the transgenic animals. Our results show that the high-resolution ASO-tiling approach can identify cis-elements that modulate splicing positively or negatively. Most importantly, our results highlight the therapeutic potential of some of these ASOs in the context of SMA. PMID:18371932
Phase distribution of spliceosomal introns: implications for intron origin

PubMed Central

Nguyen, Hung D; Yoshihama, Maki; Kenmochi, Naoya

2006-01-01

Background The origin of spliceosomal introns is the central subject of the introns-early versus introns-late debate. The distribution of intron phases is non-uniform, with an excess of phase-0 introns. Introns-early explains this by speculating that a fraction of present-day introns were present between minigenes in the progenote and therefore must lie in phase-0. In contrast, introns-late predicts that the nonuniformity of intron phase distribution reflects the nonrandomness of intron insertions. Results In this paper, we tested the two theories using analyses of intron phase distribution. We inferred the evolution of intron phase distribution from a dataset of 684 gene orthologs from seven eukaryotes using a maximum likelihood method. We also tested whether the observed intron phase distributions from 10 eukaryotes can be explained by intron insertions on a genome-wide scale. In contrast to the prediction of introns-early, the inferred evolution of intron phase distribution showed that the proportion of phase-0 introns increased over evolution. Consistent with introns-late, the observed intron phase distributions matched those predicted by an intron insertion model quite well. Conclusion Our results strongly support the introns-late hypothesis of the origin of spliceosomal introns. PMID:16959043
Chemical Approaches for Structure and Function of RNA in Postgenomic Era

PubMed Central

Ro-Choi, Tae Suk; Choi, Yong Chun

2012-01-01

In the study of cellular RNA chemistry, a major thrust of research focused upon sequence determinations for decades. Structures of snRNAs (4.5S RNA I (Alu), U1, U2, U3, U4, U5, and U6) were determined at Baylor College of Medicine, Houston, Tex, in an earlier time of pregenomic era. They show novel modifications including base methylation, sugar methylation, 5′-cap structures (types 0–III) and sequence heterogeneity. This work offered an exciting problem of posttranscriptional modification and underwent numerous significant advances through technological revolutions during pregenomic, genomic, and postgenomic eras. Presently, snRNA research is making progresses involved in enzymology of snRNA modifications, molecular evolution, mechanism of spliceosome assembly, chemical mechanism of intron removal, high-order structure of snRNA in spliceosome, and pathology of splicing. These works are destined to reach final pathway of work “Function and Structure of Spliceosome” in addition to exciting new exploitation of other noncoding RNAs in all aspects of regulatory functions. PMID:22347623
Restless legs syndrome-associated intronic common variant in Meis1 alters enhancer function in the developing telencephalon.

PubMed

Spieler, Derek; Kaffe, Maria; Knauf, Franziska; Bessa, José; Tena, Juan J; Giesert, Florian; Schormair, Barbara; Tilch, Erik; Lee, Heekyoung; Horsch, Marion; Czamara, Darina; Karbalai, Nazanin; von Toerne, Christine; Waldenberger, Melanie; Gieger, Christian; Lichtner, Peter; Claussnitzer, Melina; Naumann, Ronald; Müller-Myhsok, Bertram; Torres, Miguel; Garrett, Lillian; Rozman, Jan; Klingenspor, Martin; Gailus-Durner, Valérie; Fuchs, Helmut; Hrabě de Angelis, Martin; Beckers, Johannes; Hölter, Sabine M; Meitinger, Thomas; Hauck, Stefanie M; Laumen, Helmut; Wurst, Wolfgang; Casares, Fernando; Gómez-Skarmeta, Jose Luis; Winkelmann, Juliane

2014-04-01

Genome-wide association studies (GWAS) identified the MEIS1 locus for Restless Legs Syndrome (RLS), but causal single nucleotide polymorphisms (SNPs) and their functional relevance remain unknown. This locus contains a large number of highly conserved noncoding regions (HCNRs) potentially functioning as cis-regulatory modules. We analyzed these HCNRs for allele-dependent enhancer activity in zebrafish and mice and found that the risk allele of the lead SNP rs12469063 reduces enhancer activity in the Meis1 expression domain of the murine embryonic ganglionic eminences (GE). CREB1 binds this enhancer and rs12469063 affects its binding in vitro. In addition, MEIS1 target genes suggest a role in the specification of neuronal progenitors in the GE, and heterozygous Meis1-deficient mice exhibit hyperactivity, resembling the RLS phenotype. Thus, in vivo and in vitro analysis of a common SNP with small effect size showed allele-dependent function in the prospective basal ganglia representing the first neurodevelopmental region implicated in RLS.
Landscape of somatic mutations in 560 breast cancer whole-genome sequences

DOE PAGES

Nik-Zainal, Serena; Davies, Helen; Staaf, Johan; ...

2016-05-02

Here, we analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, anothermore » with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer.« less
Landscape of somatic mutations in 560 breast cancer whole-genome sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nik-Zainal, Serena; Davies, Helen; Staaf, Johan

Here, we analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, anothermore » with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer.« less
Landscape of somatic mutations in 560 breast cancer whole genome sequences

PubMed Central

Nik-Zainal, Serena; Davies, Helen; Staaf, Johan; Ramakrishna, Manasa; Glodzik, Dominik; Zou, Xueqing; Martincorena, Inigo; Alexandrov, Ludmil B.; Martin, Sancha; Wedge, David C.; Van Loo, Peter; Ju, Young Seok; Smid, Marcel; Brinkman, Arie B; Morganella, Sandro; Aure, Miriam R.; Lingjærde, Ole Christian; Langerød, Anita; Ringnér, Markus; Ahn, Sung-Min; Boyault, Sandrine; Brock, Jane E.; Broeks, Annegien; Butler, Adam; Desmedt, Christine; Dirix, Luc; Dronov, Serge; Fatima, Aquila; Foekens, John A.; Gerstung, Moritz; Hooijer, Gerrit KJ; Jang, Se Jin; Jones, David R.; Kim, Hyung-Yong; King, Tari A.; Krishnamurthy, Savitri; Lee, Hee Jin; Lee, Jeong-Yeon; Li, Yilong; McLaren, Stuart; Menzies, Andrew; Mustonen, Ville; O’Meara, Sarah; Pauporté, Iris; Pivot, Xavier; Purdie, Colin A.; Raine, Keiran; Ramakrishnan, Kamna; Rodríguez-González, F. Germán; Romieu, Gilles; Sieuwerts, Anieta M.; Simpson, Peter T; Shepherd, Rebecca; Stebbings, Lucy; Stefansson, Olafur A; Teague, Jon; Tommasi, Stefania; Treilleux, Isabelle; Van den Eynden, Gert G.; Vermeulen, Peter; Vincent-Salomon, Anne; Yates, Lucy; Caldas, Carlos; van’t Veer, Laura; Tutt, Andrew; Knappskog, Stian; Tan, Benita Kiat Tee; Jonkers, Jos; Borg, Åke; Ueno, Naoto T; Sotiriou, Christos; Viari, Alain; Futreal, P. Andrew; Campbell, Peter J; Span, Paul N.; Van Laere, Steven; Lakhani, Sunil R; Eyfjord, Jorunn E.; Thompson, Alastair M.; Birney, Ewan; Stunnenberg, Hendrik G; van de Vijver, Marc J; Martens, John W.M.; Børresen-Dale, Anne-Lise; Richardson, Andrea L.; Kong, Gu; Thomas, Gilles; Stratton, Michael R.

2016-01-01

We analysed whole genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. 93 protein-coding cancer genes carried likely driver mutations. Some non-coding regions exhibited high mutation frequencies but most have distinctive structural features probably causing elevated mutation rates and do not harbour driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed 12 base substitution and six rearrangement signatures. Three rearrangement signatures, characterised by tandem duplications or deletions, appear associated with defective homologous recombination based DNA repair: one with deficient BRCA1 function; another with deficient BRCA1 or BRCA2 function; the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operative, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer. PMID:27135926
Non-coding RNA generated following lariat-debranching mediates targeting of AID to DNA

PubMed Central

Zheng, Simin; Vuong, Bao Q.; Vaidyanathan, Bharat; Lin, Jia-Yu; Huang, Feng-Ting; Chaudhuri, Jayanta

2015-01-01

SUMMARY Transcription through immunoglobulin switch (S) regions is essential for class switch recombination (CSR) but no molecular function of the transcripts has been described. Likewise, recruitment of activation-induced cytidine deaminase (AID) to S regions is critical for CSR; however, the underlying mechanism has not been fully elucidated. Here, we demonstrate that intronic switch RNA acts in trans to target AID to S region DNA. AID binds directly to switch RNA through G-quadruplexes formed by the RNA molecules. Disruption of this interaction by mutation of a key residue in the putative RNA-binding domain of AID impairs recruitment of AID to S region DNA, thereby abolishing CSR. Additionally, inhibition of RNA lariat processing leads to loss of AID localization to S regions and compromises CSR; both defects can be rescued by exogenous expression of switch transcripts in a sequence-specific manner. These studies uncover an RNA-mediated mechanism of targeting AID to DNA. PMID:25957684
Non-coding RNAs in lung cancer

PubMed Central

Ricciuti, Biagio; Mecca, Carmen; Crinò, Lucio; Baglivo, Sara; Cenci, Matteo; Metro, Giulio

2014-01-01

The discovery that protein-coding genes represent less than 2% of all human genome, and the evidence that more than 90% of it is actively transcribed, changed the classical point of view of the central dogma of molecular biology, which was always based on the assumption that RNA functions mainly as an intermediate bridge between DNA sequences and protein synthesis machinery. Accumulating data indicates that non-coding RNAs are involved in different physiological processes, providing for the maintenance of cellular homeostasis. They are important regulators of gene expression, cellular differentiation, proliferation, migration, apoptosis, and stem cell maintenance. Alterations and disruptions of their expression or activity have increasingly been associated with pathological changes of cancer cells, this evidence and the prospect of using these molecules as diagnostic markers and therapeutic targets, make currently non-coding RNAs among the most relevant molecules in cancer research. In this paper we will provide an overview of non-coding RNA function and disruption in lung cancer biology, also focusing on their potential as diagnostic, prognostic and predictive biomarkers. PMID:25593996
Regulatory variation: an emerging vantage point for cancer biology.

PubMed

Li, Luolan; Lorzadeh, Alireza; Hirst, Martin

2014-01-01

Transcriptional regulation involves complex and interdependent interactions of noncoding and coding regions of the genome with proteins that interact and modify them. Genetic variation/mutation in coding and noncoding regions of the genome can drive aberrant transcription and disease. In spite of accounting for nearly 98% of the genome comparatively little is known about the contribution of noncoding DNA elements to disease. Genome-wide association studies of complex human diseases including cancer have revealed enrichment for variants in the noncoding genome. A striking finding of recent cancer genome re-sequencing efforts has been the previously underappreciated frequency of mutations in epigenetic modifiers across a wide range of cancer types. Taken together these results point to the importance of dysregulation in transcriptional regulatory control in genesis of cancer. Powered by recent technological advancements in functional genomic profiling, exploration of normal and transformed regulatory networks will provide novel insight into the initiation and progression of cancer and open new windows to future prognostic and diagnostic tools. © 2013 Wiley Periodicals, Inc.
Genomic organization of the human mi-er1 gene and characterization of alternatively spliced isoforms: regulated use of a facultative intron determines subcellular localization.

PubMed

Paterno, Gary D; Ding, Zhihu; Lew, Yuan-Y; Nash, Gord W; Mercer, F Corinne; Gillespie, Laura L

2002-07-24

mi-er1 (previously called er1) is a fibroblast growth factor-inducible early response gene activated during mesoderm induction in Xenopus embryos and encoding a nuclear protein that functions as a transcriptional activator. The human orthologue of mi-er1 was shown to be upregulated in breast carcinoma cell lines and breast tumours when compared to normal breast cells. In this report, we investigate the structure of the human mi-er1 (hmi-er1) gene and characterize the alternatively spliced transcripts and protein isoforms. hmi-er1 is a single copy gene located at 1p31.2 and spanning 63 kb. It contains 17 exons and includes one skipped exon, a facultative intron and three polyadenylation signals to produce 12 transcripts encoding six distinct proteins. hmi-er1 transcripts were expressed at very low levels in most human adult tissues and the mRNA isoform pattern varied with the tissue. The 12 transcripts encode proteins containing a common internal sequence with variable N- and C-termini. Three distinct N- and two distinct C-termini were identified, giving rise to six protein isoforms. The two C-termini differ significantly in size and sequence and arise from alternate use of a facultative intron to produce hMI-ER1alpha and hMI-ER1beta. In all tissues except testis, transcripts encoding the beta isoform were predominant. hMI-ER1alpha lacks the predicted nuclear localization signal and transfection assays revealed that, unlike hMI-ER1beta, it is not a nuclear protein, but remains in the cytoplasm. Our results demonstrate that alternate use of a facultative intron regulates the subcellular localization of hMI-ER1 proteins and this may have important implications for hMI-ER1 function.
The PBII gene of the human salivary proline-rich protein P-B produces another protein, Q504X8, with an opiorphin homolog, QRGPR.

PubMed

Saitoh, Eiichi; Sega, Takuya; Imai, Akane; Isemura, Satoko; Kato, Tetsuo; Ochiai, Akihito; Taniguchi, Masayuki

2018-04-01

The NCBI gene database and human-transcriptome database for alternative splicing were used to determine the expression of mRNAs for P-B (SMR3B) and variant form of P-B. The translational product from the former mRNA was identified as the protein named P-B, whereas that from the latter has not yet been elucidated. In the present study, we investigated the expression of P-B and its variant form at the protein level. To identify the variant protein of P-B, (1) cationic proteins with a higher isoelectric point in human pooled whole saliva were purified by a two dimensional liquid chromatography; (2) the peptide fragments generated from the in-solution of all proteins digested with trypsin separated and analyzed by MALDI-TOF-MS; and (3) the presence or absence of P-B in individual saliva was examined by 15% SDS-PAGE. The peptide sequences (I 37 PPPYSCTPNMNNCSR 52 , C 53 HHHHKRHHYPCNYCFCYPK 72 , R 59 HHYPCNYCFCYPK 72 and H 60 HYPCNYCFCYPK 72 ) present in the variant protein of P-B were identified. The peptide sequence (G 6 PYPPGPLAPPQPFGPGFVPPPPPPPYGPGR 36 ) in P-B (or the variant) and sequence (I 37 PPPPPAPYGPGIFPPPPPQP 57 ) in P-B were identified. The sum of the sequences identified indicated a 91.23% sequence identity for P-B and 79.76% for the variant. There were cases in which P-B existed in individual saliva, but there were cases in which it did not exist in individual saliva. The variant protein is produced by excising a non-canonical intron (CC-AC pair) from the 3'-noncoding sequence of the PBII gene. Both P-B and the variant are subject to proteolysis in the oral cavity. Copyright © 2018 Elsevier Ltd. All rights reserved.
Study characterizes long non-coding RNA’s response to DNA damage in colon cancer cells | Center for Cancer Research

Cancer.gov

Researchers led by Ashish Lal, Ph.D., Investigator in the Genetics Branch, have shown that when the DNA in human colon cancer cells is damaged, a long non-coding RNA (lncRNA) regulates the expression of genes that halt growth, which allows the cells to repair the damage and promote survival. Their findings suggest an important pro-survival function of a lncRNA in cancer
An expanding universe of the non-coding genome in cancer biology.

PubMed

Xue, Bin; He, Lin

2014-06-01

Neoplastic transformation is caused by accumulation of genetic and epigenetic alterations that ultimately convert normal cells into tumor cells with uncontrolled proliferation and survival, unlimited replicative potential and invasive growth [Hanahan,D. et al. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646-674]. Although the majority of the cancer studies have focused on the functions of protein-coding genes, emerging evidence has started to reveal the importance of the vast non-coding genome, which constitutes more than 98% of the human genome. A number of non-coding RNAs (ncRNAs) derived from the 'dark matter' of the human genome exhibit cancer-specific differential expression and/or genomic alterations, and it is increasingly clear that ncRNAs, including small ncRNAs and long ncRNAs (lncRNAs), play an important role in cancer development by regulating protein-coding gene expression through diverse mechanisms. In addition to ncRNAs, nearly half of the mammalian genomes consist of transposable elements, particularly retrotransposons. Once depicted as selfish genomic parasites that propagate at the expense of host fitness, retrotransposon elements could also confer regulatory complexity to the host genomes during development and disease. Reactivation of retrotransposons in cancer, while capable of causing insertional mutagenesis and genome rearrangements to promote oncogenesis, could also alter host gene expression networks to favor tumor development. Taken together, the functional significance of non-coding genome in tumorigenesis has been previously underestimated, and diverse transcripts derived from the non-coding genome could act as integral functional components of the oncogene and tumor suppressor network. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Functional annotation of the vlinc class of non-coding RNAs using systems biology approach.

PubMed

St Laurent, Georges; Vyatkin, Yuri; Antonets, Denis; Ri, Maxim; Qi, Yao; Saik, Olga; Shtokalo, Dmitry; de Hoon, Michiel J L; Kawaji, Hideya; Itoh, Masayoshi; Lassmann, Timo; Arner, Erik; Forrest, Alistair R R; Nicolas, Estelle; McCaffrey, Timothy A; Carninci, Piero; Hayashizaki, Yoshihide; Wahlestedt, Claes; Kapranov, Philipp

2016-04-20

Functionality of the non-coding transcripts encoded by the human genome is the coveted goal of the modern genomics research. While commonly relied on the classical methods of forward genetics, integration of different genomics datasets in a global Systems Biology fashion presents a more productive avenue of achieving this very complex aim. Here we report application of a Systems Biology-based approach to dissect functionality of a newly identified vast class of very long intergenic non-coding (vlinc) RNAs. Using highly quantitative FANTOM5 CAGE dataset, we show that these RNAs could be grouped into 1542 novel human genes based on analysis of insulators that we show here indeed function as genomic barrier elements. We show that vlinc RNAs genes likely function in cisto activate nearby genes. This effect while most pronounced in closely spaced vlinc RNA-gene pairs can be detected over relatively large genomic distances. Furthermore, we identified 101 vlinc RNA genes likely involved in early embryogenesis based on patterns of their expression and regulation. We also found another 109 such genes potentially involved in cellular functions also happening at early stages of development such as proliferation, migration and apoptosis. Overall, we show that Systems Biology-based methods have great promise for functional annotation of non-coding RNAs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Functional importance of cardiac enhancer-associated noncoding RNAs in heart development and disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ounzain, Samir; Pezzuto, Iole; Micheletti, Rudi

We report here that the key information processing units within gene regulatory networks are enhancers. Enhancer activity is associated with the production of tissue-specific noncoding RNAs, yet the existence of such transcripts during cardiac development has not been established. Using an integrated genomic approach, we demonstrate that fetal cardiac enhancers generate long noncoding RNAs (lncRNAs) during cardiac differentiation and morphogenesis. Enhancer expression correlates with the emergence of active enhancer chromatin states, the initiation of RNA polymerase II at enhancer loci and expression of target genes. Orthologous human sequences are also transcribed in fetal human hearts and cardiac progenitor cells. Throughmore » a systematic bioinformatic analysis, we identified and characterized, for the first time, a catalog of lncRNAs that are expressed during embryonic stem cell differentiation into cardiomyocytes and associated with active cardiac enhancer sequences. RNA-sequencing demonstrates that many of these transcripts are polyadenylated, multi-exonic long noncoding RNAs. Moreover, knockdown of two enhancer-associated lncRNAs resulted in the specific downregulation of their predicted target genes. Interestingly, the reactivation of the fetal gene program, a hallmark of the stress response in the adult heart, is accompanied by increased expression of fetal cardiac enhancer transcripts. Altogether, these findings demonstrate that the activity of cardiac enhancers and expression of their target genes are associated with the production of enhancer-derived lncRNAs.« less
Functional importance of cardiac enhancer-associated noncoding RNAs in heart development and disease

DOE PAGES

Ounzain, Samir; Pezzuto, Iole; Micheletti, Rudi; ...

2014-08-19

We report here that the key information processing units within gene regulatory networks are enhancers. Enhancer activity is associated with the production of tissue-specific noncoding RNAs, yet the existence of such transcripts during cardiac development has not been established. Using an integrated genomic approach, we demonstrate that fetal cardiac enhancers generate long noncoding RNAs (lncRNAs) during cardiac differentiation and morphogenesis. Enhancer expression correlates with the emergence of active enhancer chromatin states, the initiation of RNA polymerase II at enhancer loci and expression of target genes. Orthologous human sequences are also transcribed in fetal human hearts and cardiac progenitor cells. Throughmore » a systematic bioinformatic analysis, we identified and characterized, for the first time, a catalog of lncRNAs that are expressed during embryonic stem cell differentiation into cardiomyocytes and associated with active cardiac enhancer sequences. RNA-sequencing demonstrates that many of these transcripts are polyadenylated, multi-exonic long noncoding RNAs. Moreover, knockdown of two enhancer-associated lncRNAs resulted in the specific downregulation of their predicted target genes. Interestingly, the reactivation of the fetal gene program, a hallmark of the stress response in the adult heart, is accompanied by increased expression of fetal cardiac enhancer transcripts. Altogether, these findings demonstrate that the activity of cardiac enhancers and expression of their target genes are associated with the production of enhancer-derived lncRNAs.« less
α satellite DNA variation and function of the human centromere

PubMed Central

Sullivan, Lori L.; Chew, Kimberline

2017-01-01

ABSTRACT Genomic variation is a source of functional diversity that is typically studied in genic and non-coding regulatory regions. However, the extent of variation within noncoding portions of the human genome, particularly highly repetitive regions, and the functional consequences are not well understood. Satellite DNA, including α satellite DNA found at human centromeres, comprises up to 10% of the genome, but is difficult to study because its repetitive nature hinders contiguous sequence assemblies. We recently described variation within α satellite DNA that affects centromere function. On human chromosome 17 (HSA17), we showed that size and sequence polymorphisms within primary array D17Z1 are associated with chromosome aneuploidy and defective centromere architecture. However, HSA17 can counteract this instability by assembling the centromere at a second, “backup” array lacking variation. Here, we discuss our findings in a broader context of human centromere assembly, and highlight areas of future study to uncover links between genomic and epigenetic features of human centromeres. PMID:28406740

Long noncoding RNAs as enhancers of gene expression.

PubMed

Ørom, U A; Derrien, T; Guigo, R; Shiekhattar, R

2010-01-01

The human genome contains thousands of long noncoding RNAs (ncRNAs) transcribed from diverse genomic locations. A large set of long ncRNAs is transcribed independent of protein-coding genes. We have used the GENCODE annotation of the human genome to identify 3019 long ncRNAs expressed in various human cell lines and tissue. This set of long ncRNAs responds to differentiation signals in primary human keratinocytes and is coexpressed with important regulators of keratinocyte development. Depletion of a number of these long ncRNAs leads to the repression of specific genes in their surrounding locus, supportive of an activating function for ncRNAs. Using reporter assays, we confirmed such activating function and show that such transcriptional enhancement is mediated through the long ncRNA transcripts. Our studies show that long ncRNAs exhibit functions similar to classically defined enhancers, through an RNA-dependent mechanism.
Combinatorial control of Drosophila circular RNA expression by intronic repeats, hnRNPs, and SR proteins.

PubMed

Kramer, Marianne C; Liang, Dongming; Tatomer, Deirdre C; Gold, Beth; March, Zachary M; Cherry, Sara; Wilusz, Jeremy E

2015-10-15

Thousands of eukaryotic protein-coding genes are noncanonically spliced to produce circular RNAs. Bioinformatics has indicated that long introns generally flank exons that circularize in Drosophila, but the underlying mechanisms by which these circular RNAs are generated are largely unknown. Here, using extensive mutagenesis of expression plasmids and RNAi screening, we reveal that circularization of the Drosophila laccase2 gene is regulated by both intronic repeats and trans-acting splicing factors. Analogous to what has been observed in humans and mice, base-pairing between highly complementary transposable elements facilitates backsplicing. Long flanking repeats (∼ 400 nucleotides [nt]) promote circularization cotranscriptionally, whereas pre-mRNAs containing minimal repeats (<40 nt) generate circular RNAs predominately after 3' end processing. Unlike the previously characterized Muscleblind (Mbl) circular RNA, which requires the Mbl protein for its biogenesis, we found that Laccase2 circular RNA levels are not controlled by Mbl or the Laccase2 gene product but rather by multiple hnRNP (heterogeneous nuclear ribonucleoprotein) and SR (serine-arginine) proteins acting in a combinatorial manner. hnRNP and SR proteins also regulate the expression of other Drosophila circular RNAs, including Plexin A (PlexA), suggesting a common strategy for regulating backsplicing. Furthermore, the laccase2 flanking introns support efficient circularization of diverse exons in Drosophila and human cells, providing a new tool for exploring the functional consequences of circular RNA expression across eukaryotes. © 2015 Kramer et al.; Published by Cold Spring Harbor Laboratory Press.
Structure of the human gene encoding the protein repair L-isoaspartyl (D-aspartyl) O-methyltransferase.

PubMed

DeVry, C G; Tsai, W; Clarke, S

1996-11-15

The protein L-isoaspartyl/D-aspartyl O-methyltransferase (EC 2.1.1.77) catalyzes the first step in the repair of proteins damaged in the aging process by isomerization or racemization reactions at aspartyl and asparaginyl residues. A single gene has been localized to human chromosome 6 and multiple transcripts arising through alternative splicing have been identified. Restriction enzyme mapping, subcloning, and DNA sequence analysis of three overlapping clones from a human genomic library in bacteriophage P1 indicate that the gene spans approximately 60 kb and is composed of 8 exons interrupted by 7 introns. Analysis of intron/exon splice junctions reveals that all of the donor and acceptor splice sites are in agreement with the mammalian consensus splicing sequence. Determination of transcription initiation sites by primer extension analysis of poly(A)+ mRNA from human brain identifies multiple start sites, with a major site 159 nucleotides upstream from the ATG start codon. Sequence analysis of the 5'-untranslated region demonstrates several potential cis-acting DNA elements including SP1, ETF, AP1, AP2, ARE, XRE, CREB, MED-1, and half-palindromic ERE motifs. The promoter of this methyltransferase gene lacks an identifiable TATA box but is characterized by a CpG island which begins approximately 723 nucleotides upstream of the major transcriptional start site and extends through exon 1 and into the first intron. These features are characteristic of housekeeping genes and are consistent with the wide tissue distribution observed for this methyltransferase activity.
CRISPR/Cas9-mediated noncoding RNA editing in human cancers.

PubMed

Yang, Jie; Meng, Xiaodan; Pan, Jinchang; Jiang, Nan; Zhou, Chengwei; Wu, Zhenhua; Gong, Zhaohui

2018-01-02

Cancer is characterized by multiple genetic and epigenetic alterations, including a higher prevalence of mutations of oncogenes and/or tumor suppressors. Mounting evidences have shown that noncoding RNAs (ncRNAs) are involved in the epigenetic regulation of cancer genes and their associated pathways. The clustered regularly interspaced short palindromic repeats (CRISPR)-associated nuclease 9 (CRISPR/Cas9) system, a revolutionary genome-editing technology, has shed light on ncRNA-based cancer therapy. Here, we briefly introduce the classifications and mechanisms of CRISPR/Cas9 system. Importantly, we mainly focused on the applications of CRISPR/Cas9 system as a molecular tool for ncRNA (microRNA, long noncoding RNA and circular RNA, etc.) editing in human cancers, and the novel techniques that are based on CRISPR/Cas9 system. Additionally, the off-target effects and the corresponding solutions as well as the challenges toward CRISPR/Cas9 were also evaluated and discussed. Long- and short-ncRNAs have been employed as targets in precision oncology, and CRISPR/Cas9-mediated ncRNA editing may provide an excellent way to cure cancer.
Expression of the cervical carcinoma expressed PCNA regulatory (CCEPR) long noncoding RNA is driven by the human papillomavirus E6 protein and modulates cell proliferation independent of PCNA.

PubMed

Sharma, Surendra; Munger, Karl

2018-05-01

Modulation of expression of noncoding RNAs is an important aspect of the oncogenic activities of high-risk human papillomavirus (HPV) E6 and E7 proteins. While HPV E6/E7-mediated alterations of microRNAs (miRNAs) has been studied in detail there are fewer reports on HPV-mediated dysregulation of long noncoding RNAs (lncRNAs). The cervical carcinoma expressed PCNA regulatory (CCEPR) lncRNA is highly expressed in cervical cancers and expression correlates with tumor size and patient outcome. We report that CCEPR is a nuclear lncRNA and that HPV16 E6 oncogene expression causes increased CCEPR expression through a mechanism that is not directly dependent on TP53 inactivation. CCEPR depletion in cervical carcinoma cell lines reduces viability, while overexpression enhances viability. In contrast to what was published and inspired its designation, there is no evidence for PCNA mRNA stabilization, and hence CCEPR likely functions through a different mechanism. Copyright © 2018 Elsevier Inc. All rights reserved.
Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species.

PubMed

Hezroni, Hadas; Koppstein, David; Schwartz, Matthew G; Avrutin, Alexandra; Bartel, David P; Ulitsky, Igor

2015-05-19

The inability to predict long noncoding RNAs from genomic sequence has impeded the use of comparative genomics for studying their biology. Here, we develop methods that use RNA sequencing (RNA-seq) data to annotate the transcriptomes of 16 vertebrates and the echinoid sea urchin, uncovering thousands of previously unannotated genes, most of which produce long intervening noncoding RNAs (lincRNAs). Although in each species, >70% of lincRNAs cannot be traced to homologs in species that diverged >50 million years ago, thousands of human lincRNAs have homologs with similar expression patterns in other species. These homologs share short, 5'-biased patches of sequence conservation nested in exonic architectures that have been extensively rewired, in part by transposable element exonization. Thus, over a thousand human lincRNAs are likely to have conserved functions in mammals, and hundreds beyond mammals, but those functions require only short patches of specific sequences and can tolerate major changes in gene architecture. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

PubMed Central

2013-01-01

Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768
[Long non-coding RNAs in the pathophysiology of atherosclerosis].

PubMed

Novak, Jan; Vašků, Julie Bienertová; Souček, Miroslav

2018-01-01

The human genome contains about 22 000 protein-coding genes that are transcribed to an even larger amount of messenger RNAs (mRNA). Interestingly, the results of the project ENCODE from 2012 show, that despite up to 90 % of our genome being actively transcribed, protein-coding mRNAs make up only 2-3 % of the total amount of the transcribed RNA. The rest of RNA transcripts is not translated to proteins and that is why they are referred to as "non-coding RNAs". Earlier the non-coding RNA was considered "the dark matter of genome", or "the junk", whose genes has accumulated in our DNA during the course of evolution. Today we already know that non-coding RNAs fulfil a variety of regulatory functions in our body - they intervene into epigenetic processes from chromatin remodelling to histone methylation, or into the transcription process itself, or even post-transcription processes. Long non-coding RNAs (lncRNA) are one of the classes of non-coding RNAs that have more than 200 nucleotides in length (non-coding RNAs with less than 200 nucleotides in length are called small non-coding RNAs). lncRNAs represent a widely varied and large group of molecules with diverse regulatory functions. We can identify them in all thinkable cell types or tissues, or even in an extracellular space, which includes blood, specifically plasma. Their levels change during the course of organogenesis, they are specific to different tissues and their changes also occur along with the development of different illnesses, including atherosclerosis. This review article aims to present lncRNAs problematics in general and then focuses on some of their specific representatives in relation to the process of atherosclerosis (i.e. we describe lncRNA involvement in the biology of endothelial cells, vascular smooth muscle cells or immune cells), and we further describe possible clinical potential of lncRNA, whether in diagnostics or therapy of atherosclerosis and its clinical manifestations.Key words: atherosclerosis - lincRNA - lncRNA - MALAT - MIAT.
Study characterizes long non-coding RNA’s response to DNA damage in colon cancer cells | Center for Cancer Research

Cancer.gov

Researchers led by Ashish Lal, Ph.D., Investigator in the Genetics Branch, have shown that when the DNA in human colon cancer cells is damaged, a long non-coding RNA (lncRNA) regulates the expression of genes that halt growth, which allows the cells to repair the damage and promote survival. Their findings suggest an important pro-survival function of a lncRNA in cancer cells. Read more...
Present Scenario of Long Non-Coding RNAs in Plants

PubMed Central

Bhatia, Garima; Goyal, Neetu; Sharma, Shailesh; Upadhyay, Santosh Kumar; Singh, Kashmir

2017-01-01

Small non-coding RNAs have been extensively studied in plants over the last decade. In contrast, genome-wide identification of plant long non-coding RNAs (lncRNAs) has recently gained momentum. LncRNAs are now being recognized as important players in gene regulation, and their potent regulatory roles are being studied comprehensively in eukaryotes. LncRNAs were first reported in humans in 1992. Since then, research in animals, particularly in humans, has rapidly progressed, and a vast amount of data has been generated, collected, and organized using computational approaches. Additionally, numerous studies have been conducted to understand the roles of these long RNA species in several diseases. However, the status of lncRNA investigation in plants lags behind that in animals (especially humans). Efforts are being made in this direction using computational tools and high-throughput sequencing technologies, such as the lncRNA microarray technique, RNA-sequencing (RNA-seq), RNA capture sequencing, (RNA CaptureSeq), etc. Given the current scenario, significant amounts of data have been produced regarding plant lncRNAs, and this amount is likely to increase in the subsequent years. In this review we have documented brief information about lncRNAs and their status of research in plants, along with the plant-specific resources/databases for information retrieval on lncRNAs. PMID:29657289
Factor IX gene haplotypes in Amerindians.

PubMed

Franco, R F; Araújo, A G; Zago, M A; Guerreiro, J F; Figueiredo, M S

1997-02-01

We have determined the haplotypes of the factor IX gene for 95 Indians from 5 Brazilian Amazon tribes: Wayampí, Wayana-Apalaí, Kayapó, Arára, and Yanomámi. Eight polymorphisms linked to the factor IX gene were investigated: MseI (at 5', nt -698), BamHI (at 5', nt -561), DdeI (intron 1), BamHI (intron 2), XmnI (intron 3), TaqI (intron 4), MspI (intron 4), and HhaI (at 3', approximately 8 kb). The results of the haplotype distribution and the allele frequencies for each of the factor IX gene polymorphisms in Amerindians were similar to the results reported for Asian populations but differed from results for other ethnic groups. Only five haplotypes were identified within the entire Amerindian study population, and the haplotype distribution was significantly different among the five tribes, with one (Arára) to four (Wayampí) haplotypes being found per tribe. These findings indicate a significant heterogeneity among the Indian tribes and contrast with the homogeneous distribution of the beta-globin gene cluster haplotypes but agree with our recent findings on the distribution of alpha-globin gene cluster haplotypes and the allele frequencies for six VNTRs in the same Amerindian tribes. Our data represent the first study of factor IX-associated polymorphisms in Amerindian populations and emphasizes the applicability of these genetic markers for population and human evolution studies.
Group I introns are widespread in archaea.

PubMed

Nawrocki, Eric P; Jones, Thomas A; Eddy, Sean R

2018-05-18

Group I catalytic introns have been found in bacterial, viral, organellar, and some eukaryotic genomes, but not in archaea. All known archaeal introns are bulge-helix-bulge (BHB) introns, with the exception of a few group II introns. It has been proposed that BHB introns arose from extinct group I intron ancestors, much like eukaryotic spliceosomal introns are thought to have descended from group II introns. However, group I introns have little sequence conservation, making them difficult to detect with standard sequence similarity searches. Taking advantage of recent improvements in a computational homology search method that accounts for both conserved sequence and RNA secondary structure, we have identified 39 group I introns in a wide range of archaeal phyla, including examples of group I introns and BHB introns in the same host gene.
The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate?

PubMed Central

Koonin, Eugene V

2006-01-01

Background Ever since the discovery of 'genes in pieces' and mRNA splicing in eukaryotes, origin and evolution of spliceosomal introns have been considered within the conceptual framework of the 'introns early' versus 'introns late' debate. The 'introns early' hypothesis, which is closely linked to the so-called exon theory of gene evolution, posits that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. Under this scenario, the absence of spliceosomal introns in prokaryotes is considered to be a result of "genome streamlining". The 'introns late' hypothesis counters that spliceosomal introns emerged only in eukaryotes, and moreover, have been inserted into protein-coding genes continuously throughout the evolution of eukaryotes. Beyond the formal dilemma, the more substantial side of this debate has to do with possible roles of introns in the evolution of eukaryotes. Results I argue that several lines of evidence now suggest a coherent solution to the introns-early versus introns-late debate, and the emerging picture of intron evolution integrates aspects of both views although, formally, there seems to be no support for the original version of introns-early. Firstly, there is growing evidence that spliceosomal introns evolved from group II self-splicing introns which are present, usually, in small numbers, in many bacteria, and probably, moved into the evolving eukaryotic genome from the α-proteobacterial progenitor of the mitochondria. Secondly, the concept of a primordial pool of 'virus-like' genetic elements implies that self-splicing introns are among the most ancient genetic entities. Thirdly, reconstructions of the ancestral state of eukaryotic genes suggest that the last common ancestor of extant eukaryotes had an intron-rich genome. Thus, it appears that ancestors of spliceosomal introns, indeed, have existed since the earliest stages of life's evolution, in a formal agreement with the introns-early scenario. However, there is no evidence that these ancient introns ever became widespread before the emergence of eukaryotes, hence, the central tenet of introns-early, the role of introns in early evolution of proteins, has no support. However, the demonstration that numerous introns invaded eukaryotic genes at the outset of eukaryotic evolution and that subsequent intron gain has been limited in many eukaryotic lineages implicates introns as an ancestral feature of eukaryotic genomes and refutes radical versions of introns-late. Perhaps, most importantly, I argue that the intron invasion triggered other pivotal events of eukaryogenesis, including the emergence of the spliceosome, the nucleus, the linear chromosomes, the telomerase, and the ubiquitin signaling system. This concept of eukaryogenesis, in a sense, revives some tenets of the exon hypothesis, by assigning to introns crucial roles in eukaryotic evolutionary innovation. Conclusion The scenario of the origin and evolution of introns that is best compatible with the results of comparative genomics and theoretical considerations goes as follows: self-splicing introns since the earliest stages of life's evolution – numerous spliceosomal introns invading genes of the emerging eukaryote during eukaryogenesis – subsequent lineage-specific loss and gain of introns. The intron invasion, probably, spawned by the mitochondrial endosymbiont, might have critically contributed to the emergence of the principal features of the eukaryotic cell. This scenario combines aspects of the introns-early and introns-late views. Reviewers this article was reviewed by W. Ford Doolittle, James Darnell (nominated by W. Ford Doolittle), William Martin, and Anthony Poole. PMID:16907971
Horizontal transfer and gene conversion as an important driving force in shaping the landscape of mitochondrial introns.

PubMed

Wu, Baojun; Hao, Weilong

2014-04-16

Group I introns are highly dynamic and mobile, featuring extensive presence-absence variation and widespread horizontal transfer. Group I introns can invade intron-lacking alleles via intron homing powered by their own encoded homing endonuclease gene (HEG) after horizontal transfer or via reverse splicing through an RNA intermediate. After successful invasion, the intron and HEG are subject to degeneration and sequential loss. It remains unclear whether these mechanisms can fully address the high dynamics and mobility of group I introns. Here, we found that HEGs undergo a fast gain-and-loss turnover comparable with introns in the yeast mitochondrial 21S-rRNA gene, which is unexpected, as the intron and HEG are generally believed to move together as a unit. We further observed extensively mosaic sequences in both the introns and HEGs, and evidence of gene conversion between HEG-containing and HEG-lacking introns. Our findings suggest horizontal transfer and gene conversion can accelerate HEG/intron degeneration and loss, or rescue and propagate HEG/introns, and ultimately result in high HEG/intron turnover rate. Given that up to 25% of the yeast mitochondrial genome is composed of introns and most mitochondrial introns are group I introns, horizontal transfer and gene conversion could have served as an important mechanism in introducing mitochondrial intron diversity, promoting intron mobility and consequently shaping mitochondrial genome architecture.
ATP-binding cassette subfamily A, member 4 intronic variants c.4773+3A>G and c.5461-10T>C cause Stargardt disease due to defective splicing.

PubMed

Jonsson, Frida; Westin, Ida Maria; Österman, Lennart; Sandgren, Ola; Burstedt, Marie; Holmberg, Monica; Golovleva, Irina

2018-02-20

Inherited retinal dystrophies (IRDs) represent a group of progressive conditions affecting the retina. There is a great genetic heterogeneity causing IRDs, and to date, more than 260 genes are associated with IRDs. Stargardt disease, type 1 (STGD1) or macular degeneration with flecks, STGD1 represents a disease with early onset, central visual impairment, frequent appearance of yellowish flecks and mutations in the ATP-binding cassette subfamily A, member 4 (ABCA4) gene. A large number of intronic sequence variants in ABCA4 have been considered pathogenic although their functional effect was seldom demonstrated. In this study, we aimed to reveal how intronic variants present in patients with Stargardt from the same Swedish family affect splicing. The splicing of the ABCA4 gene was studied in human embryonic kidney cells, HEK293T, and in human retinal pigment epithelium cells, ARPE-19, using a minigene system containing variants c.4773+3A>G and c.5461-10T>C. We showed that both ABCA4 variants, c.4773+3A>G and c.5461-10T>C, cause aberrant splicing of the ABCA4 minigene resulting in exon skipping. We also demonstrated that splicing of ABCA4 has different outcomes depending on transfected cell type. Two intronic variants c.4773+3A>G and c.5461-10T>C, both predicted to affect splicing, are indeed disease-causing mutations due to skipping of exons 33, 34, 39 and 40 of ABCA4 gene. The experimental proof that ABCA4 mutations in STGD patients affect protein function is crucial for their inclusion to future clinical trials; therefore, functional testing of all ABCA4 intronic variants associated with Stargardt disease by minigene technology is desirable. © 2018 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
Characterizing the strand-specific distribution of non-CpG methylation in human pluripotent cells.

PubMed

Guo, Weilong; Chung, Wen-Yu; Qian, Minping; Pellegrini, Matteo; Zhang, Michael Q

2014-03-01

DNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation. With the divide-and-compare strategy, we show that CHG and CHH methylation are not intrinsically different in human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We also find that non-CpG methylation is skewed between the two strands in introns, especially at intron boundaries and in highly expressed genes. Controlling for the proximal sequences of non-CpG sites, we show that the skew of non-CpG methylation in introns is mainly guided by sequence skew. By studying subgroups of transposable elements, we also found that non-CpG methylation is distributed in a strand-specific manner in both short interspersed nuclear elements (SINE) and long interspersed nuclear elements (LINE), but not in long terminal repeats (LTR). Finally, we show that on the antisense strand of Alus, a non-CpG site just downstream of the A-box is highly methylated. Together, the divide-and-compare strategy leads us to identify regions with strand-specific distributions of non-CpG methylation in humans.
Sequencing of mitochondrial genomes of nine Aspergillus and Penicillium species identifies mobile introns and accessory genes as main sources of genome size variability.

PubMed

Joardar, Vinita; Abrams, Natalie F; Hostetler, Jessica; Paukstelis, Paul J; Pakala, Suchitra; Pakala, Suman B; Zafar, Nikhat; Abolude, Olukemi O; Payne, Gary; Andrianopoulos, Alex; Denning, David W; Nierman, William C

2012-12-12

The genera Aspergillus and Penicillium include some of the most beneficial as well as the most harmful fungal species such as the penicillin-producer Penicillium chrysogenum and the human pathogen Aspergillus fumigatus, respectively. Their mitochondrial genomic sequences may hold vital clues into the mechanisms of their evolution, population genetics, and biology, yet only a handful of these genomes have been fully sequenced and annotated. Here we report the complete sequence and annotation of the mitochondrial genomes of six Aspergillus and three Penicillium species: A. fumigatus, A. clavatus, A. oryzae, A. flavus, Neosartorya fischeri (A. fischerianus), A. terreus, P. chrysogenum, P. marneffei, and Talaromyces stipitatus (P. stipitatum). The accompanying comparative analysis of these and related publicly available mitochondrial genomes reveals wide variation in size (25-36 Kb) among these closely related fungi. The sources of genome expansion include group I introns and accessory genes encoding putative homing endonucleases, DNA and RNA polymerases (presumed to be of plasmid origin) and hypothetical proteins. The two smallest sequenced genomes (A. terreus and P. chrysogenum) do not contain introns in protein-coding genes, whereas the largest genome (T. stipitatus), contains a total of eleven introns. All of the sequenced genomes have a group I intron in the large ribosomal subunit RNA gene, suggesting that this intron is fixed in these species. Subsequent analysis of several A. fumigatus strains showed low intraspecies variation. This study also includes a phylogenetic analysis based on 14 concatenated core mitochondrial proteins. The phylogenetic tree has a different topology from published multilocus trees, highlighting the challenges still facing the Aspergillus systematics. The study expands the genomic resources available to fungal biologists by providing mitochondrial genomes with consistent annotations for future genetic, evolutionary and population studies. Despite the conservation of the core genes, the mitochondrial genomes of Aspergillus and Penicillium species examined here exhibit significant amount of interspecies variation. Most of this variation can be attributed to accessory genes and mobile introns, presumably acquired by horizontal gene transfer of mitochondrial plasmids and intron homing.
Functional and comparative genomics analyses of pmp22 in medaka fish

PubMed Central

Itou, Junji; Suyama, Mikita; Imamura, Yukio; Deguchi, Tomonori; Fujimori, Kazuhiro; Yuba, Shunsuke; Kawarabayasi, Yutaka; Kawasaki, Takashi

2009-01-01

Background Pmp22, a member of the junction protein family Claudin/EMP/PMP22, plays an important role in myelin formation. Increase of pmp22 transcription causes peripheral neuropathy, Charcot-Marie-Tooth disease type1A (CMT1A). The pathophysiological phenotype of CMT1A is aberrant axonal myelination which induces a reduction in nerve conduction velocity (NCV). Several CMT1A model rodents have been established by overexpressing pmp22. Thus, it is thought that pmp22 expression must be tightly regulated for correct myelin formation in mammals. Interestingly, the myelin sheath is also present in other jawed vertebrates. The purpose of this study is to analyze the evolutionary conservation of the association between pmp22 transcription level and vertebrate myelin formation, and to find the conserved non-coding sequences for pmp22 regulation by comparative genomics analyses between jawed fishes and mammals. Results A transgenic pmp22 over-expression medaka fish line was established. The transgenic fish had approximately one fifth the peripheral NCV values of controls, and aberrant myelination of transgenic fish in the peripheral nerve system (PNS) was observed. We successfully confirmed that medaka fish pmp22 has the same exon-intron structure as mammals, and identified some known conserved regulatory motifs. Furthermore, we found novel conserved sequences in the first intron and 3'UTR. Conclusion Medaka fish undergo abnormalities in the PNS when pmp22 transcription increases. This result indicates that an adequate pmp22 transcription level is necessary for correct myelination of jawed vertebrates. Comparison of pmp22 orthologs between distantly related species identifies evolutionary conserved sequences that contribute to precise regulation of pmp22 expression. PMID:19534778
LncRNApred: Classification of Long Non-Coding RNAs and Protein-Coding Transcripts by the Ensemble Algorithm with a New Hybrid Feature.

PubMed

Pian, Cong; Zhang, Guangle; Chen, Zhi; Chen, Yuanyuan; Zhang, Jin; Yang, Tao; Zhang, Liangyun

2016-01-01

As a novel class of noncoding RNAs, long noncoding RNAs (lncRNAs) have been verified to be associated with various diseases. As large scale transcripts are generated every year, it is significant to accurately and quickly identify lncRNAs from thousands of assembled transcripts. To accurately discover new lncRNAs, we develop a classification tool of random forest (RF) named LncRNApred based on a new hybrid feature. This hybrid feature set includes three new proposed features, which are MaxORF, RMaxORF and SNR. LncRNApred is effective for classifying lncRNAs and protein coding transcripts accurately and quickly. Moreover,our RF model only requests the training using data on human coding and non-coding transcripts. Other species can also be predicted by using LncRNApred. The result shows that our method is more effective compared with the Coding Potential Calculate (CPC). The web server of LncRNApred is available for free at http://mm20132014.wicp.net:57203/LncRNApred/home.jsp.
Long non-coding RNA CASC2 regulates cell biological behaviour through the MAPK signalling pathway in hepatocellular carcinoma.

PubMed

Gan, Yuanyuan; Han, Nana; He, Xiaoqin; Yu, Jiajun; Zhang, Meixia; Zhou, Yujie; Liang, Huiling; Deng, Junjian; Zheng, Yongfa; Ge, Wei; Long, Zhixiong; Xu, Ximing

2017-06-01

Long non-coding RNAs have previously been demonstrated to play important roles in regulating human diseases, especially cancer. However, the biological functions and molecular mechanisms of long non-coding RNAs in hepatocellular carcinoma have not been extensively studied. The long non-coding RNA CASC2 (cancer susceptibility candidate 2) has been characterised as a tumour suppressor in endometrial cancer and gliomas. However, the role and function of CASC2 in hepatocellular carcinoma remain unknown. In this study, using quantitative real-time polymerase chain reaction, we confirmed that CASC2 expression was downregulated in 50 hepatocellular carcinoma cases (62%) and in hepatocellular carcinoma cell lines compared with the paired adjacent tissues and normal liver cells. In vitro experiments further demonstrated that overexpressed CASC2 decreased hepatocellular carcinoma cell proliferation, migration and invasion as well as promoted apoptosis via inactivating the mitogen-activated protein kinase signalling pathway. Our findings demonstrate that CASC2 could be a useful tumour suppressor factor and a promising therapeutic target for hepatocellular carcinoma.

Non-coding RNAs: new biomarkers and therapeutic targets for esophageal cancer

PubMed Central

Ren, Zhipeng; Zhang, Guoliang

2017-01-01

Esophageal cancer is one of the most common gastrointestinal malignant diseases and there is still no effective treatment. The incidence of esophageal cancer in the world is relatively high and on the increase year by year. Thus, the elaboration on the carcinogenesis of esophageal cancer and the identification of new biomarkers and therapeutic targets is quite beneficial to optimizing the current therapeutic regimen for treating such deadly disease. More and more evidence has shown that non-coding RNAs play an important role in the development and progression of multiple human cancers, including esophageal cancer. microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) are two functional kinds of non-coding RNAs that have been well investigated. They exert tumor suppressive or promoting effect by specifically regulating the expression of certain downstream target genes, which is tumor specific. It is also proved that miRNAs and lncRNAs level in tissue and plasma from esophageal cancer patients are closely correlated with the survival and disease progression, which could be used as a prognostic factor and therapeutic target for esophageal cancer. PMID:28388588
Non-coding RNAs: new biomarkers and therapeutic targets for esophageal cancer.

PubMed

Hou, Xiaobin; Wen, Jiaxin; Ren, Zhipeng; Zhang, Guoliang

2017-06-27

Esophageal cancer is one of the most common gastrointestinal malignant diseases and there is still no effective treatment. The incidence of esophageal cancer in the world is relatively high and on the increase year by year. Thus, the elaboration on the carcinogenesis of esophageal cancer and the identification of new biomarkers and therapeutic targets is quite beneficial to optimizing the current therapeutic regimen for treating such deadly disease. More and more evidence has shown that non-coding RNAs play an important role in the development and progression of multiple human cancers, including esophageal cancer. microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) are two functional kinds of non-coding RNAs that have been well investigated. They exert tumor suppressive or promoting effect by specifically regulating the expression of certain downstream target genes, which is tumor specific. It is also proved that miRNAs and lncRNAs level in tissue and plasma from esophageal cancer patients are closely correlated with the survival and disease progression, which could be used as a prognostic factor and therapeutic target for esophageal cancer.
Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term.

PubMed

Romero, Roberto; Tarca, Adi L; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S; Kalita, Cynthia A; Cai, Juan; Yeo, Lami; Lipovich, Leonard

2014-09-01

To identify differentially expressed long non-coding RNA (lncRNA) genes in human myometrium in women with spontaneous labor at term. Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n = 19) and women in spontaneous labor at term (n = 20). RNA was extracted and profiled using an Illumina® microarray platform. We have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. We identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an experimental method completely independent of the microarray analysis. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site, that lacked evolutionary conservation beyond primates. We provide, for the first time, evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term.
Gene encoding the human. beta. -hexosaminidase. beta. chain: Extensive homology of intron placement in the. alpha. - and. beta. -chain genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Proia, R.L.

1988-03-01

Lysosomal {beta}-hexosaminidase is composed of two structurally similar chains, {alpha} and {beta}, that are the products of different genes. Mutations in either gene causing {beta}-hexosaminidase deficiency result in the lysosomal storage disease GM2-gangliosidosis. To enable the investigation of the molecular lesions in this disorder and to study the evolutionary relationship between the {alpha} and {beta} chains, the {beta}-chain gene was isolated, and its organization was characterized. The {beta}-chain coding region is divided into 14 exons distributed over {approx}40 kilobases of DNA. Comparison with the {alpha}-chain gene revealed that 12 of the 13 introns interrupt the coding regions at homologous positions.more » This extensive sharing of intron placement demonstrates that the {alpha} and {beta} chains evolved by way of the duplication of a common ancestor.« less
Identification and analysis of multigene families by comparison of exon fingerprints.

PubMed

Brown, N P; Whittaker, A J; Newell, W R; Rawlings, C J; Beck, S

1995-06-02

Gene families are often recognised by sequence homology using similarity searching to find relationships, however, genomic sequence data provides gene architectural information not used by conventional search methods. In particular, intron positions and phases are expected to be relatively conserved features, because mis-splicing and reading frame shifts should be selected against. A fast search technique capable of detecting possible weak sequence homologies apparent at the intron/exon level of gene organization is presented for comparing spliceosomal genes and gene fragments. FINEX compares strings of exons delimited by intron/exon boundary positions and intron phases (exon fingerprint) using a global dynamic programming algorithm with a combined intron phase identity and exon size dissimilarity score. Exon fingerprints are typically two orders of magnitude smaller than their nucleic acid sequence counterparts giving rise to fast search times: a ranked search against a library of 6755 fingerprints for a typical three exon fingerprint completes in under 30 seconds on an ordinary workstation, while a worst case largest fingerprint of 52 exons completes in just over one minute. The short "sequence" length of exon fingerprints in comparisons is compensated for by the large exon alphabet compounded of intron phase types and a wide range of exon sizes, the latter contributing the most information to alignments. FINEX performs better in some searches than conventional methods, finding matches with similar exon organization, but low sequence homology. A search using a human serum albumin finds all members of the multigene family in the FINEX database at the top of the search ranking, despite very low amino acid percentage identities between family members. The method should complement conventional sequence searching and alignment techniques, offering a means of identifying otherwise hard to detect homologies where genomic data are available.
Genome-wide analysis of long non-coding RNAs at the mature stage of sea buckthorn (Hippophae rhamnoides Linn) fruit.

PubMed

Zhang, Guoyun; Duan, Aiguo; Zhang, Jianguo; He, Caiyun

2017-01-05

Long non-coding RNAs (lncRNAs), which are >200nt longer transcripts, potentially play important roles in almost all biological processes in plants and mammals. However, the functions and profiles of lncRNAs in fruit is less understood. Therefore, it is urgent and necessary to identify and analyze the functions of lncRNAs in sea buckthorns. Using RNA-sequencing, we synthetically identified lncRNAs in mature fruit from the red and yellow sea buckthorn. We obtained 567,778,938 clean reads from six samples and identified 3428 lncRNAs in mature fruit, including 2498 intergenic lncRNAs, 593 anti-sense lncRNAs, and 337 intronic lncRNAs. We also identified 3819 and 2295 circular RNAs in red and yellow sea buckthorn Fruit. In the aspects of gene architecture and expression, our results showed significant differences among the three lncRNA subtypes. We also investigated the effect of lncRNAs on its cis and trans target genes. Based on target genes analysis, we obtained 61 different expression lncRNAs (DE-lncRNAs) between these two sea buckthorns, including 23 special expression lncRNAs in red fruit and 22 special expression lncRNAs in yellow fruit. Importantly, we found a few DE-lncRNAs play cis and trans roles for genes in the Carotenoid biosynthesis, ascorbate and aldarate metabolism and fatty acid metabolism pathways. Our study provides a resource for lncRNA studies in mature fruit. It probably encourages researchers to deeply study fruit-coloring. It expands our knowledge about lncRNA biology and the annotation of the sea buckthorn genome. Copyright © 2016 Elsevier B.V. All rights reserved.
New target genes of MITF-induced microRNA-211 contribute to melanoma cell invasion.

PubMed

Margue, Christiane; Philippidou, Demetra; Reinsbach, Susanne E; Schmitt, Martina; Behrmann, Iris; Kreis, Stephanie

2013-01-01

The non-coding microRNAs (miRNA) have tissue- and disease-specific expression patterns. They down-regulate target mRNAs, which likely impacts on most fundamental cellular processes. Differential expression patterns of miRNAs are currently being exploited for identification of biomarkers for early disease diagnosis, prediction of progression for melanoma and other cancers and as promising drug targets, since they can easily be inhibited or replaced in a given cellular context. Before successfully manipulating miRNAs in clinical settings, their precise expression levels, endogenous functions and thus their target genes have to be determined. MiR-211, a melanocyte lineage-specific small non-coding miRNA, is located in an intron of TRPM1, a target gene of the microphtalmia-associated transcription factor (MITF). By transcriptionally up-regulating TRPM1, MITF, which is critical for both melanocyte differentiation and survival and for melanoma progression, indirectly drives the expression of miR-211. Expression of this miRNA is often reduced in melanoma samples. Here, we investigated functional roles of miR-211 by identifying and studying new target genes. We show that MITF-correlated miR-211 expression levels are mostly but not always reduced in a panel of 11 melanoma cell lines and in primary and metastatic melanoma compared to normal melanocytes and nevi, respectively. MiR-211 itself only marginally impacted on cell invasion and migration, while perturbation of some new miR-211 target genes, such as AP1S2, SOX11, IGFBP5, and SERINC3 significantly increased invasion. These results and the variable expression levels of miR-211 raise serious doubts on the value of miR-211 as a melanoma tumor-suppressing miRNA and/or as a biomarker for melanoma.
Dynamic and Widespread lncRNA Expression in a Sponge and the Origin of Animal Complexity

PubMed Central

Gaiti, Federico; Fernandez-Valverde, Selene L.; Nakanishi, Nagayasu; Calcino, Andrew D.; Yanai, Itai; Tanurdzic, Milos; Degnan, Bernard M.

2015-01-01

Long noncoding RNAs (lncRNAs) are important developmental regulators in bilaterian animals. A correlation has been claimed between the lncRNA repertoire expansion and morphological complexity in vertebrate evolution. However, this claim has not been tested by examining morphologically simple animals. Here, we undertake a systematic investigation of lncRNAs in the demosponge Amphimedon queenslandica, a morphologically simple, early-branching metazoan. We combine RNA-Seq data across multiple developmental stages of Amphimedon with a filtering pipeline to conservatively predict 2,935 lncRNAs. These include intronic overlapping lncRNAs, exonic antisense overlapping lncRNAs, long intergenic nonprotein coding RNAs, and precursors for small RNAs. Sponge lncRNAs are remarkably similar to their bilaterian counterparts in being relatively short with few exons and having low primary sequence conservation relative to protein-coding genes. As in bilaterians, a majority of sponge lncRNAs exhibit typical hallmarks of regulatory molecules, including high temporal specificity and dynamic developmental expression. Specific lncRNA expression profiles correlate tightly with conserved protein-coding genes likely involved in a range of developmental and physiological processes, such as the Wnt signaling pathway. Although the majority of Amphimedon lncRNAs appears to be taxonomically restricted with no identifiable orthologs, we find a few cases of conservation between demosponges in lncRNAs that are antisense to coding sequences. Based on the high similarity in the structure, organization, and dynamic expression of sponge lncRNAs to their bilaterian counterparts, we propose that these noncoding RNAs are an ancient feature of the metazoan genome. These results are consistent with lncRNAs regulating the development of animals, regardless of their level of morphological complexity. PMID:25976353
Role of non-coding RNAs in maintaining primary airway smooth muscle cells

PubMed Central

2014-01-01

Background The airway smooth muscle (ASM) cell maintains its own proliferative rate and contributes to the inflammatory response in the airways, effects that are inhibited by corticosteroids, used in the treatment of airways diseases. Objective We determined the differential expression of mRNAs, microRNAs (miRNAs) and long noncoding RNA species (lncRNAs) in primary ASM cells following treatment with a corticosteroid, dexamethasone, and fetal calf serum (FCS). Methods mRNA, miRNA and lncRNA expression was measured by microarray and quantitative real-time PCR. Results A small number of miRNAs (including miR-150, −371-5p, −718, −940, −1181, −1207-5p, −1915, and −3663-3p) were decreased following exposure to dexamethasone and FCS. The mRNA targets of these miRNAs were increased in expression. The changes in mRNA expression were associated with regulation of ASM actin cytoskeleton. We also observed changes in expression of lncRNAs, including natural antisense, pseudogenes, intronic lncRNAs, and intergenic lncRNAs following dexamethasone and FCS. We confirmed the change in expression of three of these, LINC00882, LINC00883, PVT1, and its transcriptional activator, c-MYC. We propose that four of these lincRNAs (RP11-46A10.4, LINC00883, BCYRN1, and LINC00882) act as miRNA ‘sponges’ for 4 miRNAs (miR-150, −371-5p, −940, −1207-5p). Conclusion This in-vitro model of primary ASM cell phenotype was associated with the regulation of several ncRNAs. Their identification allows for in-vitro functional experimentation to establish causality with the primary ASM phenotype, and in airway diseases such as asthma and chronic obstructive pulmonary disease (COPD). PMID:24886442
Noncoding RNAs in DNA Repair and Genome Integrity

PubMed Central

Wan, Guohui; Liu, Yunhua; Han, Cecil; Zhang, Xinna

2014-01-01

Abstract Significance: The well-studied sequences in the human genome are those of protein-coding genes, which account for only 1%–2% of the total genome. However, with the advent of high-throughput transcriptome sequencing technology, we now know that about 90% of our genome is extensively transcribed and that the vast majority of them are transcribed into noncoding RNAs (ncRNAs). It is of great interest and importance to decipher the functions of these ncRNAs in humans. Recent Advances: In the last decade, it has become apparent that ncRNAs play a crucial role in regulating gene expression in normal development, in stress responses to internal and environmental stimuli, and in human diseases. Critical Issues: In addition to those constitutively expressed structural RNA, such as ribosomal and transfer RNAs, regulatory ncRNAs can be classified as microRNAs (miRNAs), Piwi-interacting RNAs (piRNAs), small interfering RNAs (siRNAs), small nucleolar RNAs (snoRNAs), and long noncoding RNAs (lncRNAs). However, little is known about the biological features and functional roles of these ncRNAs in DNA repair and genome instability, although a number of miRNAs and lncRNAs are regulated in the DNA damage response. Future Directions: A major goal of modern biology is to identify and characterize the full profile of ncRNAs with regard to normal physiological functions and roles in human disorders. Clinically relevant ncRNAs will also be evaluated and targeted in therapeutic applications. Antioxid. Redox Signal. 20, 655–677. PMID:23879367
Long non-coding RNAs in hepatocellular carcinoma: Potential roles and clinical implications

PubMed Central

Niu, Zhao-Shan; Niu, Xiao-Jun; Wang, Wen-Hong

2017-01-01

Long non-coding RNAs (lncRNAs) are a subgroup of non-coding RNA transcripts greater than 200 nucleotides in length with little or no protein-coding potential. Emerging evidence indicates that lncRNAs may play important regulatory roles in the pathogenesis and progression of human cancers, including hepatocellular carcinoma (HCC). Certain lncRNAs may be used as diagnostic or prognostic markers for HCC, a serious malignancy with increasing morbidity and high mortality rates worldwide. Therefore, elucidating the functional roles of lncRNAs in tumors can contribute to a better understanding of the molecular mechanisms of HCC and may help in developing novel therapeutic targets. In this review, we summarize the recent progress regarding the functional roles of lncRNAs in HCC and explore their clinical implications as diagnostic or prognostic biomarkers and molecular therapeutic targets for HCC. PMID:28932078
Activity-Dependent Human Brain Coding/Noncoding Gene Regulatory Networks

PubMed Central

Lipovich, Leonard; Dachet, Fabien; Cai, Juan; Bagla, Shruti; Balan, Karina; Jia, Hui; Loeb, Jeffrey A.

2012-01-01

While most gene transcription yields RNA transcripts that code for proteins, a sizable proportion of the genome generates RNA transcripts that do not code for proteins, but may have important regulatory functions. The brain-derived neurotrophic factor (BDNF) gene, a key regulator of neuronal activity, is overlapped by a primate-specific, antisense long noncoding RNA (lncRNA) called BDNFOS. We demonstrate reciprocal patterns of BDNF and BDNFOS transcription in highly active regions of human neocortex removed as a treatment for intractable seizures. A genome-wide analysis of activity-dependent coding and noncoding human transcription using a custom lncRNA microarray identified 1288 differentially expressed lncRNAs, of which 26 had expression profiles that matched activity-dependent coding genes and an additional 8 were adjacent to or overlapping with differentially expressed protein-coding genes. The functions of most of these protein-coding partner genes, such as ARC, include long-term potentiation, synaptic activity, and memory. The nuclear lncRNAs NEAT1, MALAT1, and RPPH1, composing an RNAse P-dependent lncRNA-maturation pathway, were also upregulated. As a means to replicate human neuronal activity, repeated depolarization of SY5Y cells resulted in sustained CREB activation and produced an inverse pattern of BDNF-BDNFOS co-expression that was not achieved with a single depolarization. RNAi-mediated knockdown of BDNFOS in human SY5Y cells increased BDNF expression, suggesting that BDNFOS directly downregulates BDNF. Temporal expression patterns of other lncRNA-messenger RNA pairs validated the effect of chronic neuronal activity on the transcriptome and implied various lncRNA regulatory mechanisms. lncRNAs, some of which are unique to primates, thus appear to have potentially important regulatory roles in activity-dependent human brain plasticity. PMID:22960213
Intermediate introns in nuclear genes of euglenids - are they a distinct type?

PubMed

Milanowski, Rafał; Gumińska, Natalia; Karnkowska, Anna; Ishikawa, Takao; Zakryś, Bożena

2016-02-29

Nuclear genes of euglenids contain two major types of introns: conventional spliceosomal and nonconventional introns. The latter are characterized by variable non-canonical borders, RNA secondary structure that brings intron ends together, and an unknown mechanism of removal. Some researchers also distinguish intermediate introns, which combine features of both types. They form a stable RNA secondary structure and are classified into two subtypes depending on whether they contain one (intermediate/nonconventional subtype) or both (conventional/intermediate subtype) canonical spliceosomal borders. However, it has been also postulated that most introns classified as intermediate could simply be special cases of conventional or nonconventional introns. Sequences of tubB, hsp90 and gapC genes from six strains of Euglena agilis were obtained. They contain four, six, and two or three introns, respectively (the third intron in the gapC gene is unique for just one strain). Conventional introns were present at three positions: two in the tubB gene (at one position conventional/intermediate introns were also found) and one in the gapC gene. Nonconventional introns are present at ten positions: two in the tubB gene (at one position intermediate/nonconventional introns were also found), six in hsp90 (at four positions intermediate/nonconventional introns were also found), and two in the gapC gene. Sequence and RNA secondary structure analyses of nonconventional introns confirmed that their most strongly conserved elements are base pairing nucleotides at positions +4, +5 and +6/ -8, -7 and -6 (in most introns CAG/CTG nucleotides were observed). It was also confirmed that the presence of the 5' GT/C end in intermediate/nonconventional introns is not the result of kinship with conventional introns, but is due to evolutionary pressure to preserve the purine at the 5' end. However, an example of a nonconventional intron with GC-AG ends was shown, suggesting the possibility of intron type conversion between nonconventional and conventional. Furthermore, an analysis of conventional introns revealed that the ability to form a stable RNA secondary structure by some introns is probably not a result of their relationship with nonconventional introns. It was also shown that acquisition of new nonconventional introns is an ongoing process and can be observed at the level of a single species. In the recently acquired intron in the gapC gene an extended direct repeats at the intron-exon junctions are present, suggesting that double-strand break repair process could be the source of new nonconventional introns.
Splicing-Related Features of Introns Serve to Propel Evolution

PubMed Central

Luo, Yuping; Li, Chun; Gong, Xi; Wang, Yanlu; Zhang, Kunshan; Cui, Yaru; Sun, Yi Eve; Li, Siguang

2013-01-01

The role of spliceosomal intronic structures played in evolution has only begun to be elucidated. Comparative genomic analyses of fungal snoRNA sequences, which are often contained within introns and/or exons, revealed that about one-third of snoRNA-associated introns in three major snoRNA gene clusters manifested polymorphisms, likely resulting from intron loss and gain events during fungi evolution. Genomic deletions can clearly be observed as one mechanism underlying intron and exon loss, as well as generation of complex introns where several introns lie in juxtaposition without intercalating exons. Strikingly, by tracking conserved snoRNAs in introns, we found that some introns had moved from one position to another by excision from donor sites and insertion into target sties elsewhere in the genome without needing transposon structures. This study revealed the origin of many newly gained introns. Moreover, our analyses suggested that intron-containing sequences were more prone to sustainable structural changes than DNA sequences without introns due to intron's ability to jump within the genome via unknown mechanisms. We propose that splicing-related structural features of introns serve as an additional motor to propel evolution. PMID:23516505
Variants in the human intestinal fatty acid binding protein 2 gene in obese subjects.

PubMed

Sipiläinen, R; Uusitupa, M; Heikkinen, S; Rissanen, A; Laakso, M

1997-08-01

Fatty acid binding protein 2 gene (FABP2) has been proposed to be an important candidate gene for insulin resistance; therefore, it also could be a promising candidate gene for obesity. We screened the whole coding region of the FABP2 gene in 40 obese nondiabetic Finnish subjects. Furthermore, we investigated the effects of the codon 54 polymorphism of this gene (Ala-->Thr) on insulin levels and basal metabolic rate in 170 obese subjects. The frequencies of the variants found in exon 4 (GTA-->GTG) and 3'-noncoding region (GCGCA-->GCACA), as well as the allele frequencies for the variable lengths of the ATT repeat sequence in intron 2 did not differ between the obese subjects and nonobese controls. The frequency of threonine-encoding allele in codon 54 of the FABP2 gene did not differ between obese and control subjects (28 vs. 29%, respectively). In the obese group there were no differences in gender distribution, age, weight, body mass index, lean body mass, percentage of body fat, waist circumference, and waist-to-hip ratio among the individuals homozygous for Ala54, heterozygous for Thr54, and homozygous for Thr54-encoding alleles. Similarly, fasting serum insulin, glucose, lipids and lipoprotein concentrations, basal metabolic rate (adjusted for lean body mass and age), respiratory quotient, and rates of glucose and lipid oxidation did not differ among the groups. We conclude that obesity is not associated with specific variants in the FABP2 gene. Furthermore, the codon 54 Ala to Thr polymorphism of this gene does not influence insulin levels or basal metabolic rate in obese Finns.
Introns: The Functional Benefits of Introns in Genomes.

PubMed

Jo, Bong-Seok; Choi, Sun Shim

2015-12-01

The intron has been a big biological mystery since it was first discovered in several aspects. First, all of the completely sequenced eukaryotes harbor introns in the genomic structure, whereas no prokaryotes identified so far carry introns. Second, the amount of total introns varies in different species. Third, the length and number of introns vary in different genes, even within the same species genome. Fourth, all introns are copied into RNAs by transcription and DNAs by replication processes, but intron sequences do not participate in protein-coding sequences. The existence of introns in the genome should be a burden to some cells, because cells have to consume a great deal of energy to copy and excise them exactly at the correct positions with the help of complicated spliceosomal machineries. The existence throughout the long evolutionary history is explained, only if selective advantages of carrying introns are assumed to be given to cells to overcome the negative effect of introns. In that regard, we summarize previous research about the functional roles or benefits of introns. Additionally, several other studies strongly suggesting that introns should not be junk will be introduced.
Decoding the non-coding RNAs in Alzheimer's disease.

PubMed

Schonrock, Nicole; Götz, Jürgen

2012-11-01

Non-coding RNAs (ncRNAs) are integral components of biological networks with fundamental roles in regulating gene expression. They can integrate sequence information from the DNA code, epigenetic regulation and functions of multimeric protein complexes to potentially determine the epigenetic status and transcriptional network in any given cell. Humans potentially contain more ncRNAs than any other species, especially in the brain, where they may well play a significant role in human development and cognitive ability. This review discusses their emerging role in Alzheimer's disease (AD), a human pathological condition characterized by the progressive impairment of cognitive functions. We discuss the complexity of the ncRNA world and how this is reflected in the regulation of the amyloid precursor protein and Tau, two proteins with central functions in AD. By understanding this intricate regulatory network, there is hope for a better understanding of disease mechanisms and ultimately developing diagnostic and therapeutic tools.
Intron Retention Identifies a Malaria Vector within the Anopheles (Nyssorhynchus) Albitaris Complex (Diptera: Culicidae)

DTIC Science & Technology

2005-03-09

variation in local environments including changes driven by human activity . For example, Anopheles (Nyssorhynchus) marajoara Galvao, and Damasceno...Linthicum, 1988) is the principal malaria vector in northeastern Amazonia, replacing An. darling Root, perhaps as a result of changes in human activity (Conn
Recurrent Loss of Specific Introns during Angiosperm Evolution

PubMed Central

Wang, Hao; Devos, Katrien M.; Bennetzen, Jeffrey L.

2014-01-01

Numerous instances of presence/absence variations for introns have been documented in eukaryotes, and some cases of recurrent loss of the same intron have been suggested. However, there has been no comprehensive or phylogenetically deep analysis of recurrent intron loss. Of 883 cases of intron presence/absence variation that we detected in five sequenced grass genomes, 93 were confirmed as recurrent losses and the rest could be explained by single losses (652) or single gains (118). No case of recurrent intron gain was observed. Deep phylogenetic analysis often indicated that apparent intron gains were actually numerous independent losses of the same intron. Recurrent loss exhibited extreme non-randomness, in that some introns were removed independently in many lineages. The two larger genomes, maize and sorghum, were found to have a higher rate of both recurrent loss and overall loss and/or gain than foxtail millet, rice or Brachypodium. Adjacent introns and small introns were found to be preferentially lost. Intron loss genes exhibited a high frequency of germ line or early embryogenesis expression. In addition, flanking exon A+T-richness and intron TG/CG ratios were higher in retained introns. This last result suggests that epigenetic status, as evidenced by a loss of methylated CG dinucleotides, may play a role in the process of intron loss. This study provides the first comprehensive analysis of recurrent intron loss, makes a series of novel findings on the patterns of recurrent intron loss during the evolution of the grass family, and provides insight into the molecular mechanism(s) underlying intron loss. PMID:25474210
Limited MHC class I intron 2 repertoire variation in bonobos.

PubMed

de Groot, Natasja G; Heijmans, Corrine M C; Helsen, Philippe; Otting, Nel; Pereboom, Zjef; Stevens, Jeroen M G; Bontrop, Ronald E

2017-10-01

Common chimpanzees (Pan troglodytes) experienced a selective sweep, probably caused by a SIV-like virus, which targeted their MHC class I repertoire. Based on MHC class I intron 2 data analyses, this selective sweep took place about 2-3 million years ago. As a consequence, common chimpanzees have a skewed MHC class I repertoire that is enriched for allotypes that are able to recognise conserved regions of the SIV proteome. The bonobo (Pan paniscus) shared an ancestor with common chimpanzees approximately 1.5 to 2 million years ago. To investigate whether the signature of this selective sweep is also detectable in bonobos, the MHC class I gene repertoire of two bonobo panels comprising in total 29 animals was investigated by Sanger sequencing. We identified 14 Papa-A, 20 Papa-B and 11 Papa-C alleles, of which eight, five and eight alleles, respectively, have not been reported previously. Within this pool of MHC class I variation, we recovered only 2 Papa-A, 3 Papa-B and 6 Papa-C intron 2 sequences. As compared to humans, bonobos appear to have an even more diminished MHC class I intron 2 lineage repertoire than common chimpanzees. This supports the notion that the selective sweep may have predated the speciation of common chimpanzees and bonobos. The further reduction of the MHC class I intron 2 lineage repertoire observed in bonobos as compared to the common chimpanzee may be explained by a founding effect or other subsequent selective processes.

Noncoding copy-number variations are associated with congenital limb malformation.

PubMed

Flöttmann, Ricarda; Kragesteen, Bjørt K; Geuer, Sinje; Socha, Magdalena; Allou, Lila; Sowińska-Seidler, Anna; Bosquillon de Jarcy, Laure; Wagner, Johannes; Jamsheer, Aleksander; Oehl-Jaschkowitz, Barbara; Wittler, Lars; de Silva, Deepthi; Kurth, Ingo; Maya, Idit; Santos-Simarro, Fernando; Hülsemann, Wiebke; Klopocki, Eva; Mountford, Roger; Fryer, Alan; Borck, Guntram; Horn, Denise; Lapunzina, Pablo; Wilson, Meredith; Mascrez, Bénédicte; Duboule, Denis; Mundlos, Stefan; Spielmann, Malte

2017-10-12

PurposeCopy-number variants (CNVs) are generally interpreted by linking the effects of gene dosage with phenotypes. The clinical interpretation of noncoding CNVs remains challenging. We investigated the percentage of disease-associated CNVs in patients with congenital limb malformations that affect noncoding cis-regulatory sequences versus genes sensitive to gene dosage effects.MethodsWe applied high-resolution copy-number analysis to 340 unrelated individuals with isolated limb malformation. To investigate novel candidate CNVs, we re-engineered human CNVs in mice using clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing.ResultsOf the individuals studied, 10% harbored CNVs segregating with the phenotype in the affected families. We identified 31 CNVs previously associated with congenital limb malformations and four novel candidate CNVs. Most of the disease-associated CNVs (57%) affected the noncoding cis-regulatory genome, while only 43% included a known disease gene and were likely to result from gene dosage effects. In transgenic mice harboring four novel candidate CNVs, we observed altered gene expression in all cases, indicating that the CNVs had a regulatory effect either by changing the enhancer dosage or altering the topological associating domain architecture of the genome.ConclusionOur findings suggest that CNVs affecting noncoding regulatory elements are a major cause of congenital limb malformations.Genetics in Medicine advance online publication, 12 October 2017; doi:10.1038/gim.2017.154.
Chimeric mitochondrial minichromosomes of the human body louse, Pediculus humanus: evidence for homologous and non-homologous recombination.

PubMed

Shao, Renfu; Barker, Stephen C

2011-02-15

The mitochondrial (mt) genome of the human body louse, Pediculus humanus, consists of 18 minichromosomes. Each minichromosome is 3 to 4 kb long and has 1 to 3 genes. There is unequivocal evidence for recombination between different mt minichromosomes in P. humanus. It is not known, however, how these minichromosomes recombine. Here, we report the discovery of eight chimeric mt minichromosomes in P. humanus. We classify these chimeric mt minichromosomes into two groups: Group I and Group II. Group I chimeric minichromosomes contain parts of two different protein-coding genes that are from different minichromosomes. The two parts of protein-coding genes in each Group I chimeric minichromosome are joined at a microhomologous nucleotide sequence; microhomologous nucleotide sequences are hallmarks of non-homologous recombination. Group II chimeric minichromosomes contain all of the genes and the non-coding regions of two different minichromosomes. The conserved sequence blocks in the non-coding regions of Group II chimeric minichromosomes resemble the "recombination repeats" in the non-coding regions of the mt genomes of higher plants. These repeats are essential to homologous recombination in higher plants. Our analyses of the nucleotide sequences of chimeric mt minichromosomes indicate both homologous and non-homologous recombination between minichromosomes in the mitochondria of the human body louse. Copyright © 2010 Elsevier B.V. All rights reserved.
Prosurvival long noncoding RNA PINCR regulates a subset of p53 targets in human colorectal cancer cells by binding to Matrin 3

PubMed Central

Chaudhary, Ritu; Gryder, Berkley; Woods, Wendy S; Subramanian, Murugan; Jones, Matthew F; Li, Xiao Ling; Jenkins, Lisa M; Shabalina, Svetlana A; Mo, Min; Dasso, Mary; Yang, Yuan; Wakefield, Lalage M; Zhu, Yuelin; Frier, Susan M; Moriarity, Branden S; Prasanth, Kannanganattu V; Perez-Pinera, Pablo; Lal, Ashish

2017-01-01

Thousands of long noncoding RNAs (lncRNAs) have been discovered, yet the function of the vast majority remains unclear. Here, we show that a p53-regulated lncRNA which we named PINCR (p53-induced noncoding RNA), is induced ~100-fold after DNA damage and exerts a prosurvival function in human colorectal cancer cells (CRC) in vitro and tumor growth in vivo. Targeted deletion of PINCR in CRC cells significantly impaired G1 arrest and induced hypersensitivity to chemotherapeutic drugs. PINCR regulates the induction of a subset of p53 targets involved in G1 arrest and apoptosis, including BTG2, RRM2B and GPX1. Using a novel RNA pulldown approach that utilized endogenous S1-tagged PINCR, we show that PINCR associates with the enhancer region of these genes by binding to RNA-binding protein Matrin 3 that, in turn, associates with p53. Our findings uncover a critical prosurvival function of a p53/PINCR/Matrin 3 axis in response to DNA damage in CRC cells. DOI: http://dx.doi.org/10.7554/eLife.23244.001 PMID:28580901
Gene organization and alternative splicing of human prohormone convertase PC8.

PubMed Central

Goodge, K A; Thomas, R J; Martin, T J; Gillespie, M T

1998-01-01

The mammalian Ca2+-dependent serine protease prohormone convertase PC8 is expressed ubiquitously, being transcribed as 3.5, 4.3 and 6.0 kb mRNA isoforms in various tissues. To determine the origin of these various mRNA isoforms we report the characterization of the human PC8 gene, which has been previously localized to chromosome 11q23-24. Consisting of 16 exons, the human PC8 gene spans approx. 27 kb. A comparison of the position of intron-exon junctions of the human PC8 gene with the gene structures of previously reported prohormone convertase genes demonstrated a divergence of the human PC8 from the highly conserved nature of the gene organization of this enzyme family. The nucleotide sequence of the 5'-flanking region of the human PC8 is reported and possesses putative promoter elements characteristic of a GC-rich promoter. Further supporting the potential role of a GC-rich promoter element, multiple transcriptional initiation sites within a 200 bp region were demonstrated. We propose that the various mRNA isoforms of PC8 result from the inclusion of intronic sequences within transcripts. PMID:9820811
Extensive intron gain in the ancestor of placental mammals

PubMed Central

2011-01-01

Background Genome-wide studies of intron dynamics in mammalian orthologous genes have found convincing evidence for loss of introns but very little for intron turnover. Similarly, large-scale analysis of intron dynamics in a few vertebrate genomes has identified only intron losses and no gains, indicating that intron gain is an extremely rare event in vertebrate evolution. These studies suggest that the intron-rich genomes of vertebrates do not allow intron gain. The aim of this study was to search for evidence of de novo intron gain in domesticated genes from an analysis of their exon/intron structures. Results A phylogenomic approach has been used to analyse all domesticated genes in mammals and chordates that originated from the coding parts of transposable elements. Gain of introns in domesticated genes has been reconstructed on well established mammalian, vertebrate and chordate phylogenies, and examined as to where and when the gain events occurred. The locations, sizes and amounts of de novo introns gained in the domesticated genes during the evolution of mammals and chordates has been analyzed. A significant amount of intron gain was found only in domesticated genes of placental mammals, where more than 70 cases were identified. De novo gained introns show clear positional bias, since they are distributed mainly in 5' UTR and coding regions, while 3' UTR introns are very rare. In the coding regions of some domesticated genes up to 8 de novo gained introns have been found. Intron densities in Eutheria-specific domesticated genes and in older domesticated genes that originated early in vertebrates are lower than those for normal mammalian and vertebrate genes. Surprisingly, the majority of intron gains have occurred in the ancestor of placentals. Conclusions This study provides the first evidence for numerous intron gains in the ancestor of placental mammals and demonstrates that adequate taxon sampling is crucial for reconstructing intron evolution. The findings of this comprehensive study slightly challenge the current view on the evolutionary stasis in intron dynamics during the last 100 - 200 My. Domesticated genes could constitute an excellent system on which to analyse the mechanisms of intron gain in placental mammals. Reviewers: this article was reviewed by Dan Graur, Eugene V. Koonin and Jürgen Brosius. PMID:22112745
Origin and evolution of spliceosomal introns

PubMed Central

2012-01-01

Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section. PMID:22507701
Evaluation of the mechanisms of intron loss and gain in the social amoebae Dictyostelium.

PubMed

Ma, Ming-Yue; Che, Xun-Ru; Porceddu, Andrea; Niu, Deng-Ke

2015-12-18

Spliceosomal introns are a common feature of eukaryotic genomes. To approach a comprehensive understanding of intron evolution on Earth, studies should look beyond repeatedly studied groups such as animals, plants, and fungi. The slime mold Dictyostelium belongs to a supergroup of eukaryotes not covered in previous studies. We found 441 precise intron losses in Dictyostelium discoideum and 202 precise intron losses in Dictyostelium purpureum. Consistent with these observations, Dictyostelium discoideum was found to have significantly more copies of reverse transcriptase genes than Dictyostelium purpureum. We also found that the lost introns are significantly further from the 5' end of genes than the conserved introns. Adjacent introns were prone to be lost simultaneously in Dictyostelium discoideum. In both Dictyostelium species, the exonic sequences flanking lost introns were found to have a significantly higher GC content than those flanking conserved introns. Together, these observations support a reverse-transcription model of intron loss in which intron losses were caused by gene conversion between genomic DNA and cDNA reverse transcribed from mature mRNA. We also identified two imprecise intron losses in Dictyostelium discoideum that may have resulted from genomic deletions. Ninety-eight putative intron gains were also observed. Consistent with previous studies of other lineages, the source sequences were found in only a small number of cases, with only two instances of intron gain identified in Dictyostelium discoideum. Although they diverged very early from animals and fungi, Dictyostelium species have similar mechanisms of intron loss.
Genetic diversity of tyrosine hydroxylase (TH) and dopamine β-hydroxylase (DBH) genes in cattle breeds

PubMed Central

Lourenco-Jaramillo, Diana Lelidett; Sifuentes-Rincón, Ana María; Parra-Bracamonte, Gaspar Manuel; de la Rosa-Reyna, Xochitl Fabiola; Segura-Cabrera, Aldo; Arellano-Vera, Williams

2012-01-01

DNA from four cattle breeds was used to re-sequence all of the exons and 56% of the introns of the bovine tyrosine hydroxylase (TH) gene and 97% and 13% of the bovine dopamine β-hydroxylase (DBH) coding and non-coding sequences, respectively. Two novel single nucleotide polymorphisms (SNPs) and a microsatellite motif were found in the TH sequences. The DBH sequences contained 62 nucleotide changes, including eight non-synonymous SNPs (nsSNPs) that are of particular interest because they may alter protein function and therefore affect the phenotype. These DBH nsSNPs resulted in amino acid substitutions that were predicted to destabilize the protein structure. Six SNPs (one from TH and five from DBH non-synonymous SNPs) were genotyped in 140 animals; all of them were polymorphic and had a minor allele frequency of > 9%. There were significant differences in the intra- and inter-population haplotype distributions. The haplotype differences between Brahman cattle and the three B. t. taurus breeds (Charolais, Holstein and Lidia) were interesting from a behavioural point of view because of the differences in temperament between these breeds. PMID:22888292
Analysis of alterative cleavage and polyadenylation by 3′ region extraction and deep sequencing

PubMed Central

Hoque, Mainul; Ji, Zhe; Zheng, Dinghai; Luo, Wenting; Li, Wencheng; You, Bei; Park, Ji Yeon; Yehia, Ghassan; Tian, Bin

2012-01-01

Alternative cleavage and polyadenylation (APA) leads to mRNA isoforms with different coding sequences (CDS) and/or 3′ untranslated regions (3′UTRs). Using 3′ Region Extraction And Deep Sequencing (3′READS), a method which addresses the internal priming and oligo(A) tail issues that commonly plague polyA site (pA) identification, we comprehensively mapped pAs in the mouse genome, thoroughly annotating 3′ ends of genes and revealing over five thousand pAs (~8% of total) flanked by A-rich sequences, which have hitherto been overlooked. About 79% of mRNA genes and 66% of long non-coding RNA (lncRNA) genes have APA; but these two gene types have distinct usage patterns for pAs in introns and upstream exons. Promoter-distal pAs become relatively more abundant during embryonic development and cell differentiation, a trend affecting pAs in both 3′-most exons and upstream regions. Upregulated isoforms generally have stronger pAs, suggesting global modulation of the 3′ end processing activity in development and differentiation. PMID:23241633
Connecting the dots: chromatin and alternative splicing in EMT.

PubMed

Warns, Jessica A; Davie, James R; Dhasarathy, Archana

2016-02-01

Nature has devised sophisticated cellular machinery to process mRNA transcripts produced by RNA Polymerase II, removing intronic regions and connecting exons together, to produce mature RNAs. This process, known as splicing, is very closely linked to transcription. Alternative splicing, or the ability to produce different combinations of exons that are spliced together from the same genomic template, is a fundamental means of regulating protein complexity. Similar to transcription, both constitutive and alternative splicing can be regulated by chromatin and its associated factors in response to various signal transduction pathways activated by external stimuli. This regulation can vary between different cell types, and interference with these pathways can lead to changes in splicing, often resulting in aberrant cellular states and disease. The epithelial to mesenchymal transition (EMT), which leads to cancer metastasis, is influenced by alternative splicing events of chromatin remodelers and epigenetic factors such as DNA methylation and non-coding RNAs. In this review, we will discuss the role of epigenetic factors including chromatin, chromatin remodelers, DNA methyltransferases, and microRNAs in the context of alternative splicing, and discuss their potential involvement in alternative splicing during the EMT process.
The Genetics of the Thyroid Stimulating Hormone Receptor: History and Relevance

PubMed Central

Yin, Xiaoming; Latif, Rauf

2010-01-01

Background The thyroid stimulating hormone receptor (TSHR) is the key regulator of thyrocyte function. The gene for the TSHR on chromosome 14q31 has been implicated as coding for the major autoantigen in the autoimmune hyperthyroidism of Graves' disease (GD) to which T cells and autoantibodies are directed. Summary The TSHR is a seven-transmembrane domain receptor that undergoes complex posttranslational processing. In this brief review, we look at the genetics of this important autoantigen and its influence on a variety of tissue functions in addition to its role in the induction of GD. Conclusions There is convincing evidence that the TSH receptor gene confers increased susceptibility for GD, but not Hashimoto's thyroiditis. GD is associated with polymorphisms in the intron 1 gene region. How such noncoding nucleotide changes influence disease susceptibility remains uncertain, but is likely to involve TSHR splicing variants and/or microRNAs arising from this gene region. Whether such influences are confined to the thyroid gland or whether they influence cell function in the many extrathyroidal sites of TSHR expression remains unknown. PMID:20578897
Single-cell full-length total RNA sequencing uncovers dynamics of recursive splicing and enhancer RNAs.

PubMed

Hayashi, Tetsutaro; Ozaki, Haruka; Sasagawa, Yohei; Umeda, Mana; Danno, Hiroki; Nikaido, Itoshi

2018-02-12

Total RNA sequencing has been used to reveal poly(A) and non-poly(A) RNA expression, RNA processing and enhancer activity. To date, no method for full-length total RNA sequencing of single cells has been developed despite the potential of this technology for single-cell biology. Here we describe random displacement amplification sequencing (RamDA-seq), the first full-length total RNA-sequencing method for single cells. Compared with other methods, RamDA-seq shows high sensitivity to non-poly(A) RNA and near-complete full-length transcript coverage. Using RamDA-seq with differentiation time course samples of mouse embryonic stem cells, we reveal hundreds of dynamically regulated non-poly(A) transcripts, including histone transcripts and long noncoding RNA Neat1. Moreover, RamDA-seq profiles recursive splicing in >300-kb introns. RamDA-seq also detects enhancer RNAs and their cell type-specific activity in single cells. Taken together, we demonstrate that RamDA-seq could help investigate the dynamics of gene expression, RNA-processing events and transcriptional regulation in single cells.
Computation of direct and inverse mutations with the SEGM web server (Stochastic Evolution of Genetic Motifs): an application to splice sites of human genome introns.

PubMed

Benard, Emmanuel; Michel, Christian J

2009-08-01

We present here the SEGM web server (Stochastic Evolution of Genetic Motifs) in order to study the evolution of genetic motifs both in the direct evolutionary sense (past-present) and in the inverse evolutionary sense (present-past). The genetic motifs studied can be nucleotides, dinucleotides and trinucleotides. As an example of an application of SEGM and to understand its functionalities, we give an analysis of inverse mutations of splice sites of human genome introns. SEGM is freely accessible at http://lsiit-bioinfo.u-strasbg.fr:8080/webMathematica/SEGM/SEGM.html directly or by the web site http://dpt-info.u-strasbg.fr/~michel/. To our knowledge, this SEGM web server is to date the only computational biology software in this evolutionary approach.
Human Variation in Short Regions Predisposed to Deep Evolutionary Conservation

PubMed Central

Loots, Gabriela G.; Ovcharenko, Ivan

2010-01-01

The landscape of the human genome consists of millions of short islands of conservation that are 100% conserved across multiple vertebrate genomes (termed “bricks”), the majority of which are located in noncoding regions. Several hundred thousand bricks are deeply conserved reaching the genomes of amphibians and fish. Deep phylogenetic conservation of noncoding DNA has been reported to be strongly associated with the presence of gene regulatory elements, introducing bricks as a proxy to the functional noncoding landscape of the human genome. Here, we report a significant overrepresentation of bricks in the promoters of transcription factors and developmental genes, where the high level of phylogenetic conservation correlates with an increase in brick overrepresentation. We also found that the presence of a brick dictates a predisposition to evolutionary constraint, with only 0.7% of the amniota brick central nucleotides being diverged within the primate lineage—an 11-fold reduction in the divergence rate compared with random expectation. Human single-nucleotide polymorphism (SNP) data explains only 3% of primate-specific variation in amniota bricks, thus arguing for a widespread fixation of brick mutations within the primate lineage and prior to human radiation. This variation, in turn, might have been utilized as a driving force for primate- and hominoid-specific adaptation. We also discovered a pronounced deviation from the evolutionary predisposition in the human lineage, with over 20-fold increase in the substitution rate at brick SNP sites over expected values. In addition, contrary to typical brick mutations, brick variation commonly encountered in the human population displays limited, if any, signatures of negative selection as measured by the minor allele frequency and population differentiation (F-statistical measure) measures. These observations argue for the plasticity of gene regulatory mechanisms in vertebrates—with evidence of strong purifying selection acting on the gene regulatory landscape of the human genome, where widespread advantageous mutations in putative regulatory elements are likely utilized in functional diversification and adaptation of species. PMID:20093432
Structural analysis of the 5{prime} region of mouse and human Huntington disease genes reveals conservation of putative promoter region and Di- and trinucleotide polymorphisms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Biaoyang; Nasir, J.; Kalchman, M.A.

1995-02-10

We have previously cloned and characterized the murine homologue of the Huntington disease (HD) gene and shown that it maps to mouse chromosome 5 within a region of conserved synteny with human chromosome 4p16.3. Here we present a detailed comparison of the sequence of the putative promoter and the organization of the 5{prime} genomic region of the murine (Hdh) and human HD genes encompassing the first five exons. We show that in this region these two genes share identical exon boundaries, but have different-size introns. Two dinucleotide (CT) and one trinucleotide intronic polymorphism in Hdh and an intronic CA polymorphismmore » in the HD gene were identified. Comparison of 940-bp sequence 5{prime} to the putative translation start site reveals a highly conserved region (78.8% nucleotide identity) between Hdh and the HD gene from nucleotide -56 to -206 (of Hdh). Neither Hdh nor the HD gene have typical TATA or CCAAT elements, but both show one putative AP2 binding site and numerous potential Sp1 binding sites. The high sequence identity between Hdh and the HD gene for approximately 200 bp 5{prime} to the putative translation start site indicates that these sequences may play a role in regulating expression of the Huntington disease gene. 30 refs., 4 figs., 2 tabs.« less
Molecular Regulatory Pathways Link Sepsis With Metabolic Syndrome: Non-coding RNA Elements Underlying the Sepsis/Metabolic Cross-Talk.

PubMed

Meydan, Chanan; Bekenstein, Uriya; Soreq, Hermona

2018-01-01

Sepsis and metabolic syndrome (MetS) are both inflammation-related entities with high impact for human health and the consequences of concussions. Both represent imbalanced parasympathetic/cholinergic response to insulting triggers and variably uncontrolled inflammation that indicates shared upstream regulators, including short microRNAs (miRs) and long non-coding RNAs (lncRNAs). These may cross talk across multiple systems, leading to complex molecular and clinical outcomes. Notably, biomedical and RNA-sequencing based analyses both highlight new links between the acquired and inherited pathogenic, cardiac and inflammatory traits of sepsis/MetS. Those include the HOTAIR and MIAT lncRNAs and their targets, such as miR-122, -150, -155, -182, -197, -375, -608 and HLA-DRA. Implicating non-coding RNA regulators in sepsis and MetS may delineate novel high-value biomarkers and targets for intervention.
miRNA-dependent gene silencing involving Ago2-mediated cleavage of a circular antisense RNA

PubMed Central

Hansen, Thomas B; Wiklund, Erik D; Bramsen, Jesper B; Villadsen, Sune B; Statham, Aaron L; Clark, Susan J; Kjems, Jørgen

2011-01-01

MicroRNAs (miRNAs) are ∼22 nt non-coding RNAs that typically bind to the 3′ UTR of target mRNAs in the cytoplasm, resulting in mRNA destabilization and translational repression. Here, we report that miRNAs can also regulate gene expression by targeting non-coding antisense transcripts in human cells. Specifically, we show that miR-671 directs cleavage of a circular antisense transcript of the Cerebellar Degeneration-Related protein 1 (CDR1) locus in an Ago2-slicer-dependent manner. The resulting downregulation of circular antisense has a concomitant decrease in CDR1 mRNA levels, independently of heterochromatin formation. This study provides the first evidence for non-coding antisense transcripts as functional miRNA targets, and a novel regulatory mechanism involving a positive correlation between mRNA and antisense circular RNA levels. PMID:21964070
Regulation of neural macroRNAs by the transcriptional repressor REST

PubMed Central

Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J.; Stanton, Lawrence W.; Lipovich, Leonard

2009-01-01

The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs (“macroRNAs”), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer. PMID:19050060
Regulation of neural macroRNAs by the transcriptional repressor REST.

PubMed

Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J; Stanton, Lawrence W; Lipovich, Leonard

2009-01-01

The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs ("macroRNAs"), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer.
High-throughput sequencing of human plasma RNA by using thermostable group II intron reverse transcriptases

PubMed Central

Qin, Yidan; Yao, Jun; Wu, Douglas C.; Nottingham, Ryan M.; Mohr, Sabine; Hunicke-Smith, Scott; Lambowitz, Alan M.

2016-01-01

Next-generation RNA-sequencing (RNA-seq) has revolutionized transcriptome profiling, gene expression analysis, and RNA-based diagnostics. Here, we developed a new RNA-seq method that exploits thermostable group II intron reverse transcriptases (TGIRTs) and used it to profile human plasma RNAs. TGIRTs have higher thermostability, processivity, and fidelity than conventional reverse transcriptases, plus a novel template-switching activity that can efficiently attach RNA-seq adapters to target RNA sequences without RNA ligation. The new TGIRT-seq method enabled construction of RNA-seq libraries from <1 ng of plasma RNA in <5 h. TGIRT-seq of RNA in 1-mL plasma samples from a healthy individual revealed RNA fragments mapping to a diverse population of protein-coding gene and long ncRNAs, which are enriched in intron and antisense sequences, as well as nearly all known classes of small ncRNAs, some of which have never before been seen in plasma. Surprisingly, many of the small ncRNA species were present as full-length transcripts, suggesting that they are protected from plasma RNases in ribonucleoprotein (RNP) complexes and/or exosomes. This TGIRT-seq method is readily adaptable for profiling of whole-cell, exosomal, and miRNAs, and for related procedures, such as HITS-CLIP and ribosome profiling. PMID:26554030

Analysis of nonuniformity in intron phase distribution.

PubMed Central

Fedorov, A; Suboch, G; Bujakov, M; Fedorova, L

1992-01-01

The distribution of different intron groups with respect to phases has been analyzed. It has been established that group II introns and nuclear introns have a minimum frequency of phase 2 introns. Since the phase of introns is an extremely conservative measure the observed minimum reflects evolutionary processes. A sample of all known, group I introns was too small to provide a valid characteristic of their phase distribution. The findings observed for the unequal distribution of phases cannot be explained solely on the basis of the mobile properties of introns. One of the most likely explanations for this nonuniformity in the intron phase distribution is the process of exon shuffling. It is proposed that group II introns originated at the early stages of evolution and were involved in the process of exon shuffling. PMID:1598214
Tissue- and case-specific retention of intron 40 in mature dystrophin mRNA.

PubMed

Nishida, Atsushi; Minegishi, Maki; Takeuchi, Atsuko; Niba, Emma Tabe Eko; Awano, Hiroyuki; Lee, Tomoko; Iijima, Kazumoto; Takeshima, Yasuhiro; Matsuo, Masafumi

2015-06-01

The dystrophin gene, which is mutated in Duchenne muscular dystrophy (DMD), comprises 79 exons that show multiple alternative splicing events. Intron retention, a type of alternative splicing, may control gene expression. We examined intron retention in dystrophin introns by reverse-transcription PCR from skeletal muscle, focusing on the nine shortest (all <1000 bp), because these are more likely to be retained. Only one, intron 40, was retained in mRNA; sequencing revealed insertion of a complete intron 40 (851 nt) between exons 40 and 41. The intron 40 retention product accounted for 1.2% of the total product but had a premature stop codon at the fifth intronic codon. Intron 40 retention was most strongly observed in the kidney (36.6%) and was not obtained from the fetal liver, lung, spleen or placenta. This indicated that intron retention is a tissue-specific event whose level varies among tissues. In two DMD patients, intron 40 retention was observed in one patient but not in the other. Examination of splicing regulatory factors revealed that intron 40 had the highest guanine-cytosine content of all examined introns in a 30-nt segment at its 3' end. Further studies are needed to clarify the biological role of intron 40-retained dystrophin mRNA.
Splicing predictions reliably classify different types of alternative splicing

PubMed Central

Busch, Anke; Hertel, Klemens J.

2015-01-01

Alternative splicing is a key player in the creation of complex mammalian transcriptomes and its misregulation is associated with many human diseases. Multiple mRNA isoforms are generated from most human genes, a process mediated by the interplay of various RNA signature elements and trans-acting factors that guide spliceosomal assembly and intron removal. Here, we introduce a splicing predictor that evaluates hundreds of RNA features simultaneously to successfully differentiate between exons that are constitutively spliced, exons that undergo alternative 5′ or 3′ splice-site selection, and alternative cassette-type exons. Surprisingly, the splicing predictor did not feature strong discriminatory contributions from binding sites for known splicing regulators. Rather, the ability of an exon to be involved in one or multiple types of alternative splicing is dictated by its immediate sequence context, mainly driven by the identity of the exon's splice sites, the conservation around them, and its exon/intron architecture. Thus, the splicing behavior of human exons can be reliably predicted based on basic RNA sequence elements. PMID:25805853
Human obesity associated with an intronic SNP in the brain-derived neurotrophic factor locus

USDA-ARS?s Scientific Manuscript database

Brain-derived neurotrophic factor (BDNF) plays a key role in energy balance. In population studies, SNPs of the BDNF locus have been linked to obesity, but the mechanism by which these variants cause weight gain is unknown. Here, we examined human hypothalamic BDNF expression in association with 44 ...
Roles of polypyrimidine tract binding proteins in major immediate-early gene expression and viral replication of human cytomegalovirus.

PubMed

Cosme, Ruth S Cruz; Yamamura, Yasuhiro; Tang, Qiyi

2009-04-01

Human cytomegalovirus (HCMV), a member of the beta subgroup of the family Herpesviridae, causes serious health problems worldwide. HCMV gene expression in host cells is a well-defined sequential process: immediate-early (IE) gene expression, early-gene expression, DNA replication, and late-gene expression. The most abundant IE gene, major IE (MIE) gene pre-mRNA, needs to be spliced before being exported to the cytoplasm for translation. In this study, the regulation of MIE gene splicing was investigated; in so doing, we found that polypyrimidine tract binding proteins (PTBs) strongly repressed MIE gene production in cotransfection assays. In addition, we discovered that the repressive effects of PTB could be rescued by splicing factor U2AF. Taken together, the results suggest that PTBs inhibit MIE gene splicing by competing with U2AF65 for binding to the polypyrimidine tract in pre-mRNA. In intron deletion mutation assays and RNA detection experiments (reverse transcription [RT]-PCR and real-time RT-PCR), we further observed that PTBs target all the introns of the MIE gene, especially intron 2, and affect gene splicing, which was reflected in the variation in the ratio of pre-mRNA to mRNA. Using transfection assays, we demonstrated that PTB knockdown cells induce a higher degree of MIE gene splicing/expression. Consistently, HCMV can produce more viral proteins and viral particles in PTB knockdown cells after infection. We conclude that PTB inhibits HCMV replication by interfering with MIE gene splicing through competition with U2AF for binding to the polypyrimidine tract in MIE gene introns.
Roles of Polypyrimidine Tract Binding Proteins in Major Immediate-Early Gene Expression and Viral Replication of Human Cytomegalovirus▿

PubMed Central

Cosme, Ruth S. Cruz; Yamamura, Yasuhiro; Tang, Qiyi

2009-01-01

Human cytomegalovirus (HCMV), a member of the β subgroup of the family Herpesviridae, causes serious health problems worldwide. HCMV gene expression in host cells is a well-defined sequential process: immediate-early (IE) gene expression, early-gene expression, DNA replication, and late-gene expression. The most abundant IE gene, major IE (MIE) gene pre-mRNA, needs to be spliced before being exported to the cytoplasm for translation. In this study, the regulation of MIE gene splicing was investigated; in so doing, we found that polypyrimidine tract binding proteins (PTBs) strongly repressed MIE gene production in cotransfection assays. In addition, we discovered that the repressive effects of PTB could be rescued by splicing factor U2AF. Taken together, the results suggest that PTBs inhibit MIE gene splicing by competing with U2AF65 for binding to the polypyrimidine tract in pre-mRNA. In intron deletion mutation assays and RNA detection experiments (reverse transcription [RT]-PCR and real-time RT-PCR), we further observed that PTBs target all the introns of the MIE gene, especially intron 2, and affect gene splicing, which was reflected in the variation in the ratio of pre-mRNA to mRNA. Using transfection assays, we demonstrated that PTB knockdown cells induce a higher degree of MIE gene splicing/expression. Consistently, HCMV can produce more viral proteins and viral particles in PTB knockdown cells after infection. We conclude that PTB inhibits HCMV replication by interfering with MIE gene splicing through competition with U2AF for binding to the polypyrimidine tract in MIE gene introns. PMID:19144709
An intronic open reading frame was released from one of group II introns in the mitochondrial genome of the haptophyte Chrysochromulina sp. NIES-1333

PubMed Central

Nishimura, Yuki; Kamikawa, Ryoma; Hashimoto, Tetsuo; Inagaki, Yuji

2014-01-01

Mitochondrial (mt) genome sequences, which often bear introns, have been sampled from phylogenetically diverse eukaryotes. Thus, we can anticipate novel insights into intron evolution from previously unstudied mt genomes. We here investigated the origins and evolution of three introns in the mt genome of the haptophyte Chrysochromulina sp. NIES-1333, which was sequenced completely in this study. All the three introns were characterized as group II, on the basis of predicted secondary structure, and the conserved sequence motifs at the 5′ and 3′ termini. Our comparative studies on diverse mt genomes prompt us to propose that the Chrysochromulina mt genome laterally acquired the introns from mt genomes in distantly related eukaryotes. Many group II introns harbor intronic open reading frames for the proteins (intron-encoded proteins or IEPs), which likely facilitate the splicing of their host introns. However, we propose that a “free-standing,” IEP-like protein, which is not encoded within any introns in the Chrysochromulina mt genome, is involved in the splicing of the first cox1 intron that lacks any open reading frames. PMID:25054084
Three distinct modes of intron dynamics in the evolution of eukaryotes.

PubMed

Carmel, Liran; Wolf, Yuri I; Rogozin, Igor B; Koonin, Eugene V

2007-07-01

Several contrasting scenarios have been proposed for the origin and evolution of spliceosomal introns, a hallmark of eukaryotic genes. A comprehensive probabilistic model to obtain a definitive reconstruction of intron evolution was developed and applied to 391 sets of conserved genes from 19 eukaryotic species. It is inferred that a relatively high intron density was reached early, i.e., the last common ancestor of eukaryotes contained >2.15 introns/kilobase, and the last common ancestor of multicellular life forms harbored approximately 3.4 introns/kilobase, a greater intron density than in most of the extant fungi and in some animals. The rates of intron gain and intron loss appear to have been dropping during the last approximately 1.3 billion years, with the decline in the gain rate being much steeper. Eukaryotic lineages exhibit three distinct modes of evolution of the intron-exon structure. The primary, balanced mode, apparently, operates in all lineages. In this mode, intron gain and loss are strongly and positively correlated, in contrast to previous reports on inverse correlation between these processes. The second mode involves an elevated rate of intron loss and is prevalent in several lineages, such as fungi and insects. The third mode, characterized by elevated rate of intron gain, is seen only in deep branches of the tree, indicating that bursts of intron invasion occurred at key points in eukaryotic evolution, such as the origin of animals. Intron dynamics could depend on multiple mechanisms, and in the balanced mode, gain and loss of introns might share common mechanistic features.
A Trio of Human Molecular Genetics PCR Assays

ERIC Educational Resources Information Center

Reinking, Jeffrey L.; Waldo, Jennifer T.; Dinsmore, Jannett

2013-01-01

This laboratory exercise demonstrates three different analytical forms of the polymerase chain reaction (PCR) that allow students to genotype themselves at four different loci. Here, we present protocols to allow students to a) genotype a non-coding polymorphic Variable Number of Tandem Repeat (VNTR) locus on human chromosome 5 using conventional…
Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term

PubMed Central

Romero, Roberto; Tarca, Adi; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S.; Kalita, Cynthia A.; Cai, Juan; Yeo, Lami; Lipovich, Leonard

2014-01-01

Objective The mechanisms responsible for normal and abnormal parturition are poorly understood. Myometrial activation leading to regular uterine contractions is a key component of labor. Dysfunctional labor (arrest of dilatation and/or descent) is a leading indication for cesarean delivery. Compelling evidence suggests that most of these disorders are functional in nature, and not the result of cephalopelvic disproportion. The methodology and the datasets afforded by the post-genomic era provide novel opportunities to understand and target gene functions in these disorders. In 2012, the ENCODE Consortium elucidated the extraordinary abundance and functional complexity of long non-coding RNA genes in the human genome. The purpose of the study was to identify differentially expressed long non-coding RNA genes in human myometrium in women in spontaneous labor at term. Materials and Methods Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n=19) and women in spontaneous labor at term (n=20). RNA was extracted and profiled using an Illumina® microarray platform. The analysis of the protein coding genes from this study has been previously reported. Here, we have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. Results Upon considering more than 18,498 distinct lncRNA genes compiled nonredundantly from public experimental data sources, and interrogating 2,634 that matched Illumina microarray probes, we identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an independent experimental method. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site that lacked evolutionary conservation beyond primates. Conclusions We provide for the first time evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known, as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term. PMID:24168098
Characterization of the apolipoprotein AI and CIII genes in the domestic pig

DOE Office of Scientific and Technical Information (OSTI.GOV)

Birchbauer, A.; Knipping, G.; Juritsch, B.

1993-03-01

The apolipoproteins (apo) AI and CIII are important constituents of triglyceride-rich lipoproteins and high-density lipoproteins. In humans, apo AI is believed to play an important protective role in the pathogenesis of arteriosclerosis, whereas apo CIII might be involved in the development of hypertriglyceridemia. Both human genes are located within a gene cluster on chromosome 11. Although the domestic pig has been widely used as an animal model in arteriosclerosis and lipid research, the porcine apolipoproteins genes are poorly characterized. In this report, the complete nucleotide sequences of the porcine apo AI and CIII genes are presented and the authors demonstrate,more » for the first time, apo CIII expression in the pig. Both genes are composed of four exons and three introns and resemble closely their human counterparts with regard to the transcriptional start sites, exon sizes, intron sizes, exon-intron borders, and the size of the intergenic region. The predicted pig apo AI is a protein of 241 amino acids, which is 2 amino acids shorter than human apo AI. The protein sequence was found to be very homologous to apo AI sequences in other mammalian species. Apo AI expression was detected on the mRNA level in porcine liver and intestine. The apo CIII gene encodes a protein with 73 amino acids, which is 6 amino acids shorter than human apo CIII. In contrast to the three isoforms of apo CIII found in humans, only one major isoform was detected in the pig. Presumably this isoform is unglycosylated. In addition to apo CIII expression in the liver and the intestine, a truncated form of apo CIII mRNA was also found in porcine kidney. The studies demonstrate the presence of an apo CIII gene, an apo CIII mRNA, and an apo CIII protein in the pig and, therefore, exclude a hypothesized apo CIII deficiency in these animals. 53 refs., 5 figs.« less
Physical structure and chromosomal localization of a gene encoding human p58[sup clk-1], a cell division control related protein kinase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eipers, P.G.

1992-01-01

The gene for the human p58[sup clk[minus]1] protein kinase, a cell division control-related gene, has been mapped by somatic cell hybrid analyses, in situ localization with the chromosomal gene, and nested polymerase chain reaction amplification of microdissected chromosomes. These studies indicate that the expressed p58[sup clk[minus]1] chromosomal gene maps to 1p36, while a highly related p58[sup clk[minus]1] sequence of unknown nature maps to chromosome 15. Assignment of a p34[sup cdc2]-related gene to 1p36 region, including neuroblastoma, ductal carcinoma of the breast, malignant melanoma, Merkel cell carcinoma and endocrine neoplasia among others. Aberrant expression of this protein kinase negatively regulates normalmore » cellular growth. The p58[sup clk[minus]1] protein contains a central domain of 299 amino acids that is 46% identical to human p34[sup cdc2], the master mitotic protein kinase. This dissertation details the complete structure of the p58[sup clk[minus]1] chromosomal gene, including its putative promoter region, transcriptional start sites, exonic sequences, and intron/exon boundary sequences. The gene is 10 kb in size and contains 12 exons and 11 introns. Interestingly, the rather large 2.0 kb 3[prime] untranslated region is interrupted by an intron that separates a region containing numerous AUUUA destabilization motifs from the coding region. Furthermore, the expression of this gene in normal human tissues, as well as several human tumor cell samples and lines, is examined. The origin of multiple human transcripts from the same chromosomal gene, and the possible differential stability of these various transcripts, is discussed with regard to the transcriptional and post-transcriptional regulation of this gene. This is the first report of the chromosomal gene structure of a member of the p34[sup cdc2] supergene family.« less
Intron open reading frames as mobile elements and evolution of a group I intron.

PubMed

Sellem, C H; Belcour, L

1997-05-01

Group I introns are proposed to have become mobile following the acquisition of open reading frames (ORFs) that encode highly specific DNA endonucleases. This proposal implies that intron ORFs could behave as autonomously mobile entities. This was supported by abundant circumstantial evidence but no experiment of ORF transfer from an ORF-containing intron to its ORF-less counterpart has been described. In this paper we present such experiments, which demonstrate the efficient mobility of the mitochondrial nad1-i4-orf1 between two Podospora strains. The homing of this mobile ORF was accompanied by a bidirectional co-conversion that did not systematically involve the whole intron sequence. Orf1 acquisition would be the most recent step in the evolution of the nad1-i4 intron, which has resulted in many strains of Podospora having an intron with two ORFs (biorfic) and four splicing pathways. We show that two of the splicing events that operate in this biorfic intron, as evidenced by PCR experiments, are generated by a 5'-alternative splice site, which is most probably a remnant of the monoorfic ancestral form of the intron. We propose a sequential evolution model that is consistent with the four organizations of the corresponding nad1 locus that we found among various species of the Pyrenomycete family; these organizations consist of no intron, an intron alone, a monoorfic intron, and a biorfic intron.
Tissue-specific alternative splicing of TCF7L2

PubMed Central

Prokunina-Olsson, Ludmila; Welch, Cullan; Hansson, Ola; Adhikari, Neeta; Scott, Laura J.; Usher, Nicolle; Tong, Maurine; Sprau, Andrew; Swift, Amy; Bonnycastle, Lori L.; Erdos, Michael R.; He, Zhi; Saxena, Richa; Harmon, Brennan; Kotova, Olga; Hoffman, Eric P.; Altshuler, David; Groop, Leif; Boehnke, Michael; Collins, Francis S.; Hall, Jennifer L.

2009-01-01

Common variants in the transcription factor 7-like 2 (TCF7L2) gene have been identified as the strongest genetic risk factors for type 2 diabetes (T2D). However, the mechanisms by which these non-coding variants increase risk for T2D are not well-established. We used 13 expression assays to survey mRNA expression of multiple TCF7L2 splicing forms in up to 380 samples from eight types of human tissue (pancreas, pancreatic islets, colon, liver, monocytes, skeletal muscle, subcutaneous adipose tissue and lymphoblastoid cell lines) and observed a tissue-specific pattern of alternative splicing. We tested whether the expression of TCF7L2 splicing forms was associated with single nucleotide polymorphisms (SNPs), rs7903146 and rs12255372, located within introns 3 and 4 of the gene and most strongly associated with T2D. Expression of two splicing forms was lower in pancreatic islets with increasing counts of T2D-associated alleles of the SNPs: a ubiquitous splicing form (P = 0.018 for rs7903146 and P = 0.020 for rs12255372) and a splicing form found in pancreatic islets, pancreas and colon but not in other tissues tested here (P = 0.009 for rs12255372 and P = 0.053 for rs7903146). Expression of this form in glucose-stimulated pancreatic islets correlated with expression of proinsulin (r2 = 0.84–0.90, P < 0.00063). In summary, we identified a tissue-specific pattern of alternative splicing of TCF7L2. After adjustment for multiple tests, no association between expression of TCF7L2 in eight types of human tissue samples and T2D-associated genetic variants remained significant. Alternative splicing of TCF7L2 in pancreatic islets warrants future studies. GenBank Accession Numbers: FJ010164–FJ010174. PMID:19602480
Identification of an Intronic Splicing Enhancer Essential for the Inclusion of FGFR2 Exon IIIc*S⃞

PubMed Central

Seth, Puneet; Miller, Heather B.; Lasda, Erika L.; Pearson, James L.; Garcia-Blanco, Mariano A.

2008-01-01

The ligand specificity of fibroblast growth factor receptor 2 (FGFR2) is determined by the alternative splicing of exons 8 (IIIb) or 9 (IIIc). Exon IIIb is included in epithelial cells, whereas exon IIIc is included in mesenchymal cells. Although a number of cis elements and trans factors have been identified that play a role in exon IIIb inclusion in epithelium, little is known about the activation of exon IIIc in mesenchyme. We report here the identification of a splicing enhancer required for IIIc inclusion. This 24-nucleotide (nt) downstream intronic splicing enhancer (DISE) is located within intron 9 immediately downstream of exon IIIc. DISE was able to activate the inclusion of heterologous exons rat FGFR2 IIIb and human β-globin exon 2 in cell lines from different tissues and species and also in HeLa cell nuclear extracts in vitro. DISE was capable of replacing the intronic activator sequence 1 (IAS1), a known IIIb splicing enhancer and vice versa. This fact, together with the requirement for DISE to be close to the 5′-splice site and the ability of DISE to promote binding of U1 snRNP, suggested that IAS1 and DISE belong to the same class of cis-acting elements. PMID:18256031
In vitro mapping of Myotonic Dystrophy (DM) gene promoter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Storbeck, C.J.; Sabourin, L.; Baird, S.

1994-09-01

The Myotonic Dystrophy Kinase (DMK) gene has been cloned and shared homology to serine/threonine protein kinases. Overexpression of this gene in stably transfected mouse myoblasts has been shown to inhibit fusion into myotubes while myoblasts stably transfected with an antisense construct show increased fusion potential. These experiments, along with data showing that the DM gene is highly expressed in muscle have highlighted the possibility of DMK being involved in myogenesis. The promoter region of the DM gene lacks a consensus TATA box and CAAT box, but harbours numerous transcription binding sites. Clones containing extended 5{prime} upstream sequences (UPS) of DMKmore » only weakly drive the reporter gene chloramphenicol acetyl transferase (CAT) when transfected into C2C12 mouse myoblasts. However, four E-boxes are present in the first intron of the DM gene and transient assays show increased expression of the CAT gene when the first intron is present downstream of these 5{prime} UPS in an orientation dependent manner. Comparison between mouse and human sequence reveals that the regions in the first intron where the E-boxes are located are highly conserved. The mapping of the promoter and the importance of the first intron in the control of DMK expression will be presented.« less
Identification and functional analysis of long non-coding RNAs in human and mouse early embryos based on single-cell transcriptome data

PubMed Central

Qiu, Jia-jun; Ren, Zhao-rui; Yan, Jing-bin

2016-01-01

Epigenetics regulations have an important role in fertilization and proper embryonic development, and several human diseases are associated with epigenetic modification disorders, such as Rett syndrome, Beckwith-Wiedemann syndrome and Angelman syndrome. However, the dynamics and functions of long non-coding RNAs (lncRNAs), one type of epigenetic regulators, in human pre-implantation development have not yet been demonstrated. In this study, a comprehensive analysis of human and mouse early-stage embryonic lncRNAs was performed based on public single-cell RNA sequencing data. Expression profile analysis revealed that lncRNAs are expressed in a developmental stage–specific manner during human early-stage embryonic development, whereas a more temporal-specific expression pattern was identified in mouse embryos. Weighted gene co-expression network analysis suggested that lncRNAs involved in human early-stage embryonic development are associated with several important functions and processes, such as oocyte maturation, zygotic genome activation and mitochondrial functions. We also found that the network of lncRNAs involved in zygotic genome activation was highly preservative between human and mouse embryos, whereas in other stages no strong correlation between human and mouse embryo was observed. This study provides insight into the molecular mechanism underlying lncRNA involvement in human pre-implantation embryonic development. PMID:27542205
Deregulation of RB1 expression by loss of imprinting in human hepatocellular carcinoma.

PubMed

Anwar, Sumadi Lukman; Krech, Till; Hasemeier, Britta; Schipper, Elisa; Schweitzer, Nora; Vogel, Arndt; Kreipe, Hans; Lehmann, Ulrich

2014-08-01

The tumour suppressor gene RB1 is frequently silenced in many different types of human cancer, including hepatocellular carcinoma (HCC). However, mutations of the RB1 gene are relatively rare in HCC. A systematic screen for the identification of imprinted genes deregulated in human HCC revealed that RB1 shows imprint abnormalities in a high proportion of primary patient samples. Altogether, 40% of the HCC specimens (16/40) showed hyper- or hypomethylation at the CpG island in intron 2 of the RB1 gene. Re-analysis of publicly available genome-wide DNA methylation data confirmed these findings in two independent HCC cohorts. Loss of correct DNA methylation patterns at the RB1 locus leads to the aberrant expression of an alternative RB1-E2B transcript, as measured by quantitative real-time PCR. Demethylation at the intron 2 CpG island by DNMT1 knock-down or aza-deoxycytidine (DAC) treatment stimulated expression of the RB1-E2B transcript, accompanied by diminished RB1 main transcript expression. No aberrant DNA methylation was found at the RB1 locus in hepatocellular adenoma (HCA, n = 10), focal nodular hyperplasia (FNH, n = 5) and their corresponding adjacent liver tissue specimens. Deregulated RB1 expression due to hyper- or hypomethylation in intron 2 of the RB1 gene is found in tumours without loss of heterozygosity and is associated with a decrease in overall survival (p = 0.032) if caused by hypermethylation of CpG85. This unequivocally demonstrates that loss of imprinting represents an important additional mechanism for RB1 pathway inactivation in human HCC, complementing well-described molecular defects. Copyright © 2014 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
Mechanisms Used for Genomic Proliferation by Thermophilic Group II Introns

PubMed Central

Mohr, Georg; Ghanem, Eman; Lambowitz, Alan M.

2010-01-01

Mobile group II introns, which are found in bacterial and organellar genomes, are site-specific retroelments hypothesized to be evolutionary ancestors of spliceosomal introns and retrotransposons in higher organisms. Most bacteria, however, contain no more than one or a few group II introns, making it unclear how introns could have proliferated to higher copy numbers in eukaryotic genomes. An exception is the thermophilic cyanobacterium Thermosynechococcus elongatus, which contains 28 closely related copies of a group II intron, constituting ∼1.3% of the genome. Here, by using a combination of bioinformatics and mobility assays at different temperatures, we identified mechanisms that contribute to the proliferation of T. elongatus group II introns. These mechanisms include divergence of DNA target specificity to avoid target site saturation; adaptation of some intron-encoded reverse transcriptases to splice and mobilize multiple degenerate introns that do not encode reverse transcriptases, leading to a common splicing apparatus; and preferential insertion within other mobile introns or insertion elements, which provide new unoccupied sites in expanding non-essential DNA regions. Additionally, unlike mesophilic group II introns, the thermophilic T. elongatus introns rely on elevated temperatures to help promote DNA strand separation, enabling access to a larger number of DNA target sites by base pairing of the intron RNA, with minimal constraint from the reverse transcriptase. Our results provide insight into group II intron proliferation mechanisms and show that higher temperatures, which are thought to have prevailed on Earth during the emergence of eukaryotes, favor intron proliferation by increasing the accessibility of DNA target sites. We also identify actively mobile thermophilic introns, which may be useful for structural studies, gene targeting in thermophiles, and as a source of thermostable reverse transcriptases. PMID:20543989
Evolution of group I introns in Porifera: new evidence for intron mobility and implications for DNA barcoding.

PubMed

Schuster, Astrid; Lopez, Jose V; Becking, Leontine E; Kelly, Michelle; Pomponi, Shirley A; Wörheide, Gert; Erpenbeck, Dirk; Cárdenas, Paco

2017-03-20

Mitochondrial introns intermit coding regions of genes and feature characteristic secondary structures and splicing mechanisms. In metazoans, mitochondrial introns have only been detected in sponges, cnidarians, placozoans and one annelid species. Within demosponges, group I and group II introns are present in six families. Based on different insertion sites within the cox1 gene and secondary structures, four types of group I and two types of group II introns are known, which can harbor up to three encoding homing endonuclease genes (HEG) of the LAGLIDADG family (group I) and/or reverse transcriptase (group II). However, only little is known about sponge intron mobility, transmission, and origin due to the lack of a comprehensive dataset. We analyzed the largest dataset on sponge mitochondrial group I introns to date: 95 specimens, from 11 different sponge genera which provided novel insights into the evolution of group I introns. For the first time group I introns were detected in four genera of the sponge family Scleritodermidae (Scleritoderma, Microscleroderma, Aciculites, Setidium). We demonstrated that group I introns in sponges aggregate in the most conserved regions of cox1. We showed that co-occurrence of two introns in cox1 is unique among metazoans, but not uncommon in sponges. However, this combination always associates an active intron with a degenerating one. Earlier hypotheses of HGT were confirmed and for the first time VGT and secondary losses of introns conclusively demonstrated. This study validates the subclass Spirophorina (Tetractinellida) as an intron hotspot in sponges. Our analyses confirm that most sponge group I introns probably originated from fungi. DNA barcoding is discussed and the application of alternative primers suggested.

Intron self-complementarity enforces exon inclusion in a yeast pre-mRNA

PubMed Central

Howe, Kenneth James; Ares, Manuel

1997-01-01

Skipping of internal exons during removal of introns from pre-mRNA must be avoided for proper expression of most eukaryotic genes. Despite significant understanding of the mechanics of intron removal, mechanisms that ensure inclusion of internal exons in multi-intron pre-mRNAs remain mysterious. Using a natural two-intron yeast gene, we have identified distinct RNA–RNA complementarities within each intron that prevent exon skipping and ensure inclusion of internal exons. We show that these complementarities are positioned to act as intron identity elements, bringing together only the appropriate 5′ splice sites and branchpoints. Destroying either intron self-complementarity allows exon skipping to occur, and restoring the complementarity using compensatory mutations rescues exon inclusion, indicating that the elements act through formation of RNA secondary structure. Introducing new pairing potential between regions near the 5′ splice site of intron 1 and the branchpoint of intron 2 dramatically enhances exon skipping. Similar elements identified in single intron yeast genes contribute to splicing efficiency. Our results illustrate how intron secondary structure serves to coordinate splice site pairing and enforce exon inclusion. We suggest that similar elements in vertebrate genes could assist in the splicing of very large introns and in the evolution of alternative splicing. PMID:9356473
RNA splicing. The human splicing code reveals new insights into the genetic determinants of disease.

PubMed

Xiong, Hui Y; Alipanahi, Babak; Lee, Leo J; Bretschneider, Hannes; Merico, Daniele; Yuen, Ryan K C; Hua, Yimin; Gueroussov, Serge; Najafabadi, Hamed S; Hughes, Timothy R; Morris, Quaid; Barash, Yoseph; Krainer, Adrian R; Jojic, Nebojsa; Scherer, Stephen W; Blencowe, Benjamin J; Frey, Brendan J

2015-01-09

To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine. Copyright © 2015, American Association for the Advancement of Science.
Phylogenetics and Gene Structure Dynamics of Polygalacturonase Genes in Aspergillus and Neurospora crassa

PubMed Central

Hong, Jin-Sung; Ryu, Ki-Hyun; Kwon, Soon-Jae; Kim, Jin-Won; Kim, Kwang-Soo; Park, Kyong-Cheul

2013-01-01

Polygalacturonase (PG) gene is a typical gene family present in eukaryotes. Forty-nine PGs were mined from the genomes of Neurospora crassa and five Aspergillus species. The PGs were classified into 3 clades such as clade 1 for rhamno-PGs, clade 2 for exo-PGs and clade 3 for exo- and endo-PGs, which were further grouped into 13 sub-clades based on the polypeptide sequence similarity. In gene structure analysis, a total of 124 introns were present in 44 genes and five genes lacked introns to give an average of 2.5 introns per gene. Intron phase distribution was 64.5% for phase 0, 21.8% for phase 1, and 13.7% for phase 2, respectively. The introns varied in their sequences and their lengths ranged from 20 bp to 424 bp with an average of 65.9 bp, which is approximately half the size of introns in other fungal genes. There were 29 homologous intron blocks and 26 of those were sub-clade specific. Intron losses were counted in 18 introns in which no obvious phase preference for intron loss was observed. Eighteen introns were placed at novel positions, which is considerably higher than those of plant PGs. In an evolutionary sense both intron loss and gain must have taken place for shaping the current PGs in these fungi. Together with the small intron size, low conservation of homologous intron blocks and higher number of novel introns, PGs of fungal species seem to have recently undergone highly dynamic evolution. PMID:25288950
RPG: the Ribosomal Protein Gene database.

PubMed

Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya

2004-01-01

RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes.
RPG: the Ribosomal Protein Gene database

PubMed Central

Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya

2004-01-01

RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes. PMID:14681386
Phylogenetic Distribution of Intron Positions in Alpha-Amylase Genes of Bilateria Suggests Numerous Gains and Losses

PubMed Central

Da Lage, Jean-Luc; Maczkowiak, Frédérique; Cariou, Marie-Louise

2011-01-01

Most eukaryotes have at least some genes interrupted by introns. While it is well accepted that introns were already present at moderate density in the last eukaryote common ancestor, the conspicuous diversity of intron density among genomes suggests a complex evolutionary history, with marked differences between phyla. The question of the rates of intron gains and loss in the course of evolution and factors influencing them remains controversial. We have investigated a single gene family, alpha-amylase, in 55 species covering a variety of animal phyla. Comparison of intron positions across phyla suggests a complex history, with a likely ancestral intronless gene undergoing frequent intron loss and gain, leading to extant intron/exon structures that are highly variable, even among species from the same phylum. Because introns are known to play no regulatory role in this gene and there is no alternative splicing, the structural differences may be interpreted more easily: intron positions, sizes, losses or gains may be more likely related to factors linked to splicing mechanisms and requirements, and to recognition of introns and exons, or to more extrinsic factors, such as life cycle and population size. We have shown that intron losses outnumbered gains in recent periods, but that “resets” of intron positions occurred at the origin of several phyla, including vertebrates. Rates of gain and loss appear to be positively correlated. No phase preference was found. We also found evidence for parallel gains and for intron sliding. Presence of introns at given positions was correlated to a strong protosplice consensus sequence AG/G, which was much weaker in the absence of intron. In contrast, recent intron insertions were not associated with a specific sequence. In animal Amy genes, population size and generation time seem to have played only minor roles in shaping gene structures. PMID:21611157
De novo insertion of an intron into the mammalian sex determining gene, SRY

PubMed Central

O’Neill, Rachel J. Waugh; Brennan, Francine E.; Delbridge, Margaret L.; Crozier, Ross H.; Graves, Jennifer A. Marshall

1998-01-01

Two theories have been proposed to explain the evolution of introns within eukaryotic genes. The introns early theory, or “exon theory of genes,” proposes that introns are ancient and that recombination within introns provided new exon structure, and thus new genes. The introns late theory, or “insertional theory of introns,” proposes that ancient genes existed as uninterrupted exons and that introns have been introduced during the course of evolution. There is still controversy as to how intron–exon structure evolved and whether the majority of introns are ancient or novel. Although there is extensive evidence in support of the introns early theory, phylogenetic comparisons of several genes indicate recent gain and loss of introns within these genes. However, no example has been shown of a protein coding gene, intronless in its ancestral form, which has acquired an intron in a derived form. The mammalian sex determining gene, SRY, is intronless in all mammals studied to date, as is the gene from which it recently evolved. However, we report here comparisons of genomic and cDNA sequences that now provide evidence of a de novo insertion of an intron into the SRY gene of dasyurid marsupials. This recently (approximately 45 million years ago) inserted sequence is not homologous with known transposable elements. Our data demonstrate that introns may be inserted as spliced units within a developmentally crucial gene without disrupting its function. PMID:9465071
Influence of intron length on interaction characters between post-spliced intron and its CDS in ribosomal protein genes

NASA Astrophysics Data System (ADS)

Zhao, Xiaoqing; Li, Hong; Bao, Tonglaga; Ying, Zhiqiang

2012-09-01

Many experiment evidences showed that sequence structures of introns and intron loss/gain can influence gene expression, but current mechanisms did not refer to the functions of post-spliced introns directly. We propose that postspliced introns play their functions in gene expression by interacting with their mRNA sequences and the interaction is characterized by the matched segments between introns and their CDS. In this study, we investigated the interaction characters with length series by improved Smith-Waterman local alignment software for the ribosomal protein genes in C. elegans and D. melanogaster. Our results showed that RF values of five intron groups are significantly high in the central non-conserved region and very low in 5'-end and 3'-end splicing region. It is interesting that the number of the optimal matched regions gradually increases with intron length. Distributions of the optimal matched regions are different for five intron groups. Our study revealed that there are more interaction regions between longer introns and their CDS than shorter, and it provides a positive pattern for regulating the gene expression.
SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.

2002-01-01

Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs inmore » gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.« less
An intronic mutation c.6430-3C>G in the F8 gene causes splicing efficiency and premature termination in hemophilia A.

PubMed

Xia, Zunjing; Lin, Jie; Lu, Lingping; Kim, Chol; Yu, Ping; Qi, Ming

2018-06-01

: Hemophilia A is a bleeding disorder caused by coagulation factor VIII protein deficiency or dysfunction, which is classified into severe, moderate, and mild according to factor clotting activity. An overwhelming majority of missense and nonsense mutations occur in exons of F8 gene, whereas mutations in introns can also be pathogenic. This study aimed to investigate the effect of an intronic mutation, c.6430-3C>G (IVS22-3C>G), on pre-mRNA splicing of the F8 gene. We applied DNA and cDNA sequencing in a Chinese boy with hemophilia A to search if any pathogenic mutation in the F8 gene. Functional analysis was performed to investigate the effect of an intronic mutation at the transcriptional level. Human Splicing Finder and PyMol were also used to predict its effect. We found the mutation c.6430-3C>G (IVS22-3C>G) in the F8 gene in the affected boy, with his mother being a carrier. cDNA from the mother and pSPL3 splicing assay showed that the mutation IVS22-3C>G results in a two-nucleotide AG inclusion at the 3' end of intron 22 and leads to a truncated coagulation factor VIII protein, with partial loss of the C1 domain and complete loss of the C2 domain. The in-silico tool predicted that the mutation induces altered pre-mRNA splicing by using a cryptic acceptor site in intron 22. The IVS22-3C>G mutation was confirmed to affect pre-mRNA splicing and produce a truncated protein, which reduces the stability of binding between the F8 protein and von Willebrand factor carrier protein due to the loss of an interaction domain.
Intron-loss evolution of hatching enzyme genes in Teleostei

PubMed Central

2010-01-01

Background Hatching enzyme, belonging to the astacin metallo-protease family, digests egg envelope at embryo hatching. Orthologous genes of the enzyme are found in all vertebrate genomes. Recently, we found that exon-intron structures of the genes were conserved among tetrapods, while the genes of teleosts frequently lost their introns. Occurrence of such intron losses in teleostean hatching enzyme genes is an uncommon evolutionary event, as most eukaryotic genes are generally known to be interrupted by introns and the intron insertion sites are conserved from species to species. Here, we report on extensive studies of the exon-intron structures of teleostean hatching enzyme genes for insight into how and why introns were lost during evolution. Results We investigated the evolutionary pathway of intron-losses in hatching enzyme genes of 27 species of Teleostei. Hatching enzyme genes of basal teleosts are of only one type, which conserves the 9-exon-8-intron structure of an assumed ancestor. On the other hand, otocephalans and euteleosts possess two types of hatching enzyme genes, suggesting a gene duplication event in the common ancestor of otocephalans and euteleosts. The duplicated genes were classified into two clades, clades I and II, based on phylogenetic analysis. In otocephalans and euteleosts, clade I genes developed a phylogeny-specific structure, such as an 8-exon-7-intron, 5-exon-4-intron, 4-exon-3-intron or intron-less structure. In contrast to the clade I genes, the structures of clade II genes were relatively stable in their configuration, and were similar to that of the ancestral genes. Expression analyses revealed that hatching enzyme genes were high-expression genes, when compared to that of housekeeping genes. When expression levels were compared between clade I and II genes, clade I genes tends to be expressed more highly than clade II genes. Conclusions Hatching enzyme genes evolved to lose their introns, and the intron-loss events occurred at the specific points of teleostean phylogeny. We propose that the high-expression hatching enzyme genes frequently lost their introns during the evolution of teleosts, while the low-expression genes maintained the exon-intron structure of the ancestral gene. PMID:20796321
Introns in Cryptococcus.

PubMed

Janbon, Guilhem

2018-01-01

In Cryptococcus neoformans, nearly all genes are interrupted by small introns. In recent years, genome annotation and genetic analysis have illuminated the major roles these introns play in the biology of this pathogenic yeast. Introns are necessary for gene expression and alternative splicing can regulate gene expression in response to environmental cues. In addition, recent studies have revealed that C. neoformans introns help to prevent transposon dissemination and protect genome integrity. These characteristics of cryptococcal introns are probably not unique to Cryptococcus, and this yeast likely can be considered as a model for intron-related studies in fungi.
DNA double-strand break in vivo at the 3' extremity of exons located upstream of group II introns. Senescence and circular DNA introns in Podospora mitochondria.

PubMed

Sainsard-Chanet, A; Begel, O; Belcour, L

1994-10-07

In the filamentous fungus Podospora anserina, the unavoidable phenomenon of senescence is associated with the amplification of the first intron of the mitochondrial cox1 that accumulates as circular DNA molecules consisting of tandem repeats. This group II intron (cox1-i1 or alpha) is able to transpose and contains an open reading frame with significant amino acid similarity with reverse transcriptases. The generation of these intronic circular DNA molecules, their amplification and their involvement in the senescence process are unresolved questions. We demonstrate here that: (1) another group II intron, the fourth intron of gene cox1, cox1-i4, is also able to give precise DNA end to end junctions; (2) this intronic sequence can be found amplified during senescence, although to a lesser extent than cox1-i1; (3) the amplification of the DNA multimeric cox1-i1 molecules likely does not proceed by autonomous replication; (4) the generation of the DNA intronic circles does not require efficient intron splicing; (5) a DNA double-strand break occurs in vivo at the 3' extremity of the cox1-e1 and cox1-e4 exons preceding the group II introns that form circular DNAs. On the whole, these results show that the ability to form DNA circular molecules is a property of some group II introns and they demonstrate the occurrence of a specific DNA cleavage at or near the integration site of these group II introns. The results strongly suggest that this cleavage is involved in the formation of the group II intronic DNA circles and could also be involved in the phenomenon of group II intron homing.
Detection of human microRNAs across miRNA Array and Next Generation DNA Sequencing Platforms

EPA Science Inventory

microRNA (miRNAs) are non-coding RNA molecules between 19 and 30 nucleotides in length that are believed to regulate approximately 30 per cent of all human genes. They act as negative regulators of their gene targets in many biological processes. Recent developments in microar...
Diversity in mRNA expression of the serine-type carboxypeptidase ocpG in Aspergillus oryzae through intron retention.

PubMed

Ishida, Ken; Kuboshima, Megumi; Morita, Hiroto; Maeda, Hiroshi; Okamoto, Ayako; Takeuchi, Michio; Yamagata, Youhei

2014-01-01

Alternative splicing is thought to be a means for diversification of products by mRNA modification. Although some intron retentions are predicted by transcriptome analysis in Aspergillus oryzae, its physiological significance remains unknown. We found that intron retention occurred occasionally in the serine-type carboxypeptidase gene, ocpG. Analysis under various culture conditions revealed that extracellular nitrogen conditions influence splicing patterns; this suggested that there might be a correlation between splicing efficiency and the necessity of OcpG activity for obtaining a nitrogen source. Since further analysis showed that splicing occurred independently in each intron, we constructed ocpG intron-exchanging strain by interchanging the positions of intron-1 and intron-2. The splicing pattern indicated the probability that ocpG intron retention was affected by the secondary structures of intronic mRNA.
Long noncoding RNA linc00617 exhibits oncogenic activity in breast cancer.

PubMed

Li, Hengyu; Zhu, Li; Xu, Lu; Qin, Keyu; Liu, Chaoqian; Yu, Yue; Su, Dongwei; Wu, Kainan; Sheng, Yuan

2017-01-01

Protein-coding genes account for only 2% of the human genome, whereas the vast majority of transcripts are noncoding RNAs including long noncoding RNAs. LncRNAs are involved in the regulation of a diverse array of biological processes, including cancer progression. An evolutionarily conserved lncRNA TUNA, was found to be required for pluripotency of mouse embryonic stem cells. In this study, we found the human ortholog of TUNA, linc00617, was upregulated in breast cancer samples. Linc00617 promoted motility and invasion of breast cancer cells and induced epithelial-mesenchymal-transition (EMT), which was accompanied by generation of stem cell properties. Moreover, knockdown of linc00617 repressed lung metastasis in vivo. We demonstrated that linc00617 upregulated the expression of stemness factor Sox2 in breast cancer cells, which was shown to promote the oncogenic activity of breast cancer cells by stimulating epithelial-to-mesenchymal transition and enhancing the tumor-initiating capacity. Thus, our data indicate that linc00617 functions as an important regulator of EMT and promotes breast cancer progression and metastasis via activating the transcription of Sox2. Together, it suggests that linc00617 may be a potential therapeutic target for aggressive breast cancer. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
The origins and evolutionary history of human non-coding RNA regulatory networks.

PubMed

Sherafatian, Masih; Mowla, Seyed Javad

2017-04-01

The evolutionary history and origin of the regulatory function of animal non-coding RNAs are not well understood. Lack of conservation of long non-coding RNAs and small sizes of microRNAs has been major obstacles in their phylogenetic analysis. In this study, we tried to shed more light on the evolution of ncRNA regulatory networks by changing our phylogenetic strategy to focus on the evolutionary pattern of their protein coding targets. We used available target databases of miRNAs and lncRNAs to find their protein coding targets in human. We were able to recognize evolutionary hallmarks of ncRNA targets by phylostratigraphic analysis. We found the conventional 3'-UTR and lesser known 5'-UTR targets of miRNAs to be enriched at three consecutive phylostrata. Firstly, in eukaryata phylostratum corresponding to the emergence of miRNAs, our study revealed that miRNA targets function primarily in cell cycle processes. Moreover, the same overrepresentation of the targets observed in the next two consecutive phylostrata, opisthokonta and eumetazoa, corresponded to the expansion periods of miRNAs in animals evolution. Coding sequence targets of miRNAs showed a delayed rise at opisthokonta phylostratum, compared to the 3' and 5' UTR targets of miRNAs. LncRNA regulatory network was the latest to evolve at eumetazoa.
The interplay between noncoding RNAs and insulin in diabetes.

PubMed

Tian, Yan; Xu, Jia; Du, Xiao; Fu, Xianghui

2018-04-10

Noncoding RNAs (ncRNAs), including microRNAs, long noncoding RNAs and circular RNAs, regulate various biological processes and are involved in the initiation and progression of human diseases. Insulin, a predominant hormone secreted from pancreatic β cells, is an essential factor in regulation of systemic metabolism through multifunctional insulin signaling. Insulin production and action are tightly controlled. Dysregulations of insulin production and action can impair metabolic homeostasis, and eventually lead to the development of multiple metabolic diseases, especially diabetes. Accumulating data indicates that ncRNAs modulate β cell mass, insulin synthesis, secretion and signaling, and their role in diabetes is dramatically emerging. This review summarizes our current knowledge of ncRNAs as regulators of insulin, with particular emphasis on the implications of this interplay in the development of diabetes. We outline the role of ncRNAs in pancreatic β cell mass and function, which is critical for insulin production and secretion. We also highlight the involvement of ncRNAs in insulin signaling in peripheral tissues including liver, muscle and adipose, and discuss ncRNA-mediated inter-organ crosstalk under diabetic conditions. A more in-depth understanding of the interplay between ncRNAs and insulin may afford valuable insights and novel therapeutic strategies for treatment of diabetes, as well as other human diseases. Copyright © 2018 Elsevier B.V. All rights reserved.
GENCODE: the reference human genome annotation for The ENCODE Project.

PubMed

Harrow, Jennifer; Frankish, Adam; Gonzalez, Jose M; Tapanari, Electra; Diekhans, Mark; Kokocinski, Felix; Aken, Bronwen L; Barrell, Daniel; Zadissa, Amonida; Searle, Stephen; Barnes, If; Bignell, Alexandra; Boychenko, Veronika; Hunt, Toby; Kay, Mike; Mukherjee, Gaurab; Rajan, Jeena; Despacio-Reyes, Gloria; Saunders, Gary; Steward, Charles; Harte, Rachel; Lin, Michael; Howald, Cédric; Tanzer, Andrea; Derrien, Thomas; Chrast, Jacqueline; Walters, Nathalie; Balasubramanian, Suganthi; Pei, Baikang; Tress, Michael; Rodriguez, Jose Manuel; Ezkurdia, Iakes; van Baren, Jeltje; Brent, Michael; Haussler, David; Kellis, Manolis; Valencia, Alfonso; Reymond, Alexandre; Gerstein, Mark; Guigó, Roderic; Hubbard, Tim J

2012-09-01

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.
A Catalogue of Putative cis-Regulatory Interactions Between Long Non-coding RNAs and Proximal Coding Genes Based on Correlative Analysis Across Diverse Human Tumors.

PubMed

Basu, Swaraj; Larsson, Erik

2018-05-31

Antisense transcripts and other long non-coding RNAs are pervasive in mammalian cells, and some of these molecules have been proposed to regulate proximal protein-coding genes in cis For example, non-coding transcription can contribute to inactivation of tumor suppressor genes in cancer, and antisense transcripts have been implicated in the epigenetic inactivation of imprinted genes. However, our knowledge is still limited and more such regulatory interactions likely await discovery. Here, we make use of available gene expression data from a large compendium of human tumors to generate hypotheses regarding non-coding-to-coding cis -regulatory relationships with emphasis on negative associations, as these are less likely to arise for reasons other than cis -regulation. We document a large number of possible regulatory interactions, including 193 coding/non-coding pairs that show expression patterns compatible with negative cis -regulation. Importantly, by this approach we capture several known cases, and many of the involved coding genes have known roles in cancer. Our study provides a large catalog of putative non-coding/coding cis -regulatory pairs that may serve as a basis for further experimental validation and characterization. Copyright © 2018 Basu and Larsson.

Cytoplasmic long noncoding RNAs are frequently bound to and degraded at ribosomes in human cells

PubMed Central

Carlevaro-Fita, Joana; Rahim, Anisa; Guigó, Roderic; Vardy, Leah A.; Johnson, Rory

2016-01-01

Recent footprinting studies have made the surprising observation that long noncoding RNAs (lncRNAs) physically interact with ribosomes. However, these findings remain controversial, and the overall proportion of cytoplasmic lncRNAs involved is unknown. Here we make a global, absolute estimate of the cytoplasmic and ribosome-associated population of stringently filtered lncRNAs in a human cell line using polysome profiling coupled to spike-in normalized microarray analysis. Fifty-four percent of expressed lncRNAs are detected in the cytoplasm. The majority of these (70%) have >50% of their cytoplasmic copies associated with polysomal fractions. These interactions are lost upon disruption of ribosomes by puromycin. Polysomal lncRNAs are distinguished by a number of 5′ mRNA-like features, including capping and 5′UTR length. On the other hand, nonpolysomal “free cytoplasmic” lncRNAs have more conserved promoters and a wider range of expression across cell types. Exons of polysomal lncRNAs are depleted of endogenous retroviral insertions, suggesting a role for repetitive elements in lncRNA localization. Finally, we show that blocking of ribosomal elongation results in stabilization of many associated lncRNAs. Together these findings suggest that the ribosome is the default destination for the majority of cytoplasmic long noncoding RNAs and may play a role in their degradation. PMID:27090285
Transcription Factor Binding Profiles Reveal Cyclic Expression of Human Protein-coding Genes and Non-coding RNAs

PubMed Central

Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.

2013-01-01

Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175
A mixed group II/group III twintron in the Euglena gracilis chloroplast ribosomal protein S3 gene: evidence for intron insertion during gene evolution.

PubMed Central

Copertino, D W; Christopher, D A; Hallick, R B

1991-01-01

The splicing of a 409 nucleotide intron from the Euglena gracilis chloroplast ribosomal protein S3 gene (rps3) was examined by cDNA cloning and sequencing, and northern hybridization. Based on the characterization of a partially spliced pre-mRNA, the intron was characterized as a 'mixed' twintron, composed of a 311 nucleotide group II intron internal to a 98 nucleotide group III intron. Twintron excision is via a 2-step sequential splicing pathway, with removal of the internal group II intron preceding excision of the external group III intron. Based on secondary structural analysis of the twintron, we propose that group III introns may represent highly degenerate versions of group II introns. The existence of twintrons is interpreted as evidence that group II introns were inserted during the evolution of Euglena chloroplast genes from a common ancestor with eubacteria, archaebacteria, cyanobacteria, and other chloroplasts. Images PMID:1721702
Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence.

PubMed

Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab; Khandekar, Sushant; Crawford, Erin; Zirbel, Craig L; Leisner, Scott; Prakash, Ashwin; Fedorova, Larisa; Fedorov, Alexei

2014-09-10

Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals. Copyright © 2014 Elsevier B.V. All rights reserved.
Recent mobility of plastid encoded group II introns and twintrons in five strains of the unicellular red alga Porphyridium

PubMed Central

Perrineau, Marie-Mathilde; Price, Dana C.; Mohr, Georg

2015-01-01

Group II introns are closely linked to eukaryote evolution because nuclear spliceosomal introns and the small RNAs associated with the spliceosome are thought to trace their ancient origins to these mobile elements. Therefore, elucidating how group II introns move, and how they lose mobility can potentially shed light on fundamental aspects of eukaryote biology. To this end, we studied five strains of the unicellular red alga Porphyridium purpureum that surprisingly contain 42 group II introns in their plastid genomes. We focused on a subset of these introns that encode mobility-conferring intron-encoded proteins (IEPs) and found them to be distributed among the strains in a lineage-specific manner. The reverse transcriptase and maturase domains were present in all lineages but the DNA endonuclease domain was deleted in vertically inherited introns, demonstrating a key step in the loss of mobility. P. purpureum plastid intron RNAs had a classic group IIB secondary structure despite variability in the DIII and DVI domains. We report for the first time the presence of twintrons (introns-within-introns, derived from the same mobile element) in Rhodophyta. The P. purpureum IEPs and their mobile introns provide a valuable model for the study of mobile retroelements in eukaryotes and offer promise for biotechnological applications. PMID:26157604
Genetic Manipulation of Lactococcus lactis by Using Targeted Group II Introns: Generation of Stable Insertions without Selection

PubMed Central

Frazier, Courtney L.; San Filippo, Joseph; Lambowitz, Alan M.; Mills, David A.

2003-01-01

Despite their commercial importance, there are relatively few facile methods for genomic manipulation of the lactic acid bacteria. Here, the lactococcal group II intron, Ll.ltrB, was targeted to insert efficiently into genes encoding malate decarboxylase (mleS) and tetracycline resistance (tetM) within the Lactococcus lactis genome. Integrants were readily identified and maintained in the absence of a selectable marker. Since splicing of the Ll.ltrB intron depends on the intron-encoded protein, targeted invasion with an intron lacking the intron open reading frame disrupted TetM and MleS function, and MleS activity could be partially restored by expressing the intron-encoded protein in trans. Restoration of splicing from intron variants lacking the intron-encoded protein illustrates how targeted group II introns could be used for conditional expression of any gene. Furthermore, the modified Ll.ltrB intron was used to separately deliver a phage resistance gene (abiD) and a tetracycline resistance marker (tetM) into mleS, without the need for selection to drive the integration or to maintain the integrant. Our findings demonstrate the utility of targeted group II introns as a potential food-grade mechanism for delivery of industrially important traits into the genomes of lactococci. PMID:12571038
Evolution of introns in the archaeal world.

PubMed

Tocchini-Valentini, Giuseppe D; Fruscoloni, Paolo; Tocchini-Valentini, Glauco P

2011-03-22

The self-splicing group I introns are removed by an autocatalytic mechanism that involves a series of transesterification reactions. They require RNA binding proteins to act as chaperones to correctly fold the RNA into an active intermediate structure in vivo. Pre-tRNA introns in Bacteria and in higher eukaryote plastids are typical examples of self-splicing group I introns. By contrast, two striking features characterize RNA splicing in the archaeal world. First, self-splicing group I introns cannot be found, to this date, in that kingdom. Second, the RNA splicing scenario in Archaea is uniform: All introns, whether in pre-tRNA or elsewhere, are removed by tRNA splicing endonucleases. We suggest that in Archaea, the protein recruited for splicing is the preexisting tRNA splicing endonuclease and that this enzyme, together with the ligase, takes over the task of intron removal in a more efficient fashion than the ribozyme. The extinction of group I introns in Archaea would then be a consequence of recruitment of the tRNA splicing endonuclease. We deal here with comparative genome analysis, focusing specifically on the integration of introns into genes coding for 23S rRNA molecules, and how this newly acquired intron has to be removed to regenerate a functional RNA molecule. We show that all known oligomeric structures of the endonuclease can recognize and cleave a ribosomal intron, even when the endonuclease derives from a strain lacking rRNA introns. The persistence of group I introns in mitochondria and chloroplasts would be explained by the inaccessibility of these introns to the endonuclease.
Hypervariable and highly divergent intron-exon organizations in the chordate Oikopleura dioica.

PubMed

Edvardsen, Rolf B; Lerat, Emmanuelle; Maeland, Anne Dorthea; Flåt, Mette; Tewari, Rita; Jensen, Marit F; Lehrach, Hans; Reinhardt, Richard; Seo, Hee-Chan; Chourrout, Daniel

2004-10-01

Oikopleura dioica is a pelagic tunicate with a very small genome and a very short life cycle. In order to investigate the intron-exon organizations in Oikopleura, we have isolated and characterized ribosomal protein EF-1alpha, Hox, and alpha-tubulin genes. Their intron positions have been compared with those of the same genes from various invertebrates and vertebrates, including four species with entirely sequenced genomes. Oikopleura genes, like Caenorhabditis genes, have introns at a large number of nonconserved positions, which must originate from late insertions or intron sliding of ancient insertions. Both species exhibit hypervariable intron-exon organization within their alpha-tubulin gene family. This is due to localization of most nonconserved intron positions in single members of this gene family. The hypervariability and divergence of intron positions in Oikopleura and Caenorhabditis may be related to the predominance of short introns, the processing of which is not very dependent upon the exonic environment compared to large introns. Also, both species have an undermethylated genome, and the control of methylation-induced point mutations imposes a control on exon size, at least in vertebrate genes. That introns placed at such variable positions in Oikopleura or C. elegans may serve a specific purpose is not easy to infer from our current knowledge and hypotheses on intron functions. We propose that new introns are retained in species with very short life cycles, because illegitimate exchanges including gene conversion are repressed. We also speculate that introns placed at gene-specific positions may contribute to suppressing these exchanges and thereby favor their own persistence.
The Mitochondrial Genome of the Prasinophyte Prasinoderma coloniale Reveals Two Trans-Spliced Group I Introns in the Large Subunit rRNA Gene

PubMed Central

Pombert, Jean-François; Otis, Christian; Turmel, Monique; Lemieux, Claude

2013-01-01

Organelle genes are often interrupted by group I and or group II introns. Splicing of these mobile genetic occurs at the RNA level via serial transesterification steps catalyzed by the introns'own tertiary structures and, sometimes, with the help of external factors. These catalytic ribozymes can be found in cis or trans configuration, and although trans-arrayed group II introns have been known for decades, trans-spliced group I introns have been reported only recently. In the course of sequencing the complete mitochondrial genome of the prasinophyte picoplanktonic green alga Prasinoderma coloniale CCMP 1220 (Prasinococcales, clade VI), we uncovered two additional cases of trans-spliced group I introns. Here, we describe these introns and compare the 54,546 bp-long mitochondrial genome of Prasinoderma with those of four other prasinophytes (clades II, III and V). This comparison underscores the highly variable mitochondrial genome architecture in these ancient chlorophyte lineages. Both Prasinoderma trans-spliced introns reside within the large subunit rRNA gene (rnl) at positions where cis-spliced relatives, often containing homing endonuclease genes, have been found in other organelles. In contrast, all previously reported trans-spliced group I introns occur in different mitochondrial genes (rns or coxI). Each Prasinoderma intron is fragmented into two pieces, forming at the RNA level a secondary structure that resembles those of its cis-spliced counterparts. As observed for other trans-spliced group I introns, the breakpoint of the first intron maps to the variable loop L8, whereas that of the second is uniquely located downstream of P9.1. The breakpoint In each Prasinoderma intron corresponds to the same region where the open reading frame (ORF) occurs when present in cis-spliced orthologs. This correlation between the intron breakpoint and the ORF location in cis-spliced orthologs also holds for other trans-spliced introns; we discuss the possible implications of this interesting observation for trans-splicing of group I introns. PMID:24386369
Evolutionary and biogeographical implications of degraded LAGLIDADG endonuclease functionality and group I intron occurrence in stony corals (Scleractinia) and mushroom corals (Corallimorpharia).

PubMed

Celis, Juan Sebastián; Edgell, David R; Stelbrink, Björn; Wibberg, Daniel; Hauffe, Torsten; Blom, Jochen; Kalinowski, Jörn; Wilke, Thomas

2017-01-01

Group I introns and homing endonuclease genes (HEGs) are mobile genetic elements, capable of invading target sequences in intron-less genomes. LAGLIDADG HEGs are the largest family of endonucleases, playing a key role in the mobility of group I introns in a process known as 'homing'. Group I introns and HEGs are rare in metazoans, and can be mainly found inserted in the COXI gene of some sponges and cnidarians, including stony corals (Scleractinia) and mushroom corals (Corallimorpharia). Vertical and horizontal intron transfer mechanisms have been proposed as explanations for intron occurrence in cnidarians. However, the central role of LAGLIDADG motifs in intron mobility mechanisms remains poorly understood. To resolve questions regarding the evolutionary origin and distribution of group I introns and HEGs in Scleractinia and Corallimorpharia, we examined intron/HEGs sequences within a comprehensive phylogenetic framework. Analyses of LAGLIDADG motif conservation showed a high degree of degradation in complex Scleractinia and Corallimorpharia. Moreover, the two motifs lack the respective acidic residues necessary for metal-ion binding and catalysis, potentially impairing horizontal intron mobility. In contrast, both motifs are highly conserved within robust Scleractinia, indicating a fully functional endonuclease capable of promoting horizontal intron transference. A higher rate of non-synonymous substitutions (Ka) detected in the HEGs of complex Scleractinia and Corallimorpharia suggests degradation of the HEG, whereas lower Ka rates in robust Scleractinia are consistent with a scenario of purifying selection. Molecular-clock analyses and ancestral inference of intron type indicated an earlier intron insertion in complex Scleractinia and Corallimorpharia in comparison to robust Scleractinia. These findings suggest that the lack of horizontal intron transfers in the former two groups is related to an age-dependent degradation of the endonuclease activity. Moreover, they also explain the peculiar geographical patterns of introns in stony and mushroom corals.
Understanding Neurodevelopmental Disorders: The Promise of Regulatory Variation in the 3'UTRome.

PubMed

Wanke, Kai A; Devanna, Paolo; Vernes, Sonja C

2018-04-01

Neurodevelopmental disorders have a strong genetic component, but despite widespread efforts, the specific genetic factors underlying these disorders remain undefined for a large proportion of affected individuals. Given the accessibility of exome sequencing, this problem has thus far been addressed from a protein-centric standpoint; however, protein-coding regions only make up ∼1% to 2% of the human genome. With the advent of whole genome sequencing we are in the midst of a paradigm shift as it is now possible to interrogate the entire sequence of the human genome (coding and noncoding) to fill in the missing heritability of complex disorders. These new technologies bring new challenges, as the number of noncoding variants identified per individual can be overwhelming, making it prudent to focus on noncoding regions of known function, for which the effects of variation can be predicted and directly tested to assess pathogenicity. The 3'UTRome is a region of the noncoding genome that perfectly fulfills these criteria and is of high interest when searching for pathogenic variation related to complex neurodevelopmental disorders. Herein, we review the regulatory roles of the 3'UTRome as binding sites for microRNAs or RNA binding proteins, or during alternative polyadenylation. We detail existing evidence that these regions contribute to neurodevelopmental disorders and outline strategies for identification and validation of novel putatively pathogenic variation in these regions. This evidence suggests that studying the 3'UTRome will lead to the identification of new risk factors, new candidate disease genes, and a better understanding of the molecular mechanisms contributing to neurodevelopmental disorders. Copyright © 2017 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Genomic organization of the human gene (CA5) and pseudogene for mitochondrial carbonic anhydrase V and their localization to chromosomes 16q and 16p

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nagao, Yoshiro; Sly, W.S.; Batanian, J.R.

1995-08-10

Carbonic anhydrase V (CA V) is expressed in mitochondrial matrix in liver and several other tissues. It is of interest for its putative roles in providing bicarbonate to carbamoyl phosphate synthetase for ureagenesis and to pyruvate carboxylase for gluconeogenesis and its possible importance in explaining certain inherited metabolic disorders with hyperammonemia and hypoglycemia. Following the recent characterization of the cDNA for human CA V, we report the isolation of the human gene from two {lambda} genomic libraries and its characterization. The CA V gene (CA5) is approximately 50 kb long and contains 7 exons and 6 introns. The exon-intron boundariesmore » are found in positions identical to those determined for the previously described CA II, CA III, and CA VII genes. Like the CA VII gene, CA5 does not contain typical TATA and CAAT promoter elements in the 5{prime} flanking region but does contain a TTTAA sequence 147 nucleotides upstream of the initiation codon. CA5 also contains a 12-bp GT-rich segment beginning 13 bp downstream of the polyadenylation signal in the 3{prime} untranslated region of exon 7. FISH analysis allowed CA5 to be assigned to chromosome 16q24.3. An unprocessed pseudogene containing sequence homologous to exons 3-7 and introns 3-6 was also isolated and was assigned by FISH analysis to chromosome 16p11.2-p12. 22 refs., 4 figs., 1 tab.« less
Bacterial group II introns: not just splicing.

PubMed

Toro, Nicolás; Jiménez-Zurdo, José Ignacio; García-Rodríguez, Fernando Manuel

2007-04-01

Group II introns are both catalytic RNAs (ribozymes) and mobile retroelements that were discovered almost 14 years ago. It has been suggested that eukaryotic mRNA introns might have originated from the group II introns present in the alphaproteobacterial progenitor of the mitochondria. Bacterial group II introns are of considerable interest not only because of their evolutionary significance, but also because they could potentially be used as tools for genetic manipulation in biotechnology and for gene therapy. This review summarizes what is known about the splicing mechanisms and mobility of bacterial group II introns, and describes the recent development of group II intron-based gene-targetting methods. Bacterial group II intron diversity, evolutionary relationships, and behaviour in bacteria are also discussed.
Structure of a group II intron in complex with its reverse transcriptase.

PubMed

Qu, Guosheng; Kaushal, Prem Singh; Wang, Jia; Shigematsu, Hideki; Piazza, Carol Lyn; Agrawal, Rajendra Kumar; Belfort, Marlene; Wang, Hong-Wei

2016-06-01

Bacterial group II introns are large catalytic RNAs related to nuclear spliceosomal introns and eukaryotic retrotransposons. They self-splice, yielding mature RNA, and integrate into DNA as retroelements. A fully active group II intron forms a ribonucleoprotein complex comprising the intron ribozyme and an intron-encoded protein that performs multiple activities including reverse transcription, in which intron RNA is copied into the DNA target. Here we report cryo-EM structures of an endogenously spliced Lactococcus lactis group IIA intron in its ribonucleoprotein complex form at 3.8-Å resolution and in its protein-depleted form at 4.5-Å resolution, revealing functional coordination of the intron RNA with the protein. Remarkably, the protein structure reveals a close relationship between the reverse transcriptase catalytic domain and telomerase, whereas the active splicing center resembles the spliceosomal Prp8 protein. These extraordinary similarities hint at intricate ancestral relationships and provide new insights into splicing and retromobility.
Ancient nature of alternative splicing and functions of introns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Kemin; Salamov, Asaf; Kuo, Alan

Using four genomes: Chamydomonas reinhardtii, Agaricus bisporus, Aspergillus carbonarius, and Sporotricum thermophile with EST coverage of 2.9x, 8.9x, 29.5x, and 46.3x respectively, we identified 11 alternative splicing (AS) types that were dominated by intron retention (RI; biased toward short introns) and found 15, 35, 52, and 63percent AS of multiexon genes respectively. Genes with AS were more ancient, and number of AS correlated with number of exons, expression level, and maximum intron length of the gene. Introns with tendency to be retained had either stop codons or length of 3n+1 or 3n+2 presumably triggering nonsense-mediated mRNA decay (NMD), but intronsmore » retained in major isoforms (0.2-6percent of all introns) were biased toward 3n length and stop codon free. Stopless introns were biased toward phase 0, but 3n introns favored phase 1 that introduced more flexible and hydrophilic amino acids on both ends of introns which would be less disruptive to protein structure. We proposed a model in which minor RI intron could evolve into major RI that could facilitate intron loss through exonization.« less
Putative cross-kingdom horizontal gene transfer in sponge (Porifera) mitochondria.

PubMed

Rot, Chagai; Goldfarb, Itay; Ilan, Micha; Huchon, Dorothée

2006-09-14

The mitochondrial genome of Metazoa is usually a compact molecule without introns. Exceptions to this rule have been reported only in corals and sea anemones (Cnidaria), in which group I introns have been discovered in the cox1 and nad5 genes. Here we show several lines of evidence demonstrating that introns can also be found in the mitochondria of sponges (Porifera). A 2,349 bp fragment of the mitochondrial cox1 gene was sequenced from the sponge Tetilla sp. (Spirophorida). This fragment suggests the presence of a 1143 bp intron. Similar to all the cnidarian mitochondrial introns, the putative intron has group I intron characteristics. The intron is present in the cox1 gene and encodes a putative homing endonuclease. In order to establish the distribution of this intron in sponges, the cox1 gene was sequenced from several representatives of the demosponge diversity. The intron was found only in the sponge order Spirophorida. A phylogenetic analysis of the COI protein sequence and of the intron open reading frame suggests that the intron may have been transmitted horizontally from a fungus donor. Little is known about sponge-associated fungi, although in the last few years the latter have been frequently isolated from sponges. We suggest that the horizontal gene transfer of a mitochondrial intron was facilitated by a symbiotic relationship between fungus and sponge. Ecological relationships are known to have implications at the genomic level. Here, an ecological relationship between sponge and fungus is suggested based on the genomic analysis.
Spliceosomal Intron Insertions in Genome Compacted Ray-Finned Fishes as Evident from Phylogeny of MC Receptors, Also Supported by a Few Other GPCRs

PubMed Central

Sinha, Rahul; Goyal, Pankaj; Grapputo, Alessandro

2011-01-01

Background Insertions of spliceosomal introns are very rare events during evolution of vertebrates and the mechanisms governing creation of novel intron(s) remain obscure. Largely, gene structures of melanocortin (MC) receptors are characterized by intron-less architecture. However, recently a few exceptions have been reported in some fishes. This warrants a systematic survey of MC receptors for understanding intron insertion events during vertebrate evolution. Methodology/Principal Findings We have compiled an extended list of MC receptors from different vertebrate genomes with variations in fishes. Notably, the closely linked MC2Rs and MC5Rs from a group of ray-finned fishes have three and one intron insertion(s), respectively, with conserved positions and intron phase. In both genes, one novel insertion was in the highly conserved DRY motif at the end of helix TM3. Further, the proto-splice site MAG↑R is maintained at intron insertion sites in these two genes. However, the orthologs of these receptors from zebrafish and tetrapods are intron-less, suggesting these introns are simultaneously created in selected fishes. Surprisingly, these novel introns are traceable only in four fish genomes. We found that these fish genomes are severely compacted after the separation from zebrafish. Furthermore, we also report novel intron insertions in P2Y receptors and in CHRM3. Finally, we report ultrasmall introns in MC2R genes from selected fishes. Conclusions/Significance The current repository of MC receptors illustrates that fishes have no MC3R ortholog. MC2R, MC5R, P2Y receptors and CHRM3 have novel intron insertions only in ray-finned fishes that underwent genome compaction. These receptors share one intron at an identical position suggestive of being inserted contemporaneously. In addition to repetitive elements, genome compaction is now believed to be a new hallmark that promotes intron insertions, as it requires rapid DNA breakage and subsequent repair processes to gain back normal functionality. PMID:21850219
NONCODE v2.0: decoding the non-coding.

PubMed

He, Shunmin; Liu, Changning; Skogerbø, Geir; Zhao, Haitao; Wang, Jie; Liu, Tao; Bai, Baoyan; Zhao, Yi; Chen, Runsheng

2008-01-01

The NONCODE database is an integrated knowledge database designed for the analysis of non-coding RNAs (ncRNAs). Since NONCODE was first released 3 years ago, the number of known ncRNAs has grown rapidly, and there is growing recognition that ncRNAs play important regulatory roles in most organisms. In the updated version of NONCODE (NONCODE v2.0), the number of collected ncRNAs has reached 206 226, including a wide range of microRNAs, Piwi-interacting RNAs and mRNA-like ncRNAs. The improvements brought to the database include not only new and updated ncRNA data sets, but also an incorporation of BLAST alignment search service and access through our custom UCSC Genome Browser. NONCODE can be found under http://www.noncode.org or http://noncode.bioinfo.org.cn.
Conservation/Mutation in the Splice Sites of Mitochondrial Solute Carrier Genes of Vertebrates.

PubMed

Calvello, Rosa; Panaro, Maria A; Salvatore, Rosaria; Mitolo, Vincenzo; Cianciulli, Antonia

2016-10-01

The "canonical" introns begin by the dinucleotide GT and end by the dinucleotide AG. GT, together with a few downstream nucleotides, and AG, with a few of the immediately preceding nucleotides, are thought to be the strongest splicing signals (5'ss and 3'ss, respectively). We examined the composition of the intronic initial and terminal hexanucleotides of the mitochondrial solute carrier genes (SLC25A's) of zebrafish, chicken, mouse, and human. These genes are orthologous and we selected the transcripts in which the arrangement of exons and introns was superimposable in the species considered. Both 5'ss and 3'ss were highly polymorphic, with 104 and 126 different configurations, respectively, in our sample. In the line of evolution from zebrafish to chicken, as well as in that from zebrafish to mammals, the average nucleotide conservation in the four variable nucleotides was about 50 % at 5' and 40 % at 3'. In the divergent evolution of mouse and human, the conservation was about 80 % at 5' and 70 % at 3'. Despite these changes, the splicing signals remain strong enough to operate at the same site. At both 5' and 3', the frequency of a nucleotide at a given position in the zebrafish sequence is positively correlated with its conservation in chicken and mammals, suggesting that selection continued to operate in birds and mammals along similar lines.
Biochemical and proteomic analysis of spliceosome factors interacting with intron-1 of human papillomavirus type-16.

PubMed

Martínez-Salazar, Martha; López-Urrutia, Eduardo; Arechaga-Ocampo, Elena; Bonilla-Moreno, Raul; Martínez-Castillo, Macario; Díaz-Hernández, Job; Del Moral-Hernández, Oscar; Cedillo-Barrón, Leticia; Martines-Juarez, Víctor; De Nova-Ocampo, Monica; Valdes, Jesús; Berumen, Jaime; Villegas-Sepúlveda, Nicolás

2014-12-05

The human papillomavirus type 16 (HPV-16) E6/E7 spliced transcripts are heterogeneously expressed in cervical carcinoma. The heterogeneity of the E6/E7 splicing profile might be in part due to the intrinsic variation of splicing factors in tumor cells. However, the splicing factors that bind the E6/E7 intron 1 (In-1) have not been defined. Therefore, we aimed to identify these factors; we used HeLa nuclear extracts (NE) for in vitro spliceosome assembly. The proteins were allowed to bind to an RNA/DNA hybrid formed by the In-1 transcript and a 5'-biotinylated DNA oligonucleotide complementary to the upstream exon sequence, which prevented interference in protein binding to the intron. The hybrid probes bound with the nuclear proteins were coupled to streptavidin magnetic beads for chromatography affinity purification. Proteins were eluted and identified by mass spectrometry (MS). Approximately 170 proteins were identified by MS, 80% of which were RNA binding proteins, including canonical spliceosome core components, helicases and regulatory splicing factors. The canonical factors were identified as components of the spliceosomal B-complex. Although 35-40 of the identified factors were cognate splicing factors or helicases, they have not been previously detected in spliceosome complexes that were assembled using in vivo or in vitro models. Copyright © 2014 Elsevier B.V. All rights reserved.

Identification of antisense long noncoding RNAs that function as SINEUPs in human cells.

PubMed

Schein, Aleks; Zucchelli, Silvia; Kauppinen, Sakari; Gustincich, Stefano; Carninci, Piero

2016-09-20

Mammalian genomes encode numerous natural antisense long noncoding RNAs (lncRNAs) that regulate gene expression. Recently, an antisense lncRNA to mouse Ubiquitin carboxyl-terminal hydrolase L1 (Uchl1) was reported to increase UCHL1 protein synthesis, representing a new functional class of lncRNAs, designated as SINEUPs, for SINE element-containing translation UP-regulators. Here, we show that an antisense lncRNA to the human protein phosphatase 1 regulatory subunit 12A (PPP1R12A), named as R12A-AS1, which overlaps with the 5' UTR and first coding exon of the PPP1R12A mRNA, functions as a SINEUP, increasing PPP1R12A protein translation in human cells. The SINEUP activity depends on the aforementioned sense-antisense interaction and a free right Alu monomer repeat element at the 3' end of R12A-AS1. In addition, we identify another human antisense lncRNA with SINEUP activity. Our results demonstrate for the first time that human natural antisense lncRNAs can up-regulate protein translation, suggesting that endogenous SINEUPs may be widespread and present in many mammalian species.
Genomic deletion of a long-range bone enhancer misregulatessclerostin in Van Buchem disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Loots, Gabriela G.; Kneissel, Michaela; Keller, Hansjoerg

2005-04-15

Mutations in distant regulatory elements can negatively impact human development and health, yet due to the difficulty of detecting these critical sequences we predominantly focus on coding sequences for diagnostic purposes. We have undertaken a comparative sequence-based approach to characterize a large noncoding region deleted in patients affected by Van Buchem disease (VB), a severe sclerosing bone dysplasia. Using BAC recombination and transgenesis we characterized the expression of human sclerostin (sost) from normal (hSOSTwt) or Van Buchem(hSOSTvb D) alleles. Only the hSOSTwt allele faithfully expressed high levels of human sost in the adult bone and impacted bone metabolism, consistent withmore » the model that the VB noncoding deletion removes a sost specific regulatory element. By exploiting cross-species sequence comparisons with in vitro and in vivo enhancer assays we were able to identify a candidate enhancer element that drives human sost expression in osteoblast-like cell lines in vitro and in the skeletal anlage of the E14.5 mouse embryo, and discovered a novel function for sclerostin during limb development. Our approach represents a framework for characterizing distant regulatory elements associated with abnormal human phenotypes.« less
Rejuvenation of Gene Expression Pattern of Aged Human Skin by Broadband Light Treatment: A Pilot Study

PubMed Central

Chang, Anne Lynn S; Bitter, Patrick H; Qu, Kun; Lin, Meihong; Rapicavoli, Nicole A; Chang, Howard Y

2013-01-01

Studies in model organisms suggest that aged cells can be functionally rejuvenated, but whether this concept applies to human skin is unclear. Here we apply 3′-end sequencing for expression quantification (“3-seq”) to discover the gene expression program associated with human photoaging and intrinsic skin aging (collectively termed “skin aging”), and the impact of broadband light (BBL) treatment. We find that skin aging was associated with a significantly altered expression level of 2,265 coding and noncoding RNAs, of which 1,293 became “rejuvenated” after BBL treatment; i.e., they became more similar to their expression level in youthful skin. Rejuvenated genes (RGs) included several known key regulators of organismal longevity and their proximal long noncoding RNAs. Skin aging is not associated with systematic changes in 3′-end mRNA processing. Hence, BBL treatment can restore gene expression pattern of photoaged and intrinsically aged human skin to resemble young skin. In addition, our data reveal, to our knowledge, a previously unreported set of targets that may lead to new insights into the human skin aging process. PMID:22931923
Low-dose exposure to bisphenols A, F and S of human primary adipocyte impacts coding and non-coding RNA profiles

PubMed Central

Leloire, Audrey; Dhennin, Véronique; Coumoul, Xavier; Yengo, Loïc; Froguel, Philippe

2017-01-01

Bisphenol A (BPA) exposure has been suspected to be associated with deleterious effects on health including obesity and metabolically-linked diseases. Although bisphenols F (BPF) and S (BPS) are BPA structural analogs commonly used in many marketed products as a replacement for BPA, only sparse toxicological data are available yet. Our objective was to comprehensively characterize bisphenols gene targets in a human primary adipocyte model, in order to determine whether they may induce cellular dysfunction, using chronic exposure at two concentrations: a “low-dose” similar to the dose usually encountered in human biological fluids and a higher dose. Therefore, BPA, BPF and BPS have been added at 10 nM or 10 μM during the differentiation of human primary adipocytes from subcutaneous fat of three non-diabetic Caucasian female patients. Gene expression (mRNA/lncRNA) arrays and microRNA arrays, have been used to assess coding and non-coding RNA changes. We detected significantly deregulated mRNA/lncRNA and miRNA at low and high doses. Enrichment in “cancer” and “organismal injury and abnormalities” related pathways was found in response to the three products. Some long intergenic non-coding RNAs and small nucleolar RNAs were differentially expressed suggesting that bisphenols may also activate multiple cellular processes and epigenetic modifications. The analysis of upstream regulators of deregulated genes highlighted hormones or hormone-like chemicals suggesting that BPS and BPF can be suspected to interfere, just like BPA, with hormonal regulation and have to be considered as endocrine disruptors. All these results suggest that as BPA, its substitutes BPS and BPF should be used with the same restrictions. PMID:28628672
Multiple recent horizontal transfers of the cox1 intron in Solanaceae and extended co-conversion of flanking exons

PubMed Central

2011-01-01

Background The most frequent case of horizontal transfer in plants involves a group I intron in the mitochondrial gene cox1, which has been acquired via some 80 separate plant-to-plant transfer events among 833 diverse angiosperms examined. This homing intron encodes an endonuclease thought to promote the intron's promiscuous behavior. A promising experimental approach to study endonuclease activity and intron transmission involves somatic cell hybridization, which in plants leads to mitochondrial fusion and genome recombination. However, the cox1 intron has not yet been found in the ideal group for plant somatic genetics - the Solanaceae. We therefore undertook an extensive survey of this family to find members with the intron and to learn more about the evolutionary history of this exceptionally mobile genetic element. Results Although 409 of the 426 species of Solanaceae examined lack the cox1 intron, it is uniformly present in three phylogenetically disjunct clades. Despite strong overall incongruence of cox1 intron phylogeny with angiosperm phylogeny, two of these clades possess nearly identical intron sequences and are monophyletic in intron phylogeny. These two clades, and possibly the third also, contain a co-conversion tract (CCT) downstream of the intron that is extended relative to all previously recognized CCTs in angiosperm cox1. Re-examination of all published cox1 genes uncovered additional cases of extended co-conversion and identified a rare case of putative intron loss, accompanied by full retention of the CCT. Conclusions We infer that the cox1 intron was separately and recently acquired by at least three different lineages of Solanaceae. The striking identity of the intron and CCT from two of these lineages suggests that one of these three intron captures may have occurred by a within-family transfer event. This is consistent with previous evidence that horizontal transfer in plants is biased towards phylogenetically local events. The discovery of extended co-conversion suggests that other cox1 conversions may be longer than realized but obscured by the exceptional conservation of plant mitochondrial sequences. Our findings provide further support for the rampant-transfer model of cox1 intron evolution and recommend the Solanaceae as a model system for the experimental analysis of cox1 intron transfer in plants. PMID:21943226
Discovery of functional non-coding conserved regions in the α-synuclein gene locus

PubMed Central

Sterling, Lori; Walter, Michael; Ting, Dennis; Schüle, Birgitt

2014-01-01

Several single nucleotide polymorphisms (SNPs) and the Rep-1 microsatellite marker of the α-synuclein ( SNCA) gene have consistently been shown to be associated with Parkinson’s disease, but the functional relevance is unclear. Based on these findings we hypothesized that conserved cis-regulatory elements in the SNCA genomic region regulate expression of SNCA, and that SNPs in these regions could be functionally modulating the expression of SNCA, thus contributing to neuronal demise and predisposing to Parkinson’s disease. In a pair-wise comparison of a 206kb genomic region encompassing the SNCA gene, we revealed 34 evolutionary conserved DNA sequences between human and mouse. All elements were cloned into reporter vectors and assessed for expression modulation in dual luciferase reporter assays. We found that 12 out of 34 elements exhibited either an enhancement or reduction of the expression of the reporter gene. Three elements upstream of the SNCA gene displayed an approximately 1.5 fold (p<0.009) increase in expression. Of the intronic regions, three showed a 1.5 fold increase and two others indicated a 2 and 2.5 fold increase in expression (p<0.002). Three elements downstream of the SNCA gene showed 1.5 fold and 2.5 fold increase (p<0.0009). One element downstream of SNCA had a reduced expression of the reporter gene of 0.35 fold (p<0.0009) of normal activity. Our results demonstrate that the SNCA gene contains cis-regulatory regions that might regulate the transcription and expression of SNCA. Further studies in disease-relevant tissue types will be important to understand the functional impact of regulatory regions and specific Parkinson’s disease-associated SNPs and its function in the disease process. PMID:25566351
Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction.

PubMed

Yang, Yuedong; Li, Xiaomei; Zhao, Huiying; Zhan, Jian; Wang, Jihua; Zhou, Yaoqi

2017-01-01

As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learning method trained on protein-bound RNA structures for solvent accessibility prediction. Built on sequence profiles from multiple sequence alignment (RNAsnap-prof), the method provided robust prediction in fivefold cross-validation and an independent test (Pearson correlation coefficients, r, between predicted and actual ASA values are 0.66 and 0.63, respectively). Application of the method to 6178 mRNAs revealed its positive correlation to mRNA accessibility by dimethyl sulphate (DMS) experimentally measured in vivo (r = 0.37) but not in vitro (r = 0.07), despite the lack of training on mRNAs and the fact that DMS accessibility is only an approximation to solvent accessibility. We further found strong association across coding and noncoding regions between predicted solvent accessibility of the mutation site of a single nucleotide variant (SNV) and the frequency of that variant in the population for 2.2 million SNVs obtained in the 1000 Genomes Project. Moreover, mapping solvent accessibility of RNAs to the human genome indicated that introns, 5' cap of 5' and 3' cap of 3' untranslated regions, are more solvent accessible, consistent with their respective functional roles. These results support conformational selections as the mechanism for the formation of RNA-protein complexes and highlight the utility of genome-scale characterization of RNA tertiary structures by RNAsnap. The server and its stand-alone downloadable version are available at http://sparks-lab.org. © 2016 Yang et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Genome-Wide Discovery of Long Non-Coding RNAs in Rainbow Trout.

PubMed

Al-Tobasei, Rafet; Paneru, Bam; Salem, Mohamed

2016-01-01

The ENCODE project revealed that ~70% of the human genome is transcribed. While only 1-2% of the RNAs encode for proteins, the rest are non-coding RNAs. Long non-coding RNAs (lncRNAs) form a diverse class of non-coding RNAs that are longer than 200 nt. Emerging evidence indicates that lncRNAs play critical roles in various cellular processes including regulation of gene expression. LncRNAs show low levels of gene expression and sequence conservation, which make their computational identification in genomes difficult. In this study, more than two billion Illumina sequence reads were mapped to the genome reference using the TopHat and Cufflinks software. Transcripts shorter than 200 nt, with more than 83-100 amino acids ORF, or with significant homologies to the NCBI nr-protein database were removed. In addition, a computational pipeline was used to filter the remaining transcripts based on a protein-coding-score test. Depending on the filtering stringency conditions, between 31,195 and 54,503 lncRNAs were identified, with only 421 matching known lncRNAs in other species. A digital gene expression atlas revealed 2,935 tissue-specific and 3,269 ubiquitously-expressed lncRNAs. This study annotates the lncRNA rainbow trout genome and provides a valuable resource for functional genomics research in salmonids.
SNPnexus: assessing the functional relevance of genetic variation to facilitate the promise of precision medicine.

PubMed

Dayem Ullah, Abu Z; Oscanoa, Jorge; Wang, Jun; Nagano, Ai; Lemoine, Nicholas R; Chelala, Claude

2018-05-11

Broader functional annotation of genetic variation is a valuable means for prioritising phenotypically-important variants in further disease studies and large-scale genotyping projects. We developed SNPnexus to meet this need by assessing the potential significance of known and novel SNPs on the major transcriptome, proteome, regulatory and structural variation models. Since its previous release in 2012, we have made significant improvements to the annotation categories and updated the query and data viewing systems. The most notable changes include broader functional annotation of noncoding variants and expanding annotations to the most recent human genome assembly GRCh38/hg38. SNPnexus has now integrated rich resources from ENCODE and Roadmap Epigenomics Consortium to map and annotate the noncoding variants onto different classes of regulatory regions and noncoding RNAs as well as providing their predicted functional impact from eight popular non-coding variant scoring algorithms and computational methods. A novel functionality offered now is the support for neo-epitope predictions from leading tools to facilitate its use in immunotherapeutic applications. These updates to SNPnexus are in preparation for its future expansion towards a fully comprehensive computational workflow for disease-associated variant prioritization from sequencing data, placing its users at the forefront of translational research. SNPnexus is freely available at http://www.snp-nexus.org.
Insights into the strategies used by related group II introns to adapt successfully for the colonisation of a bacterial genome

PubMed Central

Martínez-Rodríguez, Laura; García-Rodríguez, Fernando M; Molina-Sánchez, María Dolores; Toro, Nicolás; Martínez-Abarca, Francisco

2014-01-01

Group II introns are self-splicing RNAs and site-specific mobile retroelements found in bacterial and organellar genomes. The group II intron RmInt1 is present at high copy number in Sinorhizobium meliloti species, and has a multifunctional intron-encoded protein (IEP) with reverse transcriptase/maturase activities, but lacking the DNA-binding and endonuclease domains. We characterized two RmInt1-related group II introns RmInt2 from S. meliloti strain GR4 and Sr.md.I1 from S. medicae strain WSM419 in terms of splicing and mobility activities. We used both wild-type and engineered intron-donor constructs based on ribozyme ΔORF-coding sequence derivatives, and we determined the DNA target requirements for RmInt2, the element most distantly related to RmInt1. The excision and mobility patterns of intron-donor constructs expressing different combinations of IEP and intron RNA provided experimental evidence for the co-operation of IEPs and intron RNAs from related elements in intron splicing and, in some cases, in intron homing. We were also able to identify the DNA target regions recognized by these IEPs lacking the DNA endonuclease domain. Our results provide new insight into the versatility of related group II introns and the possible co-operation between these elements to facilitate the colonization of bacterial genomes. PMID:25482895
A pipeline of programs for collecting and analyzing group II intron retroelement sequences from GenBank

PubMed Central

2013-01-01

Background Accurate and complete identification of mobile elements is a challenging task in the current era of sequencing, given their large numbers and frequent truncations. Group II intron retroelements, which consist of a ribozyme and an intron-encoded protein (IEP), are usually identified in bacterial genomes through their IEP; however, the RNA component that defines the intron boundaries is often difficult to identify because of a lack of strong sequence conservation corresponding to the RNA structure. Compounding the problem of boundary definition is the fact that a majority of group II intron copies in bacteria are truncated. Results Here we present a pipeline of 11 programs that collect and analyze group II intron sequences from GenBank. The pipeline begins with a BLAST search of GenBank using a set of representative group II IEPs as queries. Subsequent steps download the corresponding genomic sequences and flanks, filter out non-group II introns, assign introns to phylogenetic subclasses, filter out incomplete and/or non-functional introns, and assign IEP sequences and RNA boundaries to the full-length introns. In the final step, the redundancy in the data set is reduced by grouping introns into sets of ≥95% identity, with one example sequence chosen to be the representative. Conclusions These programs should be useful for comprehensive identification of group II introns in sequence databases as data continue to rapidly accumulate. PMID:24359548
Insights into the strategies used by related group II introns to adapt successfully for the colonisation of a bacterial genome.

PubMed

Martínez-Rodríguez, Laura; García-Rodríguez, Fernando M; Molina-Sánchez, María Dolores; Toro, Nicolás; Martínez-Abarca, Francisco

2014-01-01

Group II introns are self-splicing RNAs and site-specific mobile retroelements found in bacterial and organellar genomes. The group II intron RmInt1 is present at high copy number in Sinorhizobium meliloti species, and has a multifunctional intron-encoded protein (IEP) with reverse transcriptase/maturase activities, but lacking the DNA-binding and endonuclease domains. We characterized two RmInt1-related group II introns RmInt2 from S. meliloti strain GR4 and Sr.md.I1 from S. medicae strain WSM419 in terms of splicing and mobility activities. We used both wild-type and engineered intron-donor constructs based on ribozyme ΔORF-coding sequence derivatives, and we determined the DNA target requirements for RmInt2, the element most distantly related to RmInt1. The excision and mobility patterns of intron-donor constructs expressing different combinations of IEP and intron RNA provided experimental evidence for the co-operation of IEPs and intron RNAs from related elements in intron splicing and, in some cases, in intron homing. We were also able to identify the DNA target regions recognized by these IEPs lacking the DNA endonuclease domain. Our results provide new insight into the versatility of related group II introns and the possible co-operation between these elements to facilitate the colonization of bacterial genomes.
Localized Retroprocessing as a Model of Intron Loss in the Plant Mitochondrial Genome

PubMed Central

Cuenca, Argelia; Ross, T. Gregory; Graham, Sean W.; Barrett, Craig F.; Davis, Jerrold I.; Seberg, Ole; Petersen, Gitte

2016-01-01

Loss of introns in plant mitochondrial genes is commonly explained by retroprocessing. Under this model, an mRNA is reverse transcribed and integrated back into the genome, simultaneously affecting the contents of introns and edited sites. To evaluate the extent to which retroprocessing explains intron loss, we analyzed patterns of intron content and predicted RNA editing for whole mitochondrial genomes of 30 species in the monocot order Alismatales. In this group, we found an unusually high degree of variation in the intron content, even expanding the hitherto known variation among angiosperms. Some species have lost some two-third of the cis-spliced introns. We found a strong correlation between intron content and editing frequency, and detected 27 events in which intron loss is consistent with the presence of nucleotides in an edited state, supporting retroprocessing. However, we also detected seven cases of intron loss not readily being explained by retroprocession. Our analyses are also not consistent with the entire length of a fully processed cDNA copy being integrated into the genome, but instead indicate that retroprocessing usually occurs for only part of the gene. In some cases, several rounds of retroprocessing may explain intron loss in genes completely devoid of introns. A number of taxa retroprocessing seem to be very common and a possibly ongoing process. It affects the entire mitochondrial genome. PMID:27435795
Euglena gracilis chloroplast DNA: analysis of a 1.6 kb intron of the psb C gene containing an open reading frame of 458 codons.

PubMed

Montandon, P E; Vasserot, A; Stutz, E

1986-01-01

We retrieved a 1.6 kbp intron separating two exons of the psb C gene which codes for the 44 kDa reaction center protein of photosystem II. This intron is 3 to 4 times the size of all previously sequenced Euglena gracilis chloroplast introns. It contains an open reading frame of 458 codons potentially coding for a basic protein of 54 kDa of yet unknown function. The intron boundaries follow consensus sequences established for chloroplast introns related to class II and nuclear pre-mRNA introns. Its 3'-terminal segment has structural features similar to class II mitochondrial introns with an invariant base A as possible branch point for lariat formation.
Pan-cancer transcriptomic analysis associates long non-coding RNAs with key mutational driver events

PubMed Central

Ashouri, Arghavan; Sayin, Volkan I.; Van den Eynden, Jimmy; Singh, Simranjit X.; Papagiannakopoulos, Thales; Larsson, Erik

2016-01-01

Thousands of long non-coding RNAs (lncRNAs) lie interspersed with coding genes across the genome, and a small subset has been implicated as downstream effectors in oncogenic pathways. Here we make use of transcriptome and exome sequencing data from thousands of tumours across 19 cancer types, to identify lncRNAs that are induced or repressed in relation to somatic mutations in key oncogenic driver genes. Our screen confirms known coding and non-coding effectors and also associates many new lncRNAs to relevant pathways. The associations are often highly reproducible across cancer types, and while many lncRNAs are co-expressed with their protein-coding hosts or neighbours, some are intergenic and independent. We highlight lncRNAs with possible functions downstream of the tumour suppressor TP53 and the master antioxidant transcription factor NFE2L2. Our study provides a comprehensive overview of lncRNA transcriptional alterations in relation to key driver mutational events in human cancers. PMID:28959951
Androgen receptor and monoamine oxidase polymorphism in wild bonobos

PubMed Central

Garai, Cintia; Furuichi, Takeshi; Kawamoto, Yoshi; Ryu, Heungjin; Inoue-Murayama, Miho

2014-01-01

Androgen receptor gene (AR), monoamine oxidase A gene (MAOA) and monoamine oxidase B gene (MAOB) have been found to have associations with behavioral traits, such as aggressiveness, and disorders in humans. However, the extent to which similar genetic effects might influence the behavior of wild apes is unclear. We examined the loci AR glutamine repeat (ARQ), AR glycine repeat (ARG), MAOA intron 2 dinucleotide repeat (MAin2) and MAOB intron 2 dinucleotide repeat (MBin2) in 32 wild bonobos, Pan paniscus, and compared them with those of chimpanzees, Pan troglodytes, and humans. We found that bonobos were polymorphic on the four loci examined. Both loci MAin2 and MBin2 in bonobos showed a higher diversity than in chimpanzees. Because monoamine oxidase influences aggressiveness, the differences between the polymorphisms of MAin2 and MBin2 in bonobos and chimpanzees may be associated with the differences in aggression between the two species. In order to understand the evolution of these loci and AR, MAOA and MAOB in humans and non-human primates, it would be useful to conduct future studies focusing on the potential association between aggressiveness, and other personality traits, and polymorphisms documented in bonobos. PMID:25606465
Putative cross-kingdom horizontal gene transfer in sponge (Porifera) mitochondria

PubMed Central

Rot, Chagai; Goldfarb, Itay; Ilan, Micha; Huchon, Dorothée

2006-01-01

Background The mitochondrial genome of Metazoa is usually a compact molecule without introns. Exceptions to this rule have been reported only in corals and sea anemones (Cnidaria), in which group I introns have been discovered in the cox1 and nad5 genes. Here we show several lines of evidence demonstrating that introns can also be found in the mitochondria of sponges (Porifera). Results A 2,349 bp fragment of the mitochondrial cox1 gene was sequenced from the sponge Tetilla sp. (Spirophorida). This fragment suggests the presence of a 1143 bp intron. Similar to all the cnidarian mitochondrial introns, the putative intron has group I intron characteristics. The intron is present in the cox1 gene and encodes a putative homing endonuclease. In order to establish the distribution of this intron in sponges, the cox1 gene was sequenced from several representatives of the demosponge diversity. The intron was found only in the sponge order Spirophorida. A phylogenetic analysis of the COI protein sequence and of the intron open reading frame suggests that the intron may have been transmitted horizontally from a fungus donor. Conclusion Little is known about sponge-associated fungi, although in the last few years the latter have been frequently isolated from sponges. We suggest that the horizontal gene transfer of a mitochondrial intron was facilitated by a symbiotic relationship between fungus and sponge. Ecological relationships are known to have implications at the genomic level. Here, an ecological relationship between sponge and fungus is suggested based on the genomic analysis. PMID:16972986
Exon definition as a potential negative force against intron losses in evolution.

PubMed

Niu, Deng-Ke

2008-11-13

Previous studies have indicated that the wide variation in intron density (the number of introns per gene) among different eukaryotes largely reflects varying degrees of intron loss during evolution. The most popular model, which suggests that organisms lose introns through a mechanism in which reverse-transcribed cDNA recombines with the genomic DNA, concerns only one mutational force. Using exons as the units of splicing-site recognition, exon definition constrains the length of exons. An intron-loss event results in fusion of flanking exons and thus a larger exon. The large size of the newborn exon may cause splicing errors, i.e., exon skipping, if the splicing of pre-mRNAs is initiated by exon definition. By contrast, if the splicing of pre-mRNAs is initiated by intron definition, intron loss does not matter. Exon definition may thus be a selective force against intron loss. An organism with a high frequency of exon definition is expected to experience a low rate of intron loss throughout evolution and have a high density of spliceosomal introns. The majority of spliceosomal introns in vertebrates may be maintained during evolution not because of potential functions, but because of their splicing mechanism (i.e., exon definition). Further research is required to determine whether exon definition is a negative force in maintaining the high intron density of vertebrates. This article was reviewed by Dr. Scott W. Roy (nominated by Dr. John Logsdon), Dr.Eugene V. Koonin, and Dr. Igor B. Rogozin (nominated by Dr. Mikhail Gelfand). For the full reviews,please go to the Reviewers' comments section.
Structure and polymorphism of the mouse prion protein gene.

PubMed Central

Westaway, D; Cooper, C; Turner, S; Da Costa, M; Carlson, G A; Prusiner, S B

1994-01-01

Missense mutations in the prion protein (PrP) gene, overexpression of the cellular isoform of PrP (PrPC), and infection with prions containing the scrapie isoform of PrP (PrPSc) all cause neurodegenerative disease. To understand better the physiology and expression of PrPC, we retrieved mouse PrP gene (Prn-p) yeast artificial chromosome (YAC), cosmid, phage, and cDNA clones. Physical mapping positions Prn-p approximately 300 kb from ecotropic virus integration site number 4 (Evi-4), compatible with failure to detect recombination between Prn-p and Evi-4 in genetic crosses. The Prn-pa allele encompasses three exons, with exons 1 and 2 encoding the mRNA 5' untranslated region. Exon 2 has no equivalent in the Syrian hamster and human PrP genes. The Prn-pb gene shares this intron/exon structure but harbors an approximately 6-kb deletion within intron 2. While the Prn-pb open reading frame encodes two amino acid substitutions linked to prolonged scrapie incubation periods, a deletion of intron 2 sequences also characterizes inbred strains such as RIII/S and MOLF/Ei with shorter incubation periods, making a relationship between intron 2 size and scrapie pathogenesis unlikely. The promoter regions of a and b Prn-p alleles include consensus Sp1 and AP-1 sites, as well as other conserved motifs which may represent binding sites for as yet unidentified transcription factors. Images PMID:7912827
Effects of GWAS-Associated Genetic Variants on lncRNAs within IBD and T1D Candidate Loci

PubMed Central

Brorsson, Caroline A.; Pociot, Flemming

2014-01-01

Long non-coding RNAs are a new class of non-coding RNAs that are at the crosshairs in many human diseases such as cancers, cardiovascular disorders, inflammatory and autoimmune disease like Inflammatory Bowel Disease (IBD) and Type 1 Diabetes (T1D). Nearly 90% of the phenotype-associated single-nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) lie outside of the protein coding regions, and map to the non-coding intervals. However, the relationship between phenotype-associated loci and the non-coding regions including the long non-coding RNAs (lncRNAs) is poorly understood. Here, we systemically identified all annotated IBD and T1D loci-associated lncRNAs, and mapped nominally significant GWAS/ImmunoChip SNPs for IBD and T1D within these lncRNAs. Additionally, we identified tissue-specific cis-eQTLs, and strong linkage disequilibrium (LD) signals associated with these SNPs. We explored sequence and structure based attributes of these lncRNAs, and also predicted the structural effects of mapped SNPs within them. We also identified lncRNAs in IBD and T1D that are under recent positive selection. Our analysis identified putative lncRNA secondary structure-disruptive SNPs within and in close proximity (+/−5 kb flanking regions) of IBD and T1D loci-associated candidate genes, suggesting that these RNA conformation-altering polymorphisms might be associated with diseased-phenotype. Disruption of lncRNA secondary structure due to presence of GWAS SNPs provides valuable information that could be potentially useful for future structure-function studies on lncRNAs. PMID:25144376

The emergence of noncoding RNAs as Heracles in autophagy.

PubMed

Zhang, Jian; Wang, Peiyuan; Wan, Lin; Xu, Shouping; Pang, Da

2017-06-03

Macroautophagy/autophagy is a catabolic process that is widely found in nature. Over the past few decades, mounting evidence has indicated that noncoding RNAs, ranging from small noncoding RNAs to long noncoding RNAs (lncRNAs) and even circular RNAs (circRNAs), mediate the transcriptional and post-transcriptional regulation of autophagy-related genes by participating in autophagy regulatory networks. The differential expression of noncoding RNAs affects autophagy levels at different physiological and pathological stages, including embryonic proliferation and differentiation, cellular senescence, and even diseases such as cancer. We summarize the current knowledge regarding noncoding RNA dysregulation in autophagy and investigate the molecular regulatory mechanisms underlying noncoding RNA involvement in autophagy regulatory networks. Then, we integrate public resources to predict autophagy-related noncoding RNAs across species and discuss strategies for and the challenges of identifying autophagy-related noncoding RNAs. This article will deepen our understanding of the relationship between noncoding RNAs and autophagy, and provide new insights to specifically target noncoding RNAs in autophagy-associated therapeutic strategies.
Identification and Functional Characterization of Hypoxia-Induced Endoplasmic Reticulum Stress Regulating lncRNA (HypERlnc) in Pericytes.

PubMed

Bischoff, Florian C; Werner, Astrid; John, David; Boeckel, Jes-Niels; Melissari, Maria-Theodora; Grote, Phillip; Glaser, Simone F; Demolli, Shemsi; Uchida, Shizuka; Michalik, Katharina M; Meder, Benjamin; Katus, Hugo A; Haas, Jan; Chen, Wei; Pullamsetti, Soni S; Seeger, Werner; Zeiher, Andreas M; Dimmeler, Stefanie; Zehendner, Christoph M

2017-08-04

Pericytes are essential for vessel maturation and endothelial barrier function. Long noncoding RNAs regulate many cellular functions, but their role in pericyte biology remains unexplored. Here, we investigate the effect of hypoxia-induced endoplasmic reticulum stress regulating long noncoding RNAs (HypERlnc, also known as ENSG00000262454) on pericyte function in vitro and its regulation in human heart failure and idiopathic pulmonary arterial hypertension. RNA sequencing in human primary pericytes identified hypoxia-regulated long noncoding RNAs, including HypERlnc. Silencing of HypERlnc decreased cell viability and proliferation and resulted in pericyte dedifferentiation, which went along with increased endothelial permeability in cocultures consisting of human primary pericyte and human coronary microvascular endothelial cells. Consistently, Cas9-based transcriptional activation of HypERlnc was associated with increased expression of pericyte marker genes. Moreover, HypERlnc knockdown reduced endothelial-pericyte recruitment in Matrigel assays ( P <0.05). Mechanistically, transcription factor reporter arrays demonstrated that endoplasmic reticulum stress-related transcription factors were prominently activated by HypERlnc knockdown, which was confirmed via immunoblotting for the endoplasmic reticulum stress markers IRE1α ( P <0.001), ATF6 ( P <0.01), and soluble BiP ( P <0.001). Kyoto encyclopedia of genes and gene ontology pathway analyses of RNA sequencing experiments after HypERlnc knockdown indicate a role in cardiovascular disease states. Indeed, HypERlnc expression was significantly reduced in human cardiac tissue from patients with heart failure ( P <0.05; n=19) compared with controls. In addition, HypERlnc expression significantly correlated with pericyte markers in human lungs derived from patients diagnosed with idiopathic pulmonary arterial hypertension and from donor lungs (n=14). Here, we show that HypERlnc regulates human pericyte function and the endoplasmic reticulum stress response. In addition, RNA sequencing analyses in conjunction with reduced expression of HypERlnc in heart failure and correlation with pericyte markers in idiopathic pulmonary arterial hypertension indicate a role of HypERlnc in human cardiopulmonary disease. © 2017 American Heart Association, Inc.
Group II intron inhibits conjugative relaxase expression in bacteria by mRNA targeting

PubMed Central

Piazza, Carol Lyn; Smith, Dorie

2018-01-01

Group II introns are mobile ribozymes that are rare in bacterial genomes, often cohabiting with various mobile elements, and seldom interrupting housekeeping genes. What accounts for this distribution has not been well understood. Here, we demonstrate that Ll.LtrB, the group II intron residing in a relaxase gene on a conjugative plasmid from Lactococcus lactis, inhibits its host gene expression and restrains the naturally cohabiting mobile element from conjugative horizontal transfer. We show that reduction in gene expression is mainly at the mRNA level, and results from the interaction between exon-binding sequences (EBSs) in the intron and intron-binding sequences (IBSs) in the mRNA. The spliced intron targets the relaxase mRNA and reopens ligated exons, causing major mRNA loss. Taken together, this study provides an explanation for the distribution and paucity of group II introns in bacteria, and suggests a potential force for those introns to evolve into spliceosomal introns. PMID:29905149
Group II intron inhibits conjugative relaxase expression in bacteria by mRNA targeting.

PubMed

Qu, Guosheng; Piazza, Carol Lyn; Smith, Dorie; Belfort, Marlene

2018-06-15

Group II introns are mobile ribozymes that are rare in bacterial genomes, often cohabiting with various mobile elements, and seldom interrupting housekeeping genes. What accounts for this distribution has not been well understood. Here, we demonstrate that Ll.LtrB, the group II intron residing in a relaxase gene on a conjugative plasmid from Lactococcus lactis , inhibits its host gene expression and restrains the naturally cohabiting mobile element from conjugative horizontal transfer. We show that reduction in gene expression is mainly at the mRNA level, and results from the interaction between exon-binding sequences (EBSs) in the intron and intron-binding sequences (IBSs) in the mRNA. The spliced intron targets the relaxase mRNA and reopens ligated exons, causing major mRNA loss. Taken together, this study provides an explanation for the distribution and paucity of group II introns in bacteria, and suggests a potential force for those introns to evolve into spliceosomal introns. © 2018, Qu et al.
Targeted CRISPR disruption reveals a role for RNase MRP RNA in human preribosomal RNA processing

PubMed Central

Goldfarb, Katherine C.; Cech, Thomas R.

2017-01-01

MRP RNA is an abundant, essential noncoding RNA whose functions have been proposed in yeast but are incompletely understood in humans. Mutations in the genomic locus for MRP RNA cause pleiotropic human diseases, including cartilage hair hypoplasia (CHH). Here we applied CRISPR–Cas9 genome editing to disrupt the endogenous human MRP RNA locus, thereby attaining what has eluded RNAi and RNase H experiments: elimination of MRP RNA in the majority of cells. The resulting accumulation of ribosomal RNA (rRNA) precursor—analyzed by RNA fluorescent in situ hybridization (FISH), Northern blots, and RNA sequencing—implicates MRP RNA in pre-rRNA processing. Amelioration of pre-rRNA imbalance is achieved through rescue of MRP RNA levels by ectopic expression. Furthermore, affinity-purified MRP ribonucleoprotein (RNP) from HeLa cells cleaves the human pre-rRNA in vitro at at least one site used in cells, while RNP isolated from cells with CRISPR-edited MRP loci loses this activity, and ectopic MRP RNA expression restores cleavage activity. Thus, a role for RNase MRP in human pre-rRNA processing is established. As demonstrated here, targeted CRISPR disruption is a valuable tool for functional studies of essential noncoding RNAs that are resistant to RNAi and RNase H-based degradation. PMID:28115465
Cardiovascular RNA interference therapy: the broadening tool and target spectrum.

PubMed

Poller, Wolfgang; Tank, Juliane; Skurk, Carsten; Gast, Martina

2013-08-16

Understanding of the roles of noncoding RNAs (ncRNAs) within complex organisms has fundamentally changed. It is increasingly possible to use ncRNAs as diagnostic and therapeutic tools in medicine. Regarding disease pathogenesis, it has become evident that confinement to the analysis of protein-coding regions of the human genome is insufficient because ncRNA variants have been associated with important human diseases. Thus, inclusion of noncoding genomic elements in pathogenetic studies and their consideration as therapeutic targets is warranted. We consider aspects of the evolutionary and discovery history of ncRNAs, as far as they are relevant for the identification and selection of ncRNAs with likely therapeutic potential. Novel therapeutic strategies are based on ncRNAs, and we discuss here RNA interference as a highly versatile tool for gene silencing. RNA interference-mediating RNAs are small, but only parts of a far larger spectrum encompassing ncRNAs up to many kilobasepairs in size. We discuss therapeutic options in cardiovascular medicine offered by ncRNAs and key issues to be solved before clinical translation. Convergence of multiple technical advances is highlighted as a prerequisite for the translational progress achieved in recent years. Regarding safety, we review properties of RNA therapeutics, which may immunologically distinguish them from their endogenous counterparts, all of which underwent sophisticated evolutionary adaptation to specific biological contexts. Although our understanding of the noncoding human genome is only fragmentary to date, it is already feasible to develop RNA interference against a rapidly broadening spectrum of therapeutic targets and to translate this to the clinical setting under certain restrictions.
Isolation, structural determination, synthesis and quantitative determination of impurities in Intron-A, leached from a silicone tubing.

PubMed

Chan, Tze-Ming; Pramanik, Birendra; Aslanian, Robert; Gullo, Vincent; Patel, Mahesh; Cronin, Bart; Boyce, Chris; McCormick, Kevin; Berlin, Mike; Zhu, Xiaohong; Buevich, Alexei; Heimark, Larry; Bartner, Peter; Chen, Guodong; Pu, Haiyan; Hegde, Vinod

2009-02-20

Investigation of unexpected levels of impurities in Intron product has revealed the presence of low levels of impurities leached from the silicone tubing (Rehau RAU-SIK) on the Bosch filling line. In order to investigate the effect of these compounds (1a, 1b and 2) on humans, they were isolated identified and synthesized. They were extracted from the tubing by stirring in Intron placebo at room temperature for 72 h and were enriched on a reverse phase CHP-20P column, eluting with gradient aqueous ACN and were separated by HPLC. Structural elucidation of 1a, 1b and 2 by MS and NMR studies demonstrated them to be halogenated biphenyl carboxylic acids. The structures were confirmed by independent synthesis. Levels of extractable impurities in first filled vials of actual production are estimated to be in the range of 0.01-0.55 microg/vial for each leached impurity. Potential toxicity of these extractables does not represent a risk for patients under the conditions of clinical use.
Nucleotide sequence of the ribosomal RNA gene of Physarum polycephalum: intron 2 and its flanking regions of the 26S rRNA gene.

PubMed Central

Nomiyama, H; Kuhara, S; Kukita, T; Otsuka, T; Sakaki, Y

1981-01-01

The 26S ribosomal RNA gene of Physarum polycephalum is interrupted by two introns, and we have previously determined the sequence of one of them (intron 1) (Nomiyama et al. Proc.Natl.Acad.Sci.USA 78, 1376-1380, 1981). In this study we sequenced the second intron (intron 2) of about 0.5 kb length and its flanking regions, and found that one nucleotide at each junction is identical in intron 1 and intron 2, though the junction regions share no other sequence homology. Comparison of the flanking exon sequences to E. coli 23S rRNA sequences shows that conserved sequences are interspersed with tracts having little homology. In particular, the region encompassing the intron 2 interruption site is highly conserved. The E. coli ribosomal protein L1 binding region is also conserved. Images PMID:6171776
Exon–intron organization of genes in the slime mold Physarum polycephalum

PubMed Central

Trzcinska-Danielewicz, Joanna; Fronk, Jan

2000-01-01

The slime mold Physarum polycephalum is a morphologically simple organism with a large and complex genome. The exon–intron organization of its genes exhibits features typical for protists and fungi as well as those characteristic for the evolutionarily more advanced species. This indicates that both the taxonomic position as well as the size of the genome shape the exon–intron organization of an organism. The average gene has 3.7 introns which are on average 138 bp, with a rather narrow size distribution. Introns are enriched in AT base pairs by 13% relative to exons. The consensus sequences at exon–intron boundaries resemble those found for other species, with minor differences between short and long introns. A unique feature of P.polycephalum introns is the strong preference for pyrimidines in the coding strand throughout their length, without a particular enrichment at the 3′-ends. PMID:10982858
SURVEY AND SUMMARY: exon-intron organization of genes in the slime mold Physarum polycephalum.

PubMed

Trzcinska-Danielewicz, J; Fronk, J

2000-09-15

The slime mold Physarum polycephalum is a morphologically simple organism with a large and complex genome. The exon-intron organization of its genes exhibits features typical for protists and fungi as well as those characteristic for the evolutionarily more advanced species. This indicates that both the taxonomic position as well as the size of the genome shape the exon-intron organization of an organism. The average gene has 3.7 introns which are on average 138 bp, with a rather narrow size distribution. Introns are enriched in AT base pairs by 13% relative to exons. The consensus sequences at exon-intron boundaries resemble those found for other species, with minor differences between short and long introns. A unique feature of P.polycephalum introns is the strong preference for pyrimidines in the coding strand throughout their length, without a particular enrichment at the 3'-ends.
Long Intergenic Noncoding RNAs Mediate the Human Chondrocyte Inflammatory Response and Are Differentially Expressed in Osteoarthritis Cartilage.

PubMed

Pearson, Mark J; Philp, Ashleigh M; Heward, James A; Roux, Benoit T; Walsh, David A; Davis, Edward T; Lindsay, Mark A; Jones, Simon W

2016-04-01

To identify long noncoding RNAs (lncRNAs), including long intergenic noncoding RNAs (lincRNAs), antisense RNAs, and pseudogenes, associated with the inflammatory response in human primary osteoarthritis (OA) chondrocytes and to explore their expression and function in OA. OA cartilage was obtained from patients with hip or knee OA following joint replacement surgery. Non-OA cartilage was obtained from postmortem donors and patients with fracture of the neck of the femur. Primary OA chondrocytes were isolated by collagenase digestion. LncRNA expression analysis was performed by RNA sequencing (RNAseq) and quantitative reverse transcriptase-polymerase chain reaction. Modulation of lncRNA chondrocyte expression was achieved using LNA longRNA GapmeRs (Exiqon). Cytokine production was measured with Luminex. RNAseq identified 983 lncRNAs in primary human hip OA chondrocytes, 183 of which had not previously been identified. Following interleukin-1β (IL-1β) stimulation, we identified 125 lincRNAs that were differentially expressed. The lincRNA p50-associated cyclooxygenase 2-extragenic RNA (PACER) and 2 novel chondrocyte inflammation-associated lincRNAs (CILinc01 and CILinc02) were differentially expressed in both knee and hip OA cartilage compared to non-OA cartilage. In primary OA chondrocytes, these lincRNAs were rapidly and transiently induced in response to multiple proinflammatory cytokines. Knockdown of CILinc01 and CILinc02 expression in human chondrocytes significantly enhanced the IL-1-stimulated secretion of proinflammatory cytokines. The inflammatory response in human OA chondrocytes is associated with widespread changes in the profile of lncRNAs, including PACER, CILinc01, and CILinc02. Differential expression of CILinc01 and CIinc02 in hip and knee OA cartilage, and their role in modulating cytokine production during the chondrocyte inflammatory response, suggest that they may play an important role in mediating inflammation-driven cartilage degeneration in OA. © 2016 The Authors. Arthritis & Rheumatology published by Wiley Periodicals, Inc. on behalf of the American College of Rheumatology.
A comprehensive catalogue of the coding and non-coding transcripts of the human inner ear

PubMed Central

Corneveaux, Jason J.; Ohmen, Jeffrey; White, Cory; Allen, April N.; Lusis, Aldons J.; Van Camp, Guy; Huentelman, Matthew J.; Friedman, Rick A.

2015-01-01

The mammalian inner ear consists of the cochlea and the vestibular labyrinth (utricle, saccule, and semicircular canals), which participate in both hearing and balance. Proper development and life-long function of these structures involves a highly complex coordinated system of spatial and temporal gene expression. The characterization of the inner ear transcriptome is likely important for the functional study of auditory and vestibular components, yet, primarily due to tissue unavailability, detailed expression catalogues of the human inner ear remain largely incomplete. We report here, for the first time, comprehensive transcriptome characterization of the adult human cochlea, ampulla, saccule and utricle of the vestibule obtained from patients without hearing abnormalities. Using RNA-Seq, we measured the expression of >50,000 predicted genes corresponding to approximately 200,000 transcripts, in the adult inner ear and compared it to 32 other human tissues. First, we identified genes preferentially expressed in the inner ear, and unique either to the vestibule or cochlea. Next, we examined expression levels of specific groups of potentially interesting RNAs, such as genes implicated in hearing loss, long non-coding RNAs, pseudogenes and transcripts subject to nonsense mediated decay (NMD). We uncover the spatial specificity of expression of these RNAs in the hearing/balance system, and reveal evidence of tissue specific NMD. Lastly, we investigated the non-syndromic deafness loci to which no gene has been mapped, and narrow the list of potential candidates for each locus. These data represent the first high-resolution transcriptome catalogue of the adult human inner ear. A comprehensive identification of coding and non-coding RNAs in the inner ear will enable pathways of auditory and vestibular function to be further defined in the study of hearing and balance. Expression data are freely accessible at https://www.tgen.org/home/research/research-divisions/neurogenomics/supplementary-data/inner-ear-transcriptome.aspx PMID:26341477
Genomic organization and mutational analysis of the human UCP2 gene, a prime candidate gene for human obesity.

PubMed

Lentes, K U; Tu, N; Chen, H; Winnikes, U; Reinert, I; Marmann, G; Pirke, K M

1999-01-01

Uncoupling proteins (UCPs) are mitochondrial membrane transporters which are involved in dissipating the proton electrochemical gradient thereby releasing stored energy as heat. This implies a major role of UCPs in energy metabolism and thermogenesis which when deregulated are key risk factors for the development of obesity and other eating disorders. Recent studies have shown that the sympathetic nervous system, via norepinephrine (beta-adrenoceptors) and cAMP, as well as thyroid hormones and PPAR gamma ligands seem to be major regulators of UCP expression. From the three different UCPs identified so far by gene cloning UCP1 is expressed exclusively in brown adipocytes while UCP2 is widely expressed. The third analogue, UCP3, is expressed predominantly in human skeletal muscle and was found to exist in a long and a short form. At the amino acid level UCP2 has about 59% homology to UCP1 while UCP3 is 73% identical to UCP2. Both UCP2 and UCP3 were mapped in close proximity (75-150 kb) to regions of human chromosome 11 (11q13) that have been linked to obesity and hyper-insulinaemia. Furthermore, there is strong evidence that UCP2, by virtue of its ubiquitous expression, may be important for determining basal metabolic rate. Based on the published full-length cDNA sequence we have deduced the genomic structure of the human UCP2 (hUCP2) gene by PCR and direct sequence analysis. The hUCP2 gene spans over 8.4 kb distributed on 8 exons. The localization of the exon/intron boundaries within the coding region matches precisely the one found in the human UCP1 gene and is almost conserved in the recently discovered UCP3 gene as well. However, the size of each of the introns in the hUCP2 gene differs from its UCP1 and UCP3 counterparts. It varies from 81 bp (intron 5) to about 3 kb (intron 2). The high degree of homology at the nucleotide level and the conservation of the exon/intron boundaries among the three UCP genes suggests that they may have evolved from a common ancestor or are the result from gene duplication events. Mutational analysis of the hUCP2 gene in a cohort of 25 children of caucasian origin (aged 7-13) characterized by low BMR values revealed a point mutation in exon 4 (C to T transition at position 164 of the corresponding cDNA resulting in the substitution of an alanine residue by a valine at codon 55) and an insertion polymorphism in exon 8. The insertion polymorphism consists of a 45 bp repeat located 150 bp downstream of the stop codon in the 3'-UTR. The allele frequencies were 0.61 and 0.39 for the alanine and valine encoded alleles, respectively, and 0.71 versus 0.29 for the insertion polymorphism. Expression studies of the wildtype and mutant forms of UCP2 should clarify the functional consequences these mutations may have on energy metabolism and body weight regulation. In addition, mapping of the promoter region and the identification of putative promoter regulatory sequences should give insight into the transcriptional regulation of UCP2 expression--in particular by anyone of the above mentioned factors--in vitro and in vivo.
Assessing information content and interactive relationships of subgenomic DNA sequences of the MHC using complexity theory approaches based on the non-extensive statistical mechanics

NASA Astrophysics Data System (ADS)

Karakatsanis, L. P.; Pavlos, G. P.; Iliopoulos, A. C.; Pavlos, E. G.; Clark, P. M.; Duke, J. L.; Monos, D. S.

2018-09-01

This study combines two independent domains of science, the high throughput DNA sequencing capabilities of Genomics and complexity theory from Physics, to assess the information encoded by the different genomic segments of exonic, intronic and intergenic regions of the Major Histocompatibility Complex (MHC) and identify possible interactive relationships. The dynamic and non-extensive statistical characteristics of two well characterized MHC sequences from the homozygous cell lines, PGF and COX, in addition to two other genomic regions of comparable size, used as controls, have been studied using the reconstructed phase space theorem and the non-extensive statistical theory of Tsallis. The results reveal similar non-linear dynamical behavior as far as complexity and self-organization features. In particular, the low-dimensional deterministic nonlinear chaotic and non-extensive statistical character of the DNA sequences was verified with strong multifractal characteristics and long-range correlations. The nonlinear indices repeatedly verified that MHC sequences, whether exonic, intronic or intergenic include varying levels of information and reveal an interaction of the genes with intergenic regions, whereby the lower the number of genes in a region, the less the complexity and information content of the intergenic region. Finally we showed the significance of the intergenic region in the production of the DNA dynamics. The findings reveal interesting content information in all three genomic elements and interactive relationships of the genes with the intergenic regions. The results most likely are relevant to the whole genome and not only to the MHC. These findings are consistent with the ENCODE project, which has now established that the non-coding regions of the genome remain to be of relevance, as they are functionally important and play a significant role in the regulation of expression of genes and coordination of the many biological processes of the cell.
Complete chloroplast DNA sequence from a Korean endemic genus, Megaleranthis saniculifolia, and its evolutionary implications.

PubMed

Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong

2009-03-31

The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our molecular trees support Ohwi's original treatment of Megaleranthis saniculiforia to Trollius chosenensis Ohwi.
Evolution of Mhc-DRB introns: implications for the origin of primates.

PubMed

Kupfermann, H; Satta, Y; Takahata, N; Tichy, H; Klein, J

1999-06-01

Introns are generally believed to evolve too rapidly and too erratically to be of much use in phylogenetic reconstructions. Few phylogenetically informative intron sequences are available, however, to ascertain the validity of this supposition. In the present study the supposition was tested on the example of the mammalian class II major histocompatibility complex (Mhc) genes of the DRB family. Since the Mhc genes evolve under balancing selection and are believed to recombine or rearrange frequently, the evolution of their introns could be expected to be particularly rapid and subject to scrambling. Sequences of intron 4 and 5 DRB genes were obtained from polymerase chain reaction-amplified fragments of genomic DNA from representatives of six eutherian orders-Primates, Scandentia, Chiroptera, Dermoptera, Lagomorpha, and Insectivora. Although short stretches of the introns have indeed proved to be unalignable, the bulk of the intron sequences from all six orders, spanning >85 million years (my) of evolution, could be aligned and used in a study of the tempo and mode of intron evolution. The analysis has revealed the Mhc introns to evolve at a rate similar to that of other genes and of synonymous sites of non-Mhc genes. No evidence of homogenization or large-scale scrambling of the intron sequences could be found. The Mhc introns apparently evolve largely by point mutations and insertions/deletions. The phylogenetic signals contained in the intron sequences could be used to identify Scandentia as the sister group of Primates, to support the existence of the Archonta superorder, and to confirm the monophyly of the Chiroptera.
Molecular gene organisation and secondary structure of the mitochondrial large subunit ribosomal RNA from the cultivated Basidiomycota Agrocybe aegerita: a 13 kb gene possessing six unusual nucleotide extensions and eight introns.

PubMed

Gonzalez, P; Barroso, G; Labarère, J

1999-04-01

The complete gene sequence and secondary structure of the mitochondrial LSU rRNA from the cultivated Basidiomycota Agrocybe aegerita was derived by chromosome walking. The A.aegerita LSU rRNA gene (13 526 nt) represents, to date, the longest described, due to the highest number of introns (eight) and the occurrence of six long nucleotidic extensions. Seven introns belong to group I, while the intronic sequence i5 constitutes the first typical group II intron reported in a fungal mitochondrial LSU rDNA. As with most fungal LSU rDNA introns reported to date, four introns (i5-i8) are distributed in domain V associated with the peptidyl-transferase activity. One intron (i1) is located in domain I, and three (i2-i4) in domain II. The introns i2-i8 possess homologies with other fungal, algal or protozoan introns located at the same position in LSU rDNAs. One of them (i6) is located at the same insertion site as most Ascomycota or algae LSU introns, suggesting a possible inheritance from a common ancestor. On the contrary, intron i1 is located at a so-far unreported insertion site. Among the six unusual nucleotide extensions, five are located in domain I and one in domain V. This is the first report of a mitochondrial LSU rRNA gene sequence and secondary structure for the whole Basidiomycota division.
Two Virus-Induced MicroRNAs Known Only from Teleost Fishes Are Orthologues of MicroRNAs Involved in Cell Cycle Control in Humans

PubMed Central

Schyth, Brian Dall; Bela-ong, Dennis Berbulla; Jalali, Seyed Amir Hossein; Kristensen, Lasse Bøgelund Juel; Einer-Jensen, Katja; Pedersen, Finn Skou; Lorenzen, Niels

2015-01-01

MicroRNAs (miRNAs) are ~22 base pair-long non-coding RNAs which regulate gene expression in the cytoplasm of eukaryotic cells by binding to specific target regions in mRNAs to mediate transcriptional blocking or mRNA cleavage. Through their fundamental roles in cellular pathways, gene regulation mediated by miRNAs has been shown to be involved in almost all biological phenomena, including development, metabolism, cell cycle, tumor formation, and host-pathogen interactions. To address the latter in a primitive vertebrate host, we here used an array platform to analyze the miRNA response in rainbow trout (Oncorhynchus mykiss) following inoculation with the virulent fish rhabdovirus Viral hemorrhagic septicaemia virus. Two clustered miRNAs, miR-462 and miR-731 (herein referred to as miR-462 cluster), described only in teleost fishes, were found to be strongly upregulated, indicating their involvement in fish-virus interactions. We searched for homologues of the two teleost miRNAs in other vertebrate species and investigated whether findings related to ours have been reported for these homologues. Gene synteny analysis along with gene sequence conservation suggested that the teleost fish miR-462 and miR-731 had evolved from the ancestral miR-191 and miR-425 (herein called miR-191 cluster), respectively. Whereas the miR-462 cluster locus is found between two protein-coding genes (intergenic) in teleost fish genomes, the miR-191 cluster locus is found within an intron of a protein-coding gene (intragenic) in the human genome. Interferon (IFN)-inducible and immune-related promoter elements found upstream of the teleost miR-462 cluster locus suggested roles in immune responses to viral pathogens in fish, while in humans, the miR-191 cluster functionally associated with cell cycle regulation. Stimulation of fish cell cultures with the IFN inducer poly I:C accordingly upregulated the expression of miR-462 and miR-731, while no stimulatory effect on miR-191 and miR-425 expression was observed in human cell lines. Despite high sequence conservation, evolution has thus resulted in different regulation and presumably also different functional roles of these orthologous miRNA clusters in different vertebrate lineages. PMID:26207374
Functions of the RNA Editing Enzyme ADAR1 and Their Relevance to Human Diseases.

PubMed

Song, Chunzi; Sakurai, Masayuki; Shiromoto, Yusuke; Nishikura, Kazuko

2016-12-17

Adenosine deaminases acting on RNA (ADARs) convert adenosine to inosine in double-stranded RNA (dsRNA). Among the three types of mammalian ADARs, ADAR1 has long been recognized as an essential enzyme for normal development. The interferon-inducible ADAR1p150 is involved in immune responses to both exogenous and endogenous triggers, whereas the functions of the constitutively expressed ADAR1p110 are variable. Recent findings that ADAR1 is involved in the recognition of self versus non-self dsRNA provide potential explanations for its links to hematopoiesis, type I interferonopathies, and viral infections. Editing in both coding and noncoding sequences results in diseases ranging from cancers to neurological abnormalities. Furthermore, editing of noncoding sequences, like microRNAs, can regulate protein expression, while editing of Alu sequences can affect translational efficiency and editing of proximal sequences. Novel identifications of long noncoding RNA and retrotransposons as editing targets further expand the effects of A-to-I editing. Besides editing, ADAR1 also interacts with other dsRNA-binding proteins in editing-independent manners. Elucidating the disease-specific patterns of editing and/or ADAR1 expression may be useful in making diagnoses and prognoses. In this review, we relate the mechanisms of ADAR1's actions to its pathological implications, and suggest possible mechanisms for the unexplained associations between ADAR1 and human diseases.
Functions of the RNA Editing Enzyme ADAR1 and Their Relevance to Human Diseases

PubMed Central

Song, Chunzi; Sakurai, Masayuki; Shiromoto, Yusuke; Nishikura, Kazuko

2016-01-01

Adenosine deaminases acting on RNA (ADARs) convert adenosine to inosine in double-stranded RNA (dsRNA). Among the three types of mammalian ADARs, ADAR1 has long been recognized as an essential enzyme for normal development. The interferon-inducible ADAR1p150 is involved in immune responses to both exogenous and endogenous triggers, whereas the functions of the constitutively expressed ADAR1p110 are variable. Recent findings that ADAR1 is involved in the recognition of self versus non-self dsRNA provide potential explanations for its links to hematopoiesis, type I interferonopathies, and viral infections. Editing in both coding and noncoding sequences results in diseases ranging from cancers to neurological abnormalities. Furthermore, editing of noncoding sequences, like microRNAs, can regulate protein expression, while editing of Alu sequences can affect translational efficiency and editing of proximal sequences. Novel identifications of long noncoding RNA and retrotransposons as editing targets further expand the effects of A-to-I editing. Besides editing, ADAR1 also interacts with other dsRNA-binding proteins in editing-independent manners. Elucidating the disease-specific patterns of editing and/or ADAR1 expression may be useful in making diagnoses and prognoses. In this review, we relate the mechanisms of ADAR1′s actions to its pathological implications, and suggest possible mechanisms for the unexplained associations between ADAR1 and human diseases. PMID:27999332

Disease-Causing 7.4 kb Cis-Regulatory Deletion Disrupting Conserved Non-Coding Sequences and Their Interaction with the FOXL2 Promotor: Implications for Mutation Screening

PubMed Central

Dostie, Josée; Lemire, Edmond; Bouchard, Philippe; Field, Michael; Jones, Kristie; Lorenz, Birgit; Menten, Björn; Buysse, Karen; Pattyn, Filip; Friedli, Marc; Ucla, Catherine; Rossier, Colette; Wyss, Carine; Speleman, Frank; De Paepe, Anne; Dekker, Job; Antonarakis, Stylianos E.; De Baere, Elfride

2009-01-01

To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. PMID:19543368
Non-coding RNA in cystic fibrosis.

PubMed

Glasgow, Arlene M A; De Santi, Chiara; Greene, Catherine M

2018-05-09

Non-coding RNAs (ncRNAs) are an abundant class of RNAs that include small ncRNAs, long non-coding RNAs (lncRNA) and pseudogenes. The human ncRNA atlas includes thousands of these specialised RNA molecules that are further subcategorised based on their size or function. Two of the more well-known and widely studied ncRNA species are microRNAs (miRNAs) and lncRNAs. These are regulatory RNAs and their altered expression has been implicated in the pathogenesis of a variety of human diseases. Failure to express a functional cystic fibrosis (CF) transmembrane receptor (CFTR) chloride ion channel in epithelial cells underpins CF. Secondary to the CFTR defect, it is known that other pathways can be altered and these may contribute to the pathophysiology of CF lung disease in particular. For example, quantitative alterations in expression of some ncRNAs are associated with CF. In recent years, there has been a series of published studies exploring ncRNA expression and function in CF. The majority have focussed principally on miRNAs, with just a handful of reports to date on lncRNAs. The present study reviews what is currently known about ncRNA expression and function in CF, and discusses the possibility of applying this knowledge to the clinical management of CF in the near future. © 2018 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society.
Forks in the tracks: Group II introns, spliceosomes, telomeres and beyond.

PubMed

Agrawal, Rajendra Kumar; Wang, Hong-Wei; Belfort, Marlene

2016-12-01

Group II introns are large catalytic RNAs that form a ribonucleoprotein (RNP) complex by binding to an intron-encoded protein (IEP). The IEP, which facilitates both RNA splicing and intron mobility, has multiple activities including reverse transcriptase. Recent structures of a group II intron RNP complex and of IEPs from diverse bacteria fuel arguments that group II introns are ancestrally related to eukaryotic spliceosomes as well as to telomerase and viruses. Furthermore, recent structural studies of various functional states of the spliceosome allow us to draw parallels between the group II intron RNP and the spliceosome. Here we present an overview of these studies, with an emphasis on the structure of the IEPs in their isolated and RNA-bound states and on their evolutionary relatedness. In addition, we address the conundrum of the free, albeit truncated IEPs forming dimers, whereas the IEP bound to the intron ribozyme is a monomer in the mature RNP. Future studies needed to resolve some of the outstanding issues related to group II intron RNP function and dynamics are also discussed.
The human serotonin 5-HT{sub 2C} receptor: Complete cDNA, genomic structure, and alternatively spliced variant

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Enzhong; Zhu, Lingyu; Zhao, Lingyun

1996-08-01

The complete 4775-nt cDNA encoding the human serotonin 5-HT{sub 2C} receptor (5-HT{sub 2C}R), a G-protein-coupled receptor, has been isolated. It contains a 1377-nt coding region flanked by a 728-nt 5{prime}-untranslated region and a 2670-nt 3{prime}-untranslated region. By using the cloned 5-HT{sub 2C}R cDNA probe, the complete human gene for this receptor has been isolated and shown to contain six exons and five introns spanning at least 230 kb of DNA. The coding region of the human 5-HT{sub 2C}R gene is interrupted by three introns, and the positions of the intron/exon junctions are conserved between the human and the rodent genes.more » In addition, an alternatively spliced 5-HT{sub 2C}R RNA that contains a 95-nt deletion in the region coding for the second intracellular loop and the fourth transmembrane domain of the receptor has been identified. This deletion leads to a frameshift and premature termination so that the short isoform RNA encodes a putative protein of 248 amino acids. The ratio for the short isoform over the 5-HT{sub 2C}R RNA was found to be higher in choroid plexus tumor than in normal brain tissue, suggesting the possibility of differential regulation of the 5-HT{sub 2C}R gene in different neural tissues or during tumorigenesis. Transcription of the human 5-HT{sub 2C}R gene was found to be initiated at multiple sites. No classical TATA-box sequence was found at the appropriate location, and the 5{prime}-flanking sequence contains many potential transcription factor-binding sites. A 7.3-kb 5{prime}-flanking 5-HT{sub 2C}R DNA directed the efficient expression of a luciferase reported gene in SK-N-SH and IMR32 neuroblastoma cells, indicating that is contains a functional promoter. 69 refs., 8 figs., 1 tab.« less
Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease.

PubMed

Carss, Keren J; Arno, Gavin; Erwood, Marie; Stephens, Jonathan; Sanchis-Juan, Alba; Hull, Sarah; Megy, Karyn; Grozeva, Detelina; Dewhurst, Eleanor; Malka, Samantha; Plagnol, Vincent; Penkett, Christopher; Stirrups, Kathleen; Rizzo, Roberta; Wright, Genevieve; Josifova, Dragana; Bitner-Glindzicz, Maria; Scott, Richard H; Clement, Emma; Allen, Louise; Armstrong, Ruth; Brady, Angela F; Carmichael, Jenny; Chitre, Manali; Henderson, Robert H H; Hurst, Jane; MacLaren, Robert E; Murphy, Elaine; Paterson, Joan; Rosser, Elisabeth; Thompson, Dorothy A; Wakeling, Emma; Ouwehand, Willem H; Michaelides, Michel; Moore, Anthony T; Webster, Andrew R; Raymond, F Lucy

2017-01-05

Inherited retinal disease is a common cause of visual impairment and represents a highly heterogeneous group of conditions. Here, we present findings from a cohort of 722 individuals with inherited retinal disease, who have had whole-genome sequencing (n = 605), whole-exome sequencing (n = 72), or both (n = 45) performed, as part of the NIHR-BioResource Rare Diseases research study. We identified pathogenic variants (single-nucleotide variants, indels, or structural variants) for 404/722 (56%) individuals. Whole-genome sequencing gives unprecedented power to detect three categories of pathogenic variants in particular: structural variants, variants in GC-rich regions, which have significantly improved coverage compared to whole-exome sequencing, and variants in non-coding regulatory regions. In addition to previously reported pathogenic regulatory variants, we have identified a previously unreported pathogenic intronic variant in CHM in two males with choroideremia. We have also identified 19 genes not previously known to be associated with inherited retinal disease, which harbor biallelic predicted protein-truncating variants in unsolved cases. Whole-genome sequencing is an increasingly important comprehensive method with which to investigate the genetic causes of inherited retinal disease. Copyright © 2017. Published by Elsevier Inc.
A computational search for box C/D snoRNA genes in the Drosophila melanogaster genome.

PubMed

Accardo, M C; Giordano, E; Riccardo, S; Digilio, F A; Iazzetti, G; Calogero, R A; Furia, M

2004-12-12

In eukaryotes, the family of non-coding RNA genes includes a number of genes encoding small nucleolar RNAs (mainly C/D and H/ACA snoRNAs), which act as guides in the maturation or post-transcriptional modifications of target RNA molecules. Since in Drosophila melanogaster (Dm) only few examples of snoRNAs have been identified so far by cDNA libraries screening, integration of the molecular data with in silico identification of these types of genes could throw light on their organization in the Dm genome. We have performed a computational screening of the Dm genome for C/D snoRNA genes, followed by experimental validation of the putative candidates. Few of the 26 confirmed snoRNAs had been recognized by cDNA library analysis. Organization of the Dm genome was also found to be more variegated than previously suspected, with snoRNA genes nested in both the introns and exons of protein-coding genes. This finding suggests that the presence of additional mechanisms of snoRNA biogenesis based on the alternative production of overlapping mRNA/snoRNA molecules. Additional information is available at http://www.bioinformatica.unito.it/bioinformatics/snoRNAs.
Connecting the dots: chromatin and alternative splicing in EMT

PubMed Central

Warns, Jessica A.; Davie, James R.; Dhasarathy, Archana

2015-01-01

Nature has devised sophisticated cellular machinery to process mRNA transcripts produced by RNA Polymerase II, removing intronic regions and connecting exons together, to produce mature RNAs. This process, known as splicing, is very closely linked to transcription. Alternative splicing, or the ability to produce different combinations of exons that are spliced together from the same genomic template, is a fundamental means of regulating protein complexity. Similar to transcription, both constitutive and alternative splicing can be regulated by chromatin and its associated factors in response to various signal transduction pathways activated by external stimuli. This regulation can vary between different cell types, and interference with these pathways can lead to changes in splicing, often resulting in aberrant cellular states and disease. The epithelial to mesenchymal transition (EMT), which leads to cancer metastasis, is influenced by alternative splicing events of chromatin remodelers and epigenetic factors such as DNA methylation and non-coding RNAs. In this review, we will discuss the role of epigenetic factors including chromatin, chromatin remodelers, DNA methyltransferases and microRNAs in the context of alternative splicing, and discuss their potential involvement in alternative splicing during the EMT process. PMID:26291837
Function and regulation of AUTS2, a gene implicated in autism and human evolution.

PubMed

Oksenberg, Nir; Stevison, Laurie; Wall, Jeffrey D; Ahituv, Nadav

2013-01-01

Nucleotide changes in the AUTS2 locus, some of which affect only noncoding regions, are associated with autism and other neurological disorders, including attention deficit hyperactivity disorder, epilepsy, dyslexia, motor delay, language delay, visual impairment, microcephaly, and alcohol consumption. In addition, AUTS2 contains the most significantly accelerated genomic region differentiating humans from Neanderthals, which is primarily composed of noncoding variants. However, the function and regulation of this gene remain largely unknown. To characterize auts2 function, we knocked it down in zebrafish, leading to a smaller head size, neuronal reduction, and decreased mobility. To characterize AUTS2 regulatory elements, we tested sequences for enhancer activity in zebrafish and mice. We identified 23 functional zebrafish enhancers, 10 of which were active in the brain. Our mouse enhancer assays characterized three mouse brain enhancers that overlap an ASD-associated deletion and four mouse enhancers that reside in regions implicated in human evolution, two of which are active in the brain. Combined, our results show that AUTS2 is important for neurodevelopment and expose candidate enhancer sequences in which nucleotide variation could lead to neurological disease and human-specific traits.
Mitochondrial genes in the colourless alga Prototheca wickerhamii resemble plant genes in their exons but fungal genes in their introns.

PubMed Central

Wolff, G; Burger, G; Lang, B F; Kück, U

1993-01-01

The mitochondrial DNA from the colourless alga Prototheca wickerhamii contains two mosaic genes as was revealed from complete sequencing of the circular extranuclear genome. The genes for the large subunit of the ribosomal RNA (LSUrRNA) as well as for subunit I of the cytochrome oxidase (coxI) carry two and three intronic sequences respectively. On the basis of their canonical nucleotide sequences they can be classified as group I introns. Phylogenetic comparisons of the coxI protein sequences allow us to conclude that the P.wickerhamii mtDNA is much closer related to higher plant mtDNAs than to those of the chlorophyte alga C.reinhardtii. The comparison of the intron sequences revealed several unusual features: (1) The P.wickerhamii introns are structurally related to mitochondrial introns from various ascomycetous fungi. (2) Phylogenetic analyses indicate a close relationship between fungal and algal intronic sequences. (3) The P. wickerhamii introns are located at positions within the structural genes which can be considered as preferred intron insertion sites in homologous mitochondrial genes from fungi or liverwort. In all cases, the sequences adjacent to the insertion sites are very well conserved over large evolutionary distances. Our finding of highly similar introns in fungi and algae is consistent with the idea that introns have already been present in the bacterial ancestors of present day mitochondria and evolved concomitantly with the organelles. PMID:7680126
Mutation in a primate-conserved retrotransposon reveals a noncoding RNA as a mediator of infantile encephalopathy

PubMed Central

Cartault, François; Munier, Patrick; Benko, Edgar; Desguerre, Isabelle; Hanein, Sylvain; Boddaert, Nathalie; Bandiera, Simonetta; Vellayoudom, Jeanine; Krejbich-Trotot, Pascale; Bintner, Marc; Hoarau, Jean-Jacques; Girard, Muriel; Génin, Emmanuelle; de Lonlay, Pascale; Fourmaintraux, Alain; Naville, Magali; Rodriguez, Diana; Feingold, Josué; Renouil, Michel; Munnich, Arnold; Westhof, Eric; Fähling, Michael; Lyonnet, Stanislas; Henrion-Caude, Alexandra

2012-01-01

The human genome is densely populated with transposons and transposon-like repetitive elements. Although the impact of these transposons and elements on human genome evolution is recognized, the significance of subtle variations in their sequence remains mostly unexplored. Here we report homozygosity mapping of an infantile neurodegenerative disease locus in a genetic isolate. Complete DNA sequencing of the 400-kb linkage locus revealed a point mutation in a primate-specific retrotransposon that was transcribed as part of a unique noncoding RNA, which was expressed in the brain. In vitro knockdown of this RNA increased neuronal apoptosis, consistent with the inappropriate dosage of this RNA in vivo and with the phenotype. Moreover, structural analysis of the sequence revealed a small RNA-like hairpin that was consistent with the putative gain of a functional site when mutated. We show here that a mutation in a unique transposable element-containing RNA is associated with lethal encephalopathy, and we suggest that RNAs that harbor evolutionarily recent repetitive elements may play important roles in human brain development. PMID:22411793
Genetic Variation among Major Human Geographic Groups Supports a Peculiar Evolutionary Trend in PAX9

PubMed Central

Paixão-Côrtes, Vanessa R.; Meyer, Diogo; Pereira, Tiago V.; Mazières, Stéphane; Elion, Jacques; Krishnamoorthy, Rajagopal; Zago, Marco A.; Silva, Wilson A.; Salzano, Francisco M.; Bortolini, Maria Cátira

2011-01-01

A total of 172 persons from nine South Amerindian, three African and one Eskimo populations were studied in relation to the Paired box gene 9 (PAX9) exon 3 (138 base pairs) as well as its 5′and 3′flanking intronic segments (232 bp and 220 bp, respectively) and integrated with the information available for the same genetic region from individuals of different geographical origins. Nine mutations were scored in exon 3 and six in its flanking regions; four of them are new South American tribe-specific singletons. Exon3 nucleotide diversity is several orders of magnitude higher than its intronic regions. Additionally, a set of variants in the PAX9 and 101 other genes related with dentition can define at least some dental morphological differences between Sub-Saharan Africans and non-Africans, probably associated with adaptations after the modern human exodus from Africa. Exon 3 of PAX9 could be a good molecular example of how evolvability works. PMID:21298044
Computational Identification and Functional Predictions of Long Noncoding RNA in Zea mays

PubMed Central

Boerner, Susan; McGinnis, Karen M.

2012-01-01

Background Computational analysis of cDNA sequences from multiple organisms suggests that a large portion of transcribed DNA does not code for a functional protein. In mammals, noncoding transcription is abundant, and often results in functional RNA molecules that do not appear to encode proteins. Many long noncoding RNAs (lncRNAs) appear to have epigenetic regulatory function in humans, including HOTAIR and XIST. While epigenetic gene regulation is clearly an essential mechanism in plants, relatively little is known about the presence or function of lncRNAs in plants. Methodology/Principal Findings To explore the connection between lncRNA and epigenetic regulation of gene expression in plants, a computational pipeline using the programming language Python has been developed and applied to maize full length cDNA sequences to identify, classify, and localize potential lncRNAs. The pipeline was used in parallel with an SVM tool for identifying ncRNAs to identify the maximal number of ncRNAs in the dataset. Although the available library of sequences was small and potentially biased toward protein coding transcripts, 15% of the sequences were predicted to be noncoding. Approximately 60% of these sequences appear to act as precursors for small RNA molecules and may function to regulate gene expression via a small RNA dependent mechanism. ncRNAs were predicted to originate from both genic and intergenic loci. Of the lncRNAs that originated from genic loci, ∼20% were antisense to the host gene loci. Conclusions/Significance Consistent with similar studies in other organisms, noncoding transcription appears to be widespread in the maize genome. Computational predictions indicate that maize lncRNAs may function to regulate expression of other genes through multiple RNA mediated mechanisms. PMID:22916204
Behind the curtain of non-coding RNAs; long non-coding RNAs regulating hepatocarcinogenesis

PubMed Central

El Khodiry, Aya; Afify, Menna; El Tayebi, Hend M

2018-01-01

Hepatocellular carcinoma (HCC) is one of the most common and aggressive cancers worldwide. HCC is the fifth common malignancy in the world and the second leading cause of cancer death in Asia. Long non-coding RNAs (lncRNAs) are RNAs with a length greater than 200 nucleotides that do not encode proteins. lncRNAs can regulate gene expression and protein synthesis in several ways by interacting with DNA, RNA and proteins in a sequence specific manner. They could regulate cellular and developmental processes through either gene inhibition or gene activation. Many studies have shown that dysregulation of lncRNAs is related to many human diseases such as cardiovascular diseases, genetic disorders, neurological diseases, immune mediated disorders and cancers. However, the study of lncRNAs is challenging as they are poorly conserved between species, their expression levels aren’t as high as that of mRNAs and have great interpatient variations. The study of lncRNAs expression in cancers have been a breakthrough as it unveils potential biomarkers and drug targets for cancer therapy and helps understand the mechanism of pathogenesis. This review discusses many long non-coding RNAs and their contribution in HCC, their role in development, metastasis, and prognosis of HCC and how to regulate and target these lncRNAs as a therapeutic tool in HCC treatment in the future. PMID:29434445
Transcription profiling suggests that mitochondrial topoisomerase IB acts as a topological barrier and regulator of mitochondrial DNA transcription.

PubMed

Dalla Rosa, Ilaria; Zhang, Hongliang; Khiati, Salim; Wu, Xiaolin; Pommier, Yves

2017-12-08

Mitochondrial DNA (mtDNA) is essential for cell viability because it encodes subunits of the respiratory chain complexes. Mitochondrial topoisomerase IB (TOP1MT) facilitates mtDNA replication by removing DNA topological tensions produced during mtDNA transcription, but it appears to be dispensable. To test whether cells lacking TOP1MT have aberrant mtDNA transcription, we performed mitochondrial transcriptome profiling. To that end, we designed and implemented a customized tiling array, which enabled genome-wide, strand-specific, and simultaneous detection of all mitochondrial transcripts. Our technique revealed that Top1mt KO mouse cells process the mitochondrial transcripts normally but that protein-coding mitochondrial transcripts are elevated. Moreover, we found discrete long noncoding RNAs produced by H-strand transcription and encompassing the noncoding regulatory region of mtDNA in human and murine cells and tissues. Of note, these noncoding RNAs were strongly up-regulated in the absence of TOP1MT. In contrast, 7S DNA, produced by mtDNA replication, was reduced in the Top1mt KO cells. We propose that the long noncoding RNA species in the D-loop region are generated by the extension of H-strand transcripts beyond their canonical stop site and that TOP1MT acts as a topological barrier and regulator for mtDNA transcription and D-loop formation.
Retroviral vectors encoding ADA regulatory locus control region provide enhanced T-cell-specific transgene expression.

PubMed

Trinh, Alice T; Ball, Bret G; Weber, Erin; Gallaher, Timothy K; Gluzman-Poltorak, Zoya; Anderson, French; Basile, Lena A

2009-12-30

Murine retroviral vectors have been used in several hundred gene therapy clinical trials, but have fallen out of favor for a number of reasons. One issue is that gene expression from viral or internal promoters is highly variable and essentially unregulated. Moreover, with retroviral vectors, gene expression is usually silenced over time. Mammalian genes, in contrast, are characterized by highly regulated, precise levels of expression in both a temporal and a cell-specific manner. To ascertain if recapitulation of endogenous adenosine deaminase (ADA) expression can be achieved in a vector construct we created a new series of Moloney murine leukemia virus (MuLV) based retroviral vector that carry human regulatory elements including combinations of the ADA promoter, the ADA locus control region (LCR), ADA introns and human polyadenylation sequences in a self-inactivating vector backbone. A MuLV-based retroviral vector with a self-inactivating (SIN) backbone, the phosphoglycerate kinase promoter (PGK) and the enhanced green fluorescent protein (eGFP), as a reporter gene, was generated. Subsequent vectors were constructed from this basic vector by deletion or addition of certain elements. The added elements that were assessed are the human ADA promoter, human ADA locus control region (LCR), introns 7, 8, and 11 from the human ADA gene, and human growth hormone polyadenylation signal. Retroviral vector particles were produced by transient three-plasmid transfection of 293T cells. Retroviral vectors encoding eGFP were titered by transducing 293A cells, and then the proportion of GFP-positive cells was determined using fluorescence-activated cell sorting (FACS). Non T-cell and T-cell lines were transduced at a multiplicity of infection (MOI) of 0.1 and the yield of eGFP transgene expression was evaluated by FACS analysis using mean fluorescent intensity (MFI) detection. Vectors that contained the ADA LCR were preferentially expressed in T-cell lines. Further improvements in T-cell specific gene expression were observed with the incorporation of additional cis-regulatory elements, such as a human polyadenylation signal and intron 7 from the human ADA gene. These studies suggest that the combination of an authentically regulated ADA gene in a murine retroviral vector, together with additional locus-specific regulatory refinements, will yield a vector with a safer profile and greater efficacy in terms of high-level, therapeutic, regulated gene expression for the treatment of ADA-deficient severe combined immunodeficiency.
Parallel Loss of Plastid Introns and Their Maturase in the Genus Cuscuta

PubMed Central

McNeal, Joel R.; Kuehl, Jennifer V.; Boore, Jeffrey L.; Leebens-Mack, Jim; dePamphilis, Claude W.

2009-01-01

Plastid genome content and arrangement are highly conserved across most land plants and their closest relatives, streptophyte algae, with nearly all plastid introns having invaded the genome in their common ancestor at least 450 million years ago. One such intron, within the transfer RNA trnK-UUU, contains a large open reading frame that encodes a presumed intron maturase, matK. This gene is missing from the plastid genomes of two species in the parasitic plant genus Cuscuta but is found in all other published land plant and streptophyte algal plastid genomes, including that of the nonphotosynthetic angiosperm Epifagus virginiana and two other species of Cuscuta. By examining matK and plastid intron distribution in Cuscuta, we add support to the hypothesis that its normal role is in splicing seven of the eight group IIA introns in the genome. We also analyze matK nucleotide sequences from Cuscuta species and relatives that retain matK to test whether changes in selective pressure in the maturase are associated with intron deletion. Stepwise loss of most group IIA introns from the plastid genome results in substantial change in selective pressure within the hypothetical RNA-binding domain of matK in both Cuscuta and Epifagus, either through evolution from a generalist to a specialist intron splicer or due to loss of a particular intron responsible for most of the constraint on the binding region. The possibility of intron-specific specialization in the X-domain is implicated by evidence of positive selection on the lineage leading to C. nitida in association with the loss of six of seven introns putatively spliced by matK. Moreover, transfer RNA gene deletion facilitated by parasitism combined with an unusually high rate of intron loss from remaining functional plastid genes created a unique circumstance on the lineage leading to Cuscuta subgenus Grammica that allowed elimination of matK in the most species-rich lineage of Cuscuta. PMID:19543388
Parallel loss of plastid introns and their maturase in the genus Cuscuta.

PubMed

McNeal, Joel R; Kuehl, Jennifer V; Boore, Jeffrey L; Leebens-Mack, Jim; dePamphilis, Claude W

2009-06-19

Plastid genome content and arrangement are highly conserved across most land plants and their closest relatives, streptophyte algae, with nearly all plastid introns having invaded the genome in their common ancestor at least 450 million years ago. One such intron, within the transfer RNA trnK-UUU, contains a large open reading frame that encodes a presumed intron maturase, matK. This gene is missing from the plastid genomes of two species in the parasitic plant genus Cuscuta but is found in all other published land plant and streptophyte algal plastid genomes, including that of the nonphotosynthetic angiosperm Epifagus virginiana and two other species of Cuscuta. By examining matK and plastid intron distribution in Cuscuta, we add support to the hypothesis that its normal role is in splicing seven of the eight group IIA introns in the genome. We also analyze matK nucleotide sequences from Cuscuta species and relatives that retain matK to test whether changes in selective pressure in the maturase are associated with intron deletion. Stepwise loss of most group IIA introns from the plastid genome results in substantial change in selective pressure within the hypothetical RNA-binding domain of matK in both Cuscuta and Epifagus, either through evolution from a generalist to a specialist intron splicer or due to loss of a particular intron responsible for most of the constraint on the binding region. The possibility of intron-specific specialization in the X-domain is implicated by evidence of positive selection on the lineage leading to C. nitida in association with the loss of six of seven introns putatively spliced by matK. Moreover, transfer RNA gene deletion facilitated by parasitism combined with an unusually high rate of intron loss from remaining functional plastid genes created a unique circumstance on the lineage leading to Cuscuta subgenus Grammica that allowed elimination of matK in the most species-rich lineage of Cuscuta.
Fungal origin by horizontal transfer of a plant mitochondrial group I intron in the chimeric CoxI gene of Peperomia.

PubMed

Vaughn, J C; Mason, M T; Sper-Whitis, G L; Kuhlman, P; Palmer, J D

1995-11-01

We present phylogenetic evidence that a group I intron in an angiosperm mitochondrial gene arose recently by horizontal transfer from a fungal donor species. A 1,716-bp fragment of the mitochondrial coxI gene from the angiosperm Peperomia polybotrya was amplified via the polymerase chain reaction and sequenced. Comparison to other coxI genes revealed a 966-bp group I intron, which, based on homology with the related yeast coxI intron aI4, potentially encodes a 279-amino-acid site-specific DNA endonuclease. This intron, which is believed to function as a ribozyme during its own splicing, is not present in any of 19 coxI genes examined from other diverse vascular plant species. Phylogenetic analysis of intron origin was carried out using three different tree-generating algorithms, and on a variety of nucleotide and amino acid data sets from the intron and its flanking exon sequences. These analyses show that the Peperomia coxI gene intron and exon sequences are of fundamentally different evolutionary origin. The Peperomia intron is more closely related to several fungal mitochondrial introns, two of which are located at identical positions in coxI, than to identically located coxI introns from the land plant Marchantia and the green alga Prototheca. Conversely, the exon sequence of this gene is, as expected, most closely related to other angiosperm coxI genes. These results, together with evidence suggestive of co-conversion of exonic markers immediately flanking the intron insertion site, lead us to conclude that the Peperomia coxI intron probably arose by horizontal transfer from a fungal donor, using the double-strand-break repair pathway. The donor species may have been one of the symbiotic mycorrhizal fungi that live in close obligate association with most plants.
[Detection of factor VIII intron 1 inversion in severe haemophilia A].

PubMed

Liang, Yan; Yan, Zhen-yu; Yan, Mei; Hua, Bao-lai; Xiao, Bai; Zhao, Yong-qiang; Liu, Jing-zhong

2009-06-01

Screening the intron 1 inversion of factor VIII (FVIII) in the population of severe haemophilia A(HA) in China and performing carrier detection and prenatal diagnosis. Using LD-PCR to detect intron 22 inversions and multiple-PCR within two tubes to intron 1 inversions in severe HA patients. Carrier detection and prenatal diagnosis were performed in affected families. Linkage analysis and DNA sequencing were used to verify these tests. One hundred and eighteen patients were seven diagnosed as intron 22 inversions and 7 were intron 1 inversions out of 247 severe HA patients. The prevalence of the intron 1 inversion in Chinese severe haemophilia A patients was 2.8% (7/247). Six women from family A and 2 from family B were diagnosed as carriers. One fetus from family A was affected fetus. Intron 1 inversion could be detected directly by multiple-PCR within two tubes. This method made the strategy more perfective in carrier and prenatal diagnosis of haemophilia A.
Splicing of a group II intron involved in the conjugative transfer of pRS01 in lactococci.

PubMed

Mills, D A; McKay, L L; Dunny, G M

1996-06-01

Analysis of a region involved in the conjugative transfer of the lactococcal conjugative element pRS01 has revealed a bacteria] group II intron. Splicing of this lactococcal intron (designated Ll.ltrB) in vivo resulted in the ligation of two exon messages (ltrBE1 and ltrBE2) which encoded a putative conjugative relaxase essential for the transfer of pRS01. Like many group II introns, the Ll.ltrB intron possessed an open reading frame (ltrA) with homology to reverse transcriptases. Remarkably, sequence analysis of ltrA suggested a greater similarity to open reading frames encoded by eukaryotic mitochondrial group II introns than to those identified to date from other bacteria. Several insertional mutations within ltrA resulted in plasmids exhibiting a conjugative transfer-deficient phenotype. These results provide the first direct evidence for splicing of a prokaryotic group II intron in vivo and suggest that conjugative transfer is a mechanism for group II intron dissemination in bacteria.

Transposition of an intron in yeast mitochondria requires a protein encoded by that intron.

PubMed

Macreadie, I G; Scott, R M; Zinn, A R; Butow, R A

1985-06-01

The optional 1143 bp intron in the yeast mitochondrial 21S rRNA gene (omega +) is nearly quantitatively inserted in genetic crosses into 21S rRNA alleles that lack it (omega -). The intron contains an open reading frame that can encode a protein of 235 amino acids, but no function has been ascribed to this sequence. We previously found an in vivo double-strand break in omega - DNA at or close to the intron insertion site only in zygotes of omega + X omega - crosses that appears with the same kinetics as intron insertion. We now show that mutations in the intron open reading frame that would alter the translation product simultaneously inhibit nonreciprocal omega recombination and the in vivo double-strand break in omega - DNA. These results provide evidence that the open reading frame encodes a protein required for intron transposition and support the role of the double-strand break in the process.
Smooth, an hnRNP-L Homolog, Might Decrease Mitochondrial Metabolism by Post-Transcriptional Regulation of Isocitrate Dehydrogenase (Idh) and Other Metabolic Genes in the Sub-Acute Phase of Traumatic Brain Injury.

PubMed

Sen, Arko; Gurdziel, Katherine; Liu, Jenney; Qu, Wen; Nuga, Oluwademi O; Burl, Rayanne B; Hüttemann, Maik; Pique-Regi, Roger; Ruden, Douglas M

2017-01-01

Traumatic brain injury (TBI) can cause persistent pathological alteration of neurons. This may lead to cognitive dysfunction, depression and increased susceptibility to life threatening diseases, such as epilepsy and Alzheimer's disease. To investigate the underlying genetic and molecular basis of TBI, we subjected w 1118 Drosophila melanogaster to mild closed head trauma and found that mitochondrial activity is reduced in the brains of these flies 24 h after inflicting trauma. To determine the transcriptomic changes after mild TBI, we collected fly heads 24 h after inflicting trauma, and performed RNA-seq analyses. Classification of alternative splicing changes showed selective retention (RI) of long introns (>81 bps), with a mean size of ~3,000 nucleotides. Some of the genes containing RI showed a significant reduction in transcript abundance and are involved in mitochondrial metabolism such as Isocitrate dehydrogenase (Idh), which makes α-KG, a co-factor needed for both DNA and histone demethylase enzymes. The long introns are enriched in CA-rich motifs known to bind to Smooth (Sm), a heterogeneous nuclear ribonucleoprotein L (hnRNP-L) class of splicing factor, which has been shown to interact with the H3K36 histone methyltransferase, SET2, and to be involved in intron retention in human cells. H3K36me3 is a histone mark that demarcates exons in genes by interacting with the mRNA splicing machinery. Mutating sm ( sm 4 /Df) resulted in loss of both basal and induced levels of RI in many of the same long-intron containing genes. Reducing the levels of Kdm4A, the H3K36me3 histone demethylase, also resulted in loss of basal levels of RI in many of the same long-intron containing genes. Chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) for H3K36me3 revealed increased levels of this histone modification in retained introns post-trauma at CA-rich motifs. Based on these results, we propose a model in which TBI temporarily decreases mitochondrial activity in the brain 24 h after inflicting trauma, which decreases α-KG levels, and increases H3K36me3 levels and intron retention of long introns by decreasing Kdm4A activity. The consequent reduction in mature mRNA levels in metabolism genes, such as Idh, further reduces α-KG levels in a negative feedback loop. We further propose that decreasing metabolism after TBI in such a manner is a protective mechanism that gives the brain time to repair cellular damage induced by TBI.
Imprecise intron losses are less frequent than precise intron losses but are not rare in plants.

PubMed

Ma, Ming-Yue; Zhu, Tao; Li, Xue-Nan; Lan, Xin-Ran; Liu, Heng-Yuan; Yang, Yu-Fei; Niu, Deng-Ke

2015-05-27

In this study, we identified 19 intron losses, including 11 precise intron losses (PILs), six imprecise intron losses (IILs), one de-exonization, and one exon deletion in tomato and potato, and 17 IILs in Arabidopsis thaliana. Comparative analysis of related genomes confirmed that all of the IILs have been fixed during evolution. Consistent with previous studies, our results indicate that PILs are a major type of intron loss. However, at least in plants, IILs are unlikely to be as rare as previously reported. This article was reviewed by Jun Yu and Zhang Zhang. For complete reviews, see the Reviewers' Reports section.
Group I introns are inherited through common ancestry in the nuclear-encoded rRNA of Zygnematales (Charophyceae).

PubMed Central

Bhattacharya, D; Surek, B; Rüsing, M; Damberger, S; Melkonian, M

1994-01-01

Group I introns are found in organellar genomes, in the genomes of eubacteria and phages, and in nuclear-encoded rRNAs. The origin and distribution of nuclear-encoded rRNA group I introns are not understood. To elucidate their evolutionary relationships, we analyzed diverse nuclear-encoded small-subunit rRNA group I introns including nine sequences from the green-algal order Zygnematales (Charophyceae). Phylogenetic analyses of group I introns and rRNA coding regions suggest that lateral transfers have occurred in the evolutionary history of group I introns and that, after transfer, some of these elements may form stable components of the host-cell nuclear genomes. The Zygnematales introns, which share a common insertion site (position 1506 relative to the Escherichia coli small-subunit rRNA), form one subfamily of group I introns that has, after its origin, been inherited through common ancestry. Since the first Zygnematales appear in the middle Devonian within the fossil record, the "1506" group I intron presumably has been a stable component of the Zygnematales small-subunit rRNA coding region for 350-400 million years. PMID:7937917
Organellar maturases: A window into the evolution of the spliceosome.

PubMed

Schmitz-Linneweber, Christian; Lampe, Marie-Kristin; Sultan, Laure D; Ostersetzer-Biran, Oren

2015-09-01

During the evolution of eukaryotic genomes, many genes have been interrupted by intervening sequences (introns) that must be removed post-transcriptionally from RNA precursors to form mRNAs ready for translation. The origin of nuclear introns is still under debate, but one hypothesis is that the spliceosome and the intron-exon structure of genes have evolved from bacterial-type group II introns that invaded the eukaryotic genomes. The group II introns were most likely introduced into the eukaryotic genome from an α-proteobacterial predecessor of mitochondria early during the endosymbiosis event. These self-splicing and mobile introns spread through the eukaryotic genome and later degenerated. Pieces of introns became part of the general splicing machinery we know today as the spliceosome. In addition, group II introns likely brought intron maturases with them to the nucleus. Maturases are found in most bacterial introns, where they act as highly specific splicing factors for group II introns. In the spliceosome, the core protein Prp8 shows homology to group II intron-encoded maturases. While maturases are entirely intron specific, their descendant of the spliceosomal machinery, the Prp8 protein, is an extremely versatile splicing factor with multiple interacting proteins and RNAs. How could such a general player in spliceosomal splicing evolve from the monospecific bacterial maturases? Analysis of the organellar splicing machinery in plants may give clues on the evolution of nuclear splicing. Plants encode various proteins which are closely related to bacterial maturases. The organellar genomes contain one maturase each, named MatK in chloroplasts and MatR in mitochondria. In addition, several maturase genes have been found in the nucleus as well, which are acting on mitochondrial pre-RNAs. All plant maturases show sequence deviation from their progenitor bacterial maturases, and interestingly are all acting on multiple organellar group II intron targets. Moreover, they seem to function in the splicing of group II introns together with a number of additional nuclear-encoded splicing factors, possibly acting as an organellar proto-spliceosome. Together, this makes them interesting models for the early evolution of nuclear spliceosomal splicing. In this review, we summarize recent advances in our understanding of the role of plant maturases and their accessory factors in plants. This article is part of a Special Issue entitled: Chloroplast Biogenesis. Copyright © 2015 Elsevier B.V. All rights reserved.
Characterization and mapping of the mouse NDP (Norrie disease) locus (Ndp).

PubMed

Battinelli, E M; Boyd, Y; Craig, I W; Breakefield, X O; Chen, Z Y

1996-02-01

Norrie disease is a severe X-linked recessive neurological disorder characterized by congenital blindness with progressive loss of hearing. Over half of Norrie patients also manifest different degrees of mental retardation. The gene for Norrie disease (NDP) has recently been cloned and characterized. With the human NDP cDNA, mouse genomic phage libraries were screened for the homolog of the gene. Comparison between mouse and human genomic DNA blots hybridized with the NDP cDNA, as well as analysis of phage clones, shows that the mouse NDP gene is 29 kb in size (28 kb for the human gene). The organization in the two species is very similar. Both have three exons with similar-sized introns and identical exon-intron boundaries between exon 2 and 3. The mouse open reading frame is 393 bp and, like the human coding sequence, is encoded in exons 2 and 3. The absence of six nucleotides in the second mouse exon results in the encoded protein being two amino acids smaller than its human counterpart. The overall homology between the human and mouse NDP protein is 95% and is particularly high (99%) in exon 3, consistent with the apparent functional importance of this region. Analysis of transcription initiation sites suggests the presence of multiple start sites associated with expression of the mouse NDP gene. Pedigree analysis of an interspecific mouse backcross localizes the mouse NDP gene close to Maoa in the conserved segment, which runs from CYBB to PFC in both human and mouse.
Increased complexity of circRNA expression during species evolution.

PubMed

Dong, Rui; Ma, Xu-Kai; Chen, Ling-Ling; Yang, Li

2017-08-03

Circular RNAs (circRNAs) are broadly identified from precursor mRNA (pre-mRNA) back-splicing across various species. Recent studies have suggested a cell-/tissue- specific manner of circRNA expression. However, the distinct expression pattern of circRNAs among species and its underlying mechanism still remain to be explored. Here, we systematically compared circRNA expression from human and mouse, and found that only a small portion of human circRNAs could be determined in parallel mouse samples. The conserved circRNA expression between human and mouse is correlated with the existence of orientation-opposite complementary sequences in introns that flank back-spliced exons in both species, but not the circRNA sequences themselves. Quantification of RNA pairing capacity of orientation-opposite complementary sequences across circRNA-flanking introns by Complementary Sequence Index (CSI) identifies that among all types of complementary sequences, SINEs, especially Alu elements in human, contribute the most for circRNA formation and that their diverse distribution across species leads to the increased complexity of circRNA expression during species evolution. Together, our integrated and comparative reference catalog of circRNAs in different species reveals a species-specific pattern of circRNA expression and suggests a previously under-appreciated impact of fast-evolved SINEs on the regulation of (circRNA) gene expression.
Long Noncoding RNA LINC00958 Accelerates Gliomagenesis Through Regulating miR-203/CDK2.

PubMed

Guo, Erkun; Liang, Chaohui; He, Xin; Song, Guozhi; Liu, Hongjiang; Lv, Zhongqiang; Guan, Jianchao; Yang, Dezhen; Zheng, Jiapeng

2018-05-01

Increasing evidence has indicated that long noncoding RNAs (lncRNAs) play crucial roles in various biological processes, including glioma. However, the underlying mechanism of lncRNAs in gliomagenesis is still ambiguous. In this study, we aim to investigate the role of long intergenic noncoding RNA 00958 (LINC00958) in the tumorigenesis of glioma. Results revealed that LINC00958 was significantly upregulated in glioma tissues and cell lines compared with that of adjacent normal brain tissues and normal human astrocytes. Moreover, the ectopic overexpression of LINC00958 was correlated with poor prognosis of glioma patients. Loss-of-function experiments indicated that LINC00958 knockdown suppressed glioma cell proliferation, invasion, and induced cycle arrest at G0/G1 phase in vitro, and inhibited tumor growth in vivo. Bioinformatics programs and luciferase reporter assay revealed that miR-203 shared complementary binding sites with both 3'-untranslated region of LINC00958 and CDK2. In summary, our study concludes that LINC00958 acts as an oncogenic gene in the gliomagenesis through miR-203-CDK2 regulation, providing a novel insight into glioma tumorigenesis.
Polyploidization of murine mesenchymal cells is associated with suppression of the long noncoding RNA H19 and reduced tumorigenicity.

PubMed

Shoshani, Ofer; Massalha, Hassan; Shani, Nir; Kagan, Sivan; Ravid, Orly; Madar, Shalom; Trakhtenbrot, Luba; Leshkowitz, Dena; Rechavi, Gideon; Zipori, Dov

2012-12-15

Mesenchymal stromal cells (MSC) are used extensively in clinical trials; however, the possibility that MSCs have a potential for malignant transformation was raised. We examined the genomic stability versus the tumor-forming capacity of multiple mouse MSCs. Murine MSCs have been shown to be less stable and more prone to malignant transformation than their human counterparts. A large series of independently isolated MSC populations exhibited low tumorigenic potential under syngeneic conditions, which increased in immunocompromised animals. Unexpectedly, higher ploidy correlated with reduced tumor-forming capacity. Furthermore, in both cultured MSCs and primary hepatocytes, polyploidization was associated with a dramatic decrease in the expression of the long noncoding RNA H19. Direct knockdown of H19 expression in diploid cells resulted in acquisition of polyploid cell traits. Moreover, artificial tetraploidization of diploid cancer cells led to a reduction of H19 levels, as well as to an attenuation of the tumorigenic potential. Polyploidy might therefore serve as a protective mechanism aimed at reducing malignant transformation through the involvement of the H19 regulatory long noncoding RNA.
Isolation, expression, and chromosomal localization of the human mitochondrial capsule selenoprotein gene (MCSP)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aho, Hanne; Schwemmer, M.; Tessmann, D.

1996-03-01

The mitochondrial capsule selenoprotein (MCS) (HGMW-approved symbol MCSP) is one of three proteins that are important for the maintenance and stabilization of the crescent structure of the sperm mitochondria. We describe here the isolation of a cDNA, the exon-intron organization, the expression, and the chromosomal localization of the human MCS gene. Nucleotide sequence analysis of the human and mouse MCS cDNAs reveals that the 5{prime}- and 3{prime}-untranslated sequences are more conserved (71%) than the coding sequences (59%). The open reading frame encodes a 116-amino-acid protein and lacks the UGA codons, which have been reported to encode the selenocysteines in themore » N-terminal of the deduced mouse protein. The deduced human protein shows a low degree of amino acid sequence identity to the mouse protein. The deduced human protein shows a low degree of amino acid sequence identity to the mouse protein (39%). The most striking homology lies in the dicysteine motifs. Northern and Southern zooblot analyses reveal that the MCS gene in human, baboon, and bovine is more conserved than its counterparts in mouse and rat. The single intron in the human MCS gene is approximately 6 kb and interrupts the 5{prime}-untranslated region at a position equivalent to that in the mouse and rat genes. Northern blot and in situ hybridization experiments demonstrate that the expression of the human MCS gene is restricted to haploid spermatids. The human gene was assigned to q21 of chromosome 1. 30 refs., 9 figs.« less
Small non-coding RNA profiling in human biofluids and surrogate tissues from healthy individuals: description of the diverse and most represented species.

PubMed

Ferrero, Giulio; Cordero, Francesca; Tarallo, Sonia; Arigoni, Maddalena; Riccardo, Federica; Gallo, Gaetano; Ronco, Guglielmo; Allasia, Marco; Kulkarni, Neha; Matullo, Giuseppe; Vineis, Paolo; Calogero, Raffaele A; Pardini, Barbara; Naccarati, Alessio

2018-01-09

The role of non-coding RNAs in different biological processes and diseases is continuously expanding. Next-generation sequencing together with the parallel improvement of bioinformatics analyses allows the accurate detection and quantification of an increasing number of RNA species. With the aim of exploring new potential biomarkers for disease classification, a clear overview of the expression levels of common/unique small RNA species among different biospecimens is necessary. However, except for miRNAs in plasma, there are no substantial indications about the pattern of expression of various small RNAs in multiple specimens among healthy humans. By analysing small RNA-sequencing data from 243 samples, we have identified and compared the most abundantly and uniformly expressed miRNAs and non-miRNA species of comparable size with the library preparation in four different specimens (plasma exosomes, stool, urine, and cervical scrapes). Eleven miRNAs were commonly detected among all different specimens while 231 miRNAs were globally unique across them. Classification analysis using these miRNAs provided an accuracy of 99.6% to recognize the sample types. piRNAs and tRNAs were the most represented non-miRNA small RNAs detected in all specimen types that were analysed, particularly in urine samples. With the present data, the most uniformly expressed small RNAs in each sample type were also identified. A signature of small RNAs for each specimen could represent a reference gene set in validation studies by RT-qPCR. Overall, the data reported hereby provide an insight of the constitution of the human miRNome and of other small non-coding RNAs in various specimens of healthy individuals.
Targeted CRISPR disruption reveals a role for RNase MRP RNA in human preribosomal RNA processing.

PubMed

Goldfarb, Katherine C; Cech, Thomas R

2017-01-01

MRP RNA is an abundant, essential noncoding RNA whose functions have been proposed in yeast but are incompletely understood in humans. Mutations in the genomic locus for MRP RNA cause pleiotropic human diseases, including cartilage hair hypoplasia (CHH). Here we applied CRISPR-Cas9 genome editing to disrupt the endogenous human MRP RNA locus, thereby attaining what has eluded RNAi and RNase H experiments: elimination of MRP RNA in the majority of cells. The resulting accumulation of ribosomal RNA (rRNA) precursor-analyzed by RNA fluorescent in situ hybridization (FISH), Northern blots, and RNA sequencing-implicates MRP RNA in pre-rRNA processing. Amelioration of pre-rRNA imbalance is achieved through rescue of MRP RNA levels by ectopic expression. Furthermore, affinity-purified MRP ribonucleoprotein (RNP) from HeLa cells cleaves the human pre-rRNA in vitro at at least one site used in cells, while RNP isolated from cells with CRISPR-edited MRP loci loses this activity, and ectopic MRP RNA expression restores cleavage activity. Thus, a role for RNase MRP in human pre-rRNA processing is established. As demonstrated here, targeted CRISPR disruption is a valuable tool for functional studies of essential noncoding RNAs that are resistant to RNAi and RNase H-based degradation. © 2017 Goldfarb and Cech; Published by Cold Spring Harbor Laboratory Press.
Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fields, C.A.

1996-06-01

The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progressmore » report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.« less
The mitochondrial genome of fission yeast: inability of all introns to splice autocatalytically, and construction and characterization of an intronless genome.

PubMed

Schäfer, B; Merlos-Lange, A M; Anderl, C; Welser, F; Zimmer, M; Wolf, K

1991-01-01

In this paper we report the inability of four group I introns in the gene encoding subunit I of cytochrome c oxidase (cox1) and the group II intron in the apocytochrome b gene (cob) to splice autocatalytically. Furthermore we present the characterization of the first cox1 intron in the mutator strain anar-14 and the construction and characterization of strains with intronless mitochondrial genomes. We provide evidence that removal of introns at the DNA level (termed DNA splicing) is dependent on an active RNA maturase. Finally we demonstrate that the absence of introns does not abolish homologous mitochondrial recombination.
Colonization of heterochromatic genes by transposable elements in Drosophila.

PubMed

Dimitri, Patrizio; Junakovic, Nikolaj; Arcà, Bruno

2003-04-01

As a further step toward understanding transposable element-host genome interactions, we investigated the molecular anatomy of introns from five heterochromatic and 22 euchromatic protein-coding genes of Drosophila melanogaster. A total of 79 kb of intronic sequences from heterochromatic genes and 355 kb of intronic sequences from euchromatic genes have been used in Blast searches against Drosophila transposable elements (TEs). The results show that TE-homologous sequences belonging to 19 different families represent about 50% of intronic DNA from heterochromatic genes. In contrast, only 0.1% of the euchromatic intron DNA exhibits homology to known TEs. Intraspecific and interspecific size polymorphisms of introns were found, which are likely to be associated with changes in TE-related sequences. Together, the enrichment in TEs and the apparent dynamic state of heterochromatic introns suggest that TEs contribute significantly to the evolution of genes located in heterochromatin.
An intron within the 16S ribosomal RNA gene of the archaeon Pyrobaculum aerophilum

NASA Technical Reports Server (NTRS)

Burggraf, S.; Larsen, N.; Woese, C. R.; Stetter, K. O.

1993-01-01

The 16S rRNA genes of Pyrobaculum aerophilum and Pyrobaculum islandicum were amplified by the polymerase chain reaction, and the resulting products were sequenced directly. The two organisms are closely related by this measure (over 98% similar). However, they differ in that the (lone) 16S rRNA gene of Pyrobaculum aerophilum contains a 713-bp intron not seen in the corresponding gene of Pyrobaculum islandicum. To our knowledge, this is the only intron so far reported in the small subunit rRNA gene of a prokaryote. Upon excision the intron is circularized. A secondary structure model of the intron-containing rRNA suggests a splicing mechanism of the same type as that invoked for the tRNA introns of the Archaea and Eucarya and 23S rRNAs of the Archaea. The intron contains an open reading frame whose protein translation shows no certain homology with any known protein sequence.
The group II intron maturase: a reverse transcriptase and splicing factor go hand in hand.

PubMed

Zhao, Chen; Pyle, Anna Marie

2017-12-01

The splicing of group II introns in vivo requires the assistance of a multifunctional intron encoded protein (IEP, or maturase). Each IEP is also a reverse-transcriptase enzyme that enables group II introns to behave as mobile genetic elements. During splicing or retro-transposition, each group II intron forms a tight, specific complex with its own encoded IEP, resulting in a highly reactive holoenzyme. This review focuses on the structural basis for IEP function, as revealed by recent crystal structures of an IEP reverse transcriptase domain and cryo-EM structures of an IEP-intron complex. These structures explain how the same IEP scaffold is utilized for intron recognition, splicing and reverse transcription, while providing a physical basis for understanding the evolutionary transformation of the IEP into the eukaryotic splicing factor Prp8. Copyright © 2017 Elsevier Ltd. All rights reserved.
Physiological role of urothelial cancer-associated one long noncoding RNA in human skeletogenic cell differentiation.

PubMed

Ishikawa, Takanori; Nishida, Takashi; Ono, Mitsuaki; Takarada, Takeshi; Nguyen, Ha Thi; Kurihara, Shinnosuke; Furumatsu, Takayuki; Murase, Yurika; Takigawa, Masaharu; Oohashi, Toshitaka; Kamioka, Hiroshi; Kubota, Satoshi

2018-06-01

A vast number of long-noncoding RNAs (lncRNA) are found expressed in human cells, which RNAs have been developed along with human evolution. However, the physiological functions of these lncRNAs remain mostly unknown. In the present study, we for the first time uncovered the fact that one of such lncRNAs plays a significant role in the differentiation of chondrocytes and, possibly, of osteoblasts differentiated from mesenchymal stem cells, which cells eventually construct the human skeleton. The urothelial cancer-associated 1 (UCA1) lncRNA is known to be associated with several human malignancies. Firstly, we confirmed that UCA1 was expressed in normal human chondrocytes, as well as in a human chondrocytic cell line; whereas it was not detected in human bone marrow mesenchymal stem cells (hBMSCs). Of note, although UCA1 expression was undetectable in hBMSCs, it was markedly induced along with the differentiation toward chondrocytes, suggesting its critical role in chondrogenesis. Consistent with this finding, silencing of the UCA1 gene significantly repressed the expression of chondrogenic genes in human chondrocytic cells. UCA1 gene silencing and hyper-expression also had a significant impact on the osteoblastic phenotype in a human cell line. Finally, forced expression of UCA1 in a murine chondrocyte precursor, which did not possess a UCA1 gene, overdrove its differentiation into chondrocytes. These results indicate a physiological and important role of this lncRNA in the skeletal development of humans, who require more sustained endochondral ossification and osteogenesis than do smaller vertebrates. © 2017 Wiley Periodicals, Inc.
A CRM domain protein functions dually in group I and group II intron splicing in land plant chloroplasts.

PubMed

Asakura, Yukari; Barkan, Alice

2007-12-01

The CRM domain is a recently recognized RNA binding domain found in three group II intron splicing factors in chloroplasts, in a bacterial protein that associates with ribosome precursors, and in a family of uncharacterized proteins in plants. To elucidate the functional repertoire of proteins with CRM domains, we studied CFM2 (for CRM Family Member 2), which harbors four CRM domains. RNA coimmunoprecipitation assays showed that CFM2 in maize (Zea mays) chloroplasts is associated with the group I intron in pre-trnL-UAA and group II introns in the ndhA and ycf3 pre-mRNAs. T-DNA insertions in the Arabidopsis thaliana ortholog condition a defective-seed phenotype (strong allele) or chlorophyll-deficient seedlings with impaired splicing of the trnL group I intron and the ndhA, ycf3-int1, and clpP-int2 group II introns (weak alleles). CFM2 and two previously described CRM proteins are bound simultaneously to the ndhA and ycf3-int1 introns and act in a nonredundant fashion to promote their splicing. With these findings, CRM domain proteins are implicated in the activities of three classes of catalytic RNA: group I introns, group II introns, and 23S rRNA.
Localization of a bacterial group II intron-encoded protein in eukaryotic nuclear splicing-related cell compartments.

PubMed

Nisa-Martínez, Rafael; Laporte, Philippe; Jiménez-Zurdo, José Ignacio; Frugier, Florian; Crespi, Martin; Toro, Nicolás

2013-01-01

Some bacterial group II introns are widely used for genetic engineering in bacteria, because they can be reprogrammed to insert into the desired DNA target sites. There is considerable interest in developing this group II intron gene targeting technology for use in eukaryotes, but nuclear genomes present several obstacles to the use of this approach. The nuclear genomes of eukaryotes do not contain group II introns, but these introns are thought to have been the progenitors of nuclear spliceosomal introns. We investigated the expression and subcellular localization of the bacterial RmInt1 group II intron-encoded protein (IEP) in Arabidopsis thaliana protoplasts. Following the expression of translational fusions of the wild-type protein and several mutant variants with EGFP, the full-length IEP was found exclusively in the nucleolus, whereas the maturase domain alone targeted EGFP to nuclear speckles. The distribution of the bacterial RmInt1 IEP in plant cell protoplasts suggests that the compartmentalization of eukaryotic cells into nucleus and cytoplasm does not prevent group II introns from invading the host genome. Furthermore, the trafficking of the IEP between the nucleolus and the speckles upon maturase inactivation is consistent with the hypothesis that the spliceosomal machinery evolved from group II introns.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.