Sample records for conserved noncoding elements

  1. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs eachmore » inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.« less

  2. Conserved Noncoding Elements in the Most Distant Genera of Cephalochordates: The Goldilocks Principle

    PubMed Central

    Yue, Jia-Xing; Kozmikova, Iryna; Ono, Hiroki; Nossa, Carlos W.; Kozmik, Zbynek; Putnam, Nicholas H.; Yu, Jr-Kai; Holland, Linda Z.

    2016-01-01

    Cephalochordates, the sister group of vertebrates + tunicates, are evolving particularly slowly. Therefore, genome comparisons between two congeners of Branchiostoma revealed so many conserved noncoding elements (CNEs), that it was not clear how many are functional regulatory elements. To more effectively identify CNEs with potential regulatory functions, we compared noncoding sequences of genomes of the most phylogenetically distant cephalochordate genera, Asymmetron and Branchiostoma, which diverged approximately 120–160 million years ago. We found 113,070 noncoding elements conserved between the two species, amounting to 3.3% of the genome. The genomic distribution, target gene ontology, and enriched motifs of these CNEs all suggest that many of them are probably cis-regulatory elements. More than 90% of previously verified amphioxus regulatory elements were re-captured in this study. A search of the cephalochordate CNEs around 50 developmental genes in several vertebrate genomes revealed eight CNEs conserved between cephalochordates and vertebrates, indicating sequence conservation over >500 million years of divergence. The function of five CNEs was tested in reporter assays in zebrafish, and one was also tested in amphioxus. All five CNEs proved to be tissue-specific enhancers. Taken together, these findings indicate that even though Branchiostoma and Asymmetron are distantly related, as they are evolving slowly, comparisons between them are likely optimal for identifying most of their tissue-specific cis-regulatory elements laying the foundation for functional characterizations and a better understanding of the evolution of developmental regulation in cephalochordates. PMID:27412606

  3. Highly conserved elements discovered in vertebrates are present in non-syntenic loci of tunicates, act as enhancers and can be transcribed during development

    PubMed Central

    Sanges, Remo; Hadzhiev, Yavor; Gueroult-Bellone, Marion; Roure, Agnes; Ferg, Marco; Meola, Nicola; Amore, Gabriele; Basu, Swaraj; Brown, Euan R.; De Simone, Marco; Petrera, Francesca; Licastro, Danilo; Strähle, Uwe; Banfi, Sandro; Lemaire, Patrick; Birney, Ewan; Müller, Ferenc; Stupka, Elia

    2013-01-01

    Co-option of cis-regulatory modules has been suggested as a mechanism for the evolution of expression sites during development. However, the extent and mechanisms involved in mobilization of cis-regulatory modules remains elusive. To trace the history of non-coding elements, which may represent candidate ancestral cis-regulatory modules affirmed during chordate evolution, we have searched for conserved elements in tunicate and vertebrate (Olfactores) genomes. We identified, for the first time, 183 non-coding sequences that are highly conserved between the two groups. Our results show that all but one element are conserved in non-syntenic regions between vertebrate and tunicate genomes, while being syntenic among vertebrates. Nevertheless, in all the groups, they are significantly associated with transcription factors showing specific functions fundamental to animal development, such as multicellular organism development and sequence-specific DNA binding. The majority of these regions map onto ultraconserved elements and we demonstrate that they can act as functional enhancers within the organism of origin, as well as in cross-transgenesis experiments, and that they are transcribed in extant species of Olfactores. We refer to the elements as ‘Olfactores conserved non-coding elements’. PMID:23393190

  4. Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

    PubMed

    Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

    2017-03-27

    Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.

  5. Interpreting Mammalian Evolution using Fugu Genome Comparisons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stubbs, L; Ovcharenko, I; Loots, G G

    2004-04-02

    Comparative sequence analysis of the human and the pufferfish Fugu rubripes (fugu) genomes has revealed several novel functional coding and noncoding regions in the human genome. In particular, the fugu genome has been extremely valuable for identifying transcriptional regulatory elements in human loci harboring unusually high levels of evolutionary conservation to rodent genomes. In such regions, the large evolutionary distance between human and fishes provides an additional filter through which functional noncoding elements can be detected with high efficiency.

  6. Conserved expression of transposon-derived non-coding transcripts in primate stem cells.

    PubMed

    Ramsay, LeeAnn; Marchetto, Maria C; Caron, Maxime; Chen, Shu-Huang; Busche, Stephan; Kwan, Tony; Pastinen, Tomi; Gage, Fred H; Bourque, Guillaume

    2017-02-28

    A significant portion of expressed non-coding RNAs in human cells is derived from transposable elements (TEs). Moreover, it has been shown that various long non-coding RNAs (lncRNAs), which come from the human endogenous retrovirus subfamily H (HERVH), are not only expressed but required for pluripotency in human embryonic stem cells (hESCs). To identify additional TE-derived functional non-coding transcripts, we generated RNA-seq data from induced pluripotent stem cells (iPSCs) of four primate species (human, chimpanzee, gorilla, and rhesus) and searched for transcripts whose expression was conserved. We observed that about 30% of TE instances expressed in human iPSCs had orthologous TE instances that were also expressed in chimpanzee and gorilla. Notably, our analysis revealed a number of repeat families with highly conserved expression profiles including HERVH but also MER53, which is known to be the source of a placental-specific family of microRNAs (miRNAs). We also identified a number of repeat families from all classes of TEs, including MLT1-type and Tigger families, that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved. Together, these results describe TE families and TE-derived lncRNAs whose conserved expression patterns can be used to identify what are likely functional TE-derived non-coding transcripts in primate iPSCs.

  7. The Most Deeply Conserved Noncoding Sequences in Plants Serve Similar Functions to Those in Vertebrates Despite Large Differences in Evolutionary Rates[W

    PubMed Central

    Burgess, Diane; Freeling, Michael

    2014-01-01

    In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing–associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates. PMID:24681619

  8. Natural Selection and Functional Potentials of Human Noncoding Elements Revealed by Analysis of Next Generation Sequencing Data

    PubMed Central

    Xu, Shuhua

    2015-01-01

    Noncoding DNA sequences (NCS) have attracted much attention recently due to their functional potentials. Here we attempted to reveal the functional roles of noncoding sequences from the point of view of natural selection that typically indicates the functional potentials of certain genomic elements. We analyzed nearly 37 million single nucleotide polymorphisms (SNPs) of Phase I data of the 1000 Genomes Project. We estimated a series of key parameters of population genetics and molecular evolution to characterize sequence variations of the noncoding genome within and between populations, and identified the natural selection footprints in NCS in worldwide human populations. Our results showed that purifying selection is prevalent and there is substantial constraint of variations in NCS, while positive selectionis more likely to be specific to some particular genomic regions and regional populations. Intriguingly, we observed larger fraction of non-conserved NCS variants with lower derived allele frequency in the genome, indicating possible functional gain of non-conserved NCS. Notably, NCS elements are enriched for potentially functional markers such as eQTLs, TF motif, and DNase I footprints in the genome. More interestingly, some NCS variants associated with diseases such as Alzheimer's disease, Type 1 diabetes, and immune-related bowel disorder (IBD) showed signatures of positive selection, although the majority of NCS variants, reported as risk alleles by genome-wide association studies, showed signatures of negative selection. Our analyses provided compelling evidence of natural selection forces on noncoding sequences in the human genome and advanced our understanding of their functional potentials that play important roles in disease etiology and human evolution. PMID:26053627

  9. Early Evolution of Conserved Regulatory Sequences Associated with Development in Vertebrates

    PubMed Central

    McEwen, Gayle K.; Goode, Debbie K.; Parker, Hugo J.; Woolfe, Adam; Callaway, Heather; Elgar, Greg

    2009-01-01

    Comparisons between diverse vertebrate genomes have uncovered thousands of highly conserved non-coding sequences, an increasing number of which have been shown to function as enhancers during early development. Despite their extreme conservation over 500 million years from humans to cartilaginous fish, these elements appear to be largely absent in invertebrates, and, to date, there has been little understanding of their mode of action or the evolutionary processes that have modelled them. We have now exploited emerging genomic sequence data for the sea lamprey, Petromyzon marinus, to explore the depth of conservation of this type of element in the earliest diverging extant vertebrate lineage, the jawless fish (agnathans). We searched for conserved non-coding elements (CNEs) at 13 human gene loci and identified lamprey elements associated with all but two of these gene regions. Although markedly shorter and less well conserved than within jawed vertebrates, identified lamprey CNEs are able to drive specific patterns of expression in zebrafish embryos, which are almost identical to those driven by the equivalent human elements. These CNEs are therefore a unique and defining characteristic of all vertebrates. Furthermore, alignment of lamprey and other vertebrate CNEs should permit the identification of persistent sequence signatures that are responsible for common patterns of expression and contribute to the elucidation of the regulatory language in CNEs. Identifying the core regulatory code for development, common to all vertebrates, provides a foundation upon which regulatory networks can be constructed and might also illuminate how large conserved regulatory sequence blocks evolve and become fixed in genomic DNA. PMID:20011110

  10. Genetic evidence for conserved non-coding element function across species–the ears have it

    PubMed Central

    Turner, Eric E.; Cox, Timothy C.

    2014-01-01

    Comparison of genomic sequences from diverse vertebrate species has revealed numerous highly conserved regions that do not appear to encode proteins or functional RNAs. Often these “conserved non-coding elements,” or CNEs, can direct gene expression to specific tissues in transgenic models, demonstrating they have regulatory function. CNEs are frequently found near “developmental” genes, particularly transcription factors, implying that these elements have essential regulatory roles in development. However, actual examples demonstrating CNE regulatory functions across species have been few, and recent loss-of-function studies of several CNEs in mice have shown relatively minor effects. In this Perspectives article, we discuss new findings in “fancy” rats and Highland cattle demonstrating that function of a CNE near the Hmx1 gene is crucial for normal external ear development and when disrupted can mimic loss-of function Hmx1 coding mutations in mice and humans. These findings provide important support for conserved developmental roles of CNEs in divergent species, and reinforce the concept that CNEs should be examined systematically in the ongoing search for genetic causes of human developmental disorders in the era of genome-scale sequencing. PMID:24478720

  11. RNA Polymerase III promoter screen uncovers a novel noncoding RNA family conserved in Caenorhabditis and other clade V nematodes.

    PubMed

    Gruber, Andreas R

    2014-07-10

    RNA Polymerase III is a highly specialized enzyme complex responsible for the transcription of a very distinct set of housekeeping noncoding RNAs including tRNAs, 7SK snRNA, Y RNAs, U6 snRNA, and the RNA components of RNaseP and RNaseMRP. In this work we have utilized the conserved promoter structure of known RNA Polymerase III transcripts consisting of characteristic sequence elements termed proximal sequence elements (PSE) A and B and a TATA-box to uncover a novel RNA Polymerase III-transcribed, noncoding RNA family found to be conserved in Caenorhabditis as well as other clade V nematode species. Homology search in combination with detailed sequence and secondary structure analysis revealed that members of this novel ncRNA family evolve rapidly, and only maintain a potentially functional small stem structure that links the 5' end to the very 3' end of the transcript and a small hairpin structure at the 3' end. This is most likely required for efficient transcription termination. In addition, our study revealed evidence that canonical C/D box snoRNAs are also transcribed from a PSE A-PSE B-TATA-box promoter in Caenorhabditis elegans. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. RNA expression in a cartilaginous fish cell line reveals ancient 3′ noncoding regions highly conserved in vertebrates

    PubMed Central

    Forest, David; Nishikawa, Ryuhei; Kobayashi, Hiroshi; Parton, Angela; Bayne, Christopher J.; Barnes, David W.

    2007-01-01

    We have established a cartilaginous fish cell line [Squalus acanthias embryo cell line (SAE)], a mesenchymal stem cell line derived from the embryo of an elasmobranch, the spiny dogfish shark S. acanthias. Elasmobranchs (sharks and rays) first appeared >400 million years ago, and existing species provide useful models for comparative vertebrate cell biology, physiology, and genomics. Comparative vertebrate genomics among evolutionarily distant organisms can provide sequence conservation information that facilitates identification of critical coding and noncoding regions. Although these genomic analyses are informative, experimental verification of functions of genomic sequences depends heavily on cell culture approaches. Using ESTs defining mRNAs derived from the SAE cell line, we identified lengthy and highly conserved gene-specific nucleotide sequences in the noncoding 3′ UTRs of eight genes involved in the regulation of cell growth and proliferation. Conserved noncoding 3′ mRNA regions detected by using the shark nucleotide sequences as a starting point were found in a range of other vertebrate orders, including bony fish, birds, amphibians, and mammals. Nucleotide identity of shark and human in these regions was remarkably well conserved. Our results indicate that highly conserved gene sequences dating from the appearance of jawed vertebrates and representing potential cis-regulatory elements can be identified through the use of cartilaginous fish as a baseline. Because the expression of genes in the SAE cell line was prerequisite for their identification, this cartilaginous fish culture system also provides a physiologically valid tool to test functional hypotheses on the role of these ancient conserved sequences in comparative cell biology. PMID:17227856

  13. Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics.

    PubMed

    Edwards, Scott V; Cloutier, Alison; Baker, Allan J

    2017-11-01

    Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600-∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biologists.

  14. Conserved Nonexonic Elements: A Novel Class of Marker for Phylogenomics

    PubMed Central

    Cloutier, Alison; Baker, Allan J.

    2017-01-01

    Abstract Noncoding markers have a particular appeal as tools for phylogenomic analysis because, at least in vertebrates, they appear less subject to strong variation in GC content among lineages. Thus far, ultraconserved elements (UCEs) and introns have been the most widely used noncoding markers. Here we analyze and study the evolutionary properties of a new type of noncoding marker, conserved nonexonic elements (CNEEs), which consists of noncoding elements that are estimated to evolve slower than the neutral rate across a set of species. Although they often include UCEs, CNEEs are distinct from UCEs because they are not ultraconserved, and, most importantly, the core region alone is analyzed, rather than both the core and its flanking regions. Using a data set of 16 birds plus an alligator outgroup, and ∼3600–∼3800 loci per marker type, we found that although CNEEs were less variable than bioinformatically derived UCEs or introns and in some cases exhibited a slower approach to branch resolution as determined by phylogenomic subsampling, the quality of CNEE alignments was superior to those of the other markers, with fewer gaps and missing species. Phylogenetic resolution using coalescent approaches was comparable among the three marker types, with most nodes being fully and congruently resolved. Comparison of phylogenetic results across the three marker types indicated that one branch, the sister group to the passerine + falcon clade, was resolved differently and with moderate (>70%) bootstrap support between CNEEs and UCEs or introns. Overall, CNEEs appear to be promising as phylogenomic markers, yielding phylogenetic resolution as high as for UCEs and introns but with fewer gaps, less ambiguity in alignments and with patterns of nucleotide substitution more consistent with the assumptions of commonly used methods of phylogenetic analysis. PMID:28637293

  15. Genetics Home Reference: isolated Pierre Robin sequence

    MedlinePlus

    ... PG, Fitzpatrick DR, Lyonnet S. Highly conserved non-coding elements on either side of SOX9 associated with Pierre ... Citation on PubMed or Free article on PubMed Central Jakobsen LP, Ullmann R, Christensen SB, Jensen KE, ...

  16. Small RNAs, big impact: small RNA pathways in transposon control and their effect on the host stress response.

    PubMed

    Wheeler, Bayly S

    2013-12-01

    Transposons are mobile genetic elements that are a major constituent of most genomes. Organisms regulate transposable element expression, transposition, and insertion site preference, mitigating the genome instability caused by uncontrolled transposition. A recent burst of research has demonstrated the critical role of small non-coding RNAs in regulating transposition in fungi, plants, and animals. While mechanistically distinct, these pathways work through a conserved paradigm. The presence of a transposon is communicated by the presence of its RNA or by its integration into specific genomic loci. These signals are then translated into small non-coding RNAs that guide epigenetic modifications and gene silencing back to the transposon. In addition to being regulated by the host, transposable elements are themselves capable of influencing host gene expression. Transposon expression is responsive to environmental signals, and many transposons are activated by various cellular stresses. TEs can confer local gene regulation by acting as enhancers and can also confer global gene regulation through their non-coding RNAs. Thus, transposable elements can act as stress-responsive regulators that control host gene expression in cis and trans.

  17. Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences.

    PubMed

    Bergman, C M; Kreitman, M

    2001-08-01

    Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.

  18. Theria-Specific Homeodomain and cis-Regulatory Element Evolution of the Dlx3–4 Bigene Cluster in 12 Different Mammalian Species

    PubMed Central

    SUMIYAMA, KENTA; MIYAKE, TSUTOMU; GRIMWOOD, JANE; STUART, ANDREW; DICKSON, MARK; SCHMUTZ, JEREMY; RUDDLE, FRANK H.; MYERS, RICHARD M.; AMEMIYA, CHRIS T.

    2013-01-01

    The mammalian Dlx3 and Dlx4 genes are configured as a bigene cluster, and their respective expression patterns are controlled temporally and spatially by cis-elements that largely reside within the intergenic region of the cluster. Previous work revealed that there are conspicuously conserved elements within the intergenic region of the Dlx3–4 bigene clusters of mouse and human. In this paper we have extended these analyses to include 12 additional mammalian taxa (including a marsupial and a monotreme) in order to better define the nature and molecular evolutionary trends of the coding and non-coding functional elements among morphologically divergent mammals. Dlx3–4 regions were fully sequenced from 12 divergent taxa of interest. We identified three theria-specific amino acid replacements in homeodomain of Dlx4 gene that functions in placenta. Sequence analyses of constrained nucleotide sites in the intergenic non-coding region showed that many of the intergenic conserved elements are highly conserved and have evolved slowly within the mammals. In contrast, a branchial arch/craniofacial enhancer I37-2 exhibited accelerated evolution at the branch between the monotreme and therian common ancestor despite being highly conserved among therian species. Functional analysis of I37-2 in transgenic mice has shown that the equivalent region of the platypus fails to drive transcriptional activity in branchial arches. These observations, taken together with our molecular evolutionary data, suggest that theria-specific episodic changes in the I37-2 element may have contributed to craniofacial innovation at the base of the mammalian lineage. PMID:22951979

  19. Trichodesmium genome maintains abundant, widespread noncoding DNA in situ, despite oligotrophic lifestyle

    DOE PAGES

    Walworth, Nathan; Pfreundt, Ulrike; Nelson, William C.; ...

    2015-03-23

    Understanding the evolution of the free-living, cyanobacterial, diazotroph Trichodesmium is of great importance because of its critical role in oceanic biogeochemistry and primary production. Unlike the other >150 available genomes of free-living cyanobacteria, only 63.8% of the Trichodesmium erythraeum (strain IMS101) genome is predicted to encode protein, which is 20–25% less than the average for other cyanobacteria and nonpathogenic, free-living bacteria. In this paper, we use distinctive isolates and metagenomic data to show that low coding density observed in IMS101 is a common feature of the Trichodesmium genus, both in culture and in situ. Transcriptome analysis indicates that 86% ofmore » the noncoding space is expressed, although the function of these transcripts is unclear. The density of noncoding, possible regulatory elements predicted in Trichodesmium, when normalized per intergenic kilobase, was comparable and twofold higher than that found in the gene-dense genomes of the sympatric cyanobacterial genera Synechococcus and Prochlorococcus, respectively. Conserved Trichodesmium noncoding RNA secondary structures were predicted between most culture and metagenomic sequences, lending support to the structural conservation. Conservation of these intergenic regions in spatiotemporally separated Trichodesmium populations suggests possible genus-wide selection for their maintenance. These large intergenic spacers may have developed during intervals of strong genetic drift caused by periodic blooms of a subset of genotypes, which may have reduced effective population size. Finally, our data suggest that transposition of selfish DNA, low effective population size, and high-fidelity replication allowed the unusual “inflation” of noncoding sequence observed in Trichodesmium despite its oligotrophic lifestyle.« less

  20. Disruption of long-distance highly conserved noncoding elements in neurocristopathies.

    PubMed

    Amiel, Jeanne; Benko, Sabina; Gordon, Christopher T; Lyonnet, Stanislas

    2010-12-01

    One of the key discoveries of vertebrate genome sequencing projects has been the identification of highly conserved noncoding elements (CNEs). Some characteristics of CNEs include their high frequency in mammalian genomes, their potential regulatory role in gene expression, and their enrichment in gene deserts nearby master developmental genes. The abnormal development of neural crest cells (NCCs) leads to a broad spectrum of congenital malformation(s), termed neurocristopathies, and/or tumor predisposition. Here we review recent findings that disruptions of CNEs, within or at long distance from the coding sequences of key genes involved in NCC development, result in neurocristopathies via the alteration of tissue- or stage-specific long-distance regulation of gene expression. While most studies on human genetic disorders have focused on protein-coding sequences, these examples suggest that investigation of genomic alterations of CNEs will provide a broader understanding of the molecular etiology of both rare and common human congenital malformations. © 2010 New York Academy of Sciences.

  1. Human Variation in Short Regions Predisposed to Deep Evolutionary Conservation

    PubMed Central

    Loots, Gabriela G.; Ovcharenko, Ivan

    2010-01-01

    The landscape of the human genome consists of millions of short islands of conservation that are 100% conserved across multiple vertebrate genomes (termed “bricks”), the majority of which are located in noncoding regions. Several hundred thousand bricks are deeply conserved reaching the genomes of amphibians and fish. Deep phylogenetic conservation of noncoding DNA has been reported to be strongly associated with the presence of gene regulatory elements, introducing bricks as a proxy to the functional noncoding landscape of the human genome. Here, we report a significant overrepresentation of bricks in the promoters of transcription factors and developmental genes, where the high level of phylogenetic conservation correlates with an increase in brick overrepresentation. We also found that the presence of a brick dictates a predisposition to evolutionary constraint, with only 0.7% of the amniota brick central nucleotides being diverged within the primate lineage—an 11-fold reduction in the divergence rate compared with random expectation. Human single-nucleotide polymorphism (SNP) data explains only 3% of primate-specific variation in amniota bricks, thus arguing for a widespread fixation of brick mutations within the primate lineage and prior to human radiation. This variation, in turn, might have been utilized as a driving force for primate- and hominoid-specific adaptation. We also discovered a pronounced deviation from the evolutionary predisposition in the human lineage, with over 20-fold increase in the substitution rate at brick SNP sites over expected values. In addition, contrary to typical brick mutations, brick variation commonly encountered in the human population displays limited, if any, signatures of negative selection as measured by the minor allele frequency and population differentiation (F-statistical measure) measures. These observations argue for the plasticity of gene regulatory mechanisms in vertebrates—with evidence of strong purifying selection acting on the gene regulatory landscape of the human genome, where widespread advantageous mutations in putative regulatory elements are likely utilized in functional diversification and adaptation of species. PMID:20093432

  2. A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes.

    PubMed

    Hezroni, Hadas; Ben-Tov Perry, Rotem; Meir, Zohar; Housman, Gali; Lubelsky, Yoav; Ulitsky, Igor

    2017-08-30

    Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.

  3. Long non-coding RNA produced by RNA polymerase V determines boundaries of heterochromatin

    PubMed Central

    Böhmdorfer, Gudrun; Sethuraman, Shriya; Rowley, M Jordan; Krzyszton, Michal; Rothi, M Hafiz; Bouzit, Lilia; Wierzbicki, Andrzej T

    2016-01-01

    RNA-mediated transcriptional gene silencing is a conserved process where small RNAs target transposons and other sequences for repression by establishing chromatin modifications. A central element of this process are long non-coding RNAs (lncRNA), which in Arabidopsis thaliana are produced by a specialized RNA polymerase known as Pol V. Here we show that non-coding transcription by Pol V is controlled by preexisting chromatin modifications located within the transcribed regions. Most Pol V transcripts are associated with AGO4 but are not sliced by AGO4. Pol V-dependent DNA methylation is established on both strands of DNA and is tightly restricted to Pol V-transcribed regions. This indicates that chromatin modifications are established in close proximity to Pol V. Finally, Pol V transcription is preferentially enriched on edges of silenced transposable elements, where Pol V transcribes into TEs. We propose that Pol V may play an important role in the determination of heterochromatin boundaries. DOI: http://dx.doi.org/10.7554/eLife.19092.001 PMID:27779094

  4. Molecular Evolution of the Non-Coding Eosinophil Granule Ontogeny Transcript

    PubMed Central

    Rose, Dominic; Stadler, Peter F.

    2011-01-01

    Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs). The evolutionary history of mlncRNAs is still largely uncharted territory. In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT), an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs). EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyze patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrate here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved, and thermodynamic stable secondary structures. Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element. PMID:22303364

  5. Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules

    PubMed Central

    Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex

    2012-01-01

    Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789

  6. Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.

    PubMed

    Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex

    2012-01-01

    Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.

  7. Comparative sequence analysis of the X-inactivation center region in mouse, human, and bovine.

    PubMed

    Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

    2002-06-01

    We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5' of Xist that was recently shown to attract histone modification early after the onset of X inactivation.

  8. Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species.

    PubMed

    Hezroni, Hadas; Koppstein, David; Schwartz, Matthew G; Avrutin, Alexandra; Bartel, David P; Ulitsky, Igor

    2015-05-19

    The inability to predict long noncoding RNAs from genomic sequence has impeded the use of comparative genomics for studying their biology. Here, we develop methods that use RNA sequencing (RNA-seq) data to annotate the transcriptomes of 16 vertebrates and the echinoid sea urchin, uncovering thousands of previously unannotated genes, most of which produce long intervening noncoding RNAs (lincRNAs). Although in each species, >70% of lincRNAs cannot be traced to homologs in species that diverged >50 million years ago, thousands of human lincRNAs have homologs with similar expression patterns in other species. These homologs share short, 5'-biased patches of sequence conservation nested in exonic architectures that have been extensively rewired, in part by transposable element exonization. Thus, over a thousand human lincRNAs are likely to have conserved functions in mammals, and hundreds beyond mammals, but those functions require only short patches of specific sequences and can tolerate major changes in gene architecture. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  9. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less

  10. SHOX gene and conserved noncoding element deletions/duplications in Colombian patients with idiopathic short stature.

    PubMed

    Sandoval, Gloria Tatiana Vinasco; Jaimes, Giovanna Carola; Barrios, Mauricio Coll; Cespedes, Camila; Velasco, Harvy Mauricio

    2014-03-01

    SHOX gene mutations or haploinsufficiency cause a wide range of phenotypes such as Leri Weill dyschondrosteosis (LWD), Turner syndrome, and disproportionate short stature (DSS). However, this gene has also been found to be mutated in cases of idiopathic short stature (ISS) with a 3-15% frequency. In this study, the multiplex ligation-dependent probe amplification (MLPA) technique was employed to determine the frequency of SHOX gene mutations and their conserved noncoding elements (CNE) in Colombian patients with ISS. Patients were referred from different centers around the county. From a sample of 62 patients, 8.1% deletions and insertions in the intragenic regions and in the CNE were found. This result is similar to others published in other countries. Moreover, an isolated case of CNE 9 duplication and a new intron 6b deletion in another patient, associated with ISS, are described. This is one of the first studies of a Latin American population in which deletions/duplications of the SHOX gene and its CNE are examined in patients with ISS.

  11. SHOX gene and conserved noncoding element deletions/duplications in Colombian patients with idiopathic short stature

    PubMed Central

    Sandoval, Gloria Tatiana Vinasco; Jaimes, Giovanna Carola; Barrios, Mauricio Coll; Cespedes, Camila; Velasco, Harvy Mauricio

    2014-01-01

    SHOX gene mutations or haploinsufficiency cause a wide range of phenotypes such as Leri Weill dyschondrosteosis (LWD), Turner syndrome, and disproportionate short stature (DSS). However, this gene has also been found to be mutated in cases of idiopathic short stature (ISS) with a 3–15% frequency. In this study, the multiplex ligation-dependent probe amplification (MLPA) technique was employed to determine the frequency of SHOX gene mutations and their conserved noncoding elements (CNE) in Colombian patients with ISS. Patients were referred from different centers around the county. From a sample of 62 patients, 8.1% deletions and insertions in the intragenic regions and in the CNE were found. This result is similar to others published in other countries. Moreover, an isolated case of CNE 9 duplication and a new intron 6b deletion in another patient, associated with ISS, are described. This is one of the first studies of a Latin American population in which deletions/duplications of the SHOX gene and its CNE are examined in patients with ISS. PMID:24689071

  12. An Ultraconserved Brain-specific Enhancer within ADGRL3 (LPHN3) Underpins ADHD Susceptibility

    PubMed Central

    Martinez, Ariel F.; Abe, Yu; Hong, Sungkook; Molyneux, Kevin; Yarnell, David; Löhr, Heiko; Driever, Wolfgang; Acosta, Maria T.; Arcos-Burgos, Mauricio; Muenke, Maximilian

    2016-01-01

    BACKGROUND Genetic factors predispose to attention deficit/hyperactivity disorder (ADHD). Previous studies have reported linkage and association to ADHD of gene variants within ADGRL3. In this study, we functionally analyzed non-coding variants in this gene as likely pathological contributors. METHODS In silico, in vitro and in vivo approaches were used to identify and characterize evolutionary conserved elements within the ADGRL3 linkage region (~207 Kb). Family-based genetic analyses on 838 individuals (372 affected and 466 unaffected) identified ADHD-associated SNPs harbored in some of these conserved elements. Luciferase assays and zebrafish GFP transgenesis tested conserved elements for transcriptional enhancer activity. Electromobility shift assays were used to verify transcription factor binding disruption by ADHD risk alleles. RESULTS An ultraconserved element was discovered (ECR47) that functions as a transcriptional enhancer. A three-variant ADHD risk haplotype in ECR47, formed by rs17226398, rs56038622 and rs2271338, reduced enhancer activity by 40% in neuroblastoma and astrocytoma cells (PBonferroni<0.0001). This enhancer also drove GFP expression in the zebrafish brain in a tissue-specific manner, sharing aspects of endogenous ADGRL3 expression. The rs2271338 risk allele disrupts binding of YY1, an important factor in the development and function of the central nervous system. Expression quantitative trait loci analysis of post-mortem human brain tissues revealed an association between rs2271338 and reduced ADGRL3 expression in the thalamus. CONCLUSIONS These results uncover the first functional evidence of common non-coding variants with potential implications for the pathology of ADHD. PMID:27692237

  13. Potential Novel Mechanism for Axenfeld-Rieger Syndrome: Deletion of a Distant Region Containing Regulatory Elements of PITX2

    PubMed Central

    Volkmann, Bethany A.; Zinkevich, Natalya S.; Mustonen, Aki; Schilter, Kala F.; Bosenko, Dmitry V.; Reis, Linda M.; Broeckel, Ulrich; Link, Brian A.

    2011-01-01

    Purpose. Mutations in PITX2 are associated with Axenfeld-Rieger syndrome (ARS), which involves ocular, dental, and umbilical abnormalities. Identification of cis-regulatory elements of PITX2 is important to better understand the mechanisms of disease. Methods. Conserved noncoding elements surrounding PITX2/pitx2 were identified and examined through transgenic analysis in zebrafish; expression pattern was studied by in situ hybridization. Patient samples were screened for deletion/duplication of the PITX2 upstream region using arrays and probes. Results. Zebrafish pitx2 demonstrates conserved expression during ocular and craniofacial development. Thirteen conserved noncoding sequences positioned within a gene desert as far as 1.1 Mb upstream of the human PITX2 gene were identified; 11 have enhancer activities consistent with pitx2 expression. Ten elements mediated expression in the developing brain, four regions were active during eye formation, and two sequences were associated with craniofacial expression. One region, CE4, located approximately 111 kb upstream of PITX2, directed a complex pattern including expression in the developing eye and craniofacial region, the classic sites affected in ARS. Screening of ARS patients identified an approximately 7600-kb deletion that began 106 to 108 kb upstream of the PITX2 gene, leaving PITX2 intact while removing regulatory elements CE4 to CE13. Conclusions. These data suggest the presence of a complex distant regulatory matrix within the gene desert located upstream of PITX2 with an essential role in its activity and provides a possible mechanism for the previous reports of ARS in patients with balanced translocations involving the 4q25 region upstream of PITX2 and the current patient with an upstream deletion. PMID:20881290

  14. Trichodesmium genome maintains abundant, widespread noncoding DNA in situ, despite oligotrophic lifestyle

    DOE PAGES

    Walworth, Nathan G.; Pfreundt, Ulrike; Nelson, William C.; ...

    2015-04-07

    Understanding the evolution of the free-living, cyanobacterial, diazotroph Trichodesmium is of great importance due to its critical role in oceanic biogeochemistry and primary production. Unlike the other >150 available genomes of free-living cyanobacteria, only 63.8% of the Trichodesmium erythraeum (strain IMS101) genome is predicted to encode protein, which is 20-25% less than the average for other cyanobacteria and non-pathogenic, free-living bacteria. We use distinctive isolates and metagenomic data to show that low coding density observed in IMS101 is a common feature of the Trichodesmium genus both in culture and in situ. Transcriptome analysis indicates that 86% of the non-coding spacemore » is expressed, although the function of these transcripts is unclear. The density of noncoding, possible regulatory elements predicted in Trichodesmium, when normalized per intergenic kilobase, was comparable and two fold higher than that found in the gene dense genomes of the sympatric cyanobacterial genera Synechococcus and Prochlorococcus, respectively. Conserved Trichodesmium ncRNA secondary structures were predicted between most culture and metagenomic sequences lending support to the structural conservation. Conservation of these intergenic regions in spatiotemporally separated Trichodesmium populations suggests possible genus-wide selection for their maintenance. These large intergenic spacers may have developed during intervals of strong genetic drift caused by periodic blooms of a subset of genotypes, which may have reduced effective population size. Our data suggest that transposition of selfish DNA, low effective population size, and high fidelity replication allowed the unusual ‘inflation’ of noncoding sequence observed in Trichodesmium despite its oligotrophic lifestyle.« less

  15. Cell cycle, oncogenic and tumor suppressor pathways regulate numerous long and macro non-protein-coding RNAs

    PubMed Central

    2014-01-01

    Background The genome is pervasively transcribed but most transcripts do not code for proteins, constituting non-protein-coding RNAs. Despite increasing numbers of functional reports of individual long non-coding RNAs (lncRNAs), assessing the extent of functionality among the non-coding transcriptional output of mammalian cells remains intricate. In the protein-coding world, transcripts differentially expressed in the context of processes essential for the survival of multicellular organisms have been instrumental in the discovery of functionally relevant proteins and their deregulation is frequently associated with diseases. We therefore systematically identified lncRNAs expressed differentially in response to oncologically relevant processes and cell-cycle, p53 and STAT3 pathways, using tiling arrays. Results We found that up to 80% of the pathway-triggered transcriptional responses are non-coding. Among these we identified very large macroRNAs with pathway-specific expression patterns and demonstrated that these are likely continuous transcripts. MacroRNAs contain elements conserved in mammals and sauropsids, which in part exhibit conserved RNA secondary structure. Comparing evolutionary rates of a macroRNA to adjacent protein-coding genes suggests a local action of the transcript. Finally, in different grades of astrocytoma, a tumor disease unrelated to the initially used cell lines, macroRNAs are differentially expressed. Conclusions It has been shown previously that the majority of expressed non-ribosomal transcripts are non-coding. We now conclude that differential expression triggered by signaling pathways gives rise to a similar abundance of non-coding content. It is thus unlikely that the prevalence of non-coding transcripts in the cell is a trivial consequence of leaky or random transcription events. PMID:24594072

  16. Divergent evolutionary rates in vertebrate and mammalian specific conserved non-coding elements (CNEs) in echolocating mammals.

    PubMed

    Davies, Kalina T J; Tsagkogeorga, Georgia; Rossiter, Stephen J

    2014-12-19

    The majority of DNA contained within vertebrate genomes is non-coding, with a certain proportion of this thought to play regulatory roles during development. Conserved Non-coding Elements (CNEs) are an abundant group of putative regulatory sequences that are highly conserved across divergent groups and thus assumed to be under strong selective constraint. Many CNEs may contain regulatory factor binding sites, and their frequent spatial association with key developmental genes - such as those regulating sensory system development - suggests crucial roles in regulating gene expression and cellular patterning. Yet surprisingly little is known about the molecular evolution of CNEs across diverse mammalian taxa or their role in specific phenotypic adaptations. We examined 3,110 vertebrate-specific and ~82,000 mammalian-specific CNEs across 19 and 9 mammalian orders respectively, and tested for changes in the rate of evolution of CNEs located in the proximity of genes underlying the development or functioning of auditory systems. As we focused on CNEs putatively associated with genes underlying the development/functioning of auditory systems, we incorporated echolocating taxa in our dataset because of their highly specialised and derived auditory systems. Phylogenetic reconstructions of concatenated CNEs broadly recovered accepted mammal relationships despite high levels of sequence conservation. We found that CNE substitution rates were highest in rodents and lowest in primates, consistent with previous findings. Comparisons of CNE substitution rates from several genomic regions containing genes linked to auditory system development and hearing revealed differences between echolocating and non-echolocating taxa. Wider taxonomic sampling of four CNEs associated with the homeobox genes Hmx2 and Hmx3 - which are required for inner ear development - revealed family-wise variation across diverse bat species. Specifically within one family of echolocating bats that utilise frequency-modulated echolocation calls varying widely in frequency and intensity high levels of sequence divergence were found. Levels of selective constraint acting on CNEs differed both across genomic locations and taxa, with observed variation in substitution rates of CNEs among bat species. More work is needed to determine whether this variation can be linked to echolocation, and wider taxonomic sampling is necessary to fully document levels of conservation in CNEs across diverse taxa.

  17. Mutation in a primate-conserved retrotransposon reveals a noncoding RNA as a mediator of infantile encephalopathy

    PubMed Central

    Cartault, François; Munier, Patrick; Benko, Edgar; Desguerre, Isabelle; Hanein, Sylvain; Boddaert, Nathalie; Bandiera, Simonetta; Vellayoudom, Jeanine; Krejbich-Trotot, Pascale; Bintner, Marc; Hoarau, Jean-Jacques; Girard, Muriel; Génin, Emmanuelle; de Lonlay, Pascale; Fourmaintraux, Alain; Naville, Magali; Rodriguez, Diana; Feingold, Josué; Renouil, Michel; Munnich, Arnold; Westhof, Eric; Fähling, Michael; Lyonnet, Stanislas; Henrion-Caude, Alexandra

    2012-01-01

    The human genome is densely populated with transposons and transposon-like repetitive elements. Although the impact of these transposons and elements on human genome evolution is recognized, the significance of subtle variations in their sequence remains mostly unexplored. Here we report homozygosity mapping of an infantile neurodegenerative disease locus in a genetic isolate. Complete DNA sequencing of the 400-kb linkage locus revealed a point mutation in a primate-specific retrotransposon that was transcribed as part of a unique noncoding RNA, which was expressed in the brain. In vitro knockdown of this RNA increased neuronal apoptosis, consistent with the inappropriate dosage of this RNA in vivo and with the phenotype. Moreover, structural analysis of the sequence revealed a small RNA-like hairpin that was consistent with the putative gain of a functional site when mutated. We show here that a mutation in a unique transposable element-containing RNA is associated with lethal encephalopathy, and we suggest that RNAs that harbor evolutionarily recent repetitive elements may play important roles in human brain development. PMID:22411793

  18. Silencing Effect of Hominoid Highly Conserved Noncoding Sequences on Embryonic Brain Development

    PubMed Central

    Mahmoudi Saber, Morteza

    2017-01-01

    Abstract Superfamily Hominoidea, which consists of Hominidae (humans and great apes) and Hylobatidae (gibbons), is well-known for sharing human-like characteristics, however, the genomic origins of these shared unique phenotypes have mainly remained elusive. To decipher the underlying genomic basis of Hominoidea-restricted phenotypes, we identified and characterized Hominoidea-restricted highly conserved noncoding sequences (HCNSs) that are a class of potential regulatory elements which may be involved in evolution of lineage-specific phenotypes. We discovered 679 such HCNSs from human, chimpanzee, gorilla, orangutan and gibbon genomes. These HCNSs were demonstrated to be under purifying selection but with lineage-restricted characteristics different from old CNSs. A significant proportion of their ancestral sequences had accelerated rates of nucleotide substitutions, insertions and deletions during the evolution of common ancestor of Hominoidea, suggesting the intervention of positive Darwinian selection for creating those HCNSs. In contrary to enhancer elements and similar to silencer sequences, these Hominoidea-restricted HCNSs are located in close proximity of transcription start sites. Their target genes are enriched in the nervous system, development and transcription, and they tend to be remotely located from the nearest coding gene. Chip-seq signals and gene expression patterns suggest that Hominoidea-restricted HCNSs are likely to be functional regulatory elements by imposing silencing effects on their target genes in a tissue-restricted manner during fetal brain development. These HCNSs, emerged through adaptive evolution and conserved through purifying selection, represent a set of promising targets for future functional studies of the evolution of Hominoidea-restricted phenotypes. PMID:28633494

  19. Parallel evolution of chordate cis-regulatory code for development.

    PubMed

    Doglio, Laura; Goode, Debbie K; Pelleri, Maria C; Pauls, Stefan; Frabetti, Flavia; Shimeld, Sebastian M; Vavouri, Tanya; Elgar, Greg

    2013-11-01

    Urochordates are the closest relatives of vertebrates and at the larval stage, possess a characteristic bilateral chordate body plan. In vertebrates, the genes that orchestrate embryonic patterning are in part regulated by highly conserved non-coding elements (CNEs), yet these elements have not been identified in urochordate genomes. Consequently the evolution of the cis-regulatory code for urochordate development remains largely uncharacterised. Here, we use genome-wide comparisons between C. intestinalis and C. savignyi to identify putative urochordate cis-regulatory sequences. Ciona conserved non-coding elements (ciCNEs) are associated with largely the same key regulatory genes as vertebrate CNEs. Furthermore, some of the tested ciCNEs are able to activate reporter gene expression in both zebrafish and Ciona embryos, in a pattern that at least partially overlaps that of the gene they associate with, despite the absence of sequence identity. We also show that the ability of a ciCNE to up-regulate gene expression in vertebrate embryos can in some cases be localised to short sub-sequences, suggesting that functional cross-talk may be defined by small regions of ancestral regulatory logic, although functional sub-sequences may also be dispersed across the whole element. We conclude that the structure and organisation of cis-regulatory modules is very different between vertebrates and urochordates, reflecting their separate evolutionary histories. However, functional cross-talk still exists because the same repertoire of transcription factors has likely guided their parallel evolution, exploiting similar sets of binding sites but in different combinations.

  20. Inverted repeat Alu elements in the human lincRNA-p21 adopt a conserved secondary structure that regulates RNA function

    PubMed Central

    Chillón, Isabel; Pyle, Anna M.

    2016-01-01

    LincRNA-p21 is a long intergenic non-coding RNA (lincRNA) involved in the p53-mediated stress response. We sequenced the human lincRNA-p21 (hLincRNA-p21) and found that it has a single exon that includes inverted repeat Alu elements (IRAlus). Sense and antisense Alu elements fold independently of one another into a secondary structure that is conserved in lincRNA-p21 among primates. Moreover, the structures formed by IRAlus are involved in the localization of hLincRNA-p21 in the nucleus, where hLincRNA-p21 colocalizes with paraspeckles. Our results underscore the importance of IRAlus structures for the function of hLincRNA-p21 during the stress response. PMID:27378782

  1. Comparative analysis of human protein-coding and noncoding RNAs between brain and 10 mixed cell lines by RNA-Seq.

    PubMed

    Chen, Geng; Yin, Kangping; Shi, Leming; Fang, Yuanzhang; Qi, Ya; Li, Peng; Luo, Jian; He, Bing; Liu, Mingyao; Shi, Tieliu

    2011-01-01

    In their expression process, different genes can generate diverse functional products, including various protein-coding or noncoding RNAs. Here, we investigated the protein-coding capacities and the expression levels of their isoforms for human known genes, the conservation and disease association of long noncoding RNAs (ncRNAs) with two transcriptome sequencing datasets from human brain tissues and 10 mixed cell lines. Comparative analysis revealed that about two-thirds of the genes expressed between brain and cell lines are the same, but less than one-third of their isoforms are identical. Besides those genes specially expressed in brain and cell lines, about 66% of genes expressed in common encoded different isoforms. Moreover, most genes dominantly expressed one isoform and some genes only generated protein-coding (or noncoding) RNAs in one sample but not in another. We found 282 human genes could encode both protein-coding and noncoding RNAs through alternative splicing in the two samples. We also identified more than 1,000 long ncRNAs, and most of those long ncRNAs contain conserved elements across either 46 vertebrates or 33 placental mammals or 10 primates. Further analysis showed that some long ncRNAs differentially expressed in human breast cancer or lung cancer, several of those differentially expressed long ncRNAs were validated by RT-PCR. In addition, those validated differentially expressed long ncRNAs were found significantly correlated with certain breast cancer or lung cancer related genes, indicating the important biological relevance between long ncRNAs and human cancers. Our findings reveal that the differences of gene expression profile between samples mainly result from the expressed gene isoforms, and highlight the importance of studying genes at the isoform level for completely illustrating the intricate transcriptome.

  2. Small Open Reading Frames, Non-Coding RNAs and Repetitive Elements in Bradyrhizobium japonicum USDA 110

    PubMed Central

    Hahn, Julia; Tsoy, Olga V.; Thalmann, Sebastian; Čuklina, Jelena; Gelfand, Mikhail S.

    2016-01-01

    Small open reading frames (sORFs) and genes for non-coding RNAs are poorly investigated components of most genomes. Our analysis of 1391 ORFs recently annotated in the soybean symbiont Bradyrhizobium japonicum USDA 110 revealed that 78% of them contain less than 80 codons. Twenty-one of these sORFs are conserved in or outside Alphaproteobacteria and most of them are similar to genes found in transposable elements, in line with their broad distribution. Stabilizing selection was demonstrated for sORFs with proteomic evidence and bll1319_ISGA which is conserved at the nucleotide level in 16 alphaproteobacterial species, 79 species from other taxa and 49 other Proteobacteria. Further we used Northern blot hybridization to validate ten small RNAs (BjsR1 to BjsR10) belonging to new RNA families. We found that BjsR1 and BjsR3 have homologs outside the genus Bradyrhizobium, and BjsR5, BjsR6, BjsR7, and BjsR10 have up to four imperfect copies in Bradyrhizobium genomes. BjsR8, BjsR9, and BjsR10 are present exclusively in nodules, while the other sRNAs are also expressed in liquid cultures. We also found that the level of BjsR4 decreases after exposure to tellurite and iron, and this down-regulation contributes to survival under high iron conditions. Analysis of additional small RNAs overlapping with 3’-UTRs revealed two new repetitive elements named Br-REP1 and Br-REP2. These REP elements may play roles in the genomic plasticity and gene regulation and could be useful for strain identification by PCR-fingerprinting. Furthermore, we studied two potential toxin genes in the symbiotic island and confirmed toxicity of the yhaV homolog bll1687 but not of the newly annotated higB homolog blr0229_ISGA in E. coli. Finally, we revealed transcription interference resulting in an antisense RNA complementary to blr1853, a gene induced in symbiosis. The presented results expand our knowledge on sORFs, non-coding RNAs and repetitive elements in B. japonicum and related bacteria. PMID:27788207

  3. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.

    PubMed

    2004-12-09

    We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

  4. Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

    PubMed Central

    Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia

    2011-01-01

    Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358

  5. Comparative Sequence Analysis of the X-Inactivation Center Region in Mouse, Human, and Bovine

    PubMed Central

    Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

    2002-01-01

    We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5′ of Xist that was recently shown to attract histone modification early after the onset of X inactivation. [The sequence data described in this paper have been submitted to the EMBL data library under accession nos. AJ421478, AJ421479, AJ421480, and AJ421481. Online supplemental data are available at http://pbil.univ-lyon1.fr/datasets/Xic2002/data.html and www.genome.org.] PMID:12045143

  6. Primate-specific evolution of noncoding element insertion into PLA2G4C and human preterm birth

    PubMed Central

    2010-01-01

    Background The onset of birth in humans, like other apes, differs from non-primate mammals in its endocrine physiology. We hypothesize that higher primate-specific gene evolution may lead to these differences and target genes involved in human preterm birth, an area of global health significance. Methods We performed a comparative genomics screen of highly conserved noncoding elements and identified PLA2G4C, a phospholipase A isoform involved in prostaglandin biosynthesis as human accelerated. To examine whether this gene demonstrating primate-specific evolution was associated with birth timing, we genotyped and analyzed 8 common single nucleotide polymorphisms (SNPs) in PLA2G4C in US Hispanic (n = 73 preterm, 292 control), US White (n = 147 preterm, 157 control) and US Black (n = 79 preterm, 166 control) mothers. Results Detailed structural and phylogenic analysis of PLA2G4C suggested a short genomic element within the gene duplicated from a paralogous highly conserved element on chromosome 1 specifically in primates. SNPs rs8110925 and rs2307276 in US Hispanics and rs11564620 in US Whites were significant after correcting for multiple tests (p < 0.006). Additionally, rs11564620 (Thr360Pro) was associated with increased metabolite levels of the prostaglandin thromboxane in healthy individuals (p = 0.02), suggesting this variant may affect PLA2G4C activity. Conclusions Our findings suggest that variation in PLA2G4C may influence preterm birth risk by increasing levels of prostaglandins, which are known to regulate labor. PMID:21184677

  7. A screen for nuclear transcripts identifies two linked noncoding RNAs associated with SC35 splicing domains

    PubMed Central

    Hutchinson, John N; Ensminger, Alexander W; Clemson, Christine M; Lynch, Christopher R; Lawrence, Jeanne B; Chess, Andrew

    2007-01-01

    Background Noncoding RNA species play a diverse set of roles in the eukaryotic cell. While much recent attention has focused on smaller RNA species, larger noncoding transcripts are also thought to be highly abundant in mammalian cells. To search for large noncoding RNAs that might control gene expression or mRNA metabolism, we used Affymetrix expression arrays to identify polyadenylated RNA transcripts displaying nuclear enrichment. Results This screen identified no more than three transcripts; XIST, and two unique noncoding nuclear enriched abundant transcripts (NEAT) RNAs strikingly located less than 70 kb apart on human chromosome 11: NEAT1, a noncoding RNA from the locus encoding for TncRNA, and NEAT2 (also known as MALAT-1). While the two NEAT transcripts share no significant homology with each other, each is conserved within the mammalian lineage, suggesting significant function for these noncoding RNAs. NEAT2 is extraordinarily well conserved for a noncoding RNA, more so than even XIST. Bioinformatic analyses of publicly available mouse transcriptome data support our findings from human cells as they confirm that the murine homologs of these noncoding RNAs are also nuclear enriched. RNA FISH analyses suggest that these noncoding RNAs function in mRNA metabolism as they demonstrate an intimate association of these RNA species with SC35 nuclear speckles in both human and mouse cells. These studies show that one of these transcripts, NEAT1 localizes to the periphery of such domains, whereas the neighboring transcript, NEAT2, is part of the long-sought polyadenylated component of nuclear speckles. Conclusion Our genome-wide screens in two mammalian species reveal no more than three abundant large non-coding polyadenylated RNAs in the nucleus; the canonical large noncoding RNA XIST and NEAT1 and NEAT2. The function of these noncoding RNAs in mRNA metabolism is suggested by their high levels of conservation and their intimate association with SC35 splicing domains in multiple mammalian species. PMID:17270048

  8. Transcriptional Regulation in Ebola Virus: Effects of Gene Border Structure and Regulatory Elements on Gene Expression and Polymerase Scanning Behavior

    PubMed Central

    Brauburger, Kristina; Boehmann, Yannik; Krähling, Verena

    2015-01-01

    ABSTRACT The highly pathogenic Ebola virus (EBOV) has a nonsegmented negative-strand (NNS) RNA genome containing seven genes. The viral genes either are separated by intergenic regions (IRs) of variable length or overlap. The structure of the EBOV gene overlaps is conserved throughout all filovirus genomes and is distinct from that of the overlaps found in other NNS RNA viruses. Here, we analyzed how diverse gene borders and noncoding regions surrounding the gene borders influence transcript levels and govern polymerase behavior during viral transcription. Transcription of overlapping genes in EBOV bicistronic minigenomes followed the stop-start mechanism, similar to that followed by IR-containing gene borders. When the gene overlaps were extended, the EBOV polymerase was able to scan the template in an upstream direction. This polymerase feature seems to be generally conserved among NNS RNA virus polymerases. Analysis of IR-containing gene borders showed that the IR sequence plays only a minor role in transcription regulation. Changes in IR length were generally well tolerated, but specific IR lengths led to a strong decrease in downstream gene expression. Correlation analysis revealed that these effects were largely independent of the surrounding gene borders. Each EBOV gene contains exceptionally long untranslated regions (UTRs) flanking the open reading frame. Our data suggest that the UTRs adjacent to the gene borders are the main regulators of transcript levels. A highly complex interplay between the different cis-acting elements to modulate transcription was revealed for specific combinations of IRs and UTRs, emphasizing the importance of the noncoding regions in EBOV gene expression control. IMPORTANCE Our data extend those from previous analyses investigating the implication of noncoding regions at the EBOV gene borders for gene expression control. We show that EBOV transcription is regulated in a highly complex yet not easily predictable manner by a set of interacting cis-active elements. These findings are important not only for the design of recombinant filoviruses but also for the design of other replicon systems widely used as surrogate systems to study the filovirus replication cycle under low biosafety levels. Insights into the complex regulation of EBOV transcription conveyed by noncoding sequences will also help to interpret the importance of mutations that have been detected within these regions, including in isolates of the current outbreak. PMID:26656691

  9. Transcriptional Regulation in Ebola Virus: Effects of Gene Border Structure and Regulatory Elements on Gene Expression and Polymerase Scanning Behavior.

    PubMed

    Brauburger, Kristina; Boehmann, Yannik; Krähling, Verena; Mühlberger, Elke

    2016-02-15

    The highly pathogenic Ebola virus (EBOV) has a nonsegmented negative-strand (NNS) RNA genome containing seven genes. The viral genes either are separated by intergenic regions (IRs) of variable length or overlap. The structure of the EBOV gene overlaps is conserved throughout all filovirus genomes and is distinct from that of the overlaps found in other NNS RNA viruses. Here, we analyzed how diverse gene borders and noncoding regions surrounding the gene borders influence transcript levels and govern polymerase behavior during viral transcription. Transcription of overlapping genes in EBOV bicistronic minigenomes followed the stop-start mechanism, similar to that followed by IR-containing gene borders. When the gene overlaps were extended, the EBOV polymerase was able to scan the template in an upstream direction. This polymerase feature seems to be generally conserved among NNS RNA virus polymerases. Analysis of IR-containing gene borders showed that the IR sequence plays only a minor role in transcription regulation. Changes in IR length were generally well tolerated, but specific IR lengths led to a strong decrease in downstream gene expression. Correlation analysis revealed that these effects were largely independent of the surrounding gene borders. Each EBOV gene contains exceptionally long untranslated regions (UTRs) flanking the open reading frame. Our data suggest that the UTRs adjacent to the gene borders are the main regulators of transcript levels. A highly complex interplay between the different cis-acting elements to modulate transcription was revealed for specific combinations of IRs and UTRs, emphasizing the importance of the noncoding regions in EBOV gene expression control. Our data extend those from previous analyses investigating the implication of noncoding regions at the EBOV gene borders for gene expression control. We show that EBOV transcription is regulated in a highly complex yet not easily predictable manner by a set of interacting cis-active elements. These findings are important not only for the design of recombinant filoviruses but also for the design of other replicon systems widely used as surrogate systems to study the filovirus replication cycle under low biosafety levels. Insights into the complex regulation of EBOV transcription conveyed by noncoding sequences will also help to interpret the importance of mutations that have been detected within these regions, including in isolates of the current outbreak. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  10. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  11. Conserved noncoding sequences (CNSs) in higher plants.

    PubMed

    Freeling, Michael; Subramaniam, Shabarinath

    2009-04-01

    Plant conserved noncoding sequences (CNSs)--a specific category of phylogenetic footprint--have been shown experimentally to function. No plant CNS is conserved to the extent that ultraconserved noncoding sequences are conserved in vertebrates. Plant CNSs are enriched in known transcription factor or other cis-acting binding sites, and are usually clustered around genes. Genes that encode transcription factors and/or those that respond to stimuli are particularly CNS-rich. Only rarely could this function involve small RNA binding. Some transcribed CNSs encode short translation products as a form of negative control. Approximately 4% of Arabidopsis gene content is estimated to be both CNS-rich and occupies a relatively long stretch of chromosome: Bigfoot genes (long phylogenetic footprints). We discuss a 'DNA-templated protein assembly' idea that might help explain Bigfoot gene CNSs.

  12. Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

    PubMed Central

    Hall, L; Laird, J E; Craig, R K

    1984-01-01

    Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375

  13. Variations in the non-coding transcriptome as a driver of inter-strain divergence and physiological adaptation in bacteria.

    PubMed

    Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R; Voß, Björn

    2015-04-22

    In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5'UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5'UTR. Such an sRNA/mRNA structure, which we name 'actuaton', represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation.

  14. Transposon-driven transcription is a conserved feature of vertebrate spermatogenesis and transcript evolution.

    PubMed

    Davis, Matthew P; Carrieri, Claudia; Saini, Harpreet K; van Dongen, Stijn; Leonardi, Tommaso; Bussotti, Giovanni; Monahan, Jack M; Auchynnikava, Tania; Bitetti, Angelo; Rappsilber, Juri; Allshire, Robin C; Shkumatava, Alena; O'Carroll, Dónal; Enright, Anton J

    2017-07-01

    Spermatogenesis is associated with major and unique changes to chromosomes and chromatin. Here, we sought to understand the impact of these changes on spermatogenic transcriptomes. We show that long terminal repeats (LTRs) of specific mouse endogenous retroviruses (ERVs) drive the expression of many long non-coding transcripts (lncRNA). This process occurs post-mitotically predominantly in spermatocytes and round spermatids. We demonstrate that this transposon-driven lncRNA expression is a conserved feature of vertebrate spermatogenesis. We propose that transposon promoters are a mechanism by which the genome can explore novel transcriptional substrates, increasing evolutionary plasticity and allowing for the genesis of novel coding and non-coding genes. Accordingly, we show that a small fraction of these novel ERV-driven transcripts encode short open reading frames that produce detectable peptides. Finally, we find that distinct ERV elements from the same subfamilies act as differentially activated promoters in a tissue-specific context. In summary, we demonstrate that LTRs can act as tissue-specific promoters and contribute to post-mitotic spermatogenic transcriptome diversity. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.

  15. Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis.

    PubMed

    Spangler, Jacob B; Feltus, Frank Alex

    2013-01-01

    Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression.

  16. Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis

    PubMed Central

    Spangler, Jacob B.; Feltus, Frank Alex

    2013-01-01

    Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression. PMID:23675377

  17. Long-Range Control of Gene Expression: Emerging Mechanisms and Disruption in Disease

    PubMed Central

    Kleinjan, Dirk A.; van Heyningen, Veronica

    2005-01-01

    Transcriptional control is a major mechanism for regulating gene expression. The complex machinery required to effect this control is still emerging from functional and evolutionary analysis of genomic architecture. In addition to the promoter, many other regulatory elements are required for spatiotemporally and quantitatively correct gene expression. Enhancer and repressor elements may reside in introns or up- and downstream of the transcription unit. For some genes with highly complex expression patterns—often those that function as key developmental control genes—the cis-regulatory domain can extend long distances outside the transcription unit. Some of the earliest hints of this came from disease-associated chromosomal breaks positioned well outside the relevant gene. With the availability of wide-ranging genome sequence comparisons, strong conservation of many noncoding regions became obvious. Functional studies have shown many of these conserved sites to be transcriptional regulatory elements that sometimes reside inside unrelated neighboring genes. Such sequence-conserved elements generally harbor sites for tissue-specific DNA-binding proteins. Developmentally variable chromatin conformation can control protein access to these sites and can regulate transcription. Disruption of these finely tuned mechanisms can cause disease. Some regulatory element mutations will be associated with phenotypes distinct from any identified for coding-region mutations. PMID:15549674

  18. Long-range comparison of human and mouse Sprr loci to identify conserved noncoding sequences involved in coordinate regulation

    PubMed Central

    Martin, Natalia; Patel, Satyakam; Segre, Julia A.

    2004-01-01

    Mammalian epidermis provides a permeability barrier between an organism and its environment. Under homeostatic conditions, epidermal cells produce structural proteins, which are cross-linked in an orderly fashion to form a cornified envelope (CE). However, under genetic or environmental stress, specific genes are induced to rapidly build a temporary barrier. Small proline-rich (SPRR) proteins are the primary constituents of the CE. Under stress the entire family of 14 Sprr genes is upregulated. The Sprr genes are clustered within the larger epidermal differentiation complex on mouse chromosome 3, human chromosome 1q21. The clustering of the Sprr genes and their upregulation under stress suggest that these genes may be coordinately regulated. To identify enhancer elements that regulate this stress response activation of the Sprr locus, we utilized bioinformatic tools and classical biochemical dissection. Long-range comparative sequence analysis identified conserved noncoding sequences (CNSs). Clusters of epidermal-specific DNaseI-hypersensitive sites (HSs) mapped to specific CNSs. Increased prevalence of these HSs in barrier-deficient epidermis provides in vivo evidence of the regulation of the Sprr locus by these conserved sequences. Individual components of these HSs were cloned, and one was shown to have strong enhancer activity specific to conditions when the Sprr genes are coordinately upregulated. PMID:15574822

  19. Deep sequencing reveals unique small RNA repertoire that is regulated during head regeneration in Hydra magnipapillata.

    PubMed

    Krishna, Srikar; Nair, Aparna; Cheedipudi, Sirisha; Poduval, Deepak; Dhawan, Jyotsna; Palakodeti, Dasaradhi; Ghanekar, Yashoda

    2013-01-07

    Small non-coding RNAs such as miRNAs, piRNAs and endo-siRNAs fine-tune gene expression through post-transcriptional regulation, modulating important processes in development, differentiation, homeostasis and regeneration. Using deep sequencing, we have profiled small non-coding RNAs in Hydra magnipapillata and investigated changes in small RNA expression pattern during head regeneration. Our results reveal a unique repertoire of small RNAs in hydra. We have identified 126 miRNA loci; 123 of these miRNAs are unique to hydra. Less than 50% are conserved across two different strains of Hydra vulgaris tested in this study, indicating a highly diverse nature of hydra miRNAs in contrast to bilaterian miRNAs. We also identified siRNAs derived from precursors with perfect stem-loop structure and that arise from inverted repeats. piRNAs were the most abundant small RNAs in hydra, mapping to transposable elements, the annotated transcriptome and unique non-coding regions on the genome. piRNAs that map to transposable elements and the annotated transcriptome display a ping-pong signature. Further, we have identified several miRNAs and piRNAs whose expression is regulated during hydra head regeneration. Our study defines different classes of small RNAs in this cnidarian model system, which may play a role in orchestrating gene expression essential for hydra regeneration.

  20. Deep sequencing reveals unique small RNA repertoire that is regulated during head regeneration in Hydra magnipapillata

    PubMed Central

    Krishna, Srikar; Nair, Aparna; Cheedipudi, Sirisha; Poduval, Deepak; Dhawan, Jyotsna; Palakodeti, Dasaradhi; Ghanekar, Yashoda

    2013-01-01

    Small non-coding RNAs such as miRNAs, piRNAs and endo-siRNAs fine-tune gene expression through post-transcriptional regulation, modulating important processes in development, differentiation, homeostasis and regeneration. Using deep sequencing, we have profiled small non-coding RNAs in Hydra magnipapillata and investigated changes in small RNA expression pattern during head regeneration. Our results reveal a unique repertoire of small RNAs in hydra. We have identified 126 miRNA loci; 123 of these miRNAs are unique to hydra. Less than 50% are conserved across two different strains of Hydra vulgaris tested in this study, indicating a highly diverse nature of hydra miRNAs in contrast to bilaterian miRNAs. We also identified siRNAs derived from precursors with perfect stem–loop structure and that arise from inverted repeats. piRNAs were the most abundant small RNAs in hydra, mapping to transposable elements, the annotated transcriptome and unique non-coding regions on the genome. piRNAs that map to transposable elements and the annotated transcriptome display a ping–pong signature. Further, we have identified several miRNAs and piRNAs whose expression is regulated during hydra head regeneration. Our study defines different classes of small RNAs in this cnidarian model system, which may play a role in orchestrating gene expression essential for hydra regeneration. PMID:23166307

  1. Identification of a Conserved Non-Protein-Coding Genomic Element that Plays an Essential Role in Alphabaculovirus Pathogenesis

    PubMed Central

    Kikhno, Irina

    2014-01-01

    Highly homologous sequences 154–157 bp in length grouped under the name of “conserved non-protein-coding element” (CNE) were revealed in all of the sequenced genomes of baculoviruses belonging to the genus Alphabaculovirus. A CNE alignment led to the detection of a set of highly conserved nucleotide clusters that occupy strictly conserved positions in the CNE sequence. The significant length of the CNE and conservation of both its length and cluster architecture were identified as a combination of characteristics that make this CNE different from known viral non-coding functional sequences. The essential role of the CNE in the Alphabaculovirus life cycle was demonstrated through the use of a CNE-knockout Autographa californica multiple nucleopolyhedrovirus (AcMNPV) bacmid. It was shown that the essential function of the CNE was not mediated by the presumed expression activities of the protein- and non-protein-coding genes that overlap the AcMNPV CNE. On the basis of the presented data, the AcMNPV CNE was categorized as a complex-structured, polyfunctional genomic element involved in an essential DNA transaction that is associated with an undefined function of the baculovirus genome. PMID:24740153

  2. Genomic Identification and Analysis of Shared Cis-regulator Elements in a Developmentally Critical homeobox Cluster

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chris Amemiya

    2003-04-01

    The goals of this project were to isolate, characterize, and sequence the Dlx3/Dlx7 bigene cluster from twelve different species of mammals. The Dlx3 and Dlx7 genes are known to encode homeobox transcription factors involved in patterning of structures in the vertebrate jaw as well as vertebrate limbs. Genomic sequences from the respective taxa will subsequently be compared in order to identify conserved non-coding sequences that are potential cis-regulatory elements. Based on the comparisons they will fashion transgenic mouse experiments to functionally test the strength of the potential cis-regulatory elements. A goal of the project is to attempt to identify thosemore » elements that may function in coordinately regulating both Dlx3 and Dlx7 functions.« less

  3. Characterization of noncoding regulatory DNA in the human genome.

    PubMed

    Elkon, Ran; Agami, Reuven

    2017-08-08

    Genetic variants associated with common diseases are usually located in noncoding parts of the human genome. Delineation of the full repertoire of functional noncoding elements, together with efficient methods for probing their biological roles, is therefore of crucial importance. Over the past decade, DNA accessibility and various epigenetic modifications have been associated with regulatory functions. Mapping these features across the genome has enabled researchers to begin to document the full complement of putative regulatory elements. High-throughput reporter assays to probe the functions of regulatory regions have also been developed but these methods separate putative regulatory elements from the chromosome so that any effects of chromatin context and long-range regulatory interactions are lost. Definitive assignment of function(s) to putative cis-regulatory elements requires perturbation of these elements. Genome-editing technologies are now transforming our ability to perturb regulatory elements across entire genomes. Interpretation of high-throughput genetic screens that incorporate genome editors might enable the construction of an unbiased map of functional noncoding elements in the human genome.

  4. Variations in the non-coding transcriptome as a driver of inter-strain divergence and physiological adaptation in bacteria

    PubMed Central

    Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R.; Voß, Björn

    2015-01-01

    In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5′UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5′UTR. Such an sRNA/mRNA structure, which we name ‘actuaton’, represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation. PMID:25902393

  5. G-quadruplex prediction in E. coli genome reveals a conserved putative G-quadruplex-Hairpin-Duplex switch.

    PubMed

    Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman

    2016-11-02

    Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Spontaneous and engineered deletions in the 3' noncoding region of tick-borne encephalitis virus: construction of highly attenuated mutants of a flavivirus.

    PubMed

    Mandl, C W; Holzmann, H; Meixner, T; Rauscher, S; Stadler, P F; Allison, S L; Heinz, F X

    1998-03-01

    The flavivirus genome is a positive-strand RNA molecule containing a single long open reading frame flanked by noncoding regions (NCR) that mediate crucial processes of the viral life cycle. The 3' NCR of tick-borne encephalitis (TBE) virus can be divided into a variable region that is highly heterogeneous in length among strains of TBE virus and in certain cases includes an internal poly(A) tract and a 3'-terminal conserved core element that is believed to fold as a whole into a well-defined secondary structure. We have now investigated the genetic stability of the TBE virus 3' NCR and its influence on viral growth properties and virulence. We observed spontaneous deletions in the variable region during growth of TBE virus in cell culture and in mice. These deletions varied in size and location but always included the internal poly(A) element of the TBE virus 3' NCR and never extended into the conserved 3'-terminal core element. Subsequently, we constructed specific deletion mutants by using infectious cDNA clones with the entire variable region and increasing segments of the core element removed. A virus mutant lacking the entire variable region was indistinguishable from wild-type virus with respect to cell culture growth properties and virulence in the mouse model. In contrast, even small extensions of the deletion into the core element led to significant biological effects. Deletions extending to nucleotides 10826, 10847, and 10870 caused distinct attenuation in mice without measurable reduction of cell culture growth properties, which, however, were significantly restricted when the deletion was extended to nucleotide 10919. An even larger deletion (to nucleotide 10994) abolished viral viability. In spite of their high degree of attenuation, these mutants efficiently induced protective immune responses even at low inoculation doses. Thus, 3'-NCR deletions represent a useful technique for achieving stable attenuation of flaviviruses that can be included in the rational design of novel flavivirus live vaccines.

  7. ICAM-1-related long non-coding RNA: promoter analysis and expression in human retinal endothelial cells.

    PubMed

    Lumsden, Amanda L; Ma, Yuefang; Ashander, Liam M; Stempel, Andrew J; Keating, Damien J; Smith, Justine R; Appukuttan, Binoy

    2018-05-09

    Regulation of intercellular adhesion molecule (ICAM)-1 in retinal endothelial cells is a promising druggable target for retinal vascular diseases. The ICAM-1-related (ICR) long non-coding RNA stabilizes ICAM-1 transcript, increasing protein expression. However, studies of ICR involvement in disease have been limited as the promoter is uncharacterized. To address this issue, we undertook a comprehensive in silico analysis of the human ICR gene promoter region. We used genomic evolutionary rate profiling to identify a 115 base pair (bp) sequence within 500 bp upstream of the transcription start site of the annotated human ICR gene that was conserved across 25 eutherian genomes. A second constrained sequence upstream of the orthologous mouse gene (68 bp; conserved across 27 Eutherian genomes including human) was also discovered. Searching these elements identified 33 matrices predictive of binding sites for transcription factors known to be responsive to a broad range of pathological stimuli, including hypoxia, and metabolic and inflammatory proteins. Five phenotype-associated single nucleotide polymorphisms (SNPs) in the immediate vicinity of these elements included four SNPs (i.e. rs2569693, rs281439, rs281440 and rs11575074) predicted to impact binding motifs of transcription factors, and thus the expression of ICR and ICAM-1 genes, with potential to influence disease susceptibility. We verified that human retinal endothelial cells expressed ICR, and observed induction of expression by tumor necrosis factor-α.

  8. GATA: A graphic alignment tool for comparative sequenceanalysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nix, David A.; Eisen, Michael B.

    2005-01-01

    Several problems exist with current methods used to align DNA sequences for comparative sequence analysis. Most dynamic programming algorithms assume that conserved sequence elements are collinear. This assumption appears valid when comparing orthologous protein coding sequences. Functional constraints on proteins provide strong selective pressure against sequence inversions, and minimize sequence duplications and feature shuffling. For non-coding sequences this collinearity assumption is often invalid. For example, enhancers contain clusters of transcription factor binding sites that change in number, orientation, and spacing during evolution yet the enhancer retains its activity. Dotplot analysis is often used to estimate non-coding sequence relatedness. Yet dotmore » plots do not actually align sequences and thus cannot account well for base insertions or deletions. Moreover, they lack an adequate statistical framework for comparing sequence relatedness and are limited to pairwise comparisons. Lastly, dot plots and dynamic programming text outputs fail to provide an intuitive means for visualizing DNA alignments.« less

  9. Separation of the PROX1 gene from upstream conserved elements in a complex inversion/translocation patient with hypoplastic left heart

    PubMed Central

    Gill, Harinder K; Parsons, Sian R; Spalluto, Cosma; Davies, Angela F; Knorz, Victoria J; Burlinson, Clare EG; Ng, Bee Ling; Carter, Nigel P; Ogilvie, Caroline Mackie; Wilson, David I; Roberts, Roland G

    2009-01-01

    Hypoplastic left heart (HLH) occurs in at least 1 in 10 000 live births but may be more common in utero. Its causes are poorly understood but a number of affected cases are associated with chromosomal abnormalities. We set out to localize the breakpoints in a patient with sporadic HLH and a de novo translocation. Initial studies showed that the apparently simple 1q41;3q27.1 translocation was actually combined with a 4-Mb inversion, also de novo, of material within 1q41. We therefore localized all four breakpoints and found that no known transcription units were disrupted. However we present a case, based on functional considerations, synteny and position of highly conserved non-coding sequence elements, and the heterozygous Prox1+/− mouse phenotype (ventricular hypoplasia), for the involvement of dysregulation of the PROX1 gene in the aetiology of HLH in this case. Accordingly, we show that the spatial expression pattern of PROX1 in the developing human heart is consistent with a role in cardiac development. We suggest that dysregulation of PROX1 gene expression due to separation from its conserved upstream elements is likely to have caused the heart defects observed in this patient, and that PROX1 should be considered as a potential candidate gene for other cases of HLH. The relevance of another breakpoint separating the cardiac gene ESRRG from a conserved downstream element is also discussed. PMID:19471316

  10. Structural architecture of the human long non-coding RNA, steroid receptor RNA activator

    PubMed Central

    Novikova, Irina V.; Hennelly, Scott P.; Sanbonmatsu, Karissa Y.

    2012-01-01

    While functional roles of several long non-coding RNAs (lncRNAs) have been determined, the molecular mechanisms are not well understood. Here, we report the first experimentally derived secondary structure of a human lncRNA, the steroid receptor RNA activator (SRA), 0.87 kB in size. The SRA RNA is a non-coding RNA that coactivates several human sex hormone receptors and is strongly associated with breast cancer. Coding isoforms of SRA are also expressed to produce proteins, making the SRA gene a unique bifunctional system. Our experimental findings (SHAPE, in-line, DMS and RNase V1 probing) reveal that this lncRNA has a complex structural organization, consisting of four domains, with a variety of secondary structure elements. We examine the coevolution of the SRA gene at the RNA structure and protein structure levels using comparative sequence analysis across vertebrates. Rapid evolutionary stabilization of RNA structure, combined with frame-disrupting mutations in conserved regions, suggests that evolutionary pressure preserves the RNA structural core rather than its translational product. We perform similar experiments on alternatively spliced SRA isoforms to assess their structural features. PMID:22362738

  11. Three-Dimensional RNA Structure of the Major HIV-1 Packaging Signal Region

    PubMed Central

    Stephenson, James D.; Li, Haitao; Kenyon, Julia C.; Symmons, Martyn; Klenerman, Dave; Lever, Andrew M.L.

    2013-01-01

    Summary HIV-1 genomic RNA has a noncoding 5′ region containing sequential conserved structural motifs that control many parts of the life cycle. Very limited data exist on their three-dimensional (3D) conformation and, hence, how they work structurally. To assemble a working model, we experimentally reassessed secondary structure elements of a 240-nt region and used single-molecule distances, derived from fluorescence resonance energy transfer, between defined locations in these elements as restraints to drive folding of the secondary structure into a 3D model with an estimated resolution below 10 Å. The folded 3D model satisfying the data is consensual with short nuclear-magnetic-resonance-solved regions and reveals previously unpredicted motifs, offering insight into earlier functional assays. It is a 3D representation of this entire region, with implications for RNA dimerization and protein binding during regulatory steps. The structural information of this highly conserved region of the virus has the potential to reveal promising therapeutic targets. PMID:23685210

  12. Genomic assessment of the evolution of the prion protein gene family in vertebrates.

    PubMed

    Harrison, Paul M; Khachane, Amit; Kumar, Manish

    2010-05-01

    Prion diseases are devastating neurological disorders caused by the propagation of particles containing an alternative beta-sheet-rich form of the prion protein (PrP). Genes paralogous to PrP, called Doppel and Shadoo, have been identified, that also have neuropathological relevance. To aid in the further functional characterization of PrP and its relatives, we annotated completely the PrP gene family (PrP-GF), in the genomes of 42 vertebrates, through combined strategic application of gene prediction programs and advanced remote homology detection techniques (such as HMMs, PSI-TBLASTN and pGenThreader). We have uncovered several previously undescribed paralogous genes and pseudogenes. We find that current high-quality genomic evidence indicates that the PrP relative Doppel, was likely present in the last common ancestor of present-day Tetrapoda, but was lost in the bird lineage, since its divergence from reptiles. Using the new gene annotations, we have defined the consensus of structural features that are characteristic of the PrP and Doppel structures, across diverse Tetrapoda clades. Furthermore, we describe in detail a transcribed pseudogene derived from Shadoo that is conserved across primates, and that overlaps the meiosis gene, SYCE1, thus possibly regulating its expression. In addition, we analysed the locus of PRNP/PRND for significant conservation across the genomic DNA of eleven mammals, and determined the phylogenetic penetration of non-coding exons. The genomic evidence indicates that the second PRNP non-coding exon found in even-toed ungulates and rodents, is conserved in all high-coverage genome assemblies of primates (human, chimp, orang utan and macaque), and is, at least, likely to have fallen out of use during primate speciation. Furthermore, we have demonstrated that the PRNT gene (at the PRNP human locus) is conserved across at least sixteen mammals, and evolves like a long non-coding RNA, fashioned from fragments of ancient, long, interspersed elements. These annotations and evolutionary analyses will be of further use for functional characterisation of the PrP-GF, and will be updatable in a semi-automated fashion as more genomes accumulate. Copyright 2010 Elsevier Inc. All rights reserved.

  13. Social insect genomes exhibit dramatic evolution in gene composition and regulation while preserving regulatory features linked to sociality

    PubMed Central

    Simola, Daniel F.; Wissler, Lothar; Donahue, Greg; Waterhouse, Robert M.; Helmkampf, Martin; Roux, Julien; Nygaard, Sanne; Glastad, Karl M.; Hagen, Darren E.; Viljakainen, Lumi; Reese, Justin T.; Hunt, Brendan G.; Graur, Dan; Elhaik, Eran; Kriventseva, Evgenia V.; Wen, Jiayu; Parker, Brian J.; Cash, Elizabeth; Privman, Eyal; Childers, Christopher P.; Muñoz-Torres, Monica C.; Boomsma, Jacobus J.; Bornberg-Bauer, Erich; Currie, Cameron R.; Elsik, Christine G.; Suen, Garret; Goodisman, Michael A.D.; Keller, Laurent; Liebig, Jürgen; Rawls, Alan; Reinberg, Danny; Smith, Chris D.; Smith, Chris R.; Tsutsui, Neil; Wurm, Yannick; Zdobnov, Evgeny M.; Berger, Shelley L.; Gadau, Jürgen

    2013-01-01

    Genomes of eusocial insects code for dramatic examples of phenotypic plasticity and social organization. We compared the genomes of seven ants, the honeybee, and various solitary insects to examine whether eusocial lineages share distinct features of genomic organization. Each ant lineage contains ∼4000 novel genes, but only 64 of these genes are conserved among all seven ants. Many gene families have been expanded in ants, notably those involved in chemical communication (e.g., desaturases and odorant receptors). Alignment of the ant genomes revealed reduced purifying selection compared with Drosophila without significantly reduced synteny. Correspondingly, ant genomes exhibit dramatic divergence of noncoding regulatory elements; however, extant conserved regions are enriched for novel noncoding RNAs and transcription factor–binding sites. Comparison of orthologous gene promoters between eusocial and solitary species revealed significant regulatory evolution in both cis (e.g., Creb) and trans (e.g., fork head) for nearly 2000 genes, many of which exhibit phenotypic plasticity. Our results emphasize that genomic changes can occur remarkably fast in ants, because two recently diverged leaf-cutter ant species exhibit faster accumulation of species-specific genes and greater divergence in regulatory elements compared with other ants or Drosophila. Thus, while the “socio-genomes” of ants and the honeybee are broadly characterized by a pervasive pattern of divergence in gene composition and regulation, they preserve lineage-specific regulatory features linked to eusociality. We propose that changes in gene regulation played a key role in the origins of insect eusociality, whereas changes in gene composition were more relevant for lineage-specific eusocial adaptations. PMID:23636946

  14. Molecular evolution of the HoxA cluster in the three major gnathostome lineages

    PubMed Central

    Chiu, Chi-hua; Amemiya, Chris; Dewar, Ken; Kim, Chang-Bae; Ruddle, Frank H.; Wagner, Günter P.

    2002-01-01

    The duplication of Hox clusters and their maintenance in a lineage has a prominent but little understood role in chordate evolution. Here we examined how Hox cluster duplication may influence changes in cluster architecture and patterns of noncoding sequence evolution. We sequenced the entire duplicated HoxAa and HoxAb clusters of zebrafish (Danio rerio) and extended the 5′ (posterior) part of the HoxM (HoxA-like) cluster of horn shark (Heterodontus francisci) containing the hoxa11 and hoxa13 orthologs as well as intergenic and flanking noncoding sequences. The duplicated HoxA clusters in zebrafish each house considerably fewer genes and are dramatically shorter than the single HoxA clusters of human and horn shark. We compared the intergenic sequences of the HoxA clusters of human, horn shark, zebrafish (Aa, Ab), and striped bass and found extensive conservation of noncoding sequence motifs, i.e., phylogenetic footprints, between the human and horn shark, representing two of the three gnathostome lineages. These are putative cis-regulatory elements that may play a role in the regulation of the ancestral HoxA cluster. In contrast, homologous regions of the duplicated HoxAa and HoxAb clusters of zebrafish and the HoxA cluster of striped bass revealed a striking loss of conservation of these putative cis-regulatory sequences in the 3′ (anterior) segment of the cluster, where zebrafish only retains single representatives of group 1, 3, 4, and 5 (HoxAa) and group 2 (HoxAb) genes and in the 5′ part of the clusters, where zebrafish retains two copies of the group 13, 11, and 9 genes, i.e., AbdB-like genes. In analyzing patterns of cis-sequence evolution in the 5′ part of the clusters, we explicitly looked for evidence of complementary loss of conserved noncoding sequences, as predicted by the duplication-degeneration-complementation model in which genetic redundancy after gene duplication is resolved because of the fixation of complementary degenerative mutations. Our data did not yield evidence supporting this prediction. We conclude that changes in the pattern of cis-sequence conservation after Hox cluster duplication are more consistent with being the outcome of adaptive modification rather than passive mechanisms that erode redundancy created by the duplication event. These results support the view that genome duplications may provide a mechanism whereby master control genes undergo radical modifications conducive to major alterations in body plan. Such genomic revolutions may contribute significantly to the evolutionary process. PMID:11943847

  15. Noncoding RNAs of the Ultrabithorax Domain of the Drosophila Bithorax Complex

    PubMed Central

    Pease, Benjamin; Borges, Ana C.; Bender, Welcome

    2013-01-01

    RNA transcripts without obvious coding potential are widespread in many creatures, including the fruit fly, Drosophila melanogaster. Several noncoding RNAs have been identified within the Drosophila bithorax complex. These first appear in blastoderm stage embryos, and their expression patterns indicate that they are transcribed only from active domains of the bithorax complex. It has been suggested that these noncoding RNAs have a role in establishing active domains, perhaps by setting the state of Polycomb Response Elements A comprehensive survey across the proximal half of the bithorax complex has now revealed nine distinct noncoding RNA transcripts, including four within the Ultrabithorax transcription unit. At the blastoderm stage, the noncoding transcripts collectively span ∼75% of the 135 kb surveyed. Recombination-mediated cassette exchange was used to invert the promoter of one of the noncoding RNAs, a 23-kb transcript from the bxd domain of the bithorax complex. The resulting animals fail to make the normal bxd noncoding RNA and show no transcription across the bxd Polycomb Response Element in early embryos. The mutant flies look normal; the regulation of the bxd domain appears unaffected. Thus, the bxd noncoding RNA has no apparent function. PMID:24077301

  16. SET1A/COMPASS and shadow enhancers in the regulation of homeotic gene expression

    PubMed Central

    Cao, Kaixiang; Collings, Clayton K.; Marshall, Stacy A.; Morgan, Marc A.; Rendleman, Emily J.; Wang, Lu; Sze, Christie C.; Sun, Tianjiao; Bartom, Elizabeth T.; Shilatifard, Ali

    2017-01-01

    The homeotic (Hox) genes are highly conserved in metazoans, where they are required for various processes in development, and misregulation of their expression is associated with human cancer. In the developing embryo, Hox genes are activated sequentially in time and space according to their genomic position within Hox gene clusters. Accumulating evidence implicates both enhancer elements and noncoding RNAs in controlling this spatiotemporal expression of Hox genes, but disentangling their relative contributions is challenging. Here, we identify two cis-regulatory elements (E1 and E2) functioning as shadow enhancers to regulate the early expression of the HoxA genes. Simultaneous deletion of these shadow enhancers in embryonic stem cells leads to impaired activation of HoxA genes upon differentiation, while knockdown of a long noncoding RNA overlapping E1 has no detectable effect on their expression. Although MLL/COMPASS (complex of proteins associated with Set1) family of histone methyltransferases is known to activate transcription of Hox genes in other contexts, we found that individual inactivation of the MLL1-4/COMPASS family members has little effect on early Hox gene activation. Instead, we demonstrate that SET1A/COMPASS is required for full transcriptional activation of multiple Hox genes but functions independently of the E1 and E2 cis-regulatory elements. Our results reveal multiple regulatory layers for Hox genes to fine-tune transcriptional programs essential for development. PMID:28487406

  17. Sost, independent of the non-coding enhancer ECR5, is required for bone mechanoadaptation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robling, Alexander G.; Kang, Kyung Shin; Bullock, Whitney A.

    Here, sclerostin ( Sost) is a negative regulator of bone formation that acts upon the Wnt signaling pathway. Sost is mechanically regulated at both mRNA and protein level such that loading represses and unloading enhances Sost expression, in osteocytes and in circulation. The non-coding evolutionarily conserved enhancer ECR5 has been previously reported as a transcriptional regulatory element required for modulating Sost expression in osteocytes. Here we explored the mechanisms by which ECR5, or several other putative transcriptional enhancers regulate Sost expression, in response to mechanical stimulation. We found that in vivo ulna loading is equally osteoanabolic in wildtype and Sostmore » –/– mice, although Sost is required for proper distribution of load-induced bone formation to regions of high strain. Using Luciferase reporters carrying the ECR5 non-coding enhancer and heterologous or homologous h SOST promoters, we found that ECR5 is mechanosensitive in vitro and that ECR5-driven Luciferase activity decreases in osteoblasts exposed to oscillatory fluid flow. Yet, ECR5–/– mice showed similar magnitude of load-induced bone formation and similar periosteal distribution of bone formation to high-strain regions compared to wildtype mice. Further, we found that in contrast to Sost–/– mice, which are resistant to disuse-induced bone loss, ECR5–/– mice lose bone upon unloading to a degree similar to wildtype control mice. ECR5 deletion did not abrogate positive effects of unloading on Sost, suggesting that additional transcriptional regulators and regulatory elements contribute to load-induced regulation of Sost.« less

  18. Sost, independent of the non-coding enhancer ECR5, is required for bone mechanoadaptation

    DOE PAGES

    Robling, Alexander G.; Kang, Kyung Shin; Bullock, Whitney A.; ...

    2016-09-04

    Here, sclerostin ( Sost) is a negative regulator of bone formation that acts upon the Wnt signaling pathway. Sost is mechanically regulated at both mRNA and protein level such that loading represses and unloading enhances Sost expression, in osteocytes and in circulation. The non-coding evolutionarily conserved enhancer ECR5 has been previously reported as a transcriptional regulatory element required for modulating Sost expression in osteocytes. Here we explored the mechanisms by which ECR5, or several other putative transcriptional enhancers regulate Sost expression, in response to mechanical stimulation. We found that in vivo ulna loading is equally osteoanabolic in wildtype and Sostmore » –/– mice, although Sost is required for proper distribution of load-induced bone formation to regions of high strain. Using Luciferase reporters carrying the ECR5 non-coding enhancer and heterologous or homologous h SOST promoters, we found that ECR5 is mechanosensitive in vitro and that ECR5-driven Luciferase activity decreases in osteoblasts exposed to oscillatory fluid flow. Yet, ECR5–/– mice showed similar magnitude of load-induced bone formation and similar periosteal distribution of bone formation to high-strain regions compared to wildtype mice. Further, we found that in contrast to Sost–/– mice, which are resistant to disuse-induced bone loss, ECR5–/– mice lose bone upon unloading to a degree similar to wildtype control mice. ECR5 deletion did not abrogate positive effects of unloading on Sost, suggesting that additional transcriptional regulators and regulatory elements contribute to load-induced regulation of Sost.« less

  19. Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term.

    PubMed

    Romero, Roberto; Tarca, Adi L; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S; Kalita, Cynthia A; Cai, Juan; Yeo, Lami; Lipovich, Leonard

    2014-09-01

    To identify differentially expressed long non-coding RNA (lncRNA) genes in human myometrium in women with spontaneous labor at term. Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n = 19) and women in spontaneous labor at term (n = 20). RNA was extracted and profiled using an Illumina® microarray platform. We have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. We identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an experimental method completely independent of the microarray analysis. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site, that lacked evolutionary conservation beyond primates. We provide, for the first time, evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term.

  20. Genome-scale deletion screening of human long non-coding RNAs using a paired-guide RNA CRISPR library

    PubMed Central

    Zhu, Shiyou; Li, Wei; Liu, Jingze; Chen, Chen-Hao; Liao, Qi; Xu, Ping; Xu, Han; Xiao, Tengfei; Cao, Zhongzheng; Peng, Jingyu; Yuan, Pengfei; Brown, Myles; Liu, Xiaole Shirley; Wei, Wensheng

    2017-01-01

    CRISPR/Cas9 screens have been widely adopted to analyse coding gene functions, but high throughput screening of non-coding elements using this method is more challenging, because indels caused by a single cut in non-coding regions are unlikely to produce a functional knockout. A high-throughput method to produce deletions of non-coding DNA is needed. Herein, we report a high throughput genomic deletion strategy to screen for functional long non-coding RNAs (lncRNAs) that is based on a lentiviral paired-guide RNA (pgRNA) library. Applying our screening method, we identified 51 lncRNAs that can positively or negatively regulate human cancer cell growth. We individually validated 9 lncRNAs using CRISPR/Cas9-mediated genomic deletion and functional rescue, CRISPR activation or inhibition, and gene expression profiling. Our high-throughput pgRNA genome deletion method should enable rapid identification of functional mammalian non-coding elements. PMID:27798563

  1. Cytoplasmic long noncoding RNAs are frequently bound to and degraded at ribosomes in human cells

    PubMed Central

    Carlevaro-Fita, Joana; Rahim, Anisa; Guigó, Roderic; Vardy, Leah A.; Johnson, Rory

    2016-01-01

    Recent footprinting studies have made the surprising observation that long noncoding RNAs (lncRNAs) physically interact with ribosomes. However, these findings remain controversial, and the overall proportion of cytoplasmic lncRNAs involved is unknown. Here we make a global, absolute estimate of the cytoplasmic and ribosome-associated population of stringently filtered lncRNAs in a human cell line using polysome profiling coupled to spike-in normalized microarray analysis. Fifty-four percent of expressed lncRNAs are detected in the cytoplasm. The majority of these (70%) have >50% of their cytoplasmic copies associated with polysomal fractions. These interactions are lost upon disruption of ribosomes by puromycin. Polysomal lncRNAs are distinguished by a number of 5′ mRNA-like features, including capping and 5′UTR length. On the other hand, nonpolysomal “free cytoplasmic” lncRNAs have more conserved promoters and a wider range of expression across cell types. Exons of polysomal lncRNAs are depleted of endogenous retroviral insertions, suggesting a role for repetitive elements in lncRNA localization. Finally, we show that blocking of ribosomal elongation results in stabilization of many associated lncRNAs. Together these findings suggest that the ribosome is the default destination for the majority of cytoplasmic long noncoding RNAs and may play a role in their degradation. PMID:27090285

  2. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  3. Defining functional DNA elements in the human genome

    PubMed Central

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P.; Bernstein, Bradley E.; Kundaje, Anshul; Marinov, Georgi K.; Ward, Lucas D.; Birney, Ewan; Crawford, Gregory E.; Dekker, Job; Dunham, Ian; Elnitski, Laura L.; Farnham, Peggy J.; Feingold, Elise A.; Gerstein, Mark; Giddings, Morgan C.; Gilbert, David M.; Gingeras, Thomas R.; Green, Eric D.; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D.; Myers, Richard M.; Pazin, Michael J.; Ren, Bing; Stamatoyannopoulos, John A.; Weng, Zhiping; White, Kevin P.; Hardison, Ross C.

    2014-01-01

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. PMID:24753594

  4. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison

    PubMed Central

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-01-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features. PMID:15215464

  5. The Genome of the Western Clawed Frog Xenopus tropicalis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hellsten, Uffe; Harland, Richard M.; Gilchrist, Michael J.

    2009-10-01

    The western clawed frog Xenopus tropicalis is an important model for vertebrate development that combines experimental advantages of the African clawed frog Xenopus laevis with more tractable genetics. Here we present a draft genome sequence assembly of X. tropicalis. This genome encodes over 20,000 protein-coding genes, including orthologs of at least 1,700 human disease genes. Over a million expressed sequence tags validated the annotation. More than one-third of the genome consists of transposable elements, with unusually prevalent DNA transposons. Like other tetrapods, the genome contains gene deserts enriched for conserved non-coding elements. The genome exhibits remarkable shared synteny with humanmore » and chicken over major parts of large chromosomes, broken by lineage-specific chromosome fusions and fissions, mainly in the mammalian lineage.« less

  6. Noncoding sequence classification based on wavelet transform analysis: part I

    NASA Astrophysics Data System (ADS)

    Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.

    2017-09-01

    DNA sequences in human genome can be divided into the coding and noncoding ones. Coding sequences are those that are read during the transcription. The identification of coding sequences has been widely reported in literature due to its much-studied periodicity. Noncoding sequences represent the majority of the human genome. They play an important role in gene regulation and differentiation among the cells. However, noncoding sequences do not exhibit periodicities that correlate to their functions. The ENCODE (Encyclopedia of DNA elements) and Epigenomic Roadmap Project projects have cataloged the human noncoding sequences into specific functions. We study characteristics of noncoding sequences with wavelet analysis of genomic signals.

  7. Discovery of functional non-coding conserved regions in the α-synuclein gene locus

    PubMed Central

    Sterling, Lori; Walter, Michael; Ting, Dennis; Schüle, Birgitt

    2014-01-01

    Several single nucleotide polymorphisms (SNPs) and the Rep-1 microsatellite marker of the α-synuclein ( SNCA) gene have consistently been shown to be associated with Parkinson’s disease, but the functional relevance is unclear. Based on these findings we hypothesized that conserved cis-regulatory elements in the SNCA genomic region regulate expression of SNCA, and that SNPs in these regions could be functionally modulating the expression of SNCA, thus contributing to neuronal demise and predisposing to Parkinson’s disease. In a pair-wise comparison of a 206kb genomic region encompassing the SNCA gene, we revealed 34 evolutionary conserved DNA sequences between human and mouse. All elements were cloned into reporter vectors and assessed for expression modulation in dual luciferase reporter assays.  We found that 12 out of 34 elements exhibited either an enhancement or reduction of the expression of the reporter gene. Three elements upstream of the SNCA gene displayed an approximately 1.5 fold (p<0.009) increase in expression. Of the intronic regions, three showed a 1.5 fold increase and two others indicated a 2 and 2.5 fold increase in expression (p<0.002). Three elements downstream of the SNCA gene showed 1.5 fold and 2.5 fold increase (p<0.0009). One element downstream of SNCA had a reduced expression of the reporter gene of 0.35 fold (p<0.0009) of normal activity. Our results demonstrate that the SNCA gene contains cis-regulatory regions that might regulate the transcription and expression of SNCA. Further studies in disease-relevant tissue types will be important to understand the functional impact of regulatory regions and specific Parkinson’s disease-associated SNPs and its function in the disease process. PMID:25566351

  8. A single nucleotide polymorphism associated with isolated cleft lip and palate, thyroid cancer and hypothyroidism alters the activity of an oral epithelium and thyroid enhancer near FOXE1

    PubMed Central

    Lidral, Andrew C.; Liu, Huan; Bullard, Steven A.; Bonde, Greg; Machida, Junichiro; Visel, Axel; Uribe, Lina M. Moreno; Li, Xiao; Amendt, Brad; Cornell, Robert A.

    2015-01-01

    Three common diseases, isolated cleft lip and cleft palate (CLP), hypothyroidism and thyroid cancer all map to the FOXE1 locus, but causative variants have yet to be identified. In patients with CLP, the frequency of coding mutations in FOXE1 fails to account for the risk attributable to this locus, suggesting that the common risk alleles reside in nearby regulatory elements. Using a combination of zebrafish and mouse transgenesis, we screened 15 conserved non-coding sequences for enhancer activity, identifying three that regulate expression in a tissue specific pattern consistent with endogenous foxe1 expression. These three, located −82.4, −67.7 and +22.6 kb from the FOXE1 start codon, are all active in the oral epithelium or branchial arches. The −67.7 and +22.6 kb elements are also active in the developing heart, and the −67.7 kb element uniquely directs expression in the developing thyroid. Within the −67.7 kb element is the SNP rs7850258 that is associated with all three diseases. Quantitative reporter assays in oral epithelial and thyroid cell lines show that the rs7850258 allele (G) associated with CLP and hypothyroidism has significantly greater enhancer activity than the allele associated with thyroid cancer (A). Moreover, consistent with predicted transcription factor binding differences, the −67.7 kb element containing rs7850258 allele G is significantly more responsive to both MYC and ARNT than allele A. By demonstrating that this common non-coding variant alters FOXE1 expression, we have identified at least in part the functional basis for the genetic risk of these seemingly disparate disorders. PMID:25652407

  9. Analysis of a new homozygous deletion in the tumor suppressor region at 3p12.3 reveals two novel intronic noncoding RNA genes.

    PubMed

    Angeloni, Debora; ter Elst, Arja; Wei, Ming Hui; van der Veen, Anneke Y; Braga, Eleonora A; Klimov, Eugene A; Timmer, Tineke; Korobeinikova, Luba; Lerman, Michael I; Buys, Charles H C M

    2006-07-01

    Homozygous deletions or loss of heterozygosity (LOH) at human chromosome band 3p12 are consistent features of lung and other malignancies, suggesting the presence of a tumor suppressor gene(s) (TSG) at this location. Only one gene has been cloned thus far from the overlapping region deleted in lung and breast cancer cell lines U2020, NCI H2198, and HCC38. It is DUTT1 (Deleted in U Twenty Twenty), also known as ROBO1, FLJ21882, and SAX3, according to HUGO. DUTT1, the human ortholog of the fly gene ROBO, has homology with NCAM proteins. Extensive analyses of DUTT1 in lung cancer have not revealed any mutations, suggesting that another gene(s) at this location could be of importance in lung cancer initiation and progression. Here, we report the discovery of a new, small, homozygous deletion in the small cell lung cancer (SCLC) cell line GLC20, nested in the overlapping, critical region. The deletion was delineated using several polymorphic markers and three overlapping P1 phage clones. Fiber-FISH experiments revealed the deletion was approximately 130 kb. Comparative genomic sequence analysis uncovered short sequence elements highly conserved among mammalian genomes and the chicken genome. The discovery of two EST clusters within the deleted region led to the isolation of two noncoding RNA (ncRNA) genes. These were subsequently found differentially expressed in various tumors when compared to their normal tissues. The ncRNA and other highly conserved sequence elements in the deleted region may represent miRNA targets of importance in cancer initiation or progression. Published 2006 Wiley-Liss, Inc.

  10. Evolution of coding and non-coding genes in HOX clusters of a marsupial.

    PubMed

    Yu, Hongshi; Lindsay, James; Feng, Zhi-Ping; Frankenberg, Stephen; Hu, Yanqiu; Carone, Dawn; Shaw, Geoff; Pask, Andrew J; O'Neill, Rachel; Papenfuss, Anthony T; Renfree, Marilyn B

    2012-06-18

    The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.

  11. Evolution of coding and non-coding genes in HOX clusters of a marsupial

    PubMed Central

    2012-01-01

    Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial. PMID:22708672

  12. Reptiles and mammals have differentially retained long conserved noncoding sequences from the amniote ancestor.

    PubMed

    Janes, D E; Chapus, C; Gondo, Y; Clayton, D F; Sinha, S; Blatti, C A; Organ, C L; Fujita, M K; Balakrishnan, C N; Edwards, S V

    2011-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation.

  13. Reptiles and Mammals Have Differentially Retained Long Conserved Noncoding Sequences from the Amniote Ancestor

    PubMed Central

    Janes, D.E.; Chapus, C.; Gondo, Y.; Clayton, D.F.; Sinha, S.; Blatti, C.A.; Organ, C.L.; Fujita, M.K.; Balakrishnan, C.N.; Edwards, S.V.

    2010-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation. PMID:21183607

  14. Comparative transgenic analysis of enhancers from the human SHOX and mouse Shox2 genomic regions.

    PubMed

    Rosin, Jessica M; Abassah-Oppong, Samuel; Cobb, John

    2013-08-01

    Disruption of presumptive enhancers downstream of the human SHOX gene (hSHOX) is a frequent cause of the zeugopodal limb defects characteristic of Léri-Weill dyschondrosteosis (LWD). The closely related mouse Shox2 gene (mShox2) is also required for limb development, but in the more proximal stylopodium. In this study, we used transgenic mice in a comparative approach to characterize enhancer sequences in the hSHOX and mShox2 genomic regions. Among conserved noncoding elements (CNEs) that function as enhancers in vertebrate genomes, those that are maintained near paralogous genes are of particular interest given their ancient origins. Therefore, we first analyzed the regulatory potential of a genomic region containing one such duplicated CNE (dCNE) downstream of mShox2 and hSHOX. We identified a strong limb enhancer directly adjacent to the mShox2 dCNE that recapitulates the expression pattern of the endogenous gene. Interestingly, this enhancer requires sequences only conserved in the mammalian lineage in order to drive strong limb expression, whereas the more deeply conserved sequences of the dCNE function as a neural enhancer. Similarly, we found that a conserved element downstream of hSHOX (CNE9) also functions as a neural enhancer in transgenic mice. However, when the CNE9 transgenic construct was enlarged to include adjacent, non-conserved sequences frequently deleted in LWD patients, the transgene drove expression in the zeugopodium of the limbs. Therefore, both hSHOX and mShox2 limb enhancers are coupled to distinct neural enhancers. This is the first report demonstrating the activity of cis-regulatory elements from the hSHOX and mShox2 genomic regions in mammalian embryos.

  15. The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

    PubMed

    Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

    2016-04-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.

  16. The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons

    PubMed Central

    Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.

    2016-01-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095

  17. Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence.

    PubMed

    Benko, Sabina; Fantes, Judy A; Amiel, Jeanne; Kleinjan, Dirk-Jan; Thomas, Sophie; Ramsay, Jacqueline; Jamshidi, Negar; Essafi, Abdelkader; Heaney, Simon; Gordon, Christopher T; McBride, David; Golzio, Christelle; Fisher, Malcolm; Perry, Paul; Abadie, Véronique; Ayuso, Carmen; Holder-Espinasse, Muriel; Kilpatrick, Nicky; Lees, Melissa M; Picard, Arnaud; Temple, I Karen; Thomas, Paul; Vazquez, Marie-Paule; Vekemans, Michel; Roest Crollius, Hugues; Hastie, Nicholas D; Munnich, Arnold; Etchevers, Heather C; Pelet, Anna; Farlie, Peter G; Fitzpatrick, David R; Lyonnet, Stanislas

    2009-03-01

    Pierre Robin sequence (PRS) is an important subgroup of cleft palate. We report several lines of evidence for the existence of a 17q24 locus underlying PRS, including linkage analysis results, a clustering of translocation breakpoints 1.06-1.23 Mb upstream of SOX9, and microdeletions both approximately 1.5 Mb centromeric and approximately 1.5 Mb telomeric of SOX9. We have also identified a heterozygous point mutation in an evolutionarily conserved region of DNA with in vitro and in vivo features of a developmental enhancer. This enhancer is centromeric to the breakpoint cluster and maps within one of the microdeletion regions. The mutation abrogates the in vitro enhancer function and alters binding of the transcription factor MSX1 as compared to the wild-type sequence. In the developing mouse mandible, the 3-Mb region bounded by the microdeletions shows a regionally specific chromatin decompaction in cells expressing Sox9. Some cases of PRS may thus result from developmental misexpression of SOX9 due to disruption of very-long-range cis-regulatory elements.

  18. Comprehensive analysis of single molecule sequencing-derived complete genome and whole transcriptome of Hyposidra talaca nuclear polyhedrosis virus.

    PubMed

    Nguyen, Thong T; Suryamohan, Kushal; Kuriakose, Boney; Janakiraman, Vasantharajan; Reichelt, Mike; Chaudhuri, Subhra; Guillory, Joseph; Divakaran, Neethu; Rabins, P E; Goel, Ridhi; Deka, Bhabesh; Sarkar, Suman; Ekka, Preety; Tsai, Yu-Chih; Vargas, Derek; Santhosh, Sam; Mohan, Sangeetha; Chin, Chen-Shan; Korlach, Jonas; Thomas, George; Babu, Azariah; Seshagiri, Somasekar

    2018-06-12

    We sequenced the Hyposidra talaca NPV (HytaNPV) double stranded circular DNA genome using PacBio single molecule sequencing technology. We found that the HytaNPV genome is 139,089 bp long with a GC content of 39.6%. It encodes 141 open reading frames (ORFs) including the 37 baculovirus core genes, 25 genes conserved among lepidopteran baculoviruses, 72 genes known in baculovirus, and 7 genes unique to the HytaNPV genome. It is a group II alphabaculovirus that codes for the F protein and lacks the gp64 gene found in group I alphabaculovirus viruses. Using RNA-seq, we confirmed the expression of the ORFs identified in the HytaNPV genome. Phylogenetic analysis showed HytaNPV to be closest to BusuNPV, SujuNPV and EcobNPV that infect other tea pests, Buzura suppressaria, Sucra jujuba, and Ectropis oblique, respectively. We identified repeat elements and a conserved non-coding baculovirus element in the genome. Analysis of the putative promoter sequences identified motif consistent with the temporal expression of the genes observed in the RNA-seq data.

  19. Evolutionary growth process of highly conserved sequences in vertebrate genomes.

    PubMed

    Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

    2012-08-01

    Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.

  20. Identification and Characterization of Small Noncoding RNAs in Genome Sequences of the Edible Fungus Pleurotus ostreatus

    PubMed Central

    Zhao, Mengran; Hsiang, Tom; Feng, Xiaoxing

    2016-01-01

    Noncoding RNAs (ncRNAs) have been identified in many fungi. However, no genome-scale identification of ncRNAs has been inventoried for basidiomycetes. In this research, we detected 254 small noncoding RNAs (sncRNAs) in a genome assembly of an isolate (CCEF00389) of Pleurotus ostreatus, which is a widely cultivated edible basidiomycetous fungus worldwide. The identified sncRNAs include snRNAs, snoRNAs, tRNAs, and miRNAs. SnRNA U1 was not found in CCEF00389 genome assembly and some other basidiomycetous genomes by BLASTn. This implies that if snRNA U1 of basidiomycetes exists, it has a sequence that varies significantly from other organisms. By analyzing the distribution of sncRNA loci, we found that snRNAs and most tRNAs (88.6%) were located in pseudo-UTR regions, while miRNAs are commonly found in introns. To analyze the evolutionary conservation of the sncRNAs in P. ostreatus, we aligned all 254 sncRNAs to the genome assemblies of some other Agaricomycotina fungi. The results suggest that most sncRNAs (77.56%) were highly conserved in P. ostreatus, and 20% were conserved in Agaricomycotina fungi. These findings indicate that most sncRNAs of P. ostreatus were not conserved across Agaricomycotina fungi. PMID:27703969

  1. A liver enhancer in the fibrinogen gene cluster.

    PubMed

    Fort, Alexandre; Fish, Richard J; Attanasio, Catia; Dosch, Roland; Visel, Axel; Neerman-Arbez, Marguerite

    2011-01-06

    The plasma concentration of fibrinogen varies in the healthy human population between 1.5 and 3.5 g/L. Understanding the basis of this variability has clinical importance because elevated fibrinogen levels are associated with increased cardiovascular disease risk. To identify novel regulatory elements involved in the control of fibrinogen expression, we used sequence conservation and in silico-predicted regulatory potential to select 14 conserved noncoding sequences (CNCs) within the conserved block of synteny containing the fibrinogen locus. The regulatory potential of each CNC was tested in vitro using a luciferase reporter gene assay in fibrinogen-expressing hepatoma cell lines (HuH7 and HepG2). 4 potential enhancers were tested for their ability to direct enhanced green fluorescent protein expression in zebrafish embryos. CNC12, a sequence equidistant from the human fibrinogen alpha and beta chain genes, activates strong liver enhanced green fluorescent protein expression in injected embryos and their transgenic progeny. A transgenic assay in embryonic day 14.5 mouse embryos confirmed the ability of CNC12 to activate transcription in the liver. While additional experiments are necessary to prove the role of CNC12 in the regulation of fibrinogen, our study reveals a novel regulatory element in the fibrinogen locus that is active in the liver and may contribute to variable fibrinogen expression in humans.

  2. COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures

    DOE PAGES

    Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...

    2016-09-20

    There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less

  3. COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.

    There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less

  4. A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

    PubMed

    Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

    2006-04-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.

  5. A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

    PubMed Central

    Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

    2006-01-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031

  6. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

    PubMed Central

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-01-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  7. The Prx1 limb enhancers: targeted gene expression in developing zebrafish pectoral fins.

    PubMed

    Hernández-Vega, Amayra; Minguillón, Carolina

    2011-08-01

    Limbs represent an excellent model to study the induction, growth, and patterning of several organs. A breakthrough to study gene function in various tissues has been the characterization of regulatory elements that allow tissue-specific interference of gene function. The mouse Prx1 promoter has been used to generate limb-specific mutants and overexpress genes in tetrapod limbs. Although zebrafish possess advantages that favor their use to study limb morphogenesis, there is no driver described suitable for specifically interfering with gene function in developing fins. We report the generation of zebrafish lines that express enhanced green fluorescent protein (EGFP) driven by the mouse Prx1 enhancer in developing pectoral fins. We also describe the expression pattern of the zebrafish prrx1 genes and identify three conserved non-coding elements (CNEs) that we use to generate fin-specific EGFP reporter lines. Finally, we show that the mouse and zebrafish regulatory elements may be used to modify gene function in pectoral fins. Copyright © 2011 Wiley-Liss, Inc.

  8. Emergence of the Noncoding Cancer Genome: A Target of Genetic and Epigenetic Alterations.

    PubMed

    Zhou, Stanley; Treloar, Aislinn E; Lupien, Mathieu

    2016-11-01

    The emergence of whole-genome annotation approaches is paving the way for the comprehensive annotation of the human genome across diverse cell and tissue types exposed to various environmental conditions. This has already unmasked the positions of thousands of functional cis-regulatory elements integral to transcriptional regulation, such as enhancers, promoters, and anchors of chromatin interactions that populate the noncoding genome. Recent studies have shown that cis-regulatory elements are commonly the targets of genetic and epigenetic alterations associated with aberrant gene expression in cancer. Here, we review these findings to showcase the contribution of the noncoding genome and its alteration in the development and progression of cancer. We also highlight the opportunities to translate the biological characterization of genetic and epigenetic alterations in the noncoding cancer genome into novel approaches to treat or monitor disease. The majority of genetic and epigenetic alterations accumulate in the noncoding genome throughout oncogenesis. Discriminating driver from passenger events is a challenge that holds great promise to improve our understanding of the etiology of different cancer types. Advancing our understanding of the noncoding cancer genome may thus identify new therapeutic opportunities and accelerate our capacity to find improved biomarkers to monitor various stages of cancer development. Cancer Discov; 6(11); 1215-29. ©2016 AACR. ©2016 American Association for Cancer Research.

  9. Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

    PubMed

    Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

    2010-01-15

    Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short. Copyright 2009 Elsevier Inc. All rights reserved.

  10. Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates

    PubMed Central

    Kikuta, Hiroshi; Laplante, Mary; Navratilova, Pavla; Komisarczuk, Anna Z.; Engström, Pär G.; Fredman, David; Akalin, Altuna; Caccamo, Mario; Sealy, Ian; Howe, Kerstin; Ghislain, Julien; Pezeron, Guillaume; Mourrain, Philippe; Ellingsen, Staale; Oates, Andrew C.; Thisse, Christine; Thisse, Bernard; Foucher, Isabelle; Adolf, Birgit; Geling, Andrea; Lenhard, Boris; Becker, Thomas S.

    2007-01-01

    We report evidence for a mechanism for the maintenance of long-range conserved synteny across vertebrate genomes. We found the largest mammal-teleost conserved chromosomal segments to be spanned by highly conserved noncoding elements (HCNEs), their developmental regulatory target genes, and phylogenetically and functionally unrelated “bystander” genes. Bystander genes are not specifically under the control of the regulatory elements that drive the target genes and are expressed in patterns that are different from those of the target genes. Reporter insertions distal to zebrafish developmental regulatory genes pax6.1/2, rx3, id1, and fgf8 and miRNA genes mirn9-1 and mirn9-5 recapitulate the expression patterns of these genes even if located inside or beyond bystander genes, suggesting that the regulatory domain of a developmental regulatory gene can extend into and beyond adjacent transcriptional units. We termed these chromosomal segments genomic regulatory blocks (GRBs). After whole genome duplication in teleosts, GRBs, including HCNEs and target genes, were often maintained in both copies, while bystander genes were typically lost from one GRB, strongly suggesting that evolutionary pressure acts to keep the single-copy GRBs of higher vertebrates intact. We show that loss of bystander genes and other mutational events suffered by duplicated GRBs in teleost genomes permits target gene identification and HCNE/target gene assignment. These findings explain the absence of evolutionary breakpoints from large vertebrate chromosomal segments and will aid in the recognition of position effect mutations within human GRBs. PMID:17387144

  11. The Histone Modification H3K27me3 Is Retained after Gene Duplication and Correlates with Conserved Noncoding Sequences in Arabidopsis

    PubMed Central

    Berke, Lidija; Snel, Berend

    2014-01-01

    The histone modification H3K27me3 is involved in repression of transcription and plays a crucial role in developmental transitions in both animals and plants. It is deposited by PRC2 (Polycomb repressive complex 2), a conserved protein complex. In Arabidopsis thaliana, H3K27me3 is found at 15% of all genes. These tend to encode transcription factors and other regulators important for development. However, it is not known how PRC2 is recruited to target loci nor how this set of target genes arose during Arabidopsis evolution. To resolve the latter, we integrated A. thaliana gene families with five independent genome-wide H3K27me3 data sets. Gene families were either significantly enriched or depleted of H3K27me3, showing a strong impact of shared ancestry to H3K27me3 distribution. To quantify this, we performed ancestral state reconstruction of H3K27me3 on phylogenetic trees of gene families. The set of H3K27me3-marked genes changed less than expected by chance, suggesting that H3K27me3 was retained after gene duplication. This retention suggests that the PRC2-recruiting signal could be encoded in the DNA and also conserved among certain duplicated genes. Indeed, H3K27me3-marked genes were overrepresented among paralogs sharing conserved noncoding sequences (CNSs) that are enriched with transcription factor binding sites. The association of upstream CNSs with H3K27me3-marked genes represents the first genome-wide connection between H3K27me3 and potential regulatory elements in plants. Thus, we propose that CNSs likely function as part of the PRC2 recruitment in plants. PMID:24567304

  12. Translational efficiency of poliovirus mRNA: mapping inhibitory cis-acting elements within the 5' noncoding region.

    PubMed Central

    Pelletier, J; Kaplan, G; Racaniello, V R; Sonenberg, N

    1988-01-01

    Poliovirus mRNA contains a long 5' noncoding region of about 750 nucleotides (the exact number varies among the three virus serotypes), which contains several AUG codons upstream of the major initiator AUG. Unlike most eucaryotic mRNAs, poliovirus does not contain a m7GpppX (where X is any nucleotide) cap structure at its 5' end and is translated by a cap-independent mechanism. To study the manner by which poliovirus mRNA is expressed, we examined the translational efficiencies of a series of deletion mutants within the 5' noncoding region of the mRNA. In this paper we report striking translation system-specific differences in the ability of the altered mRNAs to be translated. The results suggest the existence of an inhibitory cis-acting element(s) within the 5' noncoding region of poliovirus (between nucleotides 70 and 381) which restricts mRNA translation in reticulocyte lysate, wheat germ extract, and Xenopus oocytes, but not in HeLa cell extracts. In addition, we show that HeLa cell extracts contain a trans-acting factor(s) that overcomes this restriction. Images PMID:2836606

  13. Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes

    PubMed Central

    Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong

    2014-01-01

    Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution. PMID:25523484

  14. Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes.

    PubMed

    Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong

    2014-12-19

    Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution.

  15. Identification of Regulatory Elements That Control PPARγ Expression in Adipocyte Progenitors

    PubMed Central

    Chou, Wen-Ling; Galmozzi, Andrea; Partida, David; Kwan, Kevin; Yeung, Hui; Su, Andrew I.; Saez, Enrique

    2013-01-01

    Adipose tissue renewal and obesity-driven expansion of fat cell number are dependent on proliferation and differentiation of adipose progenitors that reside in the vasculature that develops in coordination with adipose depots. The transcriptional events that regulate commitment of progenitors to the adipose lineage are poorly understood. Because expression of the nuclear receptor PPARγ defines the adipose lineage, isolation of elements that control PPARγ expression in adipose precursors may lead to discovery of transcriptional regulators of early adipocyte determination. Here, we describe the identification and validation in transgenic mice of 5 highly conserved non-coding sequences from the PPARγ locus that can drive expression of a reporter gene in a manner that recapitulates the tissue-specific pattern of PPARγ expression. Surprisingly, these 5 elements appear to control PPARγ expression in adipocyte precursors that are associated with the vasculature of adipose depots, but not in mature adipocytes. Characterization of these five PPARγ regulatory sequences may enable isolation of the transcription factors that bind these cis elements and provide insight into the molecular regulation of adipose tissue expansion in normal and pathological states. PMID:24009687

  16. Highly tissue specific expression of Sphinx supports its male courtship related role in Drosophila melanogaster.

    PubMed

    Chen, Ying; Dai, Hongzheng; Chen, Sidi; Zhang, Luoying; Long, Manyuan

    2011-04-26

    Sphinx is a lineage-specific non-coding RNA gene involved in regulating courtship behavior in Drosophila melanogaster. The 5' flanking region of the gene is conserved across Drosophila species, with the proximal 300 bp being conserved out to D. virilis and a further 600 bp region being conserved amongst the melanogaster subgroup (D. melanogaster, D. simulans, D. sechellia, D. yakuba, and D. erecta). Using a green fluorescence protein transformation system, we demonstrated that a 253 bp region of the highly conserved segment was sufficient to drive sphinx expression in male accessory gland. GFP signals were also observed in brain, wing hairs and leg bristles. An additional ∼800 bp upstream region was able to enhance expression specifically in proboscis, suggesting the existence of enhancer elements. Using anti-GFP staining, we identified putative sphinx expression signal in the brain antennal lobe and inner antennocerebral tract, suggesting that sphinx might be involved in olfactory neuron mediated regulation of male courtship behavior. Whole genome expression profiling of the sphinx knockout mutation identified significant up-regulated gene categories related to accessory gland protein function and odor perception, suggesting sphinx might be a negative regulator of its target genes.

  17. Highly Tissue Specific Expression of Sphinx Supports Its Male Courtship Related Role in Drosophila melanogaster

    PubMed Central

    Chen, Sidi; Zhang, Luoying; Long, Manyuan

    2011-01-01

    Sphinx is a lineage-specific non-coding RNA gene involved in regulating courtship behavior in Drosophila melanogaster. The 5′ flanking region of the gene is conserved across Drosophila species, with the proximal 300 bp being conserved out to D. virilis and a further 600 bp region being conserved amongst the melanogaster subgroup (D. melanogaster, D. simulans, D. sechellia, D. yakuba, and D. erecta). Using a green fluorescence protein transformation system, we demonstrated that a 253 bp region of the highly conserved segment was sufficient to drive sphinx expression in male accessory gland. GFP signals were also observed in brain, wing hairs and leg bristles. An additional ∼800 bp upstream region was able to enhance expression specifically in proboscis, suggesting the existence of enhancer elements. Using anti-GFP staining, we identified putative sphinx expression signal in the brain antennal lobe and inner antennocerebral tract, suggesting that sphinx might be involved in olfactory neuron mediated regulation of male courtship behavior. Whole genome expression profiling of the sphinx knockout mutation identified significant up-regulated gene categories related to accessory gland protein function and odor perception, suggesting sphinx might be a negative regulator of its target genes. PMID:21541324

  18. DIANA-LncBase v2: indexing microRNA targets on non-coding transcripts

    PubMed Central

    Paraskevopoulou, Maria D.; Vlachos, Ioannis S.; Karagkouni, Dimitra; Georgakilas, Georgios; Kanellos, Ilias; Vergoulis, Thanasis; Zagganas, Konstantinos; Tsanakas, Panayiotis; Floros, Evangelos; Dalamagas, Theodore; Hatzigeorgiou, Artemis G.

    2016-01-01

    microRNAs (miRNAs) are short non-coding RNAs (ncRNAs) that act as post-transcriptional regulators of coding gene expression. Long non-coding RNAs (lncRNAs) have been recently reported to interact with miRNAs. The sponge-like function of lncRNAs introduces an extra layer of complexity in the miRNA interactome. DIANA-LncBase v1 provided a database of experimentally supported and in silico predicted miRNA Recognition Elements (MREs) on lncRNAs. The second version of LncBase (www.microrna.gr/LncBase) presents an extensive collection of miRNA:lncRNA interactions. The significantly enhanced database includes more than 70 000 low and high-throughput, (in)direct miRNA:lncRNA experimentally supported interactions, derived from manually curated publications and the analysis of 153 AGO CLIP-Seq libraries. The new experimental module presents a 14-fold increase compared to the previous release. LncBase v2 hosts in silico predicted miRNA targets on lncRNAs, identified with the DIANA-microT algorithm. The relevant module provides millions of predicted miRNA binding sites, accompanied with detailed metadata and MRE conservation metrics. LncBase v2 caters information regarding cell type specific miRNA:lncRNA regulation and enables users to easily identify interactions in 66 different cell types, spanning 36 tissues for human and mouse. Database entries are also supported by accurate lncRNA expression information, derived from the analysis of more than 6 billion RNA-Seq reads. PMID:26612864

  19. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

    PubMed

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-07-20

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Detection of non-coding RNA in bacteria and archaea using the DETR'PROK Galaxy pipeline.

    PubMed

    Toffano-Nioche, Claire; Luo, Yufei; Kuchly, Claire; Wallon, Claire; Steinbach, Delphine; Zytnicki, Matthias; Jacq, Annick; Gautheret, Daniel

    2013-09-01

    RNA-seq experiments are now routinely used for the large scale sequencing of transcripts. In bacteria or archaea, such deep sequencing experiments typically produce 10-50 million fragments that cover most of the genome, including intergenic regions. In this context, the precise delineation of the non-coding elements is challenging. Non-coding elements include untranslated regions (UTRs) of mRNAs, independent small RNA genes (sRNAs) and transcripts produced from the antisense strand of genes (asRNA). Here we present a computational pipeline (DETR'PROK: detection of ncRNAs in prokaryotes) based on the Galaxy framework that takes as input a mapping of deep sequencing reads and performs successive steps of clustering, comparison with existing annotation and identification of transcribed non-coding fragments classified into putative 5' UTRs, sRNAs and asRNAs. We provide a step-by-step description of the protocol using real-life example data sets from Vibrio splendidus and Escherichia coli. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  1. Control of seed dormancy in Arabidopsis by a cis-acting noncoding antisense transcript.

    PubMed

    Fedak, Halina; Palusinska, Malgorzata; Krzyczmonik, Katarzyna; Brzezniak, Lien; Yatusevich, Ruslan; Pietras, Zbigniew; Kaczanowski, Szymon; Swiezewski, Szymon

    2016-11-29

    Seed dormancy is one of the most crucial process transitions in a plant's life cycle. Its timing is tightly controlled by the expression level of the Delay of Germination 1 gene (DOG1). DOG1 is the major quantitative trait locus for seed dormancy in Arabidopsis and has been shown to control dormancy in many other plant species. This is reflected by the evolutionary conservation of the functional short alternatively polyadenylated form of the DOG1 mRNA. Notably, the 3' region of DOG1, including the last exon that is not included in this transcript isoform, shows a high level of conservation at the DNA level, but the encoded polypeptide is poorly conserved. Here, we demonstrate that this region of DOG1 contains a promoter for the transcription of a noncoding antisense RNA, asDOG1, that is 5' capped, polyadenylated, and relatively stable. This promoter is autonomous and asDOG1 has an expression profile that is different from known DOG1 transcripts. Using several approaches we show that asDOG1 strongly suppresses DOG1 expression during seed maturation in cis, but is unable to do so in trans Therefore, the negative regulation of seed dormancy by asDOG1 in cis results in allele-specific suppression of DOG1 expression and promotes germination. Given the evolutionary conservation of the asDOG1 promoter, we propose that this cis-constrained noncoding RNA-mediated mechanism limiting the duration of seed dormancy functions across the Brassicaceae.

  2. Functional noncoding sequences derived from SINEs in the mammalian genome.

    PubMed

    Nishihara, Hidenori; Smit, Arian F A; Okada, Norihiro

    2006-07-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the approximately 1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality.

  3. Cap-independent translation of poliovirus mRNA is conferred by sequence elements within the 5' noncoding region

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pelletier, J.; Kaplan, G.; Racaniello, V.R.

    1988-03-01

    Poliovirus polysomal RNA is naturally uncapped, and as such, its translation must bypass any 5' cap-dependent ribosome recognition event. To elucidate the manner by which poliovirus mRNA is translated, the authors determined the translational efficiencies of a series of deletion mutants within the 5' noncoding region of the mRNA. They found striking differences in translatability among the altered mRNAs when assayed in mock-infected and poliovirus-infected HeLa cell extracts. The results identify a functional cis-acting element within the 5' noncoding region of the poliovirus mRNA which enables it to translate in a cap-independent fashion. The major determinant of this element mapsmore » between nucleotides 320 and 631 of the 5' end of the poliovirus mRNA. They also show that this region (320 to 631), when fused to a heterologous mRNA, can function in cis to render the mRNA cap independent in translation.« less

  4. Regulatory activities of transposable elements: from conflicts to benefits

    PubMed Central

    Chuong, Edward B.; Elde, Nels C.; Feschotte, Cédric

    2017-01-01

    Transposable elements (TEs) are a prolific source of tightly regulated, biochemically active non-coding elements, such as transcription factor binding sites and non-coding RNAs. A wealth of recent studies reinvigorates the idea that these elements are pervasively co-opted for the regulation of host genes. We argue that the inherent genetic properties of TEs and conflicting relationships with their hosts facilitate their recruitment for regulatory functions in diverse genomes. We review recent findings supporting the long-standing hypothesis that the waves of TE invasions endured by organisms for eons have catalyzed the evolution of gene regulatory networks. We also discuss the challenges of dissecting and interpreting the phenotypic impact of regulatory activities encoded by TEs in health and disease. PMID:27867194

  5. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  6. A 3' UTR-Derived Small RNA Provides the Regulatory Noncoding Arm of the Inner Membrane Stress Response.

    PubMed

    Chao, Yanjie; Vogel, Jörg

    2016-02-04

    Small RNAs (sRNAs) from conserved noncoding genes are crucial regulators in bacterial signaling pathways but have remained elusive in the Cpx response to inner membrane stress. Here we report that an alternative biogenesis pathway releasing the conserved mRNA 3' UTR of stress chaperone CpxP as an ∼60-nt sRNA provides the noncoding arm of the Cpx response. This so-called CpxQ sRNA, generated by general mRNA decay through RNase E, acts as an Hfq-dependent repressor of multiple mRNAs encoding extracytoplasmic proteins. Both CpxQ and the Cpx pathway are required for cell survival under conditions of dissipation of membrane potential. Our discovery of CpxQ illustrates how the conversion of a transcribed 3' UTR into an sRNA doubles the output of a single mRNA to produce two factors with spatially segregated functions during inner membrane stress: a chaperone that targets problematic proteins in the periplasm and a regulatory RNA that dampens their synthesis in the cytosol. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. The Hippo pathway in hepatocellular carcinoma: Non-coding RNAs in action.

    PubMed

    Shi, Xuan; Zhu, Hai-Rong; Liu, Tao-Tao; Shen, Xi-Zhong; Zhu, Ji-Min

    2017-08-01

    Hepatocellular carcinoma (HCC) is the sixth most common cancer and the third leading cause of cancer-related death worldwide. However, current strategies curing HCC are far from satisfaction. The Hippo pathway is an evolutionarily conserved tumor suppressive pathway that plays crucial roles in organ size control and tissue homeostasis. Its dysregulation is commonly observed in various types of cancer including HCC. Recently, the prominent role of non-coding RNAs in the Hippo pathway during normal development and neoplastic progression is also emerging in liver. Thus, further investigation into the regulatory network between non-coding RNAs and the Hippo pathway and their connections with HCC may provide new therapeutic avenues towards developing an effective preventative or perhaps curative treatment for HCC. Herein we summarize the role of non-coding RNAs in the Hippo pathway, with an emphasis on their contribution to carcinogenesis, diagnosis, treatment and prognosis of HCC. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Comparative evolutionary genomics of the HADH2 gene encoding Aβ-binding alcohol dehydrogenase/17β-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10)

    PubMed Central

    Marques, Alexandra T; Antunes, Agostinho; Fernandes, Pedro A; Ramos, Maria J

    2006-01-01

    Background The Aβ-binding alcohol dehydrogenase/17β-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10) is an enzyme involved in pivotal metabolic processes and in the mitochondrial dysfunction seen in the Alzheimer's disease. Here we use comparative genomic analyses to study the evolution of the HADH2 gene encoding ABAD/HSD10 across several eukaryotic species. Results Both vertebrate and nematode HADH2 genes showed a six-exon/five-intron organization while those of the insects had a reduced and varied number of exons (two to three). Eutherian mammal HADH2 genes revealed some highly conserved noncoding regions, which may indicate the presence of functional elements, namely in the upstream region about 1 kb of the transcription start site and in the first part of intron 1. These regions were also conserved between Tetraodon and Fugu fishes. We identified a conserved alternative splicing event between human and dog, which have a nine amino acid deletion, causing the removal of the strand βF. This strand is one of the seven strands that compose the core β-sheet of the Rossman fold dinucleotide-binding motif characteristic of the short chain dehydrogenase/reductase (SDR) family members. However, the fact that the substrate binding cleft residues are retained and the existence of a shared variant between human and dog suggest that it might be functional. Molecular adaptation analyses across eutherian mammal orthologues revealed the existence of sites under positive selection, some of which being localized in the substrate-binding cleft and in the insertion 1 region on loop D (an important region for the Aβ-binding to the enzyme). Interestingly, a higher than expected number of nonsynonymous substitutions were observed between human/chimpanzee and orangutan, with six out of the seven amino acid replacements being under molecular adaptation (including three in loop D and one in the substrate binding loop). Conclusion Our study revealed that HADH2 genes maintained a reasonable conserved organization across a large evolutionary distance. The conserved noncoding regions identified among mammals and between pufferfishes, the evidence of an alternative splicing variant conserved between human and dog, and the detection of positive selection across eutherian mammals, may be of importance for further research on ABAD/HSD10 function and its implication in the Alzheimer's disease. PMID:16899120

  9. DIANA-LncBase v2: indexing microRNA targets on non-coding transcripts.

    PubMed

    Paraskevopoulou, Maria D; Vlachos, Ioannis S; Karagkouni, Dimitra; Georgakilas, Georgios; Kanellos, Ilias; Vergoulis, Thanasis; Zagganas, Konstantinos; Tsanakas, Panayiotis; Floros, Evangelos; Dalamagas, Theodore; Hatzigeorgiou, Artemis G

    2016-01-04

    microRNAs (miRNAs) are short non-coding RNAs (ncRNAs) that act as post-transcriptional regulators of coding gene expression. Long non-coding RNAs (lncRNAs) have been recently reported to interact with miRNAs. The sponge-like function of lncRNAs introduces an extra layer of complexity in the miRNA interactome. DIANA-LncBase v1 provided a database of experimentally supported and in silico predicted miRNA Recognition Elements (MREs) on lncRNAs. The second version of LncBase (www.microrna.gr/LncBase) presents an extensive collection of miRNA:lncRNA interactions. The significantly enhanced database includes more than 70 000 low and high-throughput, (in)direct miRNA:lncRNA experimentally supported interactions, derived from manually curated publications and the analysis of 153 AGO CLIP-Seq libraries. The new experimental module presents a 14-fold increase compared to the previous release. LncBase v2 hosts in silico predicted miRNA targets on lncRNAs, identified with the DIANA-microT algorithm. The relevant module provides millions of predicted miRNA binding sites, accompanied with detailed metadata and MRE conservation metrics. LncBase v2 caters information regarding cell type specific miRNA:lncRNA regulation and enables users to easily identify interactions in 66 different cell types, spanning 36 tissues for human and mouse. Database entries are also supported by accurate lncRNA expression information, derived from the analysis of more than 6 billion RNA-Seq reads. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Arabidopsis intragenomic conserved noncoding sequence

    PubMed Central

    Thomas, Brian C.; Rapaka, Lakshmi; Lyons, Eric; Pedersen, Brent; Freeling, Michael

    2007-01-01

    After the most recent tetraploidy in the Arabidopsis lineage, most gene pairs lost one, but not both, of their duplicates. We manually inspected the 3,179 retained gene pairs and their surrounding gene space still present in the genome using a custom-made viewer application. The display of these pairs allowed us to define intragenic conserved noncoding sequences (CNSs), identify exon annotation errors, and discover potentially new genes. Using a strict algorithm to sort high-scoring pair sequences from the bl2seq data, we created a database of 14,944 intragenomic Arabidopsis CNSs. The mean CNS length is 31 bp, ranging from 15 to 285 bp. There are ≈1.7 CNSs associated with a typical gene, and Arabidopsis CNSs are found in all areas around exons, most frequently in the 5′ upstream region. Gene ontology classifications related to transcription, regulation, or “response to …” external or endogenous stimuli, especially hormones, tend to be significantly overrepresented among genes containing a large number of CNSs, whereas protein localization, transport, and metabolism are common among genes with no CNSs. There is a 1.5% overlap between these CNSs and the 218,982 putative RNAs in the Arabidopsis Small RNA Project database, allowing for two mismatches. These CNSs provide a unique set of noncoding sequences enriched for function. CNS function is implied by evolutionary conservation and independently supported because CNS-richness predicts regulatory gene ontology categories. PMID:17301222

  11. The development of non-coding RNA ontology.

    PubMed

    Huang, Jingshan; Eilbeck, Karen; Smith, Barry; Blake, Judith A; Dou, Dejing; Huang, Weili; Natale, Darren A; Ruttenberg, Alan; Huan, Jun; Zimmermann, Michael T; Jiang, Guoqian; Lin, Yu; Wu, Bin; Strachan, Harrison J; de Silva, Nisansa; Kasukurthi, Mohan Vamsi; Jha, Vikash Kumar; He, Yongqun; Zhang, Shaojie; Wang, Xiaowei; Liu, Zixing; Borchert, Glen M; Tan, Ming

    2016-01-01

    Identification of non-coding RNAs (ncRNAs) has been significantly improved over the past decade. On the other hand, semantic annotation of ncRNA data is facing critical challenges due to the lack of a comprehensive ontology to serve as common data elements and data exchange standards in the field. We developed the Non-Coding RNA Ontology (NCRO) to handle this situation. By providing a formally defined ncRNA controlled vocabulary, the NCRO aims to fill a specific and highly needed niche in semantic annotation of large amounts of ncRNA biological and clinical data.

  12. Hundreds of conserved non-coding genomic regions are independently lost in mammals

    PubMed Central

    Hiller, Michael; Schaar, Bruce T.; Bejerano, Gill

    2012-01-01

    Conserved non-protein-coding DNA elements (CNEs) often encode cis-regulatory elements and are rarely lost during evolution. However, CNE losses that do occur can be associated with phenotypic changes, exemplified by pelvic spine loss in sticklebacks. Using a computational strategy to detect complete loss of CNEs in mammalian genomes while strictly controlling for artifacts, we find >600 CNEs that are independently lost in at least two mammalian lineages, including a spinal cord enhancer near GDF11. We observed several genomic regions where multiple independent CNE loss events happened; the most extreme is the DIAPH2 locus. We show that CNE losses often involve deletions and that CNE loss frequencies are non-uniform. Similar to less pleiotropic enhancers, we find that independently lost CNEs are shorter, slightly less constrained and evolutionarily younger than CNEs without detected losses. This suggests that independently lost CNEs are less pleiotropic and that pleiotropic constraints contribute to non-uniform CNE loss frequencies. We also detected 35 CNEs that are independently lost in the human lineage and in other mammals. Our study uncovers an interesting aspect of the evolution of functional DNA in mammalian genomes. Experiments are necessary to test if these independently lost CNEs are associated with parallel phenotype changes in mammals. PMID:23042682

  13. Exploring the read-write genome: mobile DNA and mammalian adaptation.

    PubMed

    Shapiro, James A

    2017-02-01

    The read-write genome idea predicts that mobile DNA elements will act in evolution to generate adaptive changes in organismal DNA. This prediction was examined in the context of mammalian adaptations involving regulatory non-coding RNAs, viviparous reproduction, early embryonic and stem cell development, the nervous system, and innate immunity. The evidence shows that mobile elements have played specific and sometimes major roles in mammalian adaptive evolution by generating regulatory sites in the DNA and providing interaction motifs in non-coding RNA. Endogenous retroviruses and retrotransposons have been the predominant mobile elements in mammalian adaptive evolution, with the notable exception of bats, where DNA transposons are the major agents of RW genome inscriptions. A few examples of independent but convergent exaptation of mobile DNA elements for similar regulatory rewiring functions are noted.

  14. The identification and functional annotation of RNA structures conserved in vertebrates

    PubMed Central

    Seemann, Stefan E.; Mirza, Aashiq H.; Hansen, Claus; Bang-Berthelsen, Claus H.; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T.; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L.; Gorodkin, Jan

    2017-01-01

    Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human–mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3′ ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. PMID:28487280

  15. A fast-evolving human NPAS3 enhancer gained reporter expression in the developing forebrain of transgenic mice

    PubMed Central

    Kamm, Gretel B.; López-Leal, Rodrigo; Lorenzo, Juan R.; Franchini, Lucía F.

    2013-01-01

    The developmental brain gene NPAS3 stands out as a hot spot in human evolution because it contains the largest number of human-specific, fast-evolving, conserved, non-coding elements. In this paper we studied 2xHAR142, one of these elements that is located in the fifth intron of NPAS3. Using transgenic mice, we show that the mouse and chimp 2xHAR142 orthologues behave as transcriptional enhancers driving expression of the reporter gene lacZ to a similar NPAS3 expression subdomain in the mouse central nervous system. Interestingly, the human 2xHAR142 orthologue drives lacZ expression to an extended expression pattern in the nervous system. Thus, molecular evolution of 2xHAR142 provides the first documented example of human-specific heterotopy in the forebrain promoted by a transcriptional enhancer and suggests that it may have contributed to assemble the unique properties of the human brain. PMID:24218632

  16. Conservation of gene linkage in dispersed vertebrate NK homeobox clusters.

    PubMed

    Wotton, Karl R; Weierud, Frida K; Juárez-Morales, José L; Alvares, Lúcia E; Dietrich, Susanne; Lewis, Katharine E

    2009-10-01

    Nk homeobox genes are important regulators of many different developmental processes including muscle, heart, central nervous system and sensory organ development. They are thought to have arisen as part of the ANTP megacluster, which also gave rise to Hox and ParaHox genes, and at least some NK genes remain tightly linked in all animals examined so far. The protostome-deuterostome ancestor probably contained a cluster of nine Nk genes: (Msx)-(Nk4/tinman)-(Nk3/bagpipe)-(Lbx/ladybird)-(Tlx/c15)-(Nk7)-(Nk6/hgtx)-(Nk1/slouch)-(Nk5/Hmx). Of these genes, only NKX2.6-NKX3.1, LBX1-TLX1 and LBX2-TLX2 remain tightly linked in humans. However, it is currently unclear whether this is unique to the human genome as we do not know which of these Nk genes are clustered in other vertebrates. This makes it difficult to assess whether the remaining linkages are due to selective pressures or because chance rearrangements have "missed" certain genes. In this paper, we identify all of the paralogs of these ancestrally clustered NK genes in several distinct vertebrates. We demonstrate that tight linkages of Lbx1-Tlx1, Lbx2-Tlx2 and Nkx3.1-Nkx2.6 have been widely maintained in both the ray-finned and lobe-finned fish lineages. Moreover, the recently duplicated Hmx2-Hmx3 genes are also tightly linked. Finally, we show that Lbx1-Tlx1 and Hmx2-Hmx3 are flanked by highly conserved noncoding elements, suggesting that shared regulatory regions may have resulted in evolutionary pressure to maintain these linkages. Consistent with this, these pairs of genes have overlapping expression domains. In contrast, Lbx2-Tlx2 and Nkx3.1-Nkx2.6, which do not seem to be coexpressed, are also not associated with conserved noncoding sequences, suggesting that an alternative mechanism may be responsible for the continued clustering of these genes.

  17. Regulatory elements of Caenorhabditis elegans ribosomal protein genes

    PubMed Central

    2012-01-01

    Background Ribosomal protein genes (RPGs) are essential, tightly regulated, and highly expressed during embryonic development and cell growth. Even though their protein sequences are strongly conserved, their mechanism of regulation is not conserved across yeast, Drosophila, and vertebrates. A recent investigation of genomic sequences conserved across both nematode species and associated with different gene groups indicated the existence of several elements in the upstream regions of C. elegans RPGs, providing a new insight regarding the regulation of these genes in C. elegans. Results In this study, we performed an in-depth examination of C. elegans RPG regulation and found nine highly conserved motifs in the upstream regions of C. elegans RPGs using the motif discovery algorithm DME. Four motifs were partially similar to transcription factor binding sites from C. elegans, Drosophila, yeast, and human. One pair of these motifs was found to co-occur in the upstream regions of 250 transcripts including 22 RPGs. The distance between the two motifs displayed a complex frequency pattern that was related to their relative orientation. We tested the impact of three of these motifs on the expression of rpl-2 using a series of reporter gene constructs and showed that all three motifs are necessary to maintain the high natural expression level of this gene. One of the motifs was similar to the binding site of an orthologue of POP-1, and we showed that RNAi knockdown of pop-1 impacts the expression of rpl-2. We further determined the transcription start site of rpl-2 by 5’ RACE and found that the motifs lie 40–90 bases upstream of the start site. We also found evidence that a noncoding RNA, contained within the outron of rpl-2, is co-transcribed with rpl-2 and cleaved during trans-splicing. Conclusions Our results indicate that C. elegans RPGs are regulated by a complex novel series of regulatory elements that is evolutionarily distinct from those of all other species examined up until now. PMID:22928635

  18. Functional noncoding sequences derived from SINEs in the mammalian genome

    PubMed Central

    Nishihara, Hidenori; Smit, Arian F.A.; Okada, Norihiro

    2006-01-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the ∼1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality. PMID:16717141

  19. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.

  20. Antisense transcription is pervasive but rarely conserved in enteric bacteria.

    PubMed

    Raghavan, Rahul; Sloan, Daniel B; Ochman, Howard

    2012-01-01

    Noncoding RNAs, including antisense RNAs (asRNAs) that originate from the complementary strand of protein-coding genes, are involved in the regulation of gene expression in all domains of life. Recent application of deep-sequencing technologies has revealed that the transcription of asRNAs occurs genome-wide in bacteria. Although the role of the vast majority of asRNAs remains unknown, it is often assumed that their presence implies important regulatory functions, similar to those of other noncoding RNAs. Alternatively, many antisense transcripts may be produced by chance transcription events from promoter-like sequences that result from the degenerate nature of bacterial transcription factor binding sites. To investigate the biological relevance of antisense transcripts, we compared genome-wide patterns of asRNA expression in closely related enteric bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, by performing strand-specific transcriptome sequencing. Although antisense transcripts are abundant in both species, less than 3% of asRNAs are expressed at high levels in both species, and only about 14% appear to be conserved among species. And unlike the promoters of protein-coding genes, asRNA promoters show no evidence of sequence conservation between, or even within, species. Our findings suggest that many or even most bacterial asRNAs are nonadaptive by-products of the cell's transcription machinery. IMPORTANCE Application of high-throughput methods has revealed the expression throughout bacterial genomes of transcripts encoded on the strand complementary to protein-coding genes. Because transcription is costly, it is usually assumed that these transcripts, termed antisense RNAs (asRNAs), serve some function; however, the role of most asRNAs is unclear, raising questions about their relevance in cellular processes. Because natural selection conserves functional elements, comparisons between related species provide a method for assessing functionality genome-wide. Applying such an approach, we assayed all transcripts in two closely related bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, and demonstrate that, although the levels of genome-wide antisense transcription are similarly high in both bacteria, only a small fraction of asRNAs are shared across species. Moreover, the promoters associated with asRNAs show no evidence of sequence conservation between, or even within, species. These findings indicate that despite the genome-wide transcription of asRNAs, many of these transcripts are likely nonfunctional.

  1. GenomeVista

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Poliakov, Alexander; Couronne, Olivier

    2002-11-04

    Aligning large vertebrate genomes that are structurally complex poses a variety of problems not encountered on smaller scales. Such genomes are rich in repetitive elements and contain multiple segmental duplications, which increases the difficulty of identifying true orthologous SNA segments in alignments. The sizes of the sequences make many alignment algorithms designed for comparing single proteins extremely inefficient when processing large genomic intervals. We integrated both local and global alignment tools and developed a suite of programs for automatically aligning large vertebrate genomes and identifying conserved non-coding regions in the alignments. Our method uses the BLAT local alignment program tomore » find anchors on the base genome to identify regions of possible homology for a query sequence. These regions are postprocessed to find the best candidates which are then globally aligned using the AVID global alignment program. In the last step conserved non-coding segments are identified using VISTA. Our methods are fast and the resulting alignments exhibit a high degree of sensitivity, covering more than 90% of known coding exons in the human genome. The GenomeVISTA software is a suite of Perl programs that is built on a MySQL database platform. The scheduler gets control data from the database, builds a queve of jobs, and dispatches them to a PC cluster for execution. The main program, running on each node of the cluster, processes individual sequences. A Perl library acts as an interface between the database and the above programs. The use of a separate library allows the programs to function independently of the database schema. The library also improves on the standard Perl MySQL database interfere package by providing auto-reconnect functionality and improved error handling.« less

  2. Conservation genetics and geographic patterns of genetic variation of the endangered officinal herb Fritillaria pallidiflora

    Treesearch

    Zhihao Su; Borong Pan; Stewart C. Sanderson; Xiaolong Jiang; Mingli Zhang

    2015-01-01

    Fritillaria pallidiflora is an endangered officinal herb distributed in the Tianshan Mountains of northwestern China. We examined its phylogeography to study evolutionary processes and suggest implications for conservation. Six haplotypes were detected based on three chloroplast non-coding spacers (psbA-trnH, rps16, and trnS-trnG); genetic variation mainly occurred...

  3. Genomic positional conservation identifies topological anchor point RNAs linked to developmental loci.

    PubMed

    Amaral, Paulo P; Leonardi, Tommaso; Han, Namshik; Viré, Emmanuelle; Gascoigne, Dennis K; Arias-Carrasco, Raúl; Büscher, Magdalena; Pandolfini, Luca; Zhang, Anda; Pluchino, Stefano; Maracaja-Coutinho, Vinicius; Nakaya, Helder I; Hemberg, Martin; Shiekhattar, Ramin; Enright, Anton J; Kouzarides, Tony

    2018-03-15

    The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider promoter conservation and positional conservation as indicators of functional commonality. We identify 665 conserved lncRNA promoters in mouse and human that are preserved in genomic position relative to orthologous coding genes. These positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are coexpressed in a tissue-specific manner. Over half of positionally conserved RNAs in this set are linked to chromatin organization structures, overlapping binding sites for the CTCF chromatin organiser and located at chromatin loop anchor points and borders of topologically associating domains (TADs). We define these RNAs as topological anchor point RNAs (tapRNAs). Characterization of these noncoding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other's expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Furthermore, we find that tapRNAs contain conserved sequence domains that are enriched in motifs for zinc finger domain-containing RNA-binding proteins and transcription factors, whose binding sites are found mutated in cancers. This work leverages positional conservation to identify lncRNAs with potential importance in genome organization, development and disease. The evidence that many developmental transcription factors are physically and functionally connected to lncRNAs represents an exciting stepping-stone to further our understanding of genome regulation.

  4. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

    PubMed

    Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

    2010-07-16

    Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.

  5. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

    PubMed Central

    Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.

    2015-01-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, ncCADD and ncGWAVA, and find both scores are significantly predictive of human dosage sensitive genes and appear to carry information beyond conservation, as assessed by ncGERP. These results highlight that the intolerance of noncoding sequence stretches in the human genome can provide a critical complementary tool to other genome annotation approaches to help identify the parts of the human genome increasingly likely to harbor mutations that influence risk of disease. PMID:26332131

  6. Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term

    PubMed Central

    Romero, Roberto; Tarca, Adi; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S.; Kalita, Cynthia A.; Cai, Juan; Yeo, Lami; Lipovich, Leonard

    2014-01-01

    Objective The mechanisms responsible for normal and abnormal parturition are poorly understood. Myometrial activation leading to regular uterine contractions is a key component of labor. Dysfunctional labor (arrest of dilatation and/or descent) is a leading indication for cesarean delivery. Compelling evidence suggests that most of these disorders are functional in nature, and not the result of cephalopelvic disproportion. The methodology and the datasets afforded by the post-genomic era provide novel opportunities to understand and target gene functions in these disorders. In 2012, the ENCODE Consortium elucidated the extraordinary abundance and functional complexity of long non-coding RNA genes in the human genome. The purpose of the study was to identify differentially expressed long non-coding RNA genes in human myometrium in women in spontaneous labor at term. Materials and Methods Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n=19) and women in spontaneous labor at term (n=20). RNA was extracted and profiled using an Illumina® microarray platform. The analysis of the protein coding genes from this study has been previously reported. Here, we have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. Results Upon considering more than 18,498 distinct lncRNA genes compiled nonredundantly from public experimental data sources, and interrogating 2,634 that matched Illumina microarray probes, we identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an independent experimental method. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site that lacked evolutionary conservation beyond primates. Conclusions We provide for the first time evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known, as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term. PMID:24168098

  7. A conserved predicted pseudoknot in the NS2A-encoding sequence of West Nile and Japanese encephalitis flaviviruses suggests NS1' may derive from ribosomal frameshifting

    PubMed Central

    Firth, Andrew E; Atkins, John F

    2009-01-01

    Japanese encephalitis, West Nile, Usutu and Murray Valley encephalitis viruses form a tight subgroup within the larger Flavivirus genus. These viruses utilize a single-polyprotein expression strategy, resulting in ~10 mature proteins. Plotting the conservation at synonymous sites along the polyprotein coding sequence reveals strong conservation peaks at the very 5' end of the coding sequence, and also at the 5' end of the sequence encoding the NS2A protein. Such peaks are generally indicative of functionally important non-coding sequence elements. The second peak corresponds to a predicted stable pseudoknot structure whose biological importance is supported by compensatory mutations that preserve the structure. The pseudoknot is preceded by a conserved slippery heptanucleotide (Y CCU UUU), thus forming a classical stimulatory motif for -1 ribosomal frameshifting. We hypothesize, therefore, that the functional importance of the pseudoknot is to stimulate a portion of ribosomes to shift -1 nt into a short (45 codon), conserved, overlapping open reading frame, termed foo. Since cleavage at the NS1-NS2A boundary is known to require synthesis of NS2A in cis, the resulting transframe fusion protein is predicted to be NS1-NS2AN-term-FOO. We hypothesize that this may explain the origin of the previously identified NS1 'extension' protein in JEV-group flaviviruses, known as NS1'. PMID:19196463

  8. Enhancer elements upstream of the SHOX gene are active in the developing limb.

    PubMed

    Durand, Claudia; Bangs, Fiona; Signolet, Jason; Decker, Eva; Tickle, Cheryll; Rappold, Gudrun

    2010-05-01

    Léri-Weill Dyschondrosteosis (LWD) is a dominant skeletal disorder characterized by short stature and distinct bone anomalies. SHOX gene mutations and deletions of regulatory elements downstream of SHOX resulting in haploinsufficiency have been found in patients with LWD. SHOX encodes a homeodomain transcription factor and is known to be expressed in the developing limb. We have now analyzed the regulatory significance of the region upstream of the SHOX gene. By comparative genomic analyses, we identified several conserved non-coding elements, which subsequently were tested in an in ovo enhancer assay in both chicken limb bud and cornea, where SHOX is also expressed. In this assay, we found three enhancers to be active in the developing chicken limb, but none were functional in the developing cornea. A screening of 60 LWD patients with an intact SHOX coding and downstream region did not yield any deletion of the upstream enhancer region. Thus, we speculate that SHOX upstream deletions occur at a lower frequency because of the structural organization of this genomic region and/or that SHOX upstream deletions may cause a phenotype that differs from the one observed in LWD.

  9. Enhancer elements upstream of the SHOX gene are active in the developing limb

    PubMed Central

    Durand, Claudia; Bangs, Fiona; Signolet, Jason; Decker, Eva; Tickle, Cheryll; Rappold, Gudrun

    2010-01-01

    Léri-Weill Dyschondrosteosis (LWD) is a dominant skeletal disorder characterized by short stature and distinct bone anomalies. SHOX gene mutations and deletions of regulatory elements downstream of SHOX resulting in haploinsufficiency have been found in patients with LWD. SHOX encodes a homeodomain transcription factor and is known to be expressed in the developing limb. We have now analyzed the regulatory significance of the region upstream of the SHOX gene. By comparative genomic analyses, we identified several conserved non-coding elements, which subsequently were tested in an in ovo enhancer assay in both chicken limb bud and cornea, where SHOX is also expressed. In this assay, we found three enhancers to be active in the developing chicken limb, but none were functional in the developing cornea. A screening of 60 LWD patients with an intact SHOX coding and downstream region did not yield any deletion of the upstream enhancer region. Thus, we speculate that SHOX upstream deletions occur at a lower frequency because of the structural organization of this genomic region and/or that SHOX upstream deletions may cause a phenotype that differs from the one observed in LWD. PMID:19997128

  10. Probing Xist RNA Structure in Cells Using Targeted Structure-Seq

    PubMed Central

    Rutenberg-Schoenberg, Michael; Simon, Matthew D.

    2015-01-01

    The long non-coding RNA (lncRNA) Xist is a master regulator of X-chromosome inactivation in mammalian cells. Models for how Xist and other lncRNAs function depend on thermodynamically stable secondary and higher-order structures that RNAs can form in the context of a cell. Probing accessible RNA bases can provide data to build models of RNA conformation that provide insight into RNA function, molecular evolution, and modularity. To study the structure of Xist in cells, we built upon recent advances in RNA secondary structure mapping and modeling to develop Targeted Structure-Seq, which combines chemical probing of RNA structure in cells with target-specific massively parallel sequencing. By enriching for signals from the RNA of interest, Targeted Structure-Seq achieves high coverage of the target RNA with relatively few sequencing reads, thus providing a targeted and scalable approach to analyze RNA conformation in cells. We use this approach to probe the full-length Xist lncRNA to develop new models for functional elements within Xist, including the repeat A element in the 5’-end of Xist. This analysis also identified new structural elements in Xist that are evolutionarily conserved, including a new element proximal to the C repeats that is important for Xist function. PMID:26646615

  11. An Enhancer Near ISL1 and an Ultraconserved Exon of PCBP2 areDerived from a Retroposon

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bejerano, Gill; Lowe, Craig; Ahituv, Nadav

    2005-11-27

    Hundreds of highly conserved distal cis-regulatory elementshave been characterized to date in vertebrate genomes1. Many thousandsmore are predicted based on comparative genomics2,3. Yet, in starkcontrast to the genes they regulate, virtually none of these regions canbe traced using sequence similarity in invertebrates, leaving theirevolutionary origin obscure. Here we show that a class of conserved,primarily non-coding regions in tetrapods originated from a novel shortinterspersed repetitive element (SINE) retroposon family that was activein Sarcopterygii (lobe-finned fishes and terrestrial vertebrates) in theSilurian at least 410 Mya4, and, remarkably, appears to be recentlyactive in the "living fossil" Indonesian coelacanth, Latimeriamenadoensis. We show that onemore » copy is a distal enhancer, located 500kbfrom the neuro-developmental gene ISL1. Several others represent new,possibly regulatory, alternatively spliced exons in the middle ofpre-existing Sarcopterygian genes. One of these is the>200bpultraconserved region5, 100 percent identical in mammals, and 80 percentidentical to the coelacanth SINE, that contains a 31aa alternativelyspliced exon of the mRNA processing gene PCBP26. These add to a growinglist of examples7 in which relics of transposable elements have acquireda function that serves their host, a process termed "exaptation"8, andprovide an origin for at least some of the highly-conservedvertebrate-specific genomic sequences recently discovered usingcomparative genomics.« less

  12. Expression of Antisense Long Noncoding RNAs as Potential Regulators in Rainbow Trout with Different Tolerance to Plant-Based Diets.

    PubMed

    Abernathy, Jason; Overturf, Ken

    2018-01-04

    Reformulation of aquafeeds in salmonid diets to include more plant proteins is critical for sustainable aquaculture. However, increasing plant proteins can lead to stunted growth and enteritis. Toward an understanding of the regulatory mechanisms behind plant protein utilization, directional RNA sequencing of liver tissues from a rainbow trout strain selected for growth on an all plant-protein diet and a control strain, both fed a plant diet for 12 weeks, were utilized to construct long noncoding RNAs. Antisense long noncoding RNAs were selected for differential expression and functional analyses since they have been shown to have regulatory actions within a genome. A total of 142 unique antisense long noncoding RNAs were differentially expressed between strains, 60 of which could be mapped to a gene. Genes underlying these noncoding RNAs are indicated in lipid metabolism and immunity. Six noncoding transcripts were also found to overlap with differentially expressed protein-coding genes, all of which were co-expressed. Associating variation in regulatory elements between rainbow trout strains with differing tolerance to plant-protein diets will assist in future studies toward increased gains throughout carnivorous aquaculture.

  13. Alternative splicing of anciently exonized 5S rRNA regulates plant transcription factor TFIIIA

    PubMed Central

    Fu, Yan; Bannach, Oliver; Chen, Hao; Teune, Jan-Hendrik; Schmitz, Axel; Steger, Gerhard; Xiong, Liming; Barbazuk, W. Brad

    2009-01-01

    Identifying conserved alternative splicing (AS) events among evolutionarily distant species can prioritize AS events for functional characterization and help uncover relevant cis- and trans-regulatory factors. A genome-wide search for conserved cassette exon AS events in higher plants revealed the exonization of 5S ribosomal RNA (5S rRNA) within the gene of its own transcription regulator, TFIIIA (transcription factor for polymerase III A). The 5S rRNA-derived exon in TFIIIA gene exists in all representative land plant species but not in green algae and nonplant species, suggesting it is specific to land plants. TFIIIA is essential for RNA polymerase III-based transcription of 5S rRNA in eukaryotes. Integrating comparative genomics and molecular biology revealed that the conserved cassette exon derived from 5S rRNA is coupled with nonsense-mediated mRNA decay. Utilizing multiple independent Arabidopsis overexpressing TFIIIA transgenic lines under osmotic and salt stress, strong accordance between phenotypic and molecular evidence reveals the biological relevance of AS of the exonized 5S rRNA in quantitative autoregulation of TFIIIA homeostasis. Most significantly, this study provides the first evidence of ancient exaptation of 5S rRNA in plants, suggesting a novel gene regulation model mediated by the AS of an anciently exonized noncoding element. PMID:19211543

  14. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

    PubMed

    Catania, Francesco; Lynch, Michael

    2010-05-04

    In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  15. The identification and functional annotation of RNA structures conserved in vertebrates.

    PubMed

    Seemann, Stefan E; Mirza, Aashiq H; Hansen, Claus; Bang-Berthelsen, Claus H; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L; Gorodkin, Jan

    2017-08-01

    Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human-mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3' ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. © 2017 Seemann et al.; Published by Cold Spring Harbor Laboratory Press.

  16. Conserved noncoding sequences conserve biological networks and influence genome evolution.

    PubMed

    Xie, Jianbo; Qian, Kecheng; Si, Jingna; Xiao, Liang; Ci, Dong; Zhang, Deqiang

    2018-05-01

    Comparative genomics approaches have identified numerous conserved cis-regulatory sequences near genes in plant genomes. Despite the identification of these conserved noncoding sequences (CNSs), our knowledge of their functional importance and selection remains limited. Here, we used a combination of DNA methylome analysis, microarray expression analyses, and functional annotation to study these sequences in the model tree Populus trichocarpa. Methylation in CG contexts and non-CG contexts was lower in CNSs, particularly CNSs in the 5'-upstream regions of genes, compared with other sites in the genome. We observed that CNSs are enriched in genes with transcription and binding functions, and this also associated with syntenic genes and those from whole-genome duplications, suggesting that cis-regulatory sequences play a key role in genome evolution. We detected a significant positive correlation between CNS number and protein interactions, suggesting that CNSs may have roles in the evolution and maintenance of biological networks. The divergence of CNSs indicates that duplication-degeneration-complementation drives the subfunctionalization of a proportion of duplicated genes from whole-genome duplication. Furthermore, population genomics confirmed that most CNSs are under strong purifying selection and only a small subset of CNSs shows evidence of adaptive evolution. These findings provide a foundation for future studies exploring these key genomic features in the maintenance of biological networks, local adaptation, and transcription.

  17. Independent evolution of genomic characters during major metazoan transitions.

    PubMed

    Simakov, Oleg; Kawashima, Takeshi

    2017-07-15

    Metazoan evolution encompasses a vast evolutionary time scale spanning over 600 million years. Our ability to infer ancestral metazoan characters, both morphological and functional, is limited by our understanding of the nature and evolutionary dynamics of the underlying regulatory networks. Increasing coverage of metazoan genomes enables us to identify the evolutionary changes of the relevant genomic characters such as the loss or gain of coding sequences, gene duplications, micro- and macro-synteny, and non-coding element evolution in different lineages. In this review we describe recent advances in our understanding of ancestral metazoan coding and non-coding features, as deduced from genomic comparisons. Some genomic changes such as innovations in gene and linkage content occur at different rates across metazoan clades, suggesting some level of independence among genomic characters. While their contribution to biological innovation remains largely unclear, we review recent literature about certain genomic changes that do correlate with changes to specific developmental pathways and metazoan innovations. In particular, we discuss the origins of the recently described pharyngeal cluster which is conserved across deuterostome genomes, and highlight different genomic features that have contributed to the evolution of this group. We also assess our current capacity to infer ancestral metazoan states from gene models and comparative genomics tools and elaborate on the future directions of metazoan comparative genomics relevant to evo-devo studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  18. The developmental transcriptome of Drosophila melanogaster

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    University of Connecticut; Graveley, Brenton R.; Brooks, Angela N.

    Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development of this complex organism. Here we used RNA-Seq, tiling microarrays and cDNA sequencing to explore the transcriptome in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded discovery using established experimental, predictionmore » and conservation-based approaches. These data substantially expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of transcriptome dynamics throughout development. Drosophila melanogaster is an important non-mammalian model system that has had a critical role in basic biological discoveries, such as identifying chromosomes as the carriers of genetic information and uncovering the role of genes in development. Because it shares a substantial genic content with humans, Drosophila is increasingly used as a translational model for human development, homeostasis and disease. High-quality maps are needed for all functional genomic elements. Previous studies demonstrated that a rich collection of genes is deployed during the life cycle of the fly. Although expression profiling using microarrays has revealed the expression of, 13,000 annotated genes, it is difficult to map splice junctions and individual base modifications generated by RNA editing using such approaches. Single-base resolution is essential to define precisely the elements that comprise the Drosophila transcriptome. Estimates of the number of transcript isoforms are less accurate than estimates of the number of genes. Whereas, 20% of Drosophila genes are annotated as encoding alternatively spliced premRNAs, splice-junction microarray experiments indicate that this number is at least 40% (ref. 7). Determining the diversity of mRNAs generated by alternative promoters, alternative splicing and RNA editing will substantially increase the inferred protein repertoire. Non-coding RNA genes (ncRNAs) including short interfering RNAs (siRNAs) and microRNAS (miRNAs) (reviewed in ref. 10), and longer ncRNAs such as bxd (ref. 11) and rox (ref. 12), have important roles in gene regulation, whereas others such as small nucleolar RNAs (snoRNAs)and small nuclear RNAs (snRNAs) are important components of macromolecular machines such as the ribosome and spliceosome. The transcription and processing of these ncRNAs must also be fully documented and mapped. As part of the modENCODE project to annotate the functional elements of the D. melanogaster and Caenorhabditis elegans genomes, we used RNA-Seq and tiling microarrays to sample the Drosophila transcriptome at unprecedented depth throughout development from early embryo to ageing male and female adults. We report on a high-resolution view of the discovery, structure and dynamic expression of the D. melanogaster transcriptome.« less

  19. 5′UTR of the Neurogenic bHLH Nex1/MATH-2/NeuroD6 Gene Is Regulated by Two Distinct Promoters Through CRE and C/EBP Binding Sites

    PubMed Central

    Uittenbogaard, Martine; Martinka, Debra L.; Johnson, Peter F.; Vinson, Charles; Chiaramello, Anne

    2009-01-01

    Expression of the bHLH transcription factor Nex1/MATH-2/NeuroD6, a member of the NeuroD subfamily, parallels overt neuronal differentiation and synaptogenesis during brain development. Our previous studies have shown that Nex1 is a critical effector of the NGF pathway and promotes neuronal differentiation and survival of PC12 cells in the absence of growth factors. In this study, we investigated the transcriptional regulation of the Nex1 gene during NGF-induced neuronal differentiation. We found that Nex1 expression is under the control of two conserved promoters, Nex1-P1 and Nex1-P2, located in two distinct non-coding exons. Both promoters are TATA-less with multiple transcription start sites, and are activated on NGF or cAMP exposure. Luciferase-reporter assays showed that the Nex1-P2 promoter activity is stronger than the Nex1-P1 promoter activity, which supports the previously reported differential expression levels of Nex1 transcripts throughout brain development. Using a combination of DNaseI footprinting, EMSA assays, and site-directed mutagenesis, we identified the essential regulatory elements within the first 2 kb of the Nex1 5′UTR. The Nex1-P1 promoter is mainly regulated by a conserved CRE element, whereas the Nex1-P2 promoter is under the control of a conserved C/EBP binding site. Overexpression of wild-type C/EBPβ resulted in increased Nex1-P2 promoter activity in NGF-differentiated PC12 cells. The fact that Nex1 is a target gene of C/EBPβ provides new insight into the C/EBP transcriptional cascade known to promote neurogenesis, while repressing gliogenesis. PMID:17075921

  20. RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336”

    PubMed Central

    Kumar, Ranjit; Lawrence, Mark L.; Watt, James; Cooksey, Amanda M.; Burgess, Shane C.; Nanduri, Bindu

    2012-01-01

    Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method. The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations. PMID:22276113

  1. RNA-seq based transcriptional map of bovine respiratory disease pathogen "Histophilus somni 2336".

    PubMed

    Kumar, Ranjit; Lawrence, Mark L; Watt, James; Cooksey, Amanda M; Burgess, Shane C; Nanduri, Bindu

    2012-01-01

    Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify "novel" genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.

  2. Comparative Mitogenomics of the Assassin Bug Genus Peirates (Hemiptera: Reduviidae: Peiratinae) Reveal Conserved Mitochondrial Genome Organization of P. atromaculatus, P. fulvescens and P. turpis

    PubMed Central

    Zhao, Guangyu; Li, Hu; Zhao, Ping; Cai, Wanzhi

    2015-01-01

    In this study, we sequenced four new mitochondrial genomes and presented comparative mitogenomic analyses of five species in the genus Peirates (Hemiptera: Reduviidae). Mitochondrial genomes of these five assassin bugs had a typical set of 37 genes and retained the ancestral gene arrangement of insects. The A+T content, AT- and GC-skews were similar to the common base composition biases of insect mtDNA. Genomic size ranges from 15,702 bp to 16,314 bp and most of the size variation was due to length and copy number of the repeat unit in the putative control region. All of the control region sequences included large tandem repeats present in two or more copies. Our result revealed similarity in mitochondrial genomes of P. atromaculatus, P. fulvescens and P. turpis, as well as the highly conserved genomic-level characteristics of these three species, e.g., the same start and stop codons of protein-coding genes, conserved secondary structure of tRNAs, identical location and length of non-coding and overlapping regions, and conservation of structural elements and tandem repeat unit in control region. Phylogenetic analyses also supported a close relationship between P. atromaculatus, P. fulvescens and P. turpis, which might be recently diverged species. The present study indicates that mitochondrial genome has important implications on phylogenetics, population genetics and speciation in the genus Peirates. PMID:25689825

  3. Enhancer Evolution across 20 Mammalian Species

    PubMed Central

    Villar, Diego; Berthelot, Camille; Aldridge, Sarah; Rayner, Tim F.; Lukk, Margus; Pignatelli, Miguel; Park, Thomas J.; Deaville, Robert; Erichsen, Jonathan T.; Jasinska, Anna J.; Turner, James M.A.; Bertelsen, Mads F.; Murchison, Elizabeth P.; Flicek, Paul; Odom, Duncan T.

    2015-01-01

    Summary The mammalian radiation has corresponded with rapid changes in noncoding regions of the genome, but we lack a comprehensive understanding of regulatory evolution in mammals. Here, we track the evolution of promoters and enhancers active in liver across 20 mammalian species from six diverse orders by profiling genomic enrichment of H3K27 acetylation and H3K4 trimethylation. We report that rapid evolution of enhancers is a universal feature of mammalian genomes. Most of the recently evolved enhancers arise from ancestral DNA exaptation, rather than lineage-specific expansions of repeat elements. In contrast, almost all liver promoters are partially or fully conserved across these species. Our data further reveal that recently evolved enhancers can be associated with genes under positive selection, demonstrating the power of this approach for annotating regulatory adaptations in genomic sequences. These results provide important insight into the functional genetics underpinning mammalian regulatory evolution. PMID:25635462

  4. Long-range evolutionary constraints reveal cis-regulatory interactions on the human X chromosome

    PubMed Central

    Naville, Magali; Ishibashi, Minaka; Ferg, Marco; Bengani, Hemant; Rinkwitz, Silke; Krecsmarik, Monika; Hawkins, Thomas A.; Wilson, Stephen W.; Manning, Elizabeth; Chilamakuri, Chandra S. R.; Wilson, David I.; Louis, Alexandra; Lucy Raymond, F.; Rastegar, Sepand; Strähle, Uwe; Lenhard, Boris; Bally-Cuif, Laure; van Heyningen, Veronica; FitzPatrick, David R.; Becker, Thomas S.; Roest Crollius, Hugues

    2015-01-01

    Enhancers can regulate the transcription of genes over long genomic distances. This is thought to lead to selection against genomic rearrangements within such regions that may disrupt this functional linkage. Here we test this concept experimentally using the human X chromosome. We describe a scoring method to identify evolutionary maintenance of linkage between conserved noncoding elements and neighbouring genes. Chromatin marks associated with enhancer function are strongly correlated with this linkage score. We test >1,000 putative enhancers by transgenesis assays in zebrafish to ascertain the identity of the target gene. The majority of active enhancers drive a transgenic expression in a pattern consistent with the known expression of a linked gene. These results show that evolutionary maintenance of linkage is a reliable predictor of an enhancer's function, and provide new information to discover the genetic basis of diseases caused by the mis-regulation of gene expression. PMID:25908307

  5. Diversity of Antisense and Other Non-Coding RNAs in Archaea Revealed by Comparative Small RNA Sequencing in Four Pyrobaculum Species

    PubMed Central

    Bernick, David L.; Dennis, Patrick P.; Lui, Lauren M.; Lowe, Todd M.

    2012-01-01

    A great diversity of small, non-coding RNA (ncRNA) molecules with roles in gene regulation and RNA processing have been intensely studied in eukaryotic and bacterial model organisms, yet our knowledge of possible parallel roles for small RNAs (sRNA) in archaea is limited. We employed RNA-seq to identify novel sRNA across multiple species of the hyperthermophilic genus Pyrobaculum, known for unusual RNA gene characteristics. By comparing transcriptional data collected in parallel among four species, we were able to identify conserved RNA genes fitting into known and novel families. Among our findings, we highlight three novel cis-antisense sRNAs encoded opposite to key regulatory (ferric uptake regulator), metabolic (triose-phosphate isomerase), and core transcriptional apparatus genes (transcription factor B). We also found a large increase in the number of conserved C/D box sRNA genes over what had been previously recognized; many of these genes are encoded antisense to protein coding genes. The conserved opposition to orthologous genes across the Pyrobaculum genus suggests similarities to other cis-antisense regulatory systems. Furthermore, the genus-specific nature of these sRNAs indicates they are relatively recent, stable adaptations. PMID:22783241

  6. An expanding universe of the non-coding genome in cancer biology.

    PubMed

    Xue, Bin; He, Lin

    2014-06-01

    Neoplastic transformation is caused by accumulation of genetic and epigenetic alterations that ultimately convert normal cells into tumor cells with uncontrolled proliferation and survival, unlimited replicative potential and invasive growth [Hanahan,D. et al. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646-674]. Although the majority of the cancer studies have focused on the functions of protein-coding genes, emerging evidence has started to reveal the importance of the vast non-coding genome, which constitutes more than 98% of the human genome. A number of non-coding RNAs (ncRNAs) derived from the 'dark matter' of the human genome exhibit cancer-specific differential expression and/or genomic alterations, and it is increasingly clear that ncRNAs, including small ncRNAs and long ncRNAs (lncRNAs), play an important role in cancer development by regulating protein-coding gene expression through diverse mechanisms. In addition to ncRNAs, nearly half of the mammalian genomes consist of transposable elements, particularly retrotransposons. Once depicted as selfish genomic parasites that propagate at the expense of host fitness, retrotransposon elements could also confer regulatory complexity to the host genomes during development and disease. Reactivation of retrotransposons in cancer, while capable of causing insertional mutagenesis and genome rearrangements to promote oncogenesis, could also alter host gene expression networks to favor tumor development. Taken together, the functional significance of non-coding genome in tumorigenesis has been previously underestimated, and diverse transcripts derived from the non-coding genome could act as integral functional components of the oncogene and tumor suppressor network. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  7. Disease-Causing 7.4 kb Cis-Regulatory Deletion Disrupting Conserved Non-Coding Sequences and Their Interaction with the FOXL2 Promotor: Implications for Mutation Screening

    PubMed Central

    Dostie, Josée; Lemire, Edmond; Bouchard, Philippe; Field, Michael; Jones, Kristie; Lorenz, Birgit; Menten, Björn; Buysse, Karen; Pattyn, Filip; Friedli, Marc; Ucla, Catherine; Rossier, Colette; Wyss, Carine; Speleman, Frank; De Paepe, Anne; Dekker, Job; Antonarakis, Stylianos E.; De Baere, Elfride

    2009-01-01

    To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. PMID:19543368

  8. Identification of long non-coding RNAs in two anthozoan species and their possible implications for coral bleaching.

    PubMed

    Huang, Chen; Morlighem, Jean-Étienne R L; Cai, Jing; Liao, Qiwen; Perez, Carlos Daniel; Gomes, Paula Braga; Guo, Min; Rádis-Baptista, Gandhi; Lee, Simon Ming-Yuen

    2017-07-13

    Long non-coding RNAs (lncRNAs) have been shown to play regulatory roles in a diverse range of biological processes and are associated with the outcomes of various diseases. The majority of studies about lncRNAs focus on model organisms, with lessened investigation in non-model organisms to date. Herein, we have undertaken an investigation on lncRNA in two zoanthids (cnidarian): Protolpalythoa varibilis and Palythoa caribaeorum. A total of 11,206 and 13,240 lncRNAs were detected in P. variabilis and P. caribaeorum transcriptome, respectively. Comparison using NONCODE database indicated that the majority of these lncRNAs is taxonomically species-restricted with no identifiable orthologs. Even so, we found cases in which short regions of P. caribaeorum's lncRNAs were similar to vertebrate species' lncRNAs, and could be associated with lncRNA conserved regulatory functions. Consequently, some high-confidence lncRNA-mRNA interactions were predicted based on such conserved regions, therefore revealing possible involvement of lncRNAs in posttranscriptional processing and regulation in anthozoans. Moreover, investigation of differentially expressed lncRNAs, in healthy colonies and colonial individuals undergoing natural bleaching, indicated that some up-regulated lncRNAs in P. caribaeorum could posttranscriptionally regulate the mRNAs encoding proteins of Ras-mediated signal transduction pathway and components of innate immune-system, which could contribute to the molecular response of coral bleaching.

  9. Extensive Evolutionary Changes in Regulatory Element Activity during Human Origins Are Associated with Altered Gene Expression and Positive Selection

    PubMed Central

    Fedrigo, Olivier; Babbitt, Courtney C.; Wortham, Matthew; Tewari, Alok K.; London, Darin; Song, Lingyun; Lee, Bum-Kyu; Iyer, Vishwanath R.; Parker, Stephen C. J.; Margulies, Elliott H.; Wray, Gregory A.; Furey, Terrence S.; Crawford, Gregory E.

    2012-01-01

    Understanding the molecular basis for phenotypic differences between humans and other primates remains an outstanding challenge. Mutations in non-coding regulatory DNA that alter gene expression have been hypothesized as a key driver of these phenotypic differences. This has been supported by differential gene expression analyses in general, but not by the identification of specific regulatory elements responsible for changes in transcription and phenotype. To identify the genetic source of regulatory differences, we mapped DNaseI hypersensitive (DHS) sites, which mark all types of active gene regulatory elements, genome-wide in the same cell type isolated from human, chimpanzee, and macaque. Most DHS sites were conserved among all three species, as expected based on their central role in regulating transcription. However, we found evidence that several hundred DHS sites were gained or lost on the lineages leading to modern human and chimpanzee. Species-specific DHS site gains are enriched near differentially expressed genes, are positively correlated with increased transcription, show evidence of branch-specific positive selection, and overlap with active chromatin marks. Species-specific sequence differences in transcription factor motifs found within these DHS sites are linked with species-specific changes in chromatin accessibility. Together, these indicate that the regulatory elements identified here are genetic contributors to transcriptional and phenotypic differences among primate species. PMID:22761590

  10. Molecular characterization, genomic distribution and evolutionary dynamics of Short INterspersed Elements in the termite genome.

    PubMed

    Luchetti, Andrea; Mantovani, Barbara

    2011-02-01

    Short INterspersed Elements (SINEs) in invertebrates, and especially in animal inbred genomes such that of termites, are poorly known; in this paper we characterize three new SINE families (Talub, Taluc and Talud) through the analyses of 341 sequences, either isolated from the Reticulitermes lucifugus genome or drawn from EST Genbank collection. We further add new data to the only isopteran element known so far, Talua. These SINEs are tRNA-derived elements, with an average length ranging from 258 to 372 bp. The tails are made up by poly(A) or microsatellite motifs. Their copy number varies from 7.9 × 10(3) to 10(5) copies, well within the range observed for other metazoan genomes. Species distribution, age and target site duplication analysis indicate Talud as the oldest, possibly inactive SINE originated before the onset of Isoptera (~150 Myr ago). Taluc underwent to substantial sequence changes throughout the evolution of termites and data suggest it was silenced and then re-activated in the R. lucifugus lineage. Moreover, Taluc shares a conserved sequence block with other unrelated SINEs, as observed for some vertebrate and cephalopod elements. The study of genomic environment showed that insertions are mainly surrounded by microsatellites and other SINEs, indicating a biased accumulation within non-coding regions. The evolutionary dynamics of Talu~ elements is explained through selective mechanisms acting in an inbred genome; in this respect, the study of termites' SINEs activity may provide an interesting framework to address the (co)evolution of mobile elements and the host genome.

  11. Short interspersed element (SINE) depletion and long interspersed element (LINE) abundance are not features universally required for imprinting.

    PubMed

    Cowley, Michael; de Burca, Anna; McCole, Ruth B; Chahal, Mandeep; Saadat, Ghazal; Oakey, Rebecca J; Schulz, Reiner

    2011-04-20

    Genomic imprinting is a form of gene dosage regulation in which a gene is expressed from only one of the alleles, in a manner dependent on the parent of origin. The mechanisms governing imprinted gene expression have been investigated in detail and have greatly contributed to our understanding of genome regulation in general. Both DNA sequence features, such as CpG islands, and epigenetic features, such as DNA methylation and non-coding RNAs, play important roles in achieving imprinted expression. However, the relative importance of these factors varies depending on the locus in question. Defining the minimal features that are absolutely required for imprinting would help us to understand how imprinting has evolved mechanistically. Imprinted retrogenes are a subset of imprinted loci that are relatively simple in their genomic organisation, being distinct from large imprinting clusters, and have the potential to be used as tools to address this question. Here, we compare the repeat element content of imprinted retrogene loci with non-imprinted controls that have a similar locus organisation. We observe no significant differences that are conserved between mouse and human, suggesting that the paucity of SINEs and relative abundance of LINEs at imprinted loci reported by others is not a sequence feature universally required for imprinting.

  12. DOMAINS REARRANGED METHYLTRANSFERASE3 controls DNA methylation and regulates RNA polymerase V transcript abundance in Arabidopsis

    PubMed Central

    Zhong, Xuehua; Hale, Christopher J.; Nguyen, Minh; Ausin, Israel; Groth, Martin; Hetzel, Jonathan; Vashisht, Ajay A.; Henderson, Ian R.; Wohlschlegel, James A.; Jacobsen, Steven E.

    2015-01-01

    DNA methylation is a mechanism of epigenetic gene regulation and genome defense conserved in many eukaryotic organisms. In Arabidopsis, the DNA methyltransferase DOMAINS REARRANGED METHYLASE 2 (DRM2) controls RNA-directed DNA methylation in a pathway that also involves the plant-specific RNA Polymerase V (Pol V). Additionally, the Arabidopsis genome encodes an evolutionarily conserved but catalytically inactive DNA methyltransferase, DRM3. Here, we show that DRM3 has moderate effects on global DNA methylation and small RNA abundance and that DRM3 physically interacts with Pol V. In Arabidopsis drm3 mutants, we observe a lower level of Pol V-dependent noncoding RNA transcripts even though Pol V chromatin occupancy is increased at many sites in the genome. These findings suggest that DRM3 acts to promote Pol V transcriptional elongation or assist in the stabilization of Pol V transcripts. This work sheds further light on the mechanism by which long noncoding RNAs facilitate RNA-directed DNA methylation. PMID:25561521

  13. Divergent genome evolution caused by regional variation in DNA gain and loss between human and mouse

    PubMed Central

    Kortschak, R. Daniel

    2018-01-01

    The forces driving the accumulation and removal of non-coding DNA and ultimately the evolution of genome size in complex organisms are intimately linked to genome structure and organisation. Our analysis provides a novel method for capturing the regional variation of lineage-specific DNA gain and loss events in their respective genomic contexts. To further understand this connection we used comparative genomics to identify genome-wide individual DNA gain and loss events in the human and mouse genomes. Focusing on the distribution of DNA gains and losses, relationships to important structural features and potential impact on biological processes, we found that in autosomes, DNA gains and losses both followed separate lineage-specific accumulation patterns. However, in both species chromosome X was particularly enriched for DNA gain, consistent with its high L1 retrotransposon content required for X inactivation. We found that DNA loss was associated with gene-rich open chromatin regions and DNA gain events with gene-poor closed chromatin regions. Additionally, we found that DNA loss events tended to be smaller than DNA gain events suggesting that they were able to accumulate in gene-rich open chromatin regions due to their reduced capacity to interrupt gene regulatory architecture. GO term enrichment showed that mouse loss hotspots were strongly enriched for terms related to developmental processes. However, these genes were also located in regions with a high density of conserved elements, suggesting that despite high levels of DNA loss, gene regulatory architecture remained conserved. This is consistent with a model in which DNA gain and loss results in turnover or “churning” in regulatory element dense regions of open chromatin, where interruption of regulatory elements is selected against. PMID:29677183

  14. Regulatory variation: an emerging vantage point for cancer biology.

    PubMed

    Li, Luolan; Lorzadeh, Alireza; Hirst, Martin

    2014-01-01

    Transcriptional regulation involves complex and interdependent interactions of noncoding and coding regions of the genome with proteins that interact and modify them. Genetic variation/mutation in coding and noncoding regions of the genome can drive aberrant transcription and disease. In spite of accounting for nearly 98% of the genome comparatively little is known about the contribution of noncoding DNA elements to disease. Genome-wide association studies of complex human diseases including cancer have revealed enrichment for variants in the noncoding genome. A striking finding of recent cancer genome re-sequencing efforts has been the previously underappreciated frequency of mutations in epigenetic modifiers across a wide range of cancer types. Taken together these results point to the importance of dysregulation in transcriptional regulatory control in genesis of cancer. Powered by recent technological advancements in functional genomic profiling, exploration of normal and transformed regulatory networks will provide novel insight into the initiation and progression of cancer and open new windows to future prognostic and diagnostic tools. © 2013 Wiley Periodicals, Inc.

  15. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

    PubMed Central

    2010-01-01

    Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes. PMID:20441586

  16. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana)

    PubMed Central

    2010-01-01

    Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079

  17. Noncoding origins of anthropoid traits and a new null model of transposon functionalization

    PubMed Central

    del Rosario, Ricardo C.H.; Rayan, Nirmala Arul

    2014-01-01

    Little is known about novel genetic elements that drove the emergence of anthropoid primates. We exploited the sequencing of the marmoset genome to identify 23,849 anthropoid-specific constrained (ASC) regions and confirmed their robust functional signatures. Of the ASC base pairs, 99.7% were noncoding, suggesting that novel anthropoid functional elements were overwhelmingly cis-regulatory. ASCs were highly enriched in loci associated with fetal brain development, motor coordination, neurotransmission, and vision, thus providing a large set of candidate elements for exploring the molecular basis of hallmark primate traits. We validated ASC192 as a primate-specific enhancer in proliferative zones of the developing brain. Unexpectedly, transposable elements (TEs) contributed to >56% of ASCs, and almost all TE families showed functional potential similar to that of nonrepetitive DNA. Three L1PA repeat-derived ASCs displayed coherent eye-enhancer function, thus demonstrating that the “gene-battery” model of TE functionalization applies to enhancers in vivo. Our study provides fundamental insights into genome evolution and the origins of anthropoid phenotypes and supports an elegantly simple new null model of TE exaptation. PMID:25043600

  18. Activation-dependent intrachromosomal interactions formed by the TNF gene promoter and two distal enhancers

    PubMed Central

    Tsytsykova, Alla V.; Rajsbaum, Ricardo; Falvo, James V.; Ligeiro, Filipa; Neely, Simon R.; Goldfeld, Anne E.

    2007-01-01

    Here we provide a mechanism for specific, efficient transcription of the TNF gene and, potentially, other genes residing within multigene loci. We identify and characterize highly conserved noncoding elements flanking the TNF gene, which undergo activation-dependent intrachromosomal interactions. These elements, hypersensitive site (HSS)−9 and HSS+3 (9 kb upstream and 3 kb downstream of the TNF gene, respectively), contain DNase I hypersensitive sites in naive, T helper 1, and T helper 2 primary T cells. Both HSS-9 and HSS+3 inducibly associate with acetylated histones, indicative of chromatin remodeling, bind the transcription factor nuclear factor of activated T cells (NFAT)p in vitro and in vivo, and function as enhancers of NFAT-dependent transactivation mediated by the TNF promoter. Using the chromosome conformation capture assay, we demonstrate that upon T cell activation intrachromosomal looping occurs in the TNF locus. HSS-9 and HSS+3 each associate with the TNF promoter and with each other, circularizing the TNF gene and bringing NFAT-containing nucleoprotein complexes into close proximity. TNF gene regulation thus reveals a mode of intrachromosomal interaction that combines a looped gene topology with interactions between enhancers and a gene promoter. PMID:17940009

  19. Monocyte-specific Accessibility of a Matrix Attachment Region in the Tumor Necrosis Factor Locus*

    PubMed Central

    Biglione, Sebastian; Tsytsykova, Alla V.; Goldfeld, Anne E.

    2011-01-01

    Regulation of TNF gene expression is cell type- and stimulus-specific. We have previously identified highly conserved noncoding regulatory elements within DNase I-hypersensitive sites (HSS) located 9 kb upstream (HSS−9) and 3 kb downstream (HSS+3) of the TNF gene, which play an important role in the transcriptional regulation of TNF in T cells. They act as enhancers and interact with the TNF promoter and with each other, generating a higher order chromatin structure. Here, we report a novel monocyte-specific AT-rich DNase I-hypersensitive element located 7 kb upstream of the TNF gene (HSS−7), which serves as a matrix attachment region in monocytes. We show that HSS−7 associates with topoisomerase IIα (Top2) in vivo and that induction of endogenous TNF mRNA expression is suppressed by etoposide, a Top2 inhibitor. Moreover, Top2 binds to and cleaves HSS−7 in in vitro analysis. Thus, HSS−7, which is selectively accessible in monocytes, can tether the TNF locus to the nuclear matrix via matrix attachment region formation, potentially promoting TNF gene expression by acting as a Top2 substrate. PMID:22027829

  20. Alu-mediated deletion of SOX10 regulatory elements in Waardenburg syndrome type 4

    PubMed Central

    Bondurand, Nadége; Fouquet, Virginie; Baral, Viviane; Lecerf, Laure; Loundon, Natalie; Goossens, Michel; Duriez, Benedicte; Labrune, Philippe; Pingault, Veronique

    2012-01-01

    Waardenburg syndrome type 4 (WS4) is a rare neural crest disorder defined by the combination of Waardenburg syndrome (sensorineural hearing loss and pigmentation defects) and Hirschsprung disease (intestinal aganglionosis). Three genes are known to be involved in this syndrome, that is, EDN3 (endothelin-3), EDNRB (endothelin receptor type B), and SOX10. However, 15–35% of WS4 remains unexplained at the molecular level, suggesting that other genes could be involved and/or that mutations within known genes may have escaped previous screenings. Here, we searched for deletions within recently identified SOX10 regulatory sequences and describe the first characterization of a WS4 patient presenting with a large deletion encompassing three of these enhancers. Analysis of the breakpoint region suggests a complex rearrangement involving three Alu sequences that could be mediated by a FosTes/MMBIR replication mechanism. Taken together with recent reports, our results demonstrate that the disruption of highly conserved non-coding elements located within or at a long distance from the coding sequences of key genes can result in several neurocristopathies. This opens up new routes to the molecular dissection of neural crest disorders. PMID:22378281

  1. Alu-mediated deletion of SOX10 regulatory elements in Waardenburg syndrome type 4.

    PubMed

    Bondurand, Nadége; Fouquet, Virginie; Baral, Viviane; Lecerf, Laure; Loundon, Natalie; Goossens, Michel; Duriez, Benedicte; Labrune, Philippe; Pingault, Veronique

    2012-09-01

    Waardenburg syndrome type 4 (WS4) is a rare neural crest disorder defined by the combination of Waardenburg syndrome (sensorineural hearing loss and pigmentation defects) and Hirschsprung disease (intestinal aganglionosis). Three genes are known to be involved in this syndrome, that is, EDN3 (endothelin-3), EDNRB (endothelin receptor type B), and SOX10. However, 15-35% of WS4 remains unexplained at the molecular level, suggesting that other genes could be involved and/or that mutations within known genes may have escaped previous screenings. Here, we searched for deletions within recently identified SOX10 regulatory sequences and describe the first characterization of a WS4 patient presenting with a large deletion encompassing three of these enhancers. Analysis of the breakpoint region suggests a complex rearrangement involving three Alu sequences that could be mediated by a FosTes/MMBIR replication mechanism. Taken together with recent reports, our results demonstrate that the disruption of highly conserved non-coding elements located within or at a long distance from the coding sequences of key genes can result in several neurocristopathies. This opens up new routes to the molecular dissection of neural crest disorders.

  2. A genome-wide survey of maternal and embryonic transcripts during Xenopus tropicalis development.

    PubMed

    Paranjpe, Sarita S; Jacobi, Ulrike G; van Heeringen, Simon J; Veenstra, Gert Jan C

    2013-11-06

    Dynamics of polyadenylation vs. deadenylation determine the fate of several developmentally regulated genes. Decay of a subset of maternal mRNAs and new transcription define the maternal-to-zygotic transition, but the full complement of polyadenylated and deadenylated coding and non-coding transcripts has not yet been assessed in Xenopus embryos. To analyze the dynamics and diversity of coding and non-coding transcripts during development, both polyadenylated mRNA and ribosomal RNA-depleted total RNA were harvested across six developmental stages and subjected to high throughput sequencing. The maternally loaded transcriptome is highly diverse and consists of both polyadenylated and deadenylated transcripts. Many maternal genes show peak expression in the oocyte and include genes which are known to be the key regulators of events like oocyte maturation and fertilization. Of all the transcripts that increase in abundance between early blastula and larval stages, about 30% of the embryonic genes are induced by fourfold or more by the late blastula stage and another 35% by late gastrulation. Using a gene model validation and discovery pipeline, we identified novel transcripts and putative long non-coding RNAs (lncRNA). These lncRNA transcripts were stringently selected as spliced transcripts generated from independent promoters, with limited coding potential and a codon bias characteristic of noncoding sequences. Many lncRNAs are conserved and expressed in a developmental stage-specific fashion. These data reveal dynamics of transcriptome polyadenylation and abundance and provides a high-confidence catalogue of novel and long non-coding RNAs.

  3. Antisense Transcription Is Pervasive but Rarely Conserved in Enteric Bacteria

    PubMed Central

    Raghavan, Rahul; Sloan, Daniel B.; Ochman, Howard

    2012-01-01

    ABSTRACT Noncoding RNAs, including antisense RNAs (asRNAs) that originate from the complementary strand of protein-coding genes, are involved in the regulation of gene expression in all domains of life. Recent application of deep-sequencing technologies has revealed that the transcription of asRNAs occurs genome-wide in bacteria. Although the role of the vast majority of asRNAs remains unknown, it is often assumed that their presence implies important regulatory functions, similar to those of other noncoding RNAs. Alternatively, many antisense transcripts may be produced by chance transcription events from promoter-like sequences that result from the degenerate nature of bacterial transcription factor binding sites. To investigate the biological relevance of antisense transcripts, we compared genome-wide patterns of asRNA expression in closely related enteric bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, by performing strand-specific transcriptome sequencing. Although antisense transcripts are abundant in both species, less than 3% of asRNAs are expressed at high levels in both species, and only about 14% appear to be conserved among species. And unlike the promoters of protein-coding genes, asRNA promoters show no evidence of sequence conservation between, or even within, species. Our findings suggest that many or even most bacterial asRNAs are nonadaptive by-products of the cell’s transcription machinery. PMID:22872780

  4. Genome-Wide Discovery of Long Non-Coding RNAs in Rainbow Trout.

    PubMed

    Al-Tobasei, Rafet; Paneru, Bam; Salem, Mohamed

    2016-01-01

    The ENCODE project revealed that ~70% of the human genome is transcribed. While only 1-2% of the RNAs encode for proteins, the rest are non-coding RNAs. Long non-coding RNAs (lncRNAs) form a diverse class of non-coding RNAs that are longer than 200 nt. Emerging evidence indicates that lncRNAs play critical roles in various cellular processes including regulation of gene expression. LncRNAs show low levels of gene expression and sequence conservation, which make their computational identification in genomes difficult. In this study, more than two billion Illumina sequence reads were mapped to the genome reference using the TopHat and Cufflinks software. Transcripts shorter than 200 nt, with more than 83-100 amino acids ORF, or with significant homologies to the NCBI nr-protein database were removed. In addition, a computational pipeline was used to filter the remaining transcripts based on a protein-coding-score test. Depending on the filtering stringency conditions, between 31,195 and 54,503 lncRNAs were identified, with only 421 matching known lncRNAs in other species. A digital gene expression atlas revealed 2,935 tissue-specific and 3,269 ubiquitously-expressed lncRNAs. This study annotates the lncRNA rainbow trout genome and provides a valuable resource for functional genomics research in salmonids.

  5. Identification of 15 candidate structured noncoding RNA motifs in fungi by comparative genomics.

    PubMed

    Li, Sanshu; Breaker, Ronald R

    2017-10-13

    With the development of rapid and inexpensive DNA sequencing, the genome sequences of more than 100 fungal species have been made available. This dataset provides an excellent resource for comparative genomics analyses, which can be used to discover genetic elements, including noncoding RNAs (ncRNAs). Bioinformatics tools similar to those used to uncover novel ncRNAs in bacteria, likewise, should be useful for searching fungal genomic sequences, and the relative ease of genetic experiments with some model fungal species could facilitate experimental validation studies. We have adapted a bioinformatics pipeline for discovering bacterial ncRNAs to systematically analyze many fungal genomes. This comparative genomics pipeline integrates information on conserved RNA sequence and structural features with alternative splicing information to reveal fungal RNA motifs that are candidate regulatory domains, or that might have other possible functions. A total of 15 prominent classes of structured ncRNA candidates were identified, including variant HDV self-cleaving ribozyme representatives, atypical snoRNA candidates, and possible structured antisense RNA motifs. Candidate regulatory motifs were also found associated with genes for ribosomal proteins, S-adenosylmethionine decarboxylase (SDC), amidase, and HexA protein involved in Woronin body formation. We experimentally confirm that the variant HDV ribozymes undergo rapid self-cleavage, and we demonstrate that the SDC RNA motif reduces the expression of SAM decarboxylase by translational repression. Furthermore, we provide evidence that several other motifs discovered in this study are likely to be functional ncRNA elements. Systematic screening of fungal genomes using a computational discovery pipeline has revealed the existence of a variety of novel structured ncRNAs. Genome contexts and similarities to known ncRNA motifs provide strong evidence for the biological and biochemical functions of some newly found ncRNA motifs. Although initial examinations of several motifs provide evidence for their likely functions, other motifs will require more in-depth analysis to reveal their functions.

  6. CRX ChIP-seq reveals the cis-regulatory architecture of mouse photoreceptors

    PubMed Central

    Corbo, Joseph C.; Lawrence, Karen A.; Karlstetter, Marcus; Myers, Connie A.; Abdelaziz, Musa; Dirkes, William; Weigelt, Karin; Seifert, Martin; Benes, Vladimir; Fritsche, Lars G.; Weber, Bernhard H.F.; Langmann, Thomas

    2010-01-01

    Approximately 98% of mammalian DNA is noncoding, yet we understand relatively little about the function of this enigmatic portion of the genome. The cis-regulatory elements that control gene expression reside in noncoding regions and can be identified by mapping the binding sites of tissue-specific transcription factors. Cone-rod homeobox (CRX) is a key transcription factor in photoreceptor differentiation and survival, but its in vivo targets are largely unknown. Here, we used chromatin immunoprecipitation with massively parallel sequencing (ChIP-seq) on CRX to identify thousands of cis-regulatory regions around photoreceptor genes in adult mouse retina. CRX directly regulates downstream photoreceptor transcription factors and their target genes via a network of spatially distributed regulatory elements around each locus. CRX-bound regions act in a synergistic fashion to activate transcription and contain multiple CRX binding sites which interact in a spacing- and orientation-dependent manner to fine-tune transcript levels. CRX ChIP-seq was also performed on Nrl−/− retinas, which represent an enriched source of cone photoreceptors. Comparison with the wild-type ChIP-seq data set identified numerous rod- and cone-specific CRX-bound regions as well as many shared elements. Thus, CRX combinatorially orchestrates the transcriptional networks of both rods and cones by coordinating the expression of photoreceptor genes including most retinal disease genes. In addition, this study pinpoints thousands of noncoding regions of relevance to both Mendelian and complex retinal disease. PMID:20693478

  7. The conservation and signatures of lincRNAs in Marek’s disease of chicken

    USDA-ARS?s Scientific Manuscript database

    Long intergenic non-coding RNAs (lincRNAs) associated with a number of cancers and other diseases have been identified in mammals, but they are still formidable to be comprehensively identified and characterized. Marek’s disease (MD) is a T cell lymphoma of chickens induced by Marek’s disease virus ...

  8. The conservation and signatures of lincRNAs in Marek’s disease of chicken

    USDA-ARS?s Scientific Manuscript database

    Long intergenic non-coding RNAs (lincRNAs) associated with a number of cancers and other diseases have been identified in mammals, but they are still formidable to be comprehensively identified and characterized in chicken. Marek’s disease (MD) is a T cell lymphoma of chickens induced by Marek’s dis...

  9. Evolution of developmental regulation in the vertebrate FgfD subfamily.

    PubMed

    Jovelin, Richard; Yan, Yi-Lin; He, Xinjun; Catchen, Julian; Amores, Angel; Canestro, Cristian; Yokoi, Hayato; Postlethwait, John H

    2010-01-15

    Fibroblast growth factors (Fgfs) encode small signaling proteins that help regulate embryo patterning. Fgfs fall into seven families, including FgfD. Nonvertebrate chordates have a single FgfD gene; mammals have three (Fgf8, Fgf17, and Fgf18); and teleosts have six (fgf8a, fgf8b, fgf17, fgf18a, fgf18b, and fgf24). What are the evolutionary processes that led to the structural duplication and functional diversification of FgfD genes during vertebrate phylogeny? To study this question, we investigated conserved syntenies, patterns of gene expression, and the distribution of conserved noncoding elements (CNEs) in FgfD genes of stickleback and zebrafish, and compared them with data from cephalochordates, urochordates, and mammals. Genomic analysis suggests that Fgf8, Fgf17, Fgf18, and Fgf24 arose in two rounds of whole genome duplication at the base of the vertebrate radiation; that fgf8 and fgf18 duplications occurred at the base of the teleost radiation; and that Fgf24 is an ohnolog that was lost in the mammalian lineage. Expression analysis suggests that ancestral subfunctions partitioned between gene duplicates and points to the evolution of novel expression domains. Analysis of CNEs, at least some of which are candidate regulatory elements, suggests that ancestral CNEs partitioned between gene duplicates. These results help explain the evolutionary pathways by which the developmentally important family of FgfD molecules arose and the deduced principles that guided FgfD evolution are likely applicable to the evolution of developmental regulation in many vertebrate multigene families. (c) 2009 Wiley-Liss, Inc.

  10. Widespread long noncoding RNAs as endogenous target mimics for microRNAs in plants.

    PubMed

    Wu, Hua-Jun; Wang, Zhi-Min; Wang, Meng; Wang, Xiu-Jie

    2013-04-01

    Target mimicry is a recently identified regulatory mechanism for microRNA (miRNA) functions in plants in which the decoy RNAs bind to miRNAs via complementary sequences and therefore block the interaction between miRNAs and their authentic targets. Both endogenous decoy RNAs (miRNA target mimics) and engineered artificial RNAs can induce target mimicry effects. Yet until now, only the Induced by Phosphate Starvation1 RNA has been proven to be a functional endogenous microRNA target mimic (eTM). In this work, we developed a computational method and systematically identified intergenic or noncoding gene-originated eTMs for 20 conserved miRNAs in Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa). The predicted miRNA binding sites were well conserved among eTMs of the same miRNA, whereas sequences outside of the binding sites varied a lot. We proved that the eTMs of miR160 and miR166 are functional target mimics and identified their roles in the regulation of plant development. The effectiveness of eTMs for three other miRNAs was also confirmed by transient agroinfiltration assay.

  11. Genome defense against exogenous nucleic acids in eukaryotes by non-coding DNA occurs through CRISPR-like mechanisms in the cytosol and the bodyguard protection in the nucleus.

    PubMed

    Qiu, Guo-Hua

    2016-01-01

    In this review, the protective function of the abundant non-coding DNA in the eukaryotic genome is discussed from the perspective of genome defense against exogenous nucleic acids. Peripheral non-coding DNA has been proposed to act as a bodyguard that protects the genome and the central protein-coding sequences from ionizing radiation-induced DNA damage. In the proposed mechanism of protection, the radicals generated by water radiolysis in the cytosol and IR energy are absorbed, blocked and/or reduced by peripheral heterochromatin; then, the DNA damage sites in the heterochromatin are removed and expelled from the nucleus to the cytoplasm through nuclear pore complexes, most likely through the formation of extrachromosomal circular DNA. To strengthen this hypothesis, this review summarizes the experimental evidence supporting the protective function of non-coding DNA against exogenous nucleic acids. Based on these data, I hypothesize herein about the presence of an additional line of defense formed by small RNAs in the cytosol in addition to their bodyguard protection mechanism in the nucleus. Therefore, exogenous nucleic acids may be initially inactivated in the cytosol by small RNAs generated from non-coding DNA via mechanisms similar to the prokaryotic CRISPR-Cas system. Exogenous nucleic acids may enter the nucleus, where some are absorbed and/or blocked by heterochromatin and others integrate into chromosomes. The integrated fragments and the sites of DNA damage are removed by repetitive non-coding DNA elements in the heterochromatin and excluded from the nucleus. Therefore, the normal eukaryotic genome and the central protein-coding sequences are triply protected by non-coding DNA against invasion by exogenous nucleic acids. This review provides evidence supporting the protective role of non-coding DNA in genome defense. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Systematic molecular analyses of SHOX in Japanese patients with idiopathic short stature and Leri-Weill dyschondrosteosis.

    PubMed

    Shima, Hirohito; Tanaka, Toshiaki; Kamimaki, Tsutomu; Dateki, Sumito; Muroya, Koji; Horikawa, Reiko; Kanno, Junko; Adachi, Masanori; Naiki, Yasuhiro; Tanaka, Hiroyuki; Mabe, Hiroyo; Yagasaki, Hideaki; Kure, Shigeo; Matsubara, Yoichi; Tajima, Toshihiro; Kashimada, Kenichi; Ishii, Tomohiro; Asakura, Yumi; Fujiwara, Ikuma; Soneda, Shun; Nagasaki, Keisuke; Hamajima, Takashi; Kanzaki, Susumu; Jinno, Tomoko; Ogata, Tsutomu; Fukami, Maki

    2016-07-01

    The etiology of idiopathic short stature (ISS) and Leri-Weill dyschondrosteosis (LWD) in European patients is known to include SHOX mutations and copy-number variations (CNVs) involving SHOX and/or the highly evolutionarily conserved non-coding DNA elements (CNEs) flanking the gene. However, the frequency and types of SHOX abnormalities in non-European patients and the clinical importance of mutations in the CNEs remains to be clarified. Here, we performed systematic molecular analyses of SHOX for 328 Japanese patients with ISS or LWD. SHOX abnormalities accounted for 3.8% of ISS and 50% of LWD cases. CNVs around SHOX were identified in 16 cases, although the ~47 kb deletion frequently reported in European patients was absent in our cases. Probably damaging mutations and benign/silent substitutions were detected in four cases, respectively. Although CNE-linked substitutions were detected in 15 cases, most of them affected poorly conserved nucleotides and were shared by unaffected individuals. These results suggest that the frequency and mutation spectrum of SHOX abnormalities are comparable between Asian and European patients, with the exception of a European-specific downstream deletion. Furthermore, this study highlights the clinical importance and genetic heterogeneity of the SHOX-flanking CNVs, and indicates a limited clinical significance of point mutations in the CNEs.

  13. Unit-length line-1 transcripts in human teratocarcinoma cells.

    PubMed Central

    Skowronski, J; Fanning, T G; Singer, M F

    1988-01-01

    We have characterized the approximately 6.5-kilobase cytoplasmic poly(A)+ Line-1 (L1) RNA present in a human teratocarcinoma cell line, NTera2D1, by primer extension and by analysis of cloned cDNAs. The bulk of the RNA begins (5' end) at the residue previously identified as the 5' terminus of the longest known primate genomic L1 elements, presumed to represent "unit" length. Several of the cDNA clones are close to 6 kilobase pairs, that is, close to full length. The partial sequences of 18 cDNA clones and full sequence of one (5,975 base pairs) indicate that many different genomic L1 elements contribute transcripts to the 6.5-kilobase cytoplasmic poly(A)+ RNA in NTera2D1 cells because no 2 of the 19 cDNAs analyzed had identical sequences. The transcribed elements appear to represent a subset of the total genomic L1s, a subset that has a characteristic consensus sequence in the 3' noncoding region and a high degree of sequence conservation throughout. Two open reading frames (ORFs) of 1,122 (ORF1) and 3,852 (ORF2) bases, flanked by about 800 and 200 bases of sequence at the 5' and 3' ends, respectively, can be identified in the cDNAs. Both ORFs are in the same frame, and they are separated by 33 bases bracketed by two conserved in-frame stop codons. ORF 2 is interrupted by at least one randomly positioned stop codon in the majority of the cDNAs. The data support proposals suggesting that the human L1 family includes one or more functional genes as well as an extraordinarily large number of pseudogenes whose ORFs are broken by stop codons. The cDNA structures suggest that both genes and pseudogenes are transcribed. At least one of the cDNAs (cD11), which was sequenced in its entirety, could, in principle, represent an mRNA for production of the ORF1 polypeptide. The similarity of mammalian L1s to several recently described invertebrate movable elements defines a new widely distributed class of elements which we term class II retrotransposons. Images PMID:2454389

  14. Potential role of small noncoding RNAs in regulating hypovirulence in Rhizoctonia solani anastomosis group 3

    USDA-ARS?s Scientific Manuscript database

    Double-stranded RNA (dsRNA) elements are frequently associated with fungi. In Rhizoctonia solani anastomosis group-3 (AG3), the 3.6 kb dsRNA element M2 has been associated with the hypovirulence of Rhs1A1 strain, enabling its use as a biological control agent. Previous studies that examined the rol...

  15. Genomic identification of regulatory elements by evolutionary sequence comparison and functional analysis.

    PubMed

    Loots, Gabriela G

    2008-01-01

    Despite remarkable recent advances in genomics that have enabled us to identify most of the genes in the human genome, comparable efforts to define transcriptional cis-regulatory elements that control gene expression are lagging behind. The difficulty of this task stems from two equally important problems: our knowledge of how regulatory elements are encoded in genomes remains elementary, and there is a vast genomic search space for regulatory elements, since most of mammalian genomes are noncoding. Comparative genomic approaches are having a remarkable impact on the study of transcriptional regulation in eukaryotes and currently represent the most efficient and reliable methods of predicting noncoding sequences likely to control the patterns of gene expression. By subjecting eukaryotic genomic sequences to computational comparisons and subsequent experimentation, we are inching our way toward a more comprehensive catalog of common regulatory motifs that lie behind fundamental biological processes. We are still far from comprehending how the transcriptional regulatory code is encrypted in the human genome and providing an initial global view of regulatory gene networks, but collectively, the continued development of comparative and experimental approaches will rapidly expand our knowledge of the transcriptional regulome.

  16. Noncoding origins of anthropoid traits and a new null model of transposon functionalization.

    PubMed

    del Rosario, Ricardo C H; Rayan, Nirmala Arul; Prabhakar, Shyam

    2014-09-01

    Little is known about novel genetic elements that drove the emergence of anthropoid primates. We exploited the sequencing of the marmoset genome to identify 23,849 anthropoid-specific constrained (ASC) regions and confirmed their robust functional signatures. Of the ASC base pairs, 99.7% were noncoding, suggesting that novel anthropoid functional elements were overwhelmingly cis-regulatory. ASCs were highly enriched in loci associated with fetal brain development, motor coordination, neurotransmission, and vision, thus providing a large set of candidate elements for exploring the molecular basis of hallmark primate traits. We validated ASC192 as a primate-specific enhancer in proliferative zones of the developing brain. Unexpectedly, transposable elements (TEs) contributed to >56% of ASCs, and almost all TE families showed functional potential similar to that of nonrepetitive DNA. Three L1PA repeat-derived ASCs displayed coherent eye-enhancer function, thus demonstrating that the "gene-battery" model of TE functionalization applies to enhancers in vivo. Our study provides fundamental insights into genome evolution and the origins of anthropoid phenotypes and supports an elegantly simple new null model of TE exaptation. © 2014 del Rosario et al.; Published by Cold Spring Harbor Laboratory Press.

  17. Genomic deletion of a long-range bone enhancer misregulatessclerostin in Van Buchem disease

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loots, Gabriela G.; Kneissel, Michaela; Keller, Hansjoerg

    2005-04-15

    Mutations in distant regulatory elements can negatively impact human development and health, yet due to the difficulty of detecting these critical sequences we predominantly focus on coding sequences for diagnostic purposes. We have undertaken a comparative sequence-based approach to characterize a large noncoding region deleted in patients affected by Van Buchem disease (VB), a severe sclerosing bone dysplasia. Using BAC recombination and transgenesis we characterized the expression of human sclerostin (sost) from normal (hSOSTwt) or Van Buchem(hSOSTvb D) alleles. Only the hSOSTwt allele faithfully expressed high levels of human sost in the adult bone and impacted bone metabolism, consistent withmore » the model that the VB noncoding deletion removes a sost specific regulatory element. By exploiting cross-species sequence comparisons with in vitro and in vivo enhancer assays we were able to identify a candidate enhancer element that drives human sost expression in osteoblast-like cell lines in vitro and in the skeletal anlage of the E14.5 mouse embryo, and discovered a novel function for sclerostin during limb development. Our approach represents a framework for characterizing distant regulatory elements associated with abnormal human phenotypes.« less

  18. DCODE.ORG Anthology of Comparative Genomic Tools

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loots, G G; Ovcharenko, I

    2005-01-11

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a toolmore » for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.« less

  19. Dcode.org anthology of comparative genomic tools.

    PubMed

    Loots, Gabriela G; Ovcharenko, Ivan

    2005-07-01

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the non-coding encryption of gene regulation across genomes. To facilitate the practical application of comparative sequence analysis to genetics and genomics, we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools, zPicture and Mulan; a phylogenetic shadowing tool, eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools, rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, Creme 2.0; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here, we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ website.

  20. Molecular Regulatory Pathways Link Sepsis With Metabolic Syndrome: Non-coding RNA Elements Underlying the Sepsis/Metabolic Cross-Talk.

    PubMed

    Meydan, Chanan; Bekenstein, Uriya; Soreq, Hermona

    2018-01-01

    Sepsis and metabolic syndrome (MetS) are both inflammation-related entities with high impact for human health and the consequences of concussions. Both represent imbalanced parasympathetic/cholinergic response to insulting triggers and variably uncontrolled inflammation that indicates shared upstream regulators, including short microRNAs (miRs) and long non-coding RNAs (lncRNAs). These may cross talk across multiple systems, leading to complex molecular and clinical outcomes. Notably, biomedical and RNA-sequencing based analyses both highlight new links between the acquired and inherited pathogenic, cardiac and inflammatory traits of sepsis/MetS. Those include the HOTAIR and MIAT lncRNAs and their targets, such as miR-122, -150, -155, -182, -197, -375, -608 and HLA-DRA. Implicating non-coding RNA regulators in sepsis and MetS may delineate novel high-value biomarkers and targets for intervention.

  1. Secondary structure of the 3'-noncoding region of flavivirus genomes: comparative analysis of base pairing probabilities.

    PubMed

    Rauscher, S; Flamm, C; Mandl, C W; Heinz, F X; Stadler, P F

    1997-07-01

    The prediction of the complete matrix of base pairing probabilities was applied to the 3' noncoding region (NCR) of flavivirus genomes. This approach identifies not only well-defined secondary structure elements, but also regions of high structural flexibility. Flaviviruses, many of which are important human pathogens, have a common genomic organization, but exhibit a significant degree of RNA sequence diversity in the functionally important 3'-NCR. We demonstrate the presence of secondary structures shared by all flaviviruses, as well as structural features that are characteristic for groups of viruses within the genus reflecting the established classification scheme. The significance of most of the predicted structures is corroborated by compensatory mutations. The availability of infectious clones for several flaviviruses will allow the assessment of these structural elements in processes of the viral life cycle, such as replication and assembly.

  2. RNAcode: Robust discrimination of coding and noncoding regions in comparative sequence data

    PubMed Central

    Washietl, Stefan; Findeiß, Sven; Müller, Stephan A.; Kalkhof, Stefan; von Bergen, Martin; Hofacker, Ivo L.; Stadler, Peter F.; Goldman, Nick

    2011-01-01

    With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily conserved regions arose as a core analysis task. Here we present RNAcode, a program to detect coding regions in multiple sequence alignments that is optimized for emerging applications not covered by current protein gene-finding software. Our algorithm combines information from nucleotide substitution and gap patterns in a unified framework and also deals with real-life issues such as alignment and sequencing errors. It uses an explicit statistical model with no machine learning component and can therefore be applied “out of the box,” without any training, to data from all domains of life. We describe the RNAcode method and apply it in combination with mass spectrometry experiments to predict and confirm seven novel short peptides in Escherichia coli and to analyze the coding potential of RNAs previously annotated as “noncoding.” RNAcode is open source software and available for all major platforms at http://wash.github.com/rnacode. PMID:21357752

  3. RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data.

    PubMed

    Washietl, Stefan; Findeiss, Sven; Müller, Stephan A; Kalkhof, Stefan; von Bergen, Martin; Hofacker, Ivo L; Stadler, Peter F; Goldman, Nick

    2011-04-01

    With the availability of genome-wide transcription data and massive comparative sequencing, the discrimination of coding from noncoding RNAs and the assessment of coding potential in evolutionarily conserved regions arose as a core analysis task. Here we present RNAcode, a program to detect coding regions in multiple sequence alignments that is optimized for emerging applications not covered by current protein gene-finding software. Our algorithm combines information from nucleotide substitution and gap patterns in a unified framework and also deals with real-life issues such as alignment and sequencing errors. It uses an explicit statistical model with no machine learning component and can therefore be applied "out of the box," without any training, to data from all domains of life. We describe the RNAcode method and apply it in combination with mass spectrometry experiments to predict and confirm seven novel short peptides in Escherichia coli and to analyze the coding potential of RNAs previously annotated as "noncoding." RNAcode is open source software and available for all major platforms at http://wash.github.com/rnacode.

  4. Non-coding stem-bulge RNAs are required for cell proliferation and embryonic development in C. elegans

    PubMed Central

    Kowalski, Madzia P.; Baylis, Howard A.; Krude, Torsten

    2015-01-01

    ABSTRACT Stem bulge RNAs (sbRNAs) are a family of small non-coding stem-loop RNAs present in Caenorhabditis elegans and other nematodes, the function of which is unknown. Here, we report the first functional characterisation of nematode sbRNAs. We demonstrate that sbRNAs from a range of nematode species are able to reconstitute the initiation of chromosomal DNA replication in the presence of replication proteins in vitro, and that conserved nucleotide sequence motifs are essential for this function. By functionally inactivating sbRNAs with antisense morpholino oligonucleotides, we show that sbRNAs are required for S phase progression, early embryonic development and the viability of C. elegans in vivo. Thus, we demonstrate a new and essential role for sbRNAs during the early development of C. elegans. sbRNAs show limited nucleotide sequence similarity to vertebrate Y RNAs, which are also essential for the initiation of DNA replication. Our results therefore establish that the essential function of small non-coding stem-loop RNAs during DNA replication extends beyond vertebrates. PMID:25908866

  5. Noncoding copy-number variations are associated with congenital limb malformation.

    PubMed

    Flöttmann, Ricarda; Kragesteen, Bjørt K; Geuer, Sinje; Socha, Magdalena; Allou, Lila; Sowińska-Seidler, Anna; Bosquillon de Jarcy, Laure; Wagner, Johannes; Jamsheer, Aleksander; Oehl-Jaschkowitz, Barbara; Wittler, Lars; de Silva, Deepthi; Kurth, Ingo; Maya, Idit; Santos-Simarro, Fernando; Hülsemann, Wiebke; Klopocki, Eva; Mountford, Roger; Fryer, Alan; Borck, Guntram; Horn, Denise; Lapunzina, Pablo; Wilson, Meredith; Mascrez, Bénédicte; Duboule, Denis; Mundlos, Stefan; Spielmann, Malte

    2017-10-12

    PurposeCopy-number variants (CNVs) are generally interpreted by linking the effects of gene dosage with phenotypes. The clinical interpretation of noncoding CNVs remains challenging. We investigated the percentage of disease-associated CNVs in patients with congenital limb malformations that affect noncoding cis-regulatory sequences versus genes sensitive to gene dosage effects.MethodsWe applied high-resolution copy-number analysis to 340 unrelated individuals with isolated limb malformation. To investigate novel candidate CNVs, we re-engineered human CNVs in mice using clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing.ResultsOf the individuals studied, 10% harbored CNVs segregating with the phenotype in the affected families. We identified 31 CNVs previously associated with congenital limb malformations and four novel candidate CNVs. Most of the disease-associated CNVs (57%) affected the noncoding cis-regulatory genome, while only 43% included a known disease gene and were likely to result from gene dosage effects. In transgenic mice harboring four novel candidate CNVs, we observed altered gene expression in all cases, indicating that the CNVs had a regulatory effect either by changing the enhancer dosage or altering the topological associating domain architecture of the genome.ConclusionOur findings suggest that CNVs affecting noncoding regulatory elements are a major cause of congenital limb malformations.Genetics in Medicine advance online publication, 12 October 2017; doi:10.1038/gim.2017.154.

  6. VlincRNAs controlled by retroviral elements are a hallmark of pluripotency and cancer.

    PubMed

    St Laurent, Georges; Shtokalo, Dmitry; Dong, Biao; Tackett, Michael R; Fan, Xiaoxuan; Lazorthes, Sandra; Nicolas, Estelle; Sang, Nianli; Triche, Timothy J; McCaffrey, Timothy A; Xiao, Weidong; Kapranov, Philipp

    2013-07-22

    The function of the non-coding portion of the human genome remains one of the most important questions of our time. Its vast complexity is exemplified by the recent identification of an unusual and notable component of the transcriptome - very long intergenic non-coding RNAs, termed vlincRNAs. Here we identify 2,147 vlincRNAs covering 10 percent of our genome. We show they are present not only in cancerous cells, but also in primary cells and normal human tissues, and are controlled by canonical promoters. Furthermore, vlincRNA promoters frequently originate from within endogenous retroviral sequences. Strikingly, the number of vlincRNAs expressed from endogenous retroviral promoters strongly correlates with pluripotency or the degree of malignant transformation. These results suggest a previously unknown connection between the pluripotent state and cancer via retroviral repeat-driven expression of vlincRNAs. Finally, we show that vlincRNAs can be syntenically conserved in humans and mouse and their depletion using RNAi can cause apoptosis in cancerous cells. These intriguing observations suggest that vlincRNAs could create a framework that combines many existing short ESTs and lincRNAs into a landscape of very long transcripts functioning in the regulation of gene expression in the nucleus. Certain types of vlincRNAs participate at specific stages of normal development and, based on analysis of a limited set of cancerous and primary cell lines, they appear to be co-opted by cancer-associated transcriptional programs. This provides additional understanding of transcriptome regulation during the malignant state, and could lead to additional targets and options for its reversal.

  7. Behind the curtain of non-coding RNAs; long non-coding RNAs regulating hepatocarcinogenesis

    PubMed Central

    El Khodiry, Aya; Afify, Menna; El Tayebi, Hend M

    2018-01-01

    Hepatocellular carcinoma (HCC) is one of the most common and aggressive cancers worldwide. HCC is the fifth common malignancy in the world and the second leading cause of cancer death in Asia. Long non-coding RNAs (lncRNAs) are RNAs with a length greater than 200 nucleotides that do not encode proteins. lncRNAs can regulate gene expression and protein synthesis in several ways by interacting with DNA, RNA and proteins in a sequence specific manner. They could regulate cellular and developmental processes through either gene inhibition or gene activation. Many studies have shown that dysregulation of lncRNAs is related to many human diseases such as cardiovascular diseases, genetic disorders, neurological diseases, immune mediated disorders and cancers. However, the study of lncRNAs is challenging as they are poorly conserved between species, their expression levels aren’t as high as that of mRNAs and have great interpatient variations. The study of lncRNAs expression in cancers have been a breakthrough as it unveils potential biomarkers and drug targets for cancer therapy and helps understand the mechanism of pathogenesis. This review discusses many long non-coding RNAs and their contribution in HCC, their role in development, metastasis, and prognosis of HCC and how to regulate and target these lncRNAs as a therapeutic tool in HCC treatment in the future. PMID:29434445

  8. Long noncoding RNAs and their proposed functions in fibre development of cotton (Gossypium spp.).

    PubMed

    Wang, Maojun; Yuan, Daojun; Tu, Lili; Gao, Wenhui; He, Yonghui; Hu, Haiyan; Wang, Pengcheng; Liu, Nian; Lindsey, Keith; Zhang, Xianlong

    2015-09-01

    Long noncoding RNAs (lncRNAs) are transcripts of at least 200 bp in length, possess no apparent coding capacity and are involved in various biological regulatory processes. Until now, no systematic identification of lncRNAs has been reported in cotton (Gossypium spp.). Here, we describe the identification of 30 550 long intergenic noncoding RNA (lincRNA) loci (50 566 transcripts) and 4718 long noncoding natural antisense transcript (lncNAT) loci (5826 transcripts). LncRNAs are rich in repetitive sequences and preferentially expressed in a tissue-specific manner. The detection of abundant genome-specific and/or lineage-specific lncRNAs indicated their weak evolutionary conservation. Approximately 76% of homoeologous lncRNAs exhibit biased expression patterns towards the At or Dt subgenomes. Compared with protein-coding genes, lncRNAs showed overall higher methylation levels and their expression was less affected by gene body methylation. Expression validation in different cotton accessions and coexpression network construction helped to identify several functional lncRNA candidates involved in cotton fibre initiation and elongation. Analysis of integrated expression from the subgenomes of lncRNAs generating miR397 and its targets as a result of genome polyploidization indicated their pivotal functions in regulating lignin metabolism in domesticated tetraploid cotton fibres. This study provides the first comprehensive identification of lncRNAs in Gossypium. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  9. Long non-coding RNA discovery across the genus anopheles reveals conserved secondary structures within and beyond the Gambiae complex.

    PubMed

    Jenkins, Adam M; Waterhouse, Robert M; Muskavitch, Marc A T

    2015-04-23

    Long non-coding RNAs (lncRNAs) have been defined as mRNA-like transcripts longer than 200 nucleotides that lack significant protein-coding potential, and many of them constitute scaffolds for ribonucleoprotein complexes with critical roles in epigenetic regulation. Various lncRNAs have been implicated in the modulation of chromatin structure, transcriptional and post-transcriptional gene regulation, and regulation of genomic stability in mammals, Caenorhabditis elegans, and Drosophila melanogaster. The purpose of this study is to identify the lncRNA landscape in the malaria vector An. gambiae and assess the evolutionary conservation of lncRNAs and their secondary structures across the Anopheles genus. Using deep RNA sequencing of multiple Anopheles gambiae life stages, we have identified 2,949 lncRNAs and more than 300 previously unannotated putative protein-coding genes. The lncRNAs exhibit differential expression profiles across life stages and adult genders. We find that across the genus Anopheles, lncRNAs display much lower sequence conservation than protein-coding genes. Additionally, we find that lncRNA secondary structure is highly conserved within the Gambiae complex, but diverges rapidly across the rest of the genus Anopheles. This study offers one of the first lncRNA secondary structure analyses in vector insects. Our description of lncRNAs in An. gambiae offers the most comprehensive genome-wide insights to date into lncRNAs in this vector mosquito, and defines a set of potential targets for the development of vector-based interventions that may further curb the human malaria burden in disease-endemic countries.

  10. A conserved long noncoding RNA affects sleep behavior in Drosophila.

    PubMed

    Soshnev, Alexey A; Ishimoto, Hiroshi; McAllister, Bryant F; Li, Xingguo; Wehling, Misty D; Kitamoto, Toshihiro; Geyer, Pamela K

    2011-10-01

    Metazoan genomes encode an abundant collection of mRNA-like, long noncoding (lnc)RNAs. Although lncRNAs greatly expand the transcriptional repertoire, we have a limited understanding of how these RNAs contribute to developmental regulation. Here, we investigate the function of the Drosophila lncRNA called yellow-achaete intergenic RNA (yar). Comparative sequence analyses show that the yar gene is conserved in Drosophila species representing 40-60 million years of evolution, with one of the conserved sequence motifs encompassing the yar promoter. Further, the timing of yar expression in Drosophila virilis parallels that in D. melanogaster, suggesting that transcriptional regulation of yar is conserved. The function of yar was defined by generating null alleles. Flies lacking yar RNAs are viable and show no overt morphological defects, consistent with maintained transcriptional regulation of the adjacent yellow (y) and achaete (ac) genes. The location of yar within a neural gene cluster led to the investigation of effects of yar in behavioral assays. These studies demonstrated that loss of yar alters sleep regulation in the context of a normal circadian rhythm. Nighttime sleep was reduced and fragmented, with yar mutants displaying diminished sleep rebound following sleep deprivation. Importantly, these defects were rescued by a yar transgene. These data provide the first example of a lncRNA gene involved in Drosophila sleep regulation. We find that yar is a cytoplasmic lncRNA, suggesting that yar may regulate sleep by affecting stabilization or translational regulation of mRNAs. Such functions of lncRNAs may extend to vertebrates, as lncRNAs are abundant in neural tissues.

  11. Transposable elements (TEs) contribute to stress-related long intergenic noncoding RNAs in plants.

    PubMed

    Wang, Dong; Qu, Zhipeng; Yang, Lan; Zhang, Qingzhu; Liu, Zhi-Hong; Do, Trung; Adelson, David L; Wang, Zhen-Yu; Searle, Iain; Zhu, Jian-Kang

    2017-04-01

    Noncoding RNAs have been extensively described in plant and animal transcriptomes by using high-throughput sequencing technology. Of these noncoding RNAs, a growing number of long intergenic noncoding RNAs (lincRNAs) have been described in multicellular organisms, however the origins and functions of many lincRNAs remain to be explored. In many eukaryotic genomes, transposable elements (TEs) are widely distributed and often account for large fractions of plant and animal genomes yet the contribution of TEs to lincRNAs is largely unknown. By using strand-specific RNA-sequencing, we profiled the expression patterns of lincRNAs in Arabidopsis, rice and maize, and identified 47 611 and 398 TE-associated lincRNAs (TE-lincRNAs), respectively. TE-lincRNAs were more often derived from retrotransposons than DNA transposons and as retrotransposon copy number in both rice and maize genomes so did TE-lincRNAs. We validated the expression of these TE-lincRNAs by strand-specific RT-PCR and also demonstrated tissue-specific transcription and stress-induced TE-lincRNAs either after salt, abscisic acid (ABA) or cold treatments. For Arabidopsis TE-lincRNA11195, mutants had reduced sensitivity to ABA as demonstrated by longer roots and higher shoot biomass when compared to wild-type. Finally, by altering the chromatin state in the Arabidopsis chromatin remodelling mutant ddm1, unique lincRNAs including TE-lincRNAs were generated from the preceding untranscribed regions and interestingly inherited in a wild-type background in subsequent generations. Our findings not only demonstrate that TE-associated lincRNAs play important roles in plant abiotic stress responses but lincRNAs and TE-lincRNAs might act as an adaptive reservoir in eukaryotes. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.

  12. Heterogeneous conservation of Dlx paralog co-expression in jawed vertebrates.

    PubMed

    Debiais-Thibaud, Mélanie; Metcalfe, Cushla J; Pollack, Jacob; Germon, Isabelle; Ekker, Marc; Depew, Michael; Laurenti, Patrick; Borday-Birraux, Véronique; Casane, Didier

    2013-01-01

    The Dlx gene family encodes transcription factors involved in the development of a wide variety of morphological innovations that first evolved at the origins of vertebrates or of the jawed vertebrates. This gene family expanded with the two rounds of genome duplications that occurred before jawed vertebrates diversified. It includes at least three bigene pairs sharing conserved regulatory sequences in tetrapods and teleost fish, but has been only partially characterized in chondrichthyans, the third major group of jawed vertebrates. Here we take advantage of developmental and molecular tools applied to the shark Scyliorhinus canicula to fill in the gap and provide an overview of the evolution of the Dlx family in the jawed vertebrates. These results are analyzed in the theoretical framework of the DDC (Duplication-Degeneration-Complementation) model. The genomic organisation of the catshark Dlx genes is similar to that previously described for tetrapods. Conserved non-coding elements identified in bony fish were also identified in catshark Dlx clusters and showed regulatory activity in transgenic zebrafish. Gene expression patterns in the catshark showed that there are some expression sites with high conservation of the expressed paralog(s) and other expression sites with events of paralog sub-functionalization during jawed vertebrate diversification, resulting in a wide variety of evolutionary scenarios within this gene family. Dlx gene expression patterns in the catshark show that there has been little neo-functionalization in Dlx genes over gnathostome evolution. In most cases, one tandem duplication and two rounds of vertebrate genome duplication have led to at least six Dlx coding sequences with redundant expression patterns followed by some instances of paralog sub-functionalization. Regulatory constraints such as shared enhancers, and functional constraints including gene pleiotropy, may have contributed to the evolutionary inertia leading to high redundancy between gene expression patterns.

  13. Hiding in Plain Sight: Rediscovering the Importance of Noncoding RNA in Human Malignancy.

    PubMed

    Feeley, Kyle P; Edmonds, Mick D

    2018-05-01

    At the time of its construction in the 1950s, the central dogma of molecular biology was a useful model that represented the current state of knowledge for the flow of genetic information after a period of prolific scientific discovery. Unknowingly, it also biased many of our assumptions going forward. Whether intentional or not, genomic elements not fitting into this paradigm were deemed unimportant and emphasis on the study of protein-coding genes prevailed for decades. The phrase "Junk DNA," first popularized in the 1960s, is still used with alarming frequency to describe the entirety of noncoding DNA. It has since become apparent that RNA molecules not coding for protein are vitally important in both normal development and human malignancy. Cancer researchers have been pioneers in determining noncoding RNA function and developing new technologies to study these molecules. In this review, we will discuss well known and newly emerging species of noncoding RNAs, their functions in cancer, and new technologies being utilized to understand their mechanisms of action in cancer. Cancer Res; 78(9); 2149-58. ©2018 AACR . ©2018 American Association for Cancer Research.

  14. Identification of coding and non-coding mutational hotspots in cancer genomes.

    PubMed

    Piraino, Scott W; Furney, Simon J

    2017-01-05

    The identification of mutations that play a causal role in tumour development, so called "driver" mutations, is of critical importance for understanding how cancers form and how they might be treated. Several large cancer sequencing projects have identified genes that are recurrently mutated in cancer patients, suggesting a role in tumourigenesis. While the landscape of coding drivers has been extensively studied and many of the most prominent driver genes are well characterised, comparatively less is known about the role of mutations in the non-coding regions of the genome in cancer development. The continuing fall in genome sequencing costs has resulted in a concomitant increase in the number of cancer whole genome sequences being produced, facilitating systematic interrogation of both the coding and non-coding regions of cancer genomes. To examine the mutational landscapes of tumour genomes we have developed a novel method to identify mutational hotspots in tumour genomes using both mutational data and information on evolutionary conservation. We have applied our methodology to over 1300 whole cancer genomes and show that it identifies prominent coding and non-coding regions that are known or highly suspected to play a role in cancer. Importantly, we applied our method to the entire genome, rather than relying on predefined annotations (e.g. promoter regions) and we highlight recurrently mutated regions that may have resulted from increased exposure to mutational processes rather than selection, some of which have been identified previously as targets of selection. Finally, we implicate several pan-cancer and cancer-specific candidate non-coding regions, which could be involved in tumourigenesis. We have developed a framework to identify mutational hotspots in cancer genomes, which is applicable to the entire genome. This framework identifies known and novel coding and non-coding mutional hotspots and can be used to differentiate candidate driver regions from likely passenger regions susceptible to somatic mutation.

  15. The domain structure and distribution of Alu elements in long noncoding RNAs and mRNAs

    PubMed Central

    Kim, Eugene Z.; Wespiser, Adam R.; Caffrey, Daniel R.

    2016-01-01

    Approximately 75% of the human genome is transcribed and many of these spliced transcripts contain primate-specific Alu elements, the most abundant mobile element in the human genome. The majority of exonized Alu elements are located in long noncoding RNAs (lncRNAs) and the untranslated regions of mRNA, with some performing molecular functions. To further assess the potential for Alu elements to be repurposed as functional RNA domains, we investigated the distribution and evolution of Alu elements in spliced transcripts. Our analysis revealed that Alu elements are underrepresented in mRNAs and lncRNAs, suggesting that most exonized Alu elements arising in the population are rare or deleterious to RNA function. When mRNAs and lncRNAs retain exonized Alu elements, they have a clear preference for Alu dimers, left monomers, and right monomers. mRNAs often acquire Alu elements when their genes are duplicated within Alu-rich regions. In lncRNAs, reverse-oriented Alu elements are significantly enriched and are not restricted to the 3′ and 5′ ends. Both lncRNAs and mRNAs primarily contain the Alu J and S subfamilies that were amplified relatively early in primate evolution. Alu J subfamilies are typically overrepresented in lncRNAs, whereas the Alu S dimer is overrepresented in mRNAs. The sequences of Alu dimers tend to be constrained in both lncRNAs and mRNAs, whereas the left and right monomers are constrained within particular Alu subfamilies and classes of RNA. Collectively, these findings suggest that Alu-containing RNAs are capable of forming stable structures and that some of these Alu domains might have novel biological functions. PMID:26654912

  16. The expression pattern of the Picea glauca Defensin 1 promoter is maintained in Arabidopsis thaliana, indicating the conservation of signalling pathways between angiosperms and gymnosperms.

    PubMed

    Germain, Hugo; Lachance, Denis; Pelletier, Gervais; Fossdal, Carl Gunnar; Solheim, Halvor; Séguin, Armand

    2012-01-01

    A 1149 bp genomic fragment corresponding to the 5' non-coding region of the PgD1 (Picea glauca Defensin 1) gene was cloned, characterized, and compared with all Arabidopsis thaliana defensin promoters. The cloned fragment was found to contain several motifs specific to defence or hormonal response, including a motif involved in the methyl jasmonate reponse, a fungal elicitor responsive element, and TC-rich repeat cis-acting element involved in defence and stress responsiveness. A functional analysis of the PgD1 promoter was performed using the uidA (GUS) reporter system in stably transformed Arabidopsis and white spruce plants. The PgD1 promoter was responsive to jasmonic acid (JA), to infection by fungus and to wounding. In transgenic spruce embryos, GUS staining was clearly restricted to the shoot apical meristem. In Arabidopsis, faint GUS coloration was observed in leaves and flowers and a strong blue colour was observed in guard cells and trichomes. Transgenic Arabidopsis plants expressing the PgD1::GUS construct were also infiltrated with the hemibiotrophic pathogen Pseudomonas syringae pv. tomato DC3000. It caused a suppression of defensin expression probably resulting from the antagonistic relationship between the pathogen-stimulated salicylic acid pathway and the jasmonic acid pathway. It is therefore concluded that the PgD1 promoter fragment cloned appears to contain most if not all the elements for proper PgD1 expression and that these elements are also recognized in Arabidopsis despite the phylogenetic and evolutionary differences that separates them.

  17. Using the NCBI Genome Databases to Compare the Genes for Human & Chimpanzee Beta Hemoglobin

    ERIC Educational Resources Information Center

    Offner, Susan

    2010-01-01

    The beta hemoglobin protein is identical in humans and chimpanzees. In this tutorial, students see that even though the proteins are identical, the genes that code for them are not. There are many more differences in the introns than in the exons, which indicates that coding regions of DNA are more highly conserved than non-coding regions.

  18. The PRC2-binding long non-coding RNAs in human and mouse genomes are associated with predictive sequence features

    NASA Astrophysics Data System (ADS)

    Tu, Shiqi; Yuan, Guo-Cheng; Shao, Zhen

    2017-01-01

    Recently, long non-coding RNAs (lncRNAs) have emerged as an important class of molecules involved in many cellular processes. One of their primary functions is to shape epigenetic landscape through interactions with chromatin modifying proteins. However, mechanisms contributing to the specificity of such interactions remain poorly understood. Here we took the human and mouse lncRNAs that were experimentally determined to have physical interactions with Polycomb repressive complex 2 (PRC2), and systematically investigated the sequence features of these lncRNAs by developing a new computational pipeline for sequences composition analysis, in which each sequence is considered as a series of transitions between adjacent nucleotides. Through that, PRC2-binding lncRNAs were found to be associated with a set of distinctive and evolutionarily conserved sequence features, which can be utilized to distinguish them from the others with considerable accuracy. We further identified fragments of PRC2-binding lncRNAs that are enriched with these sequence features, and found they show strong PRC2-binding signals and are more highly conserved across species than the other parts, implying their functional importance.

  19. Transcriptomic analysis of the mussel Elliptio complanata identifies candidate stress-response genes and an abundance of novel or noncoding transcripts

    USGS Publications Warehouse

    Cornman, Robert S.; Robertson, Laura S.; Galbraith, Heather S.; Blakeslee, Carrie J.

    2014-01-01

    Mussels are useful indicator species of environmental stress and degradation, and the global decline in freshwater mussel diversity and abundance is of conservation concern. Elliptio complanata is a common freshwater mussel of eastern North America that can serve both as an indicator and as an experimental model for understanding mussel physiology and genetics. To support genetic components of these research goals, we assembled transcriptome contigs from Illumina paired-end reads. Despite efforts to collapse similar contigs, the final assembly was in excess of 136,000 contigs with an N50 of 982 bp. Even so, comparisons to the CEGMA database of conserved eukaryotic genes indicated that ∼20% of genes remain unrepresented. However, numerous candidate stress-response genes were present, and we identified lineage-specific patterns of diversification among molluscs for cytochrome P450 detoxification genes and two saccharide-modifying enzymes: 1,3 beta-galactosyltransferase and fucosyltransferase. Less than a quarter of contigs had protein-level similarity based on modest BLAST and Hmmer3 statistical thresholds. These results add comparative genomic resources for molluscs and suggest a wealth of novel proteins and noncoding transcripts.

  20. rVISTA 2.0: Evolutionary Analysis of Transcription Factor Binding Sites

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loots, G G; Ovcharenko, I

    2004-01-28

    Identifying and characterizing the patterns of DNA cis-regulatory modules represents a challenge that has the potential to reveal the regulatory language the genome uses to dictate transcriptional dynamics. Several studies have demonstrated that regulatory modules are under positive selection and therefore are often conserved between related species. Using this evolutionary principle we have created a comparative tool, rVISTA, for analyzing the regulatory potential of noncoding sequences. The rVISTA tool combines transcription factor binding site (TFBS) predictions, sequence comparisons and cluster analysis to identify noncoding DNA regions that are highly conserved and present in a specific configuration within an alignment. Heremore » we present the newly developed version 2.0 of the rVISTA tool that can process alignments generated by both zPicture and PipMaker alignment programs or use pre-computed pairwise alignments of seven vertebrate genomes available from the ECR Browser. The rVISTA web server is closely interconnected with the TRANSFAC database, allowing users to either search for matrices present in the TRANSFAC library collection or search for user-defined consensus sequences. rVISTA tool is publicly available at http://rvista.dcode.org/.« less

  1. Widespread Long Noncoding RNAs as Endogenous Target Mimics for MicroRNAs in Plants1[W

    PubMed Central

    Wu, Hua-Jun; Wang, Zhi-Min; Wang, Meng; Wang, Xiu-Jie

    2013-01-01

    Target mimicry is a recently identified regulatory mechanism for microRNA (miRNA) functions in plants in which the decoy RNAs bind to miRNAs via complementary sequences and therefore block the interaction between miRNAs and their authentic targets. Both endogenous decoy RNAs (miRNA target mimics) and engineered artificial RNAs can induce target mimicry effects. Yet until now, only the Induced by Phosphate Starvation1 RNA has been proven to be a functional endogenous microRNA target mimic (eTM). In this work, we developed a computational method and systematically identified intergenic or noncoding gene-originated eTMs for 20 conserved miRNAs in Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa). The predicted miRNA binding sites were well conserved among eTMs of the same miRNA, whereas sequences outside of the binding sites varied a lot. We proved that the eTMs of miR160 and miR166 are functional target mimics and identified their roles in the regulation of plant development. The effectiveness of eTMs for three other miRNAs was also confirmed by transient agroinfiltration assay. PMID:23429259

  2. Cis-regulatory underpinnings of human GLI3 expression in embryonic craniofacial structures and internal organs.

    PubMed

    Abbasi, Amir A; Minhas, Rashid; Schmidt, Ansgar; Koch, Sabine; Grzeschik, Karl-Heinz

    2013-10-01

    The zinc finger transcription factor Gli3 is an important mediator of Sonic hedgehog (Shh) signaling. During early embryonic development Gli3 participates in patterning and growth of the central nervous system, face, skeleton, limb, tooth and gut. Precise regulation of the temporal and spatial expression of Gli3 is crucial for the proper specification of these structures in mammals and other vertebrates. Previously we reported a set of human intronic cis-regulators controlling almost the entire known repertoire of endogenous Gli3 expression in mouse neural tube and limbs. However, the genetic underpinning of GLI3 expression in other embryonic domains such as craniofacial structures and internal organs remain elusive. Here we demonstrate in a transgenic mice assay the potential of a subset of human/fish conserved non-coding sequences (CNEs) residing within GLI3 intronic intervals to induce reporter gene expression at known regions of endogenous Gli3 transcription in embryonic domains other than central nervous system (CNS) and limbs. Highly specific reporter expression was observed in craniofacial structures, eye, gut, and genitourinary system. Moreover, the comparison of expression patterns directed by these intronic cis-acting regulatory elements in mouse and zebrafish embryos suggests that in accordance with sequence conservation, the target site specificity of a subset of these elements remains preserved among these two lineages. Taken together with our recent investigations, it is proposed here that during vertebrate evolution the Gli3 expression control acquired multiple, independently acting, intronic enhancers for spatiotemporal patterning of CNS, limbs, craniofacial structures and internal organs. © 2013 The Authors Development, Growth & Differentiation © 2013 Japanese Society of Developmental Biologists.

  3. Junk DNA and the long non-coding RNA twist in cancer genetics

    PubMed Central

    Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A

    2015-01-01

    The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839

  4. Complex organisation and structure of the ghrelin antisense strand gene GHRLOS, a candidate non-coding RNA gene

    PubMed Central

    Seim, Inge; Carter, Shea L; Herington, Adrian C; Chopin, Lisa K

    2008-01-01

    Background The peptide hormone ghrelin has many important physiological and pathophysiological roles, including the stimulation of growth hormone (GH) release, appetite regulation, gut motility and proliferation of cancer cells. We previously identified a gene on the opposite strand of the ghrelin gene, ghrelinOS (GHRLOS), which spans the promoter and untranslated regions of the ghrelin gene (GHRL). Here we further characterise GHRLOS. Results We have described GHRLOS mRNA isoforms that extend over 1.4 kb of the promoter region and 106 nucleotides of exon 4 of the ghrelin gene, GHRL. These GHRLOS transcripts initiate 4.8 kb downstream of the terminal exon 4 of GHRL and are present in the 3' untranslated exon of the adjacent gene TATDN2 (TatD DNase domain containing 2). Interestingly, we have also identified a putative non-coding TATDN2-GHRLOS chimaeric transcript, indicating that GHRLOS RNA biogenesis is extremely complex. Moreover, we have discovered that the 3' region of GHRLOS is also antisense, in a tail-to-tail fashion to a novel terminal exon of the neighbouring SEC13 gene, which is important in protein transport. Sequence analyses revealed that GHRLOS is riddled with stop codons, and that there is little nucleotide and amino-acid sequence conservation of the GHRLOS gene between vertebrates. The gene spans 44 kb on 3p25.3, is extensively spliced and harbours multiple variable exons. We have also investigated the expression of GHRLOS and found evidence of differential tissue expression. It is highly expressed in tissues which are emerging as major sites of non-coding RNA expression (the thymus, brain, and testis), as well as in the ovary and uterus. In contrast, very low levels were found in the stomach where sense, GHRL derived RNAs are highly expressed. Conclusion GHRLOS RNA transcripts display several distinctive features of non-coding (ncRNA) genes, including 5' capping, polyadenylation, extensive splicing and short open reading frames. The gene is also non-conserved, with differential and tissue-restricted expression. The overlapping genomic arrangement of GHRLOS with the ghrelin gene indicates that it is likely to have interesting regulatory and functional roles in the ghrelin axis. PMID:18954468

  5. Complex organisation and structure of the ghrelin antisense strand gene GHRLOS, a candidate non-coding RNA gene.

    PubMed

    Seim, Inge; Carter, Shea L; Herington, Adrian C; Chopin, Lisa K

    2008-10-28

    The peptide hormone ghrelin has many important physiological and pathophysiological roles, including the stimulation of growth hormone (GH) release, appetite regulation, gut motility and proliferation of cancer cells. We previously identified a gene on the opposite strand of the ghrelin gene, ghrelinOS (GHRLOS), which spans the promoter and untranslated regions of the ghrelin gene (GHRL). Here we further characterise GHRLOS. We have described GHRLOS mRNA isoforms that extend over 1.4 kb of the promoter region and 106 nucleotides of exon 4 of the ghrelin gene, GHRL. These GHRLOS transcripts initiate 4.8 kb downstream of the terminal exon 4 of GHRL and are present in the 3' untranslated exon of the adjacent gene TATDN2 (TatD DNase domain containing 2). Interestingly, we have also identified a putative non-coding TATDN2-GHRLOS chimaeric transcript, indicating that GHRLOS RNA biogenesis is extremely complex. Moreover, we have discovered that the 3' region of GHRLOS is also antisense, in a tail-to-tail fashion to a novel terminal exon of the neighbouring SEC13 gene, which is important in protein transport. Sequence analyses revealed that GHRLOS is riddled with stop codons, and that there is little nucleotide and amino-acid sequence conservation of the GHRLOS gene between vertebrates. The gene spans 44 kb on 3p25.3, is extensively spliced and harbours multiple variable exons. We have also investigated the expression of GHRLOS and found evidence of differential tissue expression. It is highly expressed in tissues which are emerging as major sites of non-coding RNA expression (the thymus, brain, and testis), as well as in the ovary and uterus. In contrast, very low levels were found in the stomach where sense, GHRL derived RNAs are highly expressed. GHRLOS RNA transcripts display several distinctive features of non-coding (ncRNA) genes, including 5' capping, polyadenylation, extensive splicing and short open reading frames. The gene is also non-conserved, with differential and tissue-restricted expression. The overlapping genomic arrangement of GHRLOS with the ghrelin gene indicates that it is likely to have interesting regulatory and functional roles in the ghrelin axis.

  6. Open chromatin reveals the functional maize genome

    USDA-ARS?s Scientific Manuscript database

    Every cellular process mediated through nuclear DNA must contend with chromatin. As results from ENCODE show, open chromatin assays can efficiently integrate across diverse regulatory elements, revealing functional non-coding genome. In this study, we use a MNase hypersensitivity assay to discover o...

  7. Possible involvement of SINEs in mammalian-specific brain formation

    PubMed Central

    Sasaki, Takeshi; Nishihara, Hidenori; Hirakawa, Mika; Fujimura, Koji; Tanaka, Mikiko; Kokubo, Nobuhiro; Kimura-Yoshida, Chiharu; Matsuo, Isao; Sumiyama, Kenta; Saitou, Naruya; Shimogori, Tomomi; Okada, Norihiro

    2008-01-01

    Retroposons, such as short interspersed elements (SINEs) and long interspersed elements (LINEs), are the major constituents of higher vertebrate genomes. Although there are many examples of retroposons' acquiring function, none has been implicated in the morphological innovations specific to a certain taxonomic group. We previously characterized a SINE family, AmnSINE1, members of which constitute a part of conserved noncoding elements (CNEs) in mammalian genomes. We proposed that this family acquired genomic functionality or was exapted after retropositioning in a mammalian ancestor. Here we identified 53 new AmnSINE1 loci and refined 124 total loci, two of which were further analyzed. Using a mouse enhancer assay, we demonstrate that one SINE locus, AS071, 178 kbp from the gene FGF8 (fibroblast growth factor 8), is an enhancer that recapitulates FGF8 expression in two regions of the developing forebrain, namely the diencephalon and the hypothalamus. Our gain-of-function analysis revealed that FGF8 expression in the diencephalon controls patterning of thalamic nuclei, which act as a relay center of the neocortex, suggesting a role for FGF8 in mammalian-specific forebrain patterning. Furthermore, we demonstrated that the locus, AS021, 392 kbp from the gene SATB2, controls gene expression in the lateral telencephalon, which is thought to be a signaling center during development. These results suggest important roles for SINEs in the development of the mammalian neuronal network, a part of which was initiated with the exaptation of AmnSINE1 in a common mammalian ancestor. PMID:18334644

  8. Possible involvement of SINEs in mammalian-specific brain formation.

    PubMed

    Sasaki, Takeshi; Nishihara, Hidenori; Hirakawa, Mika; Fujimura, Koji; Tanaka, Mikiko; Kokubo, Nobuhiro; Kimura-Yoshida, Chiharu; Matsuo, Isao; Sumiyama, Kenta; Saitou, Naruya; Shimogori, Tomomi; Okada, Norihiro

    2008-03-18

    Retroposons, such as short interspersed elements (SINEs) and long interspersed elements (LINEs), are the major constituents of higher vertebrate genomes. Although there are many examples of retroposons' acquiring function, none has been implicated in the morphological innovations specific to a certain taxonomic group. We previously characterized a SINE family, AmnSINE1, members of which constitute a part of conserved noncoding elements (CNEs) in mammalian genomes. We proposed that this family acquired genomic functionality or was exapted after retropositioning in a mammalian ancestor. Here we identified 53 new AmnSINE1 loci and refined 124 total loci, two of which were further analyzed. Using a mouse enhancer assay, we demonstrate that one SINE locus, AS071, 178 kbp from the gene FGF8 (fibroblast growth factor 8), is an enhancer that recapitulates FGF8 expression in two regions of the developing forebrain, namely the diencephalon and the hypothalamus. Our gain-of-function analysis revealed that FGF8 expression in the diencephalon controls patterning of thalamic nuclei, which act as a relay center of the neocortex, suggesting a role for FGF8 in mammalian-specific forebrain patterning. Furthermore, we demonstrated that the locus, AS021, 392 kbp from the gene SATB2, controls gene expression in the lateral telencephalon, which is thought to be a signaling center during development. These results suggest important roles for SINEs in the development of the mammalian neuronal network, a part of which was initiated with the exaptation of AmnSINE1 in a common mammalian ancestor.

  9. WordCluster: detecting clusters of DNA words and genomic elements

    PubMed Central

    2011-01-01

    Background Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes. PMID:21261981

  10. ALUminating the Path of Atherosclerosis Progression: Chaos Theory Suggests a Role for Alu Repeats in the Development of Atherosclerotic Vascular Disease.

    PubMed

    Hueso, Miguel; Cruzado, Josep M; Torras, Joan; Navarro, Estanislao

    2018-06-12

    Atherosclerosis (ATH) and coronary artery disease (CAD) are chronic inflammatory diseases with an important genetic background; they derive from the cumulative effect of multiple common risk alleles, most of which are located in genomic noncoding regions. These complex diseases behave as nonlinear dynamical systems that show a high dependence on their initial conditions; thus, long-term predictions of disease progression are unreliable. One likely possibility is that the nonlinear nature of ATH could be dependent on nonlinear correlations in the structure of the human genome. In this review, we show how chaos theory analysis has highlighted genomic regions that have shared specific structural constraints, which could have a role in ATH progression. These regions were shown to be enriched with repetitive sequences of the Alu family, genomic parasites that have colonized the human genome, which show a particular secondary structure and are involved in the regulation of gene expression. Here, we show the impact of Alu elements on the mechanisms that regulate gene expression, especially highlighting the molecular mechanisms via which the Alu elements alter the inflammatory response. We devote special attention to their relationship with the long noncoding RNA (lncRNA); antisense noncoding RNA in the INK4 locus ( ANRIL ), a risk factor for ATH; their role as microRNA (miRNA) sponges; and their ability to interfere with the regulatory circuitry of the (nuclear factor kappa B) NF-κB response. We aim to characterize ATH as a nonlinear dynamic system, in which small initial alterations in the expression of a number of repetitive elements are somehow amplified to reach phenotypic significance.

  11. A 5′ Noncoding Exon Containing Engineered Intron Enhances Transgene Expression from Recombinant AAV Vectors in vivo

    PubMed Central

    Lu, Jiamiao; Williams, James A.; Luke, Jeremy; Zhang, Feijie; Chu, Kirk; Kay, Mark A.

    2017-01-01

    We previously developed a mini-intronic plasmid (MIP) expression system in which the essential bacterial elements for plasmid replication and selection are placed within an engineered intron contained within a universal 5′ UTR noncoding exon. Like minicircle DNA plasmids (devoid of bacterial backbone sequences), MIP plasmids overcome transcriptional silencing of the transgene. However, in addition MIP plasmids increase transgene expression by 2 and often >10 times higher than minicircle vectors in vivo and in vitro. Based on these findings, we examined the effects of the MIP intronic sequences in a recombinant adeno-associated virus (AAV) vector system. Recombinant AAV vectors containing an intron with a bacterial replication origin and bacterial selectable marker increased transgene expression by 40 to 100 times in vivo when compared with conventional AAV vectors. Therefore, inclusion of this noncoding exon/intron sequence upstream of the coding region can substantially enhance AAV-mediated gene expression in vivo. PMID:27903072

  12. Identification and characterization of a class of MALAT1 -like genomic loci

    DOE PAGES

    Zhang, Bin; Mao, Yuntao S.; Diermeier, Sarah D.; ...

    2017-05-23

    The MALAT1 (Metastasis-Associated Lung Adenocarcinoma Transcript 1) gene encodes a noncoding RNA that is processed into a long nuclear retained transcript ( MALAT1) and a small cytoplasmic tRNA-like transcript (mascRNA). Using an RNA sequence- and structure-based covariance model, we identified more than 130 genomic loci in vertebrate genomes containing the MALAT1 3' end triple-helix structure and its immediate downstream tRNA-like structure, including 44 in the green lizard Anolis carolinensis. Structural and computational analyses revealed a co-occurrence of components of the 3' end module. MALAT1-like genes in Anolis carolinensis are highly expressed in adult testis, thus we named them testis-abundant longmore » noncoding RNAs (tancRNAs). MALAT1-like loci also produce multiple small RNA species, including PIWI-interacting RNAs (piRNAs), from the antisense strand. The 3' ends of tancRNAs serve as potential targets for the PIWI-piRNA complex. Furthermore, we have identified an evolutionarily conserved class of long noncoding RNAs (lncRNAs) with similar structural constraints, post-transcriptional processing, and subcellular localization and a distinct function in spermatocytes.« less

  13. Identification and characterization of a class of MALAT1 -like genomic loci

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Bin; Mao, Yuntao S.; Diermeier, Sarah D.

    The MALAT1 (Metastasis-Associated Lung Adenocarcinoma Transcript 1) gene encodes a noncoding RNA that is processed into a long nuclear retained transcript ( MALAT1) and a small cytoplasmic tRNA-like transcript (mascRNA). Using an RNA sequence- and structure-based covariance model, we identified more than 130 genomic loci in vertebrate genomes containing the MALAT1 3' end triple-helix structure and its immediate downstream tRNA-like structure, including 44 in the green lizard Anolis carolinensis. Structural and computational analyses revealed a co-occurrence of components of the 3' end module. MALAT1-like genes in Anolis carolinensis are highly expressed in adult testis, thus we named them testis-abundant longmore » noncoding RNAs (tancRNAs). MALAT1-like loci also produce multiple small RNA species, including PIWI-interacting RNAs (piRNAs), from the antisense strand. The 3' ends of tancRNAs serve as potential targets for the PIWI-piRNA complex. Furthermore, we have identified an evolutionarily conserved class of long noncoding RNAs (lncRNAs) with similar structural constraints, post-transcriptional processing, and subcellular localization and a distinct function in spermatocytes.« less

  14. Dynamic and Widespread lncRNA Expression in a Sponge and the Origin of Animal Complexity

    PubMed Central

    Gaiti, Federico; Fernandez-Valverde, Selene L.; Nakanishi, Nagayasu; Calcino, Andrew D.; Yanai, Itai; Tanurdzic, Milos; Degnan, Bernard M.

    2015-01-01

    Long noncoding RNAs (lncRNAs) are important developmental regulators in bilaterian animals. A correlation has been claimed between the lncRNA repertoire expansion and morphological complexity in vertebrate evolution. However, this claim has not been tested by examining morphologically simple animals. Here, we undertake a systematic investigation of lncRNAs in the demosponge Amphimedon queenslandica, a morphologically simple, early-branching metazoan. We combine RNA-Seq data across multiple developmental stages of Amphimedon with a filtering pipeline to conservatively predict 2,935 lncRNAs. These include intronic overlapping lncRNAs, exonic antisense overlapping lncRNAs, long intergenic nonprotein coding RNAs, and precursors for small RNAs. Sponge lncRNAs are remarkably similar to their bilaterian counterparts in being relatively short with few exons and having low primary sequence conservation relative to protein-coding genes. As in bilaterians, a majority of sponge lncRNAs exhibit typical hallmarks of regulatory molecules, including high temporal specificity and dynamic developmental expression. Specific lncRNA expression profiles correlate tightly with conserved protein-coding genes likely involved in a range of developmental and physiological processes, such as the Wnt signaling pathway. Although the majority of Amphimedon lncRNAs appears to be taxonomically restricted with no identifiable orthologs, we find a few cases of conservation between demosponges in lncRNAs that are antisense to coding sequences. Based on the high similarity in the structure, organization, and dynamic expression of sponge lncRNAs to their bilaterian counterparts, we propose that these noncoding RNAs are an ancient feature of the metazoan genome. These results are consistent with lncRNAs regulating the development of animals, regardless of their level of morphological complexity. PMID:25976353

  15. microRNA Therapeutics in Cancer - An Emerging Concept.

    PubMed

    Shah, Maitri Y; Ferrajoli, Alessandra; Sood, Anil K; Lopez-Berestein, Gabriel; Calin, George A

    2016-10-01

    MicroRNAs (miRNAs) are an evolutionarily conserved class of small, regulatory non-coding RNAs that negatively regulate protein coding gene and other non-coding transcripts expression. miRNAs have been established as master regulators of cellular processes, and they play a vital role in tumor initiation, progression and metastasis. Further, widespread deregulation of microRNAs have been reported in several cancers, with several microRNAs playing oncogenic and tumor suppressive roles. Based on these, miRNAs have emerged as promising therapeutic tools for cancer management. In this review, we have focused on the roles of miRNAs in tumorigenesis, the miRNA-based therapeutic strategies currently being evaluated for use in cancer, and the advantages and current challenges to their use in the clinic. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  16. Noncoding somatic and inherited single-nucleotide variants converge to promote ESR1 expression in breast cancer.

    PubMed

    Bailey, Swneke D; Desai, Kinjal; Kron, Ken J; Mazrooei, Parisa; Sinnott-Armstrong, Nicholas A; Treloar, Aislinn E; Dowar, Mark; Thu, Kelsie L; Cescon, David W; Silvester, Jennifer; Yang, S Y Cindy; Wu, Xue; Pezo, Rossanna C; Haibe-Kains, Benjamin; Mak, Tak W; Bedard, Philippe L; Pugh, Trevor J; Sallari, Richard C; Lupien, Mathieu

    2016-10-01

    Sustained expression of the estrogen receptor-α (ESR1) drives two-thirds of breast cancer and defines the ESR1-positive subtype. ESR1 engages enhancers upon estrogen stimulation to establish an oncogenic expression program. Somatic copy number alterations involving the ESR1 gene occur in approximately 1% of ESR1-positive breast cancers, suggesting that other mechanisms underlie the persistent expression of ESR1. We report significant enrichment of somatic mutations within the set of regulatory elements (SRE) regulating ESR1 in 7% of ESR1-positive breast cancers. These mutations regulate ESR1 expression by modulating transcription factor binding to the DNA. The SRE includes a recurrently mutated enhancer whose activity is also affected by rs9383590, a functional inherited single-nucleotide variant (SNV) that accounts for several breast cancer risk-associated loci. Our work highlights the importance of considering the combinatorial activity of regulatory elements as a single unit to delineate the impact of noncoding genetic alterations on single genes in cancer.

  17. A Somatically Acquired Enhancer of the Androgen Receptor Is a Noncoding Driver in Advanced Prostate Cancer.

    PubMed

    Takeda, David Y; Spisák, Sándor; Seo, Ji-Heui; Bell, Connor; O'Connor, Edward; Korthauer, Keegan; Ribli, Dezső; Csabai, István; Solymosi, Norbert; Szállási, Zoltán; Stillman, David R; Cejas, Paloma; Qiu, Xintao; Long, Henry W; Tisza, Viktória; Nuzzo, Pier Vitale; Rohanizadegan, Mersedeh; Pomerantz, Mark M; Hahn, William C; Freedman, Matthew L

    2018-06-09

    Increased androgen receptor (AR) activity drives therapeutic resistance in advanced prostate cancer. The most common resistance mechanism is amplification of this locus presumably targeting the AR gene. Here, we identify and characterize a somatically acquired AR enhancer located 650 kb centromeric to the AR. Systematic perturbation of this enhancer using genome editing decreased proliferation by suppressing AR levels. Insertion of an additional copy of this region sufficed to increase proliferation under low androgen conditions and to decrease sensitivity to enzalutamide. Epigenetic data generated in localized prostate tumors and benign specimens support the notion that this region is a developmental enhancer. Collectively, these observations underscore the importance of epigenomic profiling in primary specimens and the value of deploying genome editing to functionally characterize noncoding elements. More broadly, this work identifies a therapeutic vulnerability for targeting the AR and emphasizes the importance of regulatory elements as highly recurrent oncogenic drivers. Copyright © 2018 Elsevier Inc. All rights reserved.

  18. Long noncoding RNA H19 interacts with polypyrimidine tract-binding protein 1 to reprogram hepatic lipid homeostasis.

    PubMed

    Liu, Chune; Yang, Zhihong; Wu, Jianguo; Zhang, Li; Lee, Sangmin; Shin, Dong-Ju; Tran, Melanie; Wang, Li

    2018-05-01

    H19 is an imprinted long noncoding RNA abundantly expressed in embryonic liver and repressed after birth. We show that H19 serves as a lipid sensor by synergizing with the RNA-binding polypyrimidine tract-binding protein 1 (PTBP1) to modulate hepatic metabolic homeostasis. H19 RNA interacts with PTBP1 to facilitate its association with sterol regulatory element-binding protein 1c mRNA and protein, leading to increased stability and nuclear transcriptional activity. H19 and PTBP1 are up-regulated by fatty acids in hepatocytes and in diet-induced fatty liver, which further augments lipid accumulation. Ectopic expression of H19 induces steatosis and pushes the liver into a "pseudo-fed" state in response to fasting by promoting sterol regulatory element-binding protein 1c protein cleavage and nuclear translocation. Deletion of H19 or knockdown of PTBP1 abolishes high-fat and high-sucrose diet-induced steatosis. Our study unveils an H19/PTBP1/sterol regulatory element-binding protein 1 feedforward amplifying signaling pathway to exacerbate the development of fatty liver. (Hepatology 2018;67:1768-1783). © 2017 by the American Association for the Study of Liver Diseases.

  19. Functional annotation of the vlinc class of non-coding RNAs using systems biology approach

    PubMed Central

    Laurent, Georges St.; Vyatkin, Yuri; Antonets, Denis; Ri, Maxim; Qi, Yao; Saik, Olga; Shtokalo, Dmitry; de Hoon, Michiel J.L.; Kawaji, Hideya; Itoh, Masayoshi; Lassmann, Timo; Arner, Erik; Forrest, Alistair R.R.; Nicolas, Estelle; McCaffrey, Timothy A.; Carninci, Piero; Hayashizaki, Yoshihide; Wahlestedt, Claes; Kapranov, Philipp

    2016-01-01

    Functionality of the non-coding transcripts encoded by the human genome is the coveted goal of the modern genomics research. While commonly relied on the classical methods of forward genetics, integration of different genomics datasets in a global Systems Biology fashion presents a more productive avenue of achieving this very complex aim. Here we report application of a Systems Biology-based approach to dissect functionality of a newly identified vast class of very long intergenic non-coding (vlinc) RNAs. Using highly quantitative FANTOM5 CAGE dataset, we show that these RNAs could be grouped into 1542 novel human genes based on analysis of insulators that we show here indeed function as genomic barrier elements. We show that vlincRNAs genes likely function in cis to activate nearby genes. This effect while most pronounced in closely spaced vlincRNA–gene pairs can be detected over relatively large genomic distances. Furthermore, we identified 101 vlincRNA genes likely involved in early embryogenesis based on patterns of their expression and regulation. We also found another 109 such genes potentially involved in cellular functions also happening at early stages of development such as proliferation, migration and apoptosis. Overall, we show that Systems Biology-based methods have great promise for functional annotation of non-coding RNAs. PMID:27001520

  20. Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

    PubMed

    Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

    2017-10-03

    Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.

  1. Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes

    PubMed Central

    Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

    2017-01-01

    Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes. PMID:29108274

  2. Bioinformatics of cardiovascular miRNA biology.

    PubMed

    Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas

    2015-12-01

    MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Variability among the Most Rapidly Evolving Plastid Genomic Regions is Lineage-Specific: Implications of Pairwise Genome Comparisons in Pyrus (Rosaceae) and Other Angiosperms for Marker Choice

    PubMed Central

    Ter-Voskanyan, Hasmik; Allgaier, Martin; Borsch, Thomas

    2014-01-01

    Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae)—a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC–trnV, trnR–atpA, ndhF–rpl32, psbM–trnD, and trnQ–rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters). Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid), Olea (asterids) and Cymbidium (monocots) showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF–rpl32 and trnK–rps16) were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations. Sequencing whole plastid genomes to find markers for evolutionary analyses is therefore particularly useful when overall genetic distances are low. PMID:25405773

  4. ChIP-seq Identification of Weakly Conserved Heart Enhancers

    PubMed Central

    Blow, Matthew J.; McCulley, David J.; Li, Zirong; Zhang, Tao; Akiyama, Jennifer A.; Holt, Amy; Plajzer-Frick, Ingrid; Shoukry, Malak; Wright, Crystal; Chen, Feng; Afzal, Veena; Bristow, James; Ren, Bing; Black, Brian L.; Rubin, Edward M.; Visel, Axel; Pennacchio, Len A.

    2011-01-01

    Accurate control of tissue-specific gene expression plays a pivotal role in heart development, but few cardiac transcriptional enhancers have thus far been identified. Extreme non-coding sequence conservation successfully predicts enhancers active in many tissues, but fails to identify substantial numbers of heart enhancers. Here we used ChIP-seq with the enhancer-associated protein p300 from mouse embryonic day 11.5 heart tissue to identify over three thousand candidate heart enhancers genome-wide. Compared to other tissues studied at this time-point, most candidate heart enhancers are less deeply conserved in vertebrate evolution. Nevertheless, the testing of 130 candidate regions in a transgenic mouse assay revealed that most of them reproducibly function as enhancers active in the heart, irrespective of their degree of evolutionary constraint. These results provide evidence for a large population of poorly conserved heart enhancers and suggest that the evolutionary constraint of embryonic enhancers can vary depending on tissue type. PMID:20729851

  5. Validation of Small RNAs In Xylella fastidiosa by qRT-PCR

    USDA-ARS?s Scientific Manuscript database

    Xylella fastidiosa causes many economically important crop diseases including almond leaf scorch disease and Pierce’ disease of grapevine. Although non-coding small RNAs (sRNAs) are regarded as ubiquitous regulatory elements in bacteria, research attention to sRNAs in X. fastidiosa has been limited...

  6. Identification of novel MITEs (miniature inverted-repeat transposable elements) in Coxiella burnetii: implications for protein and small RNA evolution.

    PubMed

    Wachter, Shaun; Raghavan, Rahul; Wachter, Jenny; Minnick, Michael F

    2018-04-11

    Coxiella burnetii is a Gram-negative gammaproteobacterium and zoonotic agent of Q fever. C. burnetii's genome contains an abundance of pseudogenes and numerous selfish genetic elements. MITEs (miniature inverted-repeat transposable elements) are non-autonomous transposons that occur in all domains of life and are thought to be insertion sequences (ISs) that have lost their transposase function. Like most transposable elements (TEs), MITEs are thought to play an active role in evolution by altering gene function and expression through insertion and deletion activities. However, information regarding bacterial MITEs is limited. We describe two MITE families discovered during research on small non-coding RNAs (sRNAs) of C. burnetii. Two sRNAs, Cbsr3 and Cbsr13, were found to originate from a novel MITE family, termed QMITE1. Another sRNA, CbsR16, was found to originate from a separate and novel MITE family, termed QMITE2. Members of each family occur ~ 50 times within the strains evaluated. QMITE1 is a typical MITE of 300-400 bp with short (2-3 nt) direct repeats (DRs) of variable sequence and is often found overlapping annotated open reading frames (ORFs). Additionally, QMITE1 elements possess sigma-70 promoters and are transcriptionally active at several loci, potentially influencing expression of nearby genes. QMITE2 is smaller (150-190 bps), but has longer (7-11 nt) DRs of variable sequences and is mainly found in the 3' untranslated region of annotated ORFs and intergenic regions. QMITE2 contains a GTAG repetitive extragenic palindrome (REP) that serves as a target for IS1111 TE insertion. Both QMITE1 and QMITE2 display inter-strain linkage and sequence conservation, suggesting that they are adaptive and existed before divergence of C. burnetii strains. We have discovered two novel MITE families of C. burnetii. Our finding that MITEs serve as a source for sRNAs is novel. QMITE2 has a unique structure and occurs in large or small versions with unique DRs that display linkage and sequence conservation between strains, allowing for tracking of genomic rearrangements. QMITE1 and QMITE2 copies are hypothesized to influence expression of neighboring genes involved in DNA repair and virulence through transcriptional interference and ribonuclease processing.

  7. Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics

    PubMed Central

    Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

    2015-01-01

    The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. PMID:25378326

  8. A Primate lncRNA Mediates Notch Signaling During Neuronal Development by Sequestering miRNA

    PubMed Central

    Rani, Neha; Nowakowski, Tomasz J; Zhou, Hongjun; Godshalk, Sirie E.; Lisi, Véronique; Kriegstein, Arnold R.; Kosik, Kenneth S.

    2016-01-01

    Summary Long non-coding RNAs (lncRNAs) are a diverse and poorly conserved category of transcripts that have expanded greatly in primates, particularly in the brain. We identified a lncRNA, which has acquired 16 microRNA response elements for miR-143-3p in the Catarrhini branch of primates. This lncRNA termed LncND (neuro-development) is expressed in neural progenitor cells and then declines in neurons. Binding and release of miR-143-3p, by LncND, controls the expression of Notch receptors. LncND expression is enriched in radial glia cells (RGCs) in the ventricular and subventricular zones of developing human brain. Down-regulation in neuroblastoma cells reduced cell proliferation and induced neuronal differentiation, an effect phenocopied by miR-143-3p over-expression. Gain-of-function of LncND in developing mouse cortex led to an expansion of PAX6+ RGCs. These findings support role for LncND in miRNA-mediated regulation of Notch signaling within the neural progenitor pool in primates that may have contributed to the expansion of cerebral cortex. PMID:27263970

  9. Dual-Targeting Small-Molecule Inhibitors of the Staphylococcus aureus FMN Riboswitch Disrupt Riboflavin Homeostasis in an Infectious Setting.

    PubMed

    Wang, Hao; Mann, Paul A; Xiao, Li; Gill, Charles; Galgoci, Andrew M; Howe, John A; Villafania, Artjohn; Barbieri, Christopher M; Malinverni, Juliana C; Sher, Xinwei; Mayhood, Todd; McCurry, Megan D; Murgolo, Nicholas; Flattery, Amy; Mack, Matthias; Roemer, Terry

    2017-05-18

    Riboswitches are bacterial-specific, broadly conserved, non-coding RNA structural elements that control gene expression of numerous metabolic pathways and transport functions essential for cell growth. As such, riboswitch inhibitors represent a new class of potential antibacterial agents. Recently, we identified ribocil-C, a highly selective inhibitor of the flavin mononucleotide (FMN) riboswitch that controls expression of de novo riboflavin (RF, vitamin B2) biosynthesis in Escherichia coli. Here, we provide a mechanistic characterization of the antibacterial effects of ribocil-C as well as of roseoflavin (RoF), an antimetabolite analog of RF, among medically significant Gram-positive bacteria, including methicillin-resistant Staphylococcus aureus (MRSA) and Enterococcus faecalis. We provide genetic, biophysical, computational, biochemical, and pharmacological evidence that ribocil-C and RoF specifically inhibit dual FMN riboswitches, separately controlling RF biosynthesis and uptake processes essential for MRSA growth and pathogenesis. Such a dual-targeting mechanism is specifically required to develop broad-spectrum Gram-positive antibacterial agents targeting RF metabolism. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Identification of Transposable Elements Contributing to Tissue-Specific Expression of Long Non-Coding RNAs

    PubMed Central

    Chishima, Takafumi; Iwakiri, Junichi

    2018-01-01

    It has been recently suggested that transposable elements (TEs) are re-used as functional elements of long non-coding RNAs (lncRNAs). This is supported by some examples such as the human endogenous retrovirus subfamily H (HERVH) elements contained within lncRNAs and expressed specifically in human embryonic stem cells (hESCs), as required to maintain hESC identity. There are at least two unanswered questions about all lncRNAs. How many TEs are re-used within lncRNAs? Are there any other TEs that affect tissue specificity of lncRNA expression? To answer these questions, we comprehensively identify TEs that are significantly related to tissue-specific expression levels of lncRNAs. We downloaded lncRNA expression data corresponding to normal human tissue from the Expression Atlas and transformed the data into tissue specificity estimates. Then, Fisher’s exact tests were performed to verify whether the presence or absence of TE-derived sequences influences the tissue specificity of lncRNA expression. Many TE–tissue pairs associated with tissue-specific expression of lncRNAs were detected, indicating that multiple TE families can be re-used as functional domains or regulatory sequences of lncRNAs. In particular, we found that the antisense promoter region of L1PA2, a LINE-1 subfamily, appears to act as a promoter for lncRNAs with placenta-specific expression. PMID:29315213

  11. Evidence of function for conserved noncoding sequences in Arabidopsis thaliana.

    PubMed

    Spangler, Jacob B; Subramaniam, Sabarinath; Freeling, Michael; Feltus, F Alex

    2012-01-01

    • Whole genome duplication events provide a lineage with a large reservoir of genes that can be molded by evolutionary forces into phenotypes that fit alternative environments. A well-studied whole genome duplication, the α-event, occurred in an ancestor of the model plant Arabidopsis thaliana. Retained segments of the α-event have been defined in recent years in the form of duplicate protein coding sequences (α-pairs) and associated conserved noncoding DNA sequences (CNSs). Our aim was to identify any association between CNSs and α-pair co-functionality at the gene expression level. • Here, we tested for correlation between CNS counts and α-pair co-expression and expression intensity across nine expression datasets: aerial tissue, flowers, leaves, roots, rosettes, seedlings, seeds, shoots and whole plants. • We provide evidence for a putative regulatory role of the CNSs. The association of CNSs with α-pair co-expression and expression intensity varied by gene function, subgene position and the presence of transcription factor binding motifs. A range of possible CNS regulatory mechanisms, including intron-mediated enhancement, messenger RNA fold stability and transcriptional regulation, are discussed. • This study provides a framework to understand how CNS motifs are involved in the maintenance of gene expression after a whole genome duplication event. © 2011 The Authors. New Phytologist © 2011 New Phytologist Trust.

  12. G-Boxes, Bigfoot Genes, and Environmental Response: Characterization of Intragenomic Conserved Noncoding Sequences in Arabidopsis[W

    PubMed Central

    Freeling, Michael; Rapaka, Lakshmi; Lyons, Eric; Pedersen, Brent; Thomas, Brian C.

    2007-01-01

    A tetraploidy left Arabidopsis thaliana with 6358 pairs of homoeologs that, when aligned, generated 14,944 intragenomic conserved noncoding sequences (CNSs). Our previous work assembled these phylogenetic footprints into a database. We show that known transcription factor (TF) binding motifs, including the G-box, are overrepresented in these CNSs. A total of 254 genes spanning long lengths of CNS-rich chromosomes (Bigfoot) dominate this database. Therefore, we made subdatabases: one containing Bigfoot genes and the other containing genes with three to five CNSs (Smallfoot). Bigfoot genes are generally TFs that respond to signals, with their modal CNS positioned 3.1 kb 5′ from the ATG. Smallfoot genes encode components of signal transduction machinery, the cytoskeleton, or involve transcription. We queried each subdatabase with each possible 7-nucleotide sequence. Among hundreds of hits, most were purified from CNSs, and almost all of those significantly enriched in CNSs had no experimental history. The 7-mers in CNSs are not 5′- to 3′-oriented in Bigfoot genes but are often oriented in Smallfoot genes. CNSs with one G-box tend to have two G-boxes. CNSs were shared with the homoeolog only and with no other gene, suggesting that binding site turnover impedes detection. Bigfoot genes may function in adaptation to environmental change. PMID:17496117

  13. G-boxes, bigfoot genes, and environmental response: characterization of intragenomic conserved noncoding sequences in Arabidopsis.

    PubMed

    Freeling, Michael; Rapaka, Lakshmi; Lyons, Eric; Pedersen, Brent; Thomas, Brian C

    2007-05-01

    A tetraploidy left Arabidopsis thaliana with 6358 pairs of homoeologs that, when aligned, generated 14,944 intragenomic conserved noncoding sequences (CNSs). Our previous work assembled these phylogenetic footprints into a database. We show that known transcription factor (TF) binding motifs, including the G-box, are overrepresented in these CNSs. A total of 254 genes spanning long lengths of CNS-rich chromosomes (Bigfoot) dominate this database. Therefore, we made subdatabases: one containing Bigfoot genes and the other containing genes with three to five CNSs (Smallfoot). Bigfoot genes are generally TFs that respond to signals, with their modal CNS positioned 3.1 kb 5' from the ATG. Smallfoot genes encode components of signal transduction machinery, the cytoskeleton, or involve transcription. We queried each subdatabase with each possible 7-nucleotide sequence. Among hundreds of hits, most were purified from CNSs, and almost all of those significantly enriched in CNSs had no experimental history. The 7-mers in CNSs are not 5'- to 3'-oriented in Bigfoot genes but are often oriented in Smallfoot genes. CNSs with one G-box tend to have two G-boxes. CNSs were shared with the homoeolog only and with no other gene, suggesting that binding site turnover impedes detection. Bigfoot genes may function in adaptation to environmental change.

  14. Comparison of the complete mitochondrial genome of the stonefly Sweltsa longistyla (Plecoptera: Chloroperlidae) with mitogenomes of three other stoneflies.

    PubMed

    Chen, Zhi-Teng; Du, Yu-Zhou

    2015-03-01

    The complete mitochondrial genome of the stonefly, Sweltsa longistyla Wu (Plecoptera: Chloroperlidae), was sequenced in this study. The mitogenome of S. longistyla is 16,151bp and contains 37 genes including 13 protein-coding genes (PCGs), 22 tRNA genes, two rRNA genes, and a large non-coding region. S. longistyla, Pteronarcys princeps Banks, Kamimuria wangi Du and Cryptoperla stilifera Sivec belong to the Plecoptera, and the gene order and orientation of their mitogenomes were similar. The overall AT content for the four stoneflies was below 72%, and the AT content of tRNA genes was above 69%. The four genomes were compact and contained only 65-127bp of non-coding intergenic DNAs. Overlapping nucleotides existed in all four genomes and ranged from 24 (P. princeps) to 178bp (K. wangi). There was a 7-bp motif ('ATGATAA') of overlapping DNA and an 8-bp motif (AAGCCTTA) conserved in three stonefly species (P. princeps, K. wangi and C. stilifera). The control regions of four stoneflies contained a stem-loop structure. Four conserved sequence blocks (CSBs) were present in the A+T-rich regions of all four stoneflies. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. A Surrogate Approach to Study the Evolution of Noncoding DNA Elements That Organize Eukaryotic Genomes

    PubMed Central

    Vermaak, Danielle; Bayes, Joshua J.

    2009-01-01

    Comparative genomics provides a facile way to address issues of evolutionary constraint acting on different elements of the genome. However, several important DNA elements have not reaped the benefits of this new approach. Some have proved intractable to current day sequencing technology. These include centromeric and heterochromatic DNA, which are essential for chromosome segregation as well as gene regulation, but the highly repetitive nature of the DNA sequences in these regions make them difficult to assemble into longer contigs. Other sequences, like dosage compensation X chromosomal sites, origins of DNA replication, or heterochromatic sequences that encode piwi-associated RNAs, have proved difficult to study because they do not have recognizable DNA features that allow them to be described functionally or computationally. We have employed an alternate approach to the direct study of these DNA elements. By using proteins that specifically bind these noncoding DNAs as surrogates, we can indirectly assay the evolutionary constraints acting on these important DNA elements. We review the impact that such “surrogate strategies” have had on our understanding of the evolutionary constraints shaping centromeres, origins of DNA replication, and dosage compensation X chromosomal sites. These have begun to reveal that in contrast to the view that such structural DNA elements are either highly constrained (under purifying selection) or free to drift (under neutral evolution), some of them may instead be shaped by adaptive evolution and genetic conflicts (these are not mutually exclusive). These insights also help to explain why the same elements (e.g., centromeres and replication origins), which are so complex in some eukaryotic genomes, can be simple and well defined in other where similar conflicts do not exist. PMID:19635763

  16. Functional formation of domain V of the poliovirus noncoding region: significance of unpaired bases.

    PubMed

    Rowe, A; Burlison, J; Macadam, A J; Minor, P D

    2001-10-10

    Previously we have shown that polioviruses with mutations that disrupt the predicted secondary structure of the 5' noncoding region of domain V are temperature sensitive for growth. Non-temperature-sensitive revertant viruses had mutations that re-formed secondary structure by a direct back mutation of changes in the opposite strand. We mutated unpaired regions and selected revertants of viruses with single base deletions, where no obvious back mutation was available in order to gain information on secondary structure. Results indicated that conservation of length of a three base loop between two double-stranded stems was essential for a functional domain V to form. The requirement for the unpaired "hinge" base at 484 which is implicated in the attenuation of Sabin 2 was also confirmed. Results also underline the necessity for functional folding over local secondary structure stability. Copyright 2001 Academic Press.

  17. Novel Approach to Analyzing MFE of Noncoding RNA Sequences

    PubMed Central

    George, Tina P.; Thomas, Tessamma

    2016-01-01

    Genomic studies have become noncoding RNA (ncRNA) centric after the study of different genomes provided enormous information on ncRNA over the past decades. The function of ncRNA is decided by its secondary structure, and across organisms, the secondary structure is more conserved than the sequence itself. In this study, the optimal secondary structure or the minimum free energy (MFE) structure of ncRNA was found based on the thermodynamic nearest neighbor model. MFE of over 2600 ncRNA sequences was analyzed in view of its signal properties. Mathematical models linking MFE to the signal properties were found for each of the four classes of ncRNA analyzed. MFE values computed with the proposed models were in concordance with those obtained with the standard web servers. A total of 95% of the sequences analyzed had deviation of MFE values within ±15% relative to those obtained from standard web servers. PMID:27695341

  18. Novel Approach to Analyzing MFE of Noncoding RNA Sequences.

    PubMed

    George, Tina P; Thomas, Tessamma

    2016-01-01

    Genomic studies have become noncoding RNA (ncRNA) centric after the study of different genomes provided enormous information on ncRNA over the past decades. The function of ncRNA is decided by its secondary structure, and across organisms, the secondary structure is more conserved than the sequence itself. In this study, the optimal secondary structure or the minimum free energy (MFE) structure of ncRNA was found based on the thermodynamic nearest neighbor model. MFE of over 2600 ncRNA sequences was analyzed in view of its signal properties. Mathematical models linking MFE to the signal properties were found for each of the four classes of ncRNA analyzed. MFE values computed with the proposed models were in concordance with those obtained with the standard web servers. A total of 95% of the sequences analyzed had deviation of MFE values within ±15% relative to those obtained from standard web servers.

  19. A systemic identification approach for primary transcription start site of Arabidopsis miRNAs from multidimensional omics data.

    PubMed

    You, Qi; Yan, Hengyu; Liu, Yue; Yi, Xin; Zhang, Kang; Xu, Wenying; Su, Zhen

    2017-05-01

    The 22-nucleotide non-coding microRNAs (miRNAs) are mostly transcribed by RNA polymerase II and are similar to protein-coding genes. Unlike the clear process from stem-loop precursors to mature miRNAs, the primary transcriptional regulation of miRNA, especially in plants, still needs to be further clarified, including the original transcription start site, functional cis-elements and primary transcript structures. Due to several well-characterized transcription signals in the promoter region, we proposed a systemic approach integrating multidimensional "omics" (including genomics, transcriptomics, and epigenomics) data to improve the genome-wide identification of primary miRNA transcripts. Here, we used the model plant Arabidopsis thaliana to improve the ability to identify candidate promoter locations in intergenic miRNAs and to determine rules for identifying primary transcription start sites of miRNAs by integrating high-throughput omics data, such as the DNase I hypersensitive sites, chromatin immunoprecipitation-sequencing of polymerase II and H3K4me3, as well as high throughput transcriptomic data. As a result, 93% of refined primary transcripts could be confirmed by the primer pairs from a previous study. Cis-element and secondary structure analyses also supported the feasibility of our results. This work will contribute to the primary transcriptional regulatory analysis of miRNAs, and the conserved regulatory pattern may be a suitable miRNA characteristic in other plant species.

  20. Comparative Mitogenomics of Plant Bugs (Hemiptera: Miridae): Identifying the AGG Codon Reassignments between Serine and Lysine

    PubMed Central

    Wang, Pei; Song, Fan; Cai, Wanzhi

    2014-01-01

    Insect mitochondrial genomes are very important to understand the molecular evolution as well as for phylogenetic and phylogeographic studies of the insects. The Miridae are the largest family of Heteroptera encompassing more than 11,000 described species and of great economic importance. For better understanding the diversity and the evolution of plant bugs, we sequence five new mitochondrial genomes and present the first comparative analysis of nine mitochondrial genomes of mirids available to date. Our result showed that gene content, gene arrangement, base composition and sequences of mitochondrial transcription termination factor were conserved in plant bugs. Intra-genus species shared more conserved genomic characteristics, such as nucleotide and amino acid composition of protein-coding genes, secondary structure and anticodon mutations of tRNAs, and non-coding sequences. Control region possessed several distinct characteristics, including: variable size, abundant tandem repetitions, and intra-genus conservation; and was useful in evolutionary and population genetic studies. The AGG codon reassignments were investigated between serine and lysine in the genera Adelphocoris and other cimicomorphans. Our analysis revealed correlated evolution between reassignments of the AGG codon and specific point mutations at the antidocons of tRNALys and tRNASer(AGN). Phylogenetic analysis indicated that mitochondrial genome sequences were useful in resolving family level relationship of Cimicomorpha. Comparative evolutionary analysis of plant bug mitochondrial genomes allowed the identification of previously neglected coding genes or non-coding regions as potential molecular markers. The finding of the AGG codon reassignments between serine and lysine indicated the parallel evolution of the genetic code in Hemiptera mitochondrial genomes. PMID:24988409

  1. Characterization of stress-responsive lncRNAs in Arabidopsis thaliana by integrating expression, epigenetic and structural features.

    PubMed

    Di, Chao; Yuan, Jiapei; Wu, Yue; Li, Jingrui; Lin, Huixin; Hu, Long; Zhang, Ting; Qi, Yijun; Gerstein, Mark B; Guo, Yan; Lu, Zhi John

    2014-12-01

    Recently, in addition to poly(A)+ long non-coding RNAs (lncRNAs), many lncRNAs without poly(A) tails, have been characterized in mammals. However, the non-polyA lncRNAs and their conserved motifs, especially those associated with environmental stresses, have not been fully investigated in plant genomes. We performed poly(A)- RNA-seq for seedlings of Arabidopsis thaliana under four stress conditions, and predicted lncRNA transcripts. We classified the lncRNAs into three confidence levels according to their expression patterns, epigenetic signatures and RNA secondary structures. Then, we further classified the lncRNAs to poly(A)+ and poly(A)- transcripts. Compared with poly(A)+ lncRNAs and coding genes, we found that poly(A)- lncRNAs tend to have shorter transcripts and lower expression levels, and they show significant expression specificity in response to stresses. In addition, their differential expression is significantly enriched in drought condition and depleted in heat condition. Overall, we identified 245 poly(A)+ and 58 poly(A)- lncRNAs that are differentially expressed under various stress stimuli. The differential expression was validated by qRT-PCR, and the signaling pathways involved were supported by specific binding of transcription factors (TFs), phytochrome-interacting factor 4 (PIF4) and PIF5. Moreover, we found many conserved sequence and structural motifs of lncRNAs from different functional groups (e.g. a UUC motif responding to salt and a AU-rich stem-loop responding to cold), indicated that the conserved elements might be responsible for the stress-responsive functions of lncRNAs. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  2. Epigenetic Control of Cytokine Gene Expression: Regulation of the TNF/LT Locus and T Helper Cell Differentiation

    PubMed Central

    Falvo, James V.; Jasenosky, Luke D.; Kruidenier, Laurens; Goldfeld, Anne E.

    2014-01-01

    Epigenetics encompasses transient and heritable modifications to DNA and nucleosomes in the native chromatin context. For example, enzymatic addition of chemical moieties to the N-terminal “tails” of histones, particularly acetylation and methylation of lysine residues in the histone tails of H3 and H4, plays a key role in regulation of gene transcription. The modified histones, which are physically associated with gene regulatory regions that typically occur within conserved noncoding sequences, play a functional role in active, poised, or repressed gene transcription. The “histone code” defined by these modifications, along with the chromatin-binding acetylases, deacetylases, methylases, demethylases, and other enzymes that direct modifications resulting in specific patterns of histone modification, shows considerable evolutionary conservation from yeast to humans. Direct modifications at the DNA level, such as cytosine methylation at CpG motifs that represses promoter activity, are another highly conserved epigenetic mechanism of gene regulation. Furthermore, epigenetic modifications at the nucleosome or DNA level can also be coupled with higher-order intra- or interchromosomal interactions that influence the location of regulatory elements and that can place them in an environment of specific nucleoprotein complexes associated with transcription. In the mammalian immune system, epigenetic gene regulation is a crucial mechanism for a range of physiological processes, including the innate host immune response to pathogens and T cell differentiation driven by specific patterns of cytokine gene expression. Here, we will review current findings regarding epigenetic regulation of cytokine genes important in innate and/or adaptive immune responses, with a special focus upon the tumor necrosis factor/lymphotoxin locus and cytokine-driven CD4+ T cell differentiation into the Th1, Th2, and Th17 lineages. PMID:23683942

  3. Functional annotation of the vlinc class of non-coding RNAs using systems biology approach.

    PubMed

    St Laurent, Georges; Vyatkin, Yuri; Antonets, Denis; Ri, Maxim; Qi, Yao; Saik, Olga; Shtokalo, Dmitry; de Hoon, Michiel J L; Kawaji, Hideya; Itoh, Masayoshi; Lassmann, Timo; Arner, Erik; Forrest, Alistair R R; Nicolas, Estelle; McCaffrey, Timothy A; Carninci, Piero; Hayashizaki, Yoshihide; Wahlestedt, Claes; Kapranov, Philipp

    2016-04-20

    Functionality of the non-coding transcripts encoded by the human genome is the coveted goal of the modern genomics research. While commonly relied on the classical methods of forward genetics, integration of different genomics datasets in a global Systems Biology fashion presents a more productive avenue of achieving this very complex aim. Here we report application of a Systems Biology-based approach to dissect functionality of a newly identified vast class of very long intergenic non-coding (vlinc) RNAs. Using highly quantitative FANTOM5 CAGE dataset, we show that these RNAs could be grouped into 1542 novel human genes based on analysis of insulators that we show here indeed function as genomic barrier elements. We show that vlinc RNAs genes likely function in cisto activate nearby genes. This effect while most pronounced in closely spaced vlinc RNA-gene pairs can be detected over relatively large genomic distances. Furthermore, we identified 101 vlinc RNA genes likely involved in early embryogenesis based on patterns of their expression and regulation. We also found another 109 such genes potentially involved in cellular functions also happening at early stages of development such as proliferation, migration and apoptosis. Overall, we show that Systems Biology-based methods have great promise for functional annotation of non-coding RNAs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Modeling the evolution of regulatory elements by simultaneous detection and alignment with phylogenetic pair HMMs.

    PubMed

    Majoros, William H; Ohler, Uwe

    2010-12-16

    The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.

  5. Long noncoding RNA linc00617 exhibits oncogenic activity in breast cancer.

    PubMed

    Li, Hengyu; Zhu, Li; Xu, Lu; Qin, Keyu; Liu, Chaoqian; Yu, Yue; Su, Dongwei; Wu, Kainan; Sheng, Yuan

    2017-01-01

    Protein-coding genes account for only 2% of the human genome, whereas the vast majority of transcripts are noncoding RNAs including long noncoding RNAs. LncRNAs are involved in the regulation of a diverse array of biological processes, including cancer progression. An evolutionarily conserved lncRNA TUNA, was found to be required for pluripotency of mouse embryonic stem cells. In this study, we found the human ortholog of TUNA, linc00617, was upregulated in breast cancer samples. Linc00617 promoted motility and invasion of breast cancer cells and induced epithelial-mesenchymal-transition (EMT), which was accompanied by generation of stem cell properties. Moreover, knockdown of linc00617 repressed lung metastasis in vivo. We demonstrated that linc00617 upregulated the expression of stemness factor Sox2 in breast cancer cells, which was shown to promote the oncogenic activity of breast cancer cells by stimulating epithelial-to-mesenchymal transition and enhancing the tumor-initiating capacity. Thus, our data indicate that linc00617 functions as an important regulator of EMT and promotes breast cancer progression and metastasis via activating the transcription of Sox2. Together, it suggests that linc00617 may be a potential therapeutic target for aggressive breast cancer. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  6. The origins and evolutionary history of human non-coding RNA regulatory networks.

    PubMed

    Sherafatian, Masih; Mowla, Seyed Javad

    2017-04-01

    The evolutionary history and origin of the regulatory function of animal non-coding RNAs are not well understood. Lack of conservation of long non-coding RNAs and small sizes of microRNAs has been major obstacles in their phylogenetic analysis. In this study, we tried to shed more light on the evolution of ncRNA regulatory networks by changing our phylogenetic strategy to focus on the evolutionary pattern of their protein coding targets. We used available target databases of miRNAs and lncRNAs to find their protein coding targets in human. We were able to recognize evolutionary hallmarks of ncRNA targets by phylostratigraphic analysis. We found the conventional 3'-UTR and lesser known 5'-UTR targets of miRNAs to be enriched at three consecutive phylostrata. Firstly, in eukaryata phylostratum corresponding to the emergence of miRNAs, our study revealed that miRNA targets function primarily in cell cycle processes. Moreover, the same overrepresentation of the targets observed in the next two consecutive phylostrata, opisthokonta and eumetazoa, corresponded to the expansion periods of miRNAs in animals evolution. Coding sequence targets of miRNAs showed a delayed rise at opisthokonta phylostratum, compared to the 3' and 5' UTR targets of miRNAs. LncRNA regulatory network was the latest to evolve at eumetazoa.

  7. Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics

    PubMed Central

    del Val, Coral; Rivas, Elena; Torres-Quesada, Omar; Toro, Nicolás; Jiménez-Zurdo, José I

    2007-01-01

    Bacterial small non-coding RNAs (sRNAs) are being recognized as novel widespread regulators of gene expression in response to environmental signals. Here, we present the first search for sRNA-encoding genes in the nitrogen-fixing endosymbiont Sinorhizobium meliloti, performed by a genome-wide computational analysis of its intergenic regions. Comparative sequence data from eight related α-proteobacteria were obtained, and the interspecies pairwise alignments were scored with the programs eQRNA and RNAz as complementary predictive tools to identify conserved and stable secondary structures corresponding to putative non-coding RNAs. Northern experiments confirmed that eight of the predicted loci, selected among the original 32 candidates as most probable sRNA genes, expressed small transcripts. This result supports the combined use of eQRNA and RNAz as a robust strategy to identify novel sRNAs in bacteria. Furthermore, seven of the transcripts accumulated differentially in free-living and symbiotic conditions. Experimental mapping of the 5′-ends of the detected transcripts revealed that their encoding genes are organized in autonomous transcription units with recognizable promoter and, in most cases, termination signatures. These findings suggest novel regulatory functions for sRNAs related to the interactions of α-proteobacteria with their eukaryotic hosts. PMID:17971083

  8. DNA topoisomerase 1α promotes transcriptional silencing of transposable elements through DNA methylation and histone lysine 9 dimethylation in Arabidopsis.

    PubMed

    Dinh, Thanh Theresa; Gao, Lei; Liu, Xigang; Li, Dongming; Li, Shengben; Zhao, Yuanyuan; O'Leary, Michael; Le, Brandon; Schmitz, Robert J; Manavella, Pablo A; Manavella, Pablo; Li, Shaofang; Weigel, Detlef; Pontes, Olga; Ecker, Joseph R; Chen, Xuemei

    2014-07-01

    RNA-directed DNA methylation (RdDM) and histone H3 lysine 9 dimethylation (H3K9me2) are related transcriptional silencing mechanisms that target transposable elements (TEs) and repeats to maintain genome stability in plants. RdDM is mediated by small and long noncoding RNAs produced by the plant-specific RNA polymerases Pol IV and Pol V, respectively. Through a chemical genetics screen with a luciferase-based DNA methylation reporter, LUCL, we found that camptothecin, a compound with anti-cancer properties that targets DNA topoisomerase 1α (TOP1α) was able to de-repress LUCL by reducing its DNA methylation and H3K9me2 levels. Further studies with Arabidopsis top1α mutants showed that TOP1α silences endogenous RdDM loci by facilitating the production of Pol V-dependent long non-coding RNAs, AGONAUTE4 recruitment and H3K9me2 deposition at TEs and repeats. This study assigned a new role in epigenetic silencing to an enzyme that affects DNA topology.

  9. Noncoding somatic and inherited single-nucleotide variants converge to promote ESR1 expression in breast cancer

    PubMed Central

    Bailey, Swneke D.; Desai, Kinjal; Kron, Ken J.; Mazrooei, Parisa; Sinnott-Armstrong, Nicholas A.; Treloar, Aislinn E.; Dowar, Mark; Thu, Kelsie L.; Cescon, David W.; Silvester, Jennifer; Yang, S. Y. Cindy; Wu, Xue; Pezo, Rossanna C.; Haibe-Kains, Benjamin; Mak, Tak W.; Bedard, Philippe L.; Pugh, Trevor J.; Sallari, Richard C.; Lupien, Mathieu

    2016-01-01

    Sustained expression of the oestrogen receptor alpha (ESR1) drives two-thirds of breast cancer and defines the ESR1-positive subtype. ESR1 engages enhancers upon oestrogen stimulation to establish an oncogenic expression program1. Somatic copy number alterations involving the ESR1 gene occur in approximately 1% of ESR1-positive breast cancers2–5, implying that other mechanisms underlie the persistent expression of ESR1. We report the significant enrichment of somatic mutations within the set of regulatory elements (SRE) regulating ESR1 in 7% of ESR1-positive breast cancers. These mutations regulate ESR1 expression by modulating transcription factor binding to the DNA. The SRE includes a recurrently mutated enhancer whose activity is also affected by a functional inherited single nucleotide variant (SNV) rs9383590 that accounts for several breast cancer risk-loci. Our work highlights the importance of considering the combinatorial activity of regulatory elements as a single unit to delineate the impact of noncoding genetic alterations on single genes in cancer. PMID:27571262

  10. Elevated Rate of Fixation of Endogenous Retroviral Elements in Haplorhini TRIM5 and TRIM22 Genomic Sequences: Impact on Transcriptional Regulation

    PubMed Central

    Diehl, William E.; Johnson, Welkin E.; Hunter, Eric

    2013-01-01

    All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. Evolutionary studies involving the TRIM6/34/5/22 locus have predominantly focused on the coding sequence of the genes, finding that TRIM5 and TRIM22 have undergone high rates of both non-synonymous nucleotide replacements and in-frame insertions and deletions. We sought to understand if divergent evolutionary pressures on TRIM6/34/5/22 coding regions have selected for modifications in the non-coding regions of these genes and explore whether such non-coding changes may influence the biological function of these genes. The transcribed genomic regions, including the introns, of TRIM6, TRIM34, TRIM5, and TRIM22 from ten Haplorhini primates and one prosimian species were analyzed for transposable element content. In Haplorhini species, TRIM5 displayed an exaggerated interspecies variability, predominantly resulting from changes in the composition of transposable elements in the large first and fourth introns. Multiple lineage-specific endogenous retroviral long terminal repeats (LTRs) were identified in the first intron of TRIM5 and TRIM22. In the prosimian genome, we identified a duplication of TRIM5 with a concomitant loss of TRIM22. The transposable element content of the prosimian TRIM5 genes appears to largely represent the shared Haplorhini/prosimian ancestral state for this gene. Furthermore, we demonstrated that one such differentially fixed LTR provides for species-specific transcriptional regulation of TRIM22 in response to p53 activation. Our results identify a previously unrecognized source of species-specific variation in the antiviral TRIM genes, which can lead to alterations in their transcriptional regulation. These observations suggest that there has existed long-term pressure for exaptation of retroviral LTRs in the non-coding regions of these genes. This likely resulted from serial viral challenges and provided a mechanism for rapid alteration of transcriptional regulation. To our knowledge, this represents the first report of persistent evolutionary pressure for the capture of retroviral LTR insertions. PMID:23516500

  11. The complete mitochondrial genome of the sandbar shark Carcharhinus plumbeus.

    PubMed

    Blower, Dean C; Ovenden, Jennifer R

    2016-01-01

    The sandbar shark, Carcharhinus plumbeus, a major representative species in shark fisheries worldwide is now considered vulnerable to overfishing. A pool of 774,234 Roche 454 shotgun sequences from one individual were assembled into a 16,706 bp mitogenome with 33× average coverage depth. It comprised 13 protein coding genes, 22 transfer RNA's, 2 ribosomal genes and 2 non-coding regions, typical of a vertebrate mitogenome. As expected for sharks, an A-T nucleotide bias was evident. This adds to rapidly growing number of mitogenome assemblies for the economically important Carcharhinidae family. The C. plumbeus mitogenome will assist researchers, fisheries and conservation managers interested in shark molecular systematics, phylogeography, conservation genetics, population and stock structure.

  12. Acquisition and evolution of plant pathogenesis-associated gene clusters and candidate determinants of tissue-specificity in xanthomonas.

    PubMed

    Lu, Hong; Patil, Prabhu; Van Sluys, Marie-Anne; White, Frank F; Ryan, Robert P; Dow, J Maxwell; Rabinowicz, Pablo; Salzberg, Steven L; Leach, Jan E; Sonti, Ramesh; Brendel, Volker; Bogdanove, Adam J

    2008-01-01

    Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown. To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors) cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage. Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale exchange of clusters, but subtle changes in a small number of genes or in non-coding sequences, and/or differences outside the clusters, potentially among regulatory targets or secretory substrates.

  13. Uncovering drug-responsive regulatory elements

    PubMed Central

    Luizon, Marcelo R; Ahituv, Nadav

    2015-01-01

    Nucleotide changes in gene regulatory elements can have a major effect on interindividual differences in drug response. For example, by reviewing all published pharmacogenomic genome-wide association studies, we show here that 96.4% of the associated single nucleotide polymorphisms reside in noncoding regions. We discuss how sequencing technologies are improving our ability to identify drug response-associated regulatory elements genome-wide and to annotate nucleotide variants within them. We highlight specific examples of how nucleotide changes in these elements can affect drug response and illustrate the techniques used to find them and functionally characterize them. Finally, we also discuss challenges in the field of drug-responsive regulatory elements that need to be considered in order to translate these findings into the clinic. PMID:26555224

  14. Ultra-deep sequencing of ribosome-associated poly-adenylated RNA in early Drosophila embryos reveals hundreds of conserved translated sORFs.

    PubMed

    Li, Hongmei; Hu, Chuansheng; Bai, Ling; Li, Hua; Li, Mingfa; Zhao, Xiaodong; Czajkowsky, Daniel M; Shao, Zhifeng

    2016-12-01

    There is growing recognition that small open reading frames (sORFs) encoding peptides shorter than 100 amino acids are an important class of functional elements in the eukaryotic genome, with several already identified to play critical roles in growth, development, and disease. However, our understanding of their biological importance has been hindered owing to the significant technical challenges limiting their annotation. Here we combined ultra-deep sequencing of ribosome-associated poly-adenylated RNAs with rigorous conservation analysis to identify a comprehensive population of translated sORFs during early Drosophila embryogenesis. In total, we identify 399 sORFs, including those previously annotated but without evidence of translational capacity, those found within transcripts previously classified as non-coding, and those not previously known to be transcribed. Further, we find, for the first time, evidence for translation of many sORFs with different isoforms, suggesting their regulation is as complex as longer ORFs. Furthermore, many sORFs are found not associated with ribosomes in late-stage Drosophila S2 cells, suggesting that many of the translated sORFs may have stage-specific functions during embryogenesis. These results thus provide the first comprehensive annotation of the sORFs present during early Drosophila embryogenesis, a necessary basis for a detailed delineation of their function in embryogenesis and other biological processes. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  15. Distribution of RPTLN Genes Across Reptilia: Hypothesized Role for RPTLN in the Evolution of SVMPs.

    PubMed

    Sanz-Soler, Raquel; Sanz, Libia; Calvete, Juan J

    2016-11-01

    We report the cloning, full-length sequencing, and broad distribution of reptile-specific RPTLN genes across a number of Anapsida (Testudines), Diapsida (Serpentes, Sauria), and Archosauria (Crocodylia) taxa. The remarkable structural conservation of RPTLN genes in species that had a common ancestor more than 250 million years ago, their low transcriptional level, and the lack of evidence for RPTLN translation in any reptile organ investigated, suggest for this ancient gene family a yet elusive function as long noncoding RNAs. The high conservation in extant snake venom metalloproteinases (SVMPs) of the signal peptide sequence coded for by RPTLN genes strongly suggests that this region may have played a key role in the recruitment and restricted expression of SVMP genes in the venom gland of Caenophidian snakes, some 60-50 Mya. More recently, 23-16 Mya, the neofunctionalization of an RPTLN copy in the venom gland of snakes of the genera Macrovipera and Daboia marked the beginning of the evolutionary history of a new family of disintegrins, the α 1 β 1 -collagen binding antagonists, short-RTS/KTS disintegrins. This evolutionary scenario predicts that venom gland RPTLN and SVMP genes may share tissue-specific regulatory elements. Future genomic studies should support or refute this hypothesis. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.

  16. Characterization and Analysis of Whole Transcriptome of Giant Panda Spleens: Implying Critical Roles of Long Non-Coding RNAs in Immunity.

    PubMed

    Peng, Rui; Liu, Yuliang; Cai, Zhigang; Shen, Fujun; Chen, Jiasong; Hou, Rong; Zou, Fangdong

    2018-01-01

    Giant pandas, an endangered species, are a powerful symbol of species conservation. Giant pandas may suffer from a variety of diseases. Owing to their highly specialized diet of bamboo, giant pandas are thought to have a relatively weak ability to resist diseases. The spleen is the largest organ in the lymphatic system. However, there is little known about giant panda spleen at a molecular level. Thus, clarifying the regulatory mechanisms of spleen could help us further understand the immune system of the giant panda as well as its conservation. The two giant panda spleens were from two male individuals, one newborn and one an adult, in a non-pathological condition. The whole transcriptomes of mRNA, lncRNA, miRNA, and circRNA in the two spleens were sequenced using the Illumina HiSeq platform. EBseq and IDEG6 were used to observe the differentially expressed genes (DEGs) between these two spleens. Gene Ontology and KEGG analyses were used to annotate the function of DEGs. Furthermore, networks between non-coding RNAs and protein-coding genes were constructed to investigate the relationship between non-coding RNAs and immune-associated genes. By comparative analysis of the whole transcriptomes of these two spleens, we found that one of the major roles of lncRNAs could be involved in the regulation of immune responses of giant panda spleens. In addition, our results also revealed that microRNAs and circRNAs may have evolved to regulate a large set of biological processes of giant panda spleens, and circRNAs may function as miRNA sponges. To our knowledge, this is the first report of lncRNAs and circRNAs in giant panda, which could be a useful resource for further giant panda research. Our study reveals the potential functional roles of miRNAs, lncRNAs, and circRNAs in giant panda spleen. © 2018 The Author(s). Published by S. Karger AG, Basel.

  17. Identification and Characterization of Long Non-Coding RNAs Related to Mouse Embryonic Brain Development from Available Transcriptomic Data

    PubMed Central

    He, Hongjuan; Xiu, Youcheng; Guo, Jing; Liu, Hui; Liu, Qi; Zeng, Tiebo; Chen, Yan; Zhang, Yan; Wu, Qiong

    2013-01-01

    Long non-coding RNAs (lncRNAs) as a key group of non-coding RNAs have gained widely attention. Though lncRNAs have been functionally annotated and systematic explored in higher mammals, few are under systematical identification and annotation. Owing to the expression specificity, known lncRNAs expressed in embryonic brain tissues remain still limited. Considering a large number of lncRNAs are only transcribed in brain tissues, studies of lncRNAs in developmental brain are therefore of special interest. Here, publicly available RNA-sequencing (RNA-seq) data in embryonic brain are integrated to identify thousands of embryonic brain lncRNAs by a customized pipeline. A significant proportion of novel transcripts have not been annotated by available genomic resources. The putative embryonic brain lncRNAs are shorter in length, less spliced and show less conservation than known genes. The expression of putative lncRNAs is in one tenth on average of known coding genes, while comparable with known lncRNAs. From chromatin data, putative embryonic brain lncRNAs are associated with active chromatin marks, comparable with known lncRNAs. Embryonic brain expressed lncRNAs are also indicated to have expression though not evident in adult brain. Gene Ontology analysis of putative embryonic brain lncRNAs suggests that they are associated with brain development. The putative lncRNAs are shown to be related to possible cis-regulatory roles in imprinting even themselves are deemed to be imprinted lncRNAs. Re-analysis of one knockdown data suggests that four regulators are associated with lncRNAs. Taken together, the identification and systematic analysis of putative lncRNAs would provide novel insights into uncharacterized mouse non-coding regions and the relationships with mammalian embryonic brain development. PMID:23967161

  18. Dose-sensitivity, conserved non-coding sequences, and duplicate gene retention through multiple tetraploidies in the grasses.

    PubMed

    Schnable, James C; Pedersen, Brent S; Subramaniam, Sabarinath; Freeling, Michael

    2011-01-01

    Whole genome duplications, or tetraploidies, are an important source of increased gene content. Following whole genome duplication, duplicate copies of many genes are lost from the genome. This loss of genes is biased both in the classes of genes deleted and the subgenome from which they are lost. Many or all classes are genes preferentially retained as duplicate copies are engaged in dose sensitive protein-protein interactions, such that deletion of any one duplicate upsets the status quo of subunit concentrations, and presumably lowers fitness as a result. Transcription factors are also preferentially retained following every whole genome duplications studied. This has been explained as a consequence of protein-protein interactions, just as for other highly retained classes of genes. We show that the quantity of conserved noncoding sequences (CNSs) associated with genes predicts the likelihood of their retention as duplicate pairs following whole genome duplication. As many CNSs likely represent binding sites for transcriptional regulators, we propose that the likelihood of gene retention following tetraploidy may also be influenced by dose-sensitive protein-DNA interactions between the regulatory regions of CNS-rich genes - nicknamed bigfoot genes - and the proteins that bind to them. Using grass genomes, we show that differential loss of CNSs from one member of a pair following the pre-grass tetraploidy reduces its chance of retention in the subsequent maize lineage tetraploidy.

  19. A class of circadian long non-coding RNAs mark enhancers modulating long-range circadian gene regulation

    PubMed Central

    Fan, Zenghua; Zhao, Meng; Joshi, Parth D.; Li, Ping; Zhang, Yan; Guo, Weimin; Xu, Yichi; Wang, Haifang; Zhao, Zhihu

    2017-01-01

    Abstract Circadian rhythm exerts its influence on animal physiology and behavior by regulating gene expression at various levels. Here we systematically explored circadian long non-coding RNAs (lncRNAs) in mouse liver and examined their circadian regulation. We found that a significant proportion of circadian lncRNAs are expressed at enhancer regions, mostly bound by two key circadian transcription factors, BMAL1 and REV-ERBα. These circadian lncRNAs showed similar circadian phases with their nearby genes. The extent of their nuclear localization is higher than protein coding genes but less than enhancer RNAs. The association between enhancer and circadian lncRNAs is also observed in tissues other than liver. Comparative analysis between mouse and rat circadian liver transcriptomes showed that circadian transcription at lncRNA loci tends to be conserved despite of low sequence conservation of lncRNAs. One such circadian lncRNA termed lnc-Crot led us to identify a super-enhancer region interacting with a cluster of genes involved in circadian regulation of metabolism through long-range interactions. Further experiments showed that lnc-Crot locus has enhancer function independent of lnc-Crot's transcription. Our results suggest that the enhancer-associated circadian lncRNAs mark the genomic loci modulating long-range circadian gene regulation and shed new lights on the evolutionary origin of lncRNAs. PMID:28335007

  20. Dose–Sensitivity, Conserved Non-Coding Sequences, and Duplicate Gene Retention Through Multiple Tetraploidies in the Grasses

    PubMed Central

    Schnable, James C.; Pedersen, Brent S.; Subramaniam, Sabarinath; Freeling, Michael

    2011-01-01

    Whole genome duplications, or tetraploidies, are an important source of increased gene content. Following whole genome duplication, duplicate copies of many genes are lost from the genome. This loss of genes is biased both in the classes of genes deleted and the subgenome from which they are lost. Many or all classes are genes preferentially retained as duplicate copies are engaged in dose sensitive protein–protein interactions, such that deletion of any one duplicate upsets the status quo of subunit concentrations, and presumably lowers fitness as a result. Transcription factors are also preferentially retained following every whole genome duplications studied. This has been explained as a consequence of protein–protein interactions, just as for other highly retained classes of genes. We show that the quantity of conserved noncoding sequences (CNSs) associated with genes predicts the likelihood of their retention as duplicate pairs following whole genome duplication. As many CNSs likely represent binding sites for transcriptional regulators, we propose that the likelihood of gene retention following tetraploidy may also be influenced by dose–sensitive protein–DNA interactions between the regulatory regions of CNS-rich genes – nicknamed bigfoot genes – and the proteins that bind to them. Using grass genomes, we show that differential loss of CNSs from one member of a pair following the pre-grass tetraploidy reduces its chance of retention in the subsequent maize lineage tetraploidy. PMID:22645525

  1. Transcriptional dynamics of a conserved gene expression network associated with craniofacial divergence in Arctic charr.

    PubMed

    Ahi, Ehsan Pashay; Kapralova, Kalina Hristova; Pálsson, Arnar; Maier, Valerie Helene; Gudbrandsson, Jóhannes; Snorrason, Sigurdur S; Jónsson, Zophonías O; Franzdóttir, Sigrídur Rut

    2014-01-01

    Understanding the molecular basis of craniofacial variation can provide insights into key developmental mechanisms of adaptive changes and their role in trophic divergence and speciation. Arctic charr (Salvelinus alpinus) is a polymorphic fish species, and, in Lake Thingvallavatn in Iceland, four sympatric morphs have evolved distinct craniofacial structures. We conducted a gene expression study on candidates from a conserved gene coexpression network, focusing on the development of craniofacial elements in embryos of two contrasting Arctic charr morphotypes (benthic and limnetic). Four Arctic charr morphs were studied: one limnetic and two benthic morphs from Lake Thingvallavatn and a limnetic reference aquaculture morph. The presence of morphological differences at developmental stages before the onset of feeding was verified by morphometric analysis. Following up on our previous findings that Mmp2 and Sparc were differentially expressed between morphotypes, we identified a network of genes with conserved coexpression across diverse vertebrate species. A comparative expression study of candidates from this network in developing heads of the four Arctic charr morphs verified the coexpression relationship of these genes and revealed distinct transcriptional dynamics strongly correlated with contrasting craniofacial morphologies (benthic versus limnetic). A literature review and Gene Ontology analysis indicated that a significant proportion of the network genes play a role in extracellular matrix organization and skeletogenesis, and motif enrichment analysis of conserved noncoding regions of network candidates predicted a handful of transcription factors, including Ap1 and Ets2, as potential regulators of the gene network. The expression of Ets2 itself was also found to associate with network gene expression. Genes linked to glucocorticoid signalling were also studied, as both Mmp2 and Sparc are responsive to this pathway. Among those, several transcriptional targets and upstream regulators showed differential expression between the contrasting morphotypes. Interestingly, although selected network genes showed overlapping expression patterns in situ and no morph differences, Timp2 expression patterns differed between morphs. Our comparative study of transcriptional dynamics in divergent craniofacial morphologies of Arctic charr revealed a conserved network of coexpressed genes sharing functional roles in structural morphogenesis. We also implicate transcriptional regulators of the network as targets for future functional studies.

  2. Origin and evolution of the long non-coding genes in the X-inactivation center.

    PubMed

    Romito, Antonio; Rougeulle, Claire

    2011-11-01

    Random X chromosome inactivation (XCI), the eutherian mechanism of X-linked gene dosage compensation, is controlled by a cis-acting locus termed the X-inactivation center (Xic). One of the striking features that characterize the Xic landscape is the abundance of loci transcribing non-coding RNAs (ncRNAs), including Xist, the master regulator of the inactivation process. Recent comparative genomic analyses have depicted the evolutionary scenario behind the origin of the X-inactivation center, revealing that this locus evolved from a region harboring protein-coding genes. During mammalian radiation, this ancestral protein-coding region was disrupted in the marsupial group, whilst it provided in eutherian lineage the starting material for the non-translated RNAs of the X-inactivation center. The emergence of non-coding genes occurred by a dual mechanism involving loss of protein-coding function of the pre-existing genes and integration of different classes of mobile elements, some of which modeled the structure and sequence of the non-coding genes in a species-specific manner. The rising genes started to produce transcripts that acquired function in regulating the epigenetic status of the X chromosome, as shown for Xist, its antisense Tsix, Jpx, and recently suggested for Ftx. Thus, the appearance of the Xic, which occurred after the divergence between eutherians and marsupials, was the basis for the evolution of random X inactivation as a strategy to achieve dosage compensation. Copyright © 2011. Published by Elsevier Masson SAS.

  3. Genome-wide identification and functional prediction of nitrogen-responsive intergenic and intronic long non-coding RNAs in maize (Zea mays L.).

    PubMed

    Lv, Yuanda; Liang, Zhikai; Ge, Min; Qi, Weicong; Zhang, Tifu; Lin, Feng; Peng, Zhaohua; Zhao, Han

    2016-05-11

    Nitrogen (N) is an essential and often limiting nutrient to plant growth and development. Previous studies have shown that the mRNA expressions of numerous genes are regulated by nitrogen supplies; however, little is known about the expressed non-coding elements, for example long non-coding RNAs (lncRNAs) that control the response of maize (Zea mays L.) to nitrogen. LncRNAs are a class of non-coding RNAs larger than 200 bp, which have emerged as key regulators in gene expression. In this study, we surveyed the intergenic/intronic lncRNAs in maize B73 leaves at the V7 stage under conditions of N-deficiency and N-sufficiency using ribosomal RNA depletion and ultra-deep total RNA sequencing approaches. By integration with mRNA expression profiles and physiological evaluations, 7245 lncRNAs and 637 nitrogen-responsive lncRNAs were identified that exhibited unique expression patterns. Co-expression network analysis showed that the nitrogen-responsive lncRNAs were enriched mainly in one of the three co-expressed modules. The genes in the enriched module are mainly involved in NADH dehydrogenase activity, oxidative phosphorylation and the nitrogen compounds metabolic process. We identified a large number of lncRNAs in maize and illustrated their potential regulatory roles in response to N stress. The results lay the foundation for further in-depth understanding of the molecular mechanisms of lncRNAs' role in response to nitrogen stresses.

  4. Chromatin Heterogeneity and Distribution of Regulatory Elements in the Late-Replicating Intercalary Heterochromatin Domains of Drosophila melanogaster Chromosomes

    PubMed Central

    Khoroshko, Varvara A.; Levitsky, Viktor G.; Zykova, Tatyana Yu.; Antonenko, Oksana V.; Belyaeva, Elena S.; Zhimulev, Igor F.

    2016-01-01

    Late-replicating domains (intercalary heterochromatin) in the Drosophila genome display a number of features suggesting their organization is quite unique. Typically, they are quite large and encompass clusters of functionally unrelated tissue-specific genes. They correspond to the topologically associating domains and conserved microsynteny blocks. Our study aims at exploring further details of molecular organization of intercalary heterochromatin and has uncovered surprising heterogeneity of chromatin composition in these regions. Using the 4HMM model developed in our group earlier, intercalary heterochromatin regions were found to host chromatin fragments with a particular epigenetic profile. Aquamarine chromatin fragments (spanning 0.67% of late-replicating regions) are characterized as a class of sequences that appear heterogeneous in terms of their decompactization. These fragments are enriched with enhancer sequences and binding sites for insulator proteins. They likely mark the chromatin state that is related to the binding of cis-regulatory proteins. Malachite chromatin fragments (11% of late-replicating regions) appear to function as universal transitional regions between two contrasting chromatin states. Namely, they invariably delimit intercalary heterochromatin regions from the adjacent active chromatin of interbands. Malachite fragments also flank aquamarine fragments embedded in the repressed chromatin of late-replicating regions. Significant enrichment of insulator proteins CP190, SU(HW), and MOD2.2 was observed in malachite chromatin. Neither aquamarine nor malachite chromatin types appear to correlate with the positions of highly conserved non-coding elements (HCNE) that are typically replete in intercalary heterochromatin. Malachite chromatin found on the flanks of intercalary heterochromatin regions tends to replicate earlier than the malachite chromatin embedded in intercalary heterochromatin. In other words, there exists a gradient of replication progressing from the flanks of intercalary heterochromatin regions center-wise. The peculiar organization and features of replication in large late-replicating regions are discussed as possible factors shaping the evolutionary stability of intercalary heterochromatin. PMID:27300486

  5. Identification of MicroRNAs in the Coral Stylophora pistillata

    PubMed Central

    Liew, Yi Jin; Aranda, Manuel; Carr, Adrian; Baumgarten, Sebastian; Zoccola, Didier; Tambutté, Sylvie; Allemand, Denis; Micklem, Gos; Voolstra, Christian R.

    2014-01-01

    Coral reefs are major contributors to marine biodiversity. However, they are in rapid decline due to global environmental changes such as rising sea surface temperatures, ocean acidification, and pollution. Genomic and transcriptomic analyses have broadened our understanding of coral biology, but a study of the microRNA (miRNA) repertoire of corals is missing. miRNAs constitute a class of small non-coding RNAs of ∼22 nt in size that play crucial roles in development, metabolism, and stress response in plants and animals alike. In this study, we examined the coral Stylophora pistillata for the presence of miRNAs and the corresponding core protein machinery required for their processing and function. Based on small RNA sequencing, we present evidence for 31 bona fide microRNAs, 5 of which (miR-100, miR-2022, miR-2023, miR-2030, and miR-2036) are conserved in other metazoans. Homologues of Argonaute, Piwi, Dicer, Drosha, Pasha, and HEN1 were identified in the transcriptome of S. pistillata based on strong sequence conservation with known RNAi proteins, with additional support derived from phylogenetic trees. Examination of putative miRNA gene targets indicates potential roles in development, metabolism, immunity, and biomineralisation for several of the microRNAs. Here, we present first evidence of a functional RNAi machinery and five conserved miRNAs in S. pistillata, implying that miRNAs play a role in organismal biology of scleractinian corals. Analysis of predicted miRNA target genes in S. pistillata suggests potential roles of miRNAs in symbiosis and coral calcification. Given the importance of miRNAs in regulating gene expression in other metazoans, further expression analyses of small non-coding RNAs in transcriptional studies of corals should be informative about miRNA-affected processes and pathways. PMID:24658574

  6. Predicting Gene Structure Changes Resulting from Genetic Variants via Exon Definition Features.

    PubMed

    Majoros, William H; Holt, Carson; Campbell, Michael S; Ware, Doreen; Yandell, Mark; Reddy, Timothy E

    2018-04-25

    Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed, and produce functional proteins. We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and noncoding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or noncoding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products, and we propose that they may commonly act as cryptic factors in disease. The software is available from geneprediction.org/SGRF. bmajoros@duke.edu. Supplementary information is available at Bioinformatics online.

  7. LncRNA, a new component of expanding RNA-protein regulatory network important for animal sperm development.

    PubMed

    Zhang, Chenwang; Gao, Liuze; Xu, Eugene Yujun

    2016-11-01

    Spermatogenesis is one of the fundamental processes of sexual reproduction, present in almost all metazoan animals. Like many other reproductive traits, developmental features and traits of spermatogenesis are under strong selective pressure to change, both at morphological and underlying molecular levels. Yet evidence suggests that some fundamental features of spermatogenesis may be ancient and conserved among metazoan species. Identifying the underlying conserved molecular mechanisms could reveal core components of metazoan spermatogenic machinery and provide novel insight into causes of human infertility. Conserved RNA-binding proteins and their interacting RNA network emerge to be a common theme important for animal sperm development. We review research on the recent addition to the RNA family - Long non-coding RNA (lncRNA) and its roles in spermatogenesis in the context of the expanding RNA-protein network. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Massive Gene Transfer and Extensive RNA Editing of a Symbiotic Dinoflagellate Plastid Genome

    PubMed Central

    Mungpakdee, Sutada; Shinzato, Chuya; Takeuchi, Takeshi; Kawashima, Takeshi; Koyanagi, Ryo; Hisata, Kanako; Tanaka, Makiko; Goto, Hiroki; Fujie, Manabu; Lin, Senjie; Satoh, Nori; Shoguchi, Eiichi

    2014-01-01

    Genome sequencing of Symbiodinium minutum revealed that 95 of 109 plastid-associated genes have been transferred to the nuclear genome and subsequently expanded by gene duplication. Only 14 genes remain in plastids and occur as DNA minicircles. Each minicircle (1.8–3.3 kb) contains one gene and a conserved noncoding region containing putative promoters and RNA-binding sites. Nine types of RNA editing, including a novel G/U type, were discovered in minicircle transcripts but not in genes transferred to the nucleus. In contrast to DNA editing sites in dinoflagellate mitochondria, which tend to be highly conserved across all taxa, editing sites employed in DNA minicircles are highly variable from species to species. Editing is crucial for core photosystem protein function. It restores evolutionarily conserved amino acids and increases peptidyl hydropathy. It also increases protein plasticity necessary to initiate photosystem complex assembly. PMID:24881086

  9. The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101

    NASA Astrophysics Data System (ADS)

    Pfreundt, Ulrike; Kopf, Matthias; Belkin, Natalia; Berman-Frank, Ilana; Hess, Wolfgang R.

    2014-08-01

    Blooms of the dinitrogen-fixing marine cyanobacterium Trichodesmium considerably contribute to new nitrogen inputs into tropical oceans. Intriguingly, only 60% of the Trichodesmium erythraeum IMS101 genome sequence codes for protein, compared with ~85% in other sequenced cyanobacterial genomes. The extensive non-coding genome fraction suggests space for an unusually high number of unidentified, potentially regulatory non-protein-coding RNAs (ncRNAs). To identify the transcribed fraction of the genome, here we present a genome-wide map of transcriptional start sites (TSS) at single nucleotide resolution, revealing the activity of 6,080 promoters. We demonstrate that T. erythraeum has the highest number of actively splicing group II introns and the highest percentage of TSS yielding ncRNAs of any bacterium examined to date. We identified a highly transcribed retroelement that serves as template repeat for the targeted mutation of at least 12 different genes by mutagenic homing. Our findings explain the non-coding portion of the T. erythraeum genome by the transcription of an unusually high number of non-coding transcripts in addition to the known high incidence of transposable elements. We conclude that riboregulation and RNA maturation-dependent processes constitute a major part of the Trichodesmium regulatory apparatus.

  10. Re-annotation, improved large-scale assembly and establishment of a catalogue of noncoding loci for the genome of the model brown alga Ectocarpus.

    PubMed

    Cormier, Alexandre; Avia, Komlan; Sterck, Lieven; Derrien, Thomas; Wucher, Valentin; Andres, Gwendoline; Monsoor, Misharl; Godfroy, Olivier; Lipinska, Agnieszka; Perrineau, Marie-Mathilde; Van De Peer, Yves; Hitte, Christophe; Corre, Erwan; Coelho, Susana M; Cock, J Mark

    2017-04-01

    The genome of the filamentous brown alga Ectocarpus was the first to be completely sequenced from within the brown algal group and has served as a key reference genome both for this lineage and for the stramenopiles. We present a complete structural and functional reannotation of the Ectocarpus genome. The large-scale assembly of the Ectocarpus genome was significantly improved and genome-wide gene re-annotation using extensive RNA-seq data improved the structure of 11 108 existing protein-coding genes and added 2030 new loci. A genome-wide analysis of splicing isoforms identified an average of 1.6 transcripts per locus. A large number of previously undescribed noncoding genes were identified and annotated, including 717 loci that produce long noncoding RNAs. Conservation of lncRNAs between Ectocarpus and another brown alga, the kelp Saccharina japonica, suggests that at least a proportion of these loci serve a function. Finally, a large collection of single nucleotide polymorphism-based markers was developed for genetic analyses. These resources are available through an updated and improved genome database. This study significantly improves the utility of the Ectocarpus genome as a high-quality reference for the study of many important aspects of brown algal biology and as a reference for genomic analyses across the stramenopiles. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  11. Chimeric mitochondrial minichromosomes of the human body louse, Pediculus humanus: evidence for homologous and non-homologous recombination.

    PubMed

    Shao, Renfu; Barker, Stephen C

    2011-02-15

    The mitochondrial (mt) genome of the human body louse, Pediculus humanus, consists of 18 minichromosomes. Each minichromosome is 3 to 4 kb long and has 1 to 3 genes. There is unequivocal evidence for recombination between different mt minichromosomes in P. humanus. It is not known, however, how these minichromosomes recombine. Here, we report the discovery of eight chimeric mt minichromosomes in P. humanus. We classify these chimeric mt minichromosomes into two groups: Group I and Group II. Group I chimeric minichromosomes contain parts of two different protein-coding genes that are from different minichromosomes. The two parts of protein-coding genes in each Group I chimeric minichromosome are joined at a microhomologous nucleotide sequence; microhomologous nucleotide sequences are hallmarks of non-homologous recombination. Group II chimeric minichromosomes contain all of the genes and the non-coding regions of two different minichromosomes. The conserved sequence blocks in the non-coding regions of Group II chimeric minichromosomes resemble the "recombination repeats" in the non-coding regions of the mt genomes of higher plants. These repeats are essential to homologous recombination in higher plants. Our analyses of the nucleotide sequences of chimeric mt minichromosomes indicate both homologous and non-homologous recombination between minichromosomes in the mitochondria of the human body louse. Copyright © 2010 Elsevier B.V. All rights reserved.

  12. Characterization of Non-coding DNA Satellites Associated with Sweepoviruses (Genus Begomovirus, Geminiviridae) – Definition of a Distinct Class of Begomovirus-Associated Satellites

    PubMed Central

    Lozano, Gloria; Trenado, Helena P.; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W.; Navas-Castillo, Jesús

    2016-01-01

    Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem–loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem–loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed. PMID:26925037

  13. The presence, role and clinical use of spermatozoal RNAs

    PubMed Central

    Jodar, Meritxell; Selvaraju, Sellappan; Sendler, Edward; Diamond, Michael P.; Krawetz, Stephen A.

    2013-01-01

    BACKGROUND Spermatozoa are highly differentiated, transcriptionally inert cells characterized by a compact nucleus with minimal cytoplasm. Nevertheless they contain a suite of unique RNAs that are delivered to oocyte upon fertilization. They are likely integrated as part of many different processes including genome recognition, consolidation-confrontation, early embryonic development and epigenetic transgenerational inherence. Spermatozoal RNAs also provide a window into the developmental history of each sperm thereby providing biomarkers of fertility and pregnancy outcome which are being intensely studied. METHODS Literature searches were performed to review the majority of spermatozoal RNA studies that described potential functions and clinical applications with emphasis on Next-Generation Sequencing. Human, mouse, bovine and stallion were compared as their distribution and composition of spermatozoal RNAs, using these techniques, have been described. RESULTS Comparisons highlighted the complexity of the population of spermatozoal RNAs that comprises rRNA, mRNA and both large and small non-coding RNAs. RNA-seq analysis has revealed that only a fraction of the larger RNAs retain their structure. While rRNAs are the most abundant and are highly fragmented, ensuring a translationally quiescent state, other RNAs including some mRNAs retain their functional potential, thereby increasing the opportunity for regulatory interactions. Abundant small non-coding RNAs retained in spermatozoa include miRNAs and piRNAs. Some, like miR-34c are essential to the early embryo development required for the first cellular division. Others like the piRNAs are likely part of the genomic dance of confrontation and consolidation. Other non-coding spermatozoal RNAs include transposable elements, annotated lnc-RNAs, intronic retained elements, exonic elements, chromatin-associated RNAs, small-nuclear ILF3/NF30 associated RNAs, quiescent RNAs, mse-tRNAs and YRNAs. Some non-coding RNAs are known to act as epigenetic modifiers, inducing histone modifications and DNA methylation, perhaps playing a role in transgenerational epigenetic inherence. Transcript profiling holds considerable potential for the discovery of fertility biomarkers for both agriculture and human medicine. Comparing the differential RNA profiles of infertile and fertile individuals as well as assessing species similarities, should resolve the regulatory pathways contributing to male factor infertility. CONCLUSIONS Dad delivers a complex population of RNAs to the oocyte at fertilization that likely influences fertilization, embryo development, the phenotype of the offspring and possibly future generations. Development is continuing on the use of spermatozoal RNA profiles as phenotypic markers of male factor status for use as clinical diagnostics of the father's contribution to the birth of a healthy child. PMID:23856356

  14. A long noncoding RNA contributes to neuropathic pain by silencing Kcna2 in primary afferent neurons

    PubMed Central

    Zhao, Xiuli; Tang, Zongxiang; Zhang, Hongkang; Atianjoh, Fidelis E.; Zhao, Jian-Yuan; Liang, Lingli; Wang, Wei; Guan, Xiaowei; Kao, Sheng-Chin; Tiwari, Vinod; Gao, Yong-Jing; Hoffman, Paul N.; Cui, Hengmi; Li, Min; Dong, Xinzhong; Tao, Yuan-Xiang

    2013-01-01

    Neuropathic pain is a refractory disease characterized by maladaptive changes in gene transcription and translation within the sensory pathway. Long noncoding RNAs (lncRNAs) are emerging as new players in gene regulation, but how lncRNAs operate in the development of neuropathic pain is unclear. Here we identify a conserved lncRNA for Kcna2 (named Kcna2 antisense RNA) in first-order sensory neurons of rat dorsal root ganglion (DRG). Peripheral nerve injury increases Kcna2 antisense RNA expression in injured DRG through activation of myeloid zinc finger protein 1, a transcription factor that binds to Kcna2 antisense RNA gene promoter. Mimicking this increase downregulates Kcna2, reduces total Kv current, increases excitability in DRG neurons, and produces neuropathic pain symptoms. Blocking this increase reverses nerve injury-induced downregulation of DRG Kcna2 and attenuates development and maintenance of neuropathic pain. These findings suggest native Kcna2 antisense RNA as a new therapeutic target for the treatment of neuropathic pain. PMID:23792947

  15. Perspectives on the mechanism of transcriptional regulation by long non-coding RNAs.

    PubMed

    Roberts, Thomas C; Morris, Kevin V; Weinberg, Marc S

    2014-01-01

    Long non-coding RNAs (lncRNAs) are increasingly being recognized as epigenetic regulators of gene transcription. The diversity and complexity of lncRNA genes means that they exert their regulatory effects by a variety of mechanisms. Although there is still much to be learned about the mechanism of lncRNA function, general principles are starting to emerge. In particular, the application of high throughput (deep) sequencing methodologies has greatly advanced our understanding of lncRNA gene function. lncRNAs function as adaptors that link specific chromatin loci with chromatin-remodeling complexes and transcription factors. lncRNAs can act in cis or trans to guide epigenetic-modifier complexes to distinct genomic sites, or act as scaffolds which recruit multiple proteins simultaneously, thereby coordinating their activities. In this review we discuss the genomic organization of lncRNAs, the importance of RNA secondary structure to lncRNA functionality, the multitude of ways in which they interact with the genome, and what evolutionary conservation tells us about their function.

  16. Mitochondrial genome evolution in the Saccharomyces sensu stricto complex.

    PubMed

    Ruan, Jiangxing; Cheng, Jian; Zhang, Tongcun; Jiang, Huifeng

    2017-01-01

    Exploring the evolutionary patterns of mitochondrial genomes is important for our understanding of the Saccharomyces sensu stricto (SSS) group, which is a model system for genomic evolution and ecological analysis. In this study, we first obtained the complete mitochondrial sequences of two important species, Saccharomyces mikatae and Saccharomyces kudriavzevii. We then compared the mitochondrial genomes in the SSS group with those of close relatives, and found that the non-coding regions evolved rapidly, including dramatic expansion of intergenic regions, fast evolution of introns and almost 20-fold higher rearrangement rates than those of the nuclear genomes. However, the coding regions, and especially the protein-coding genes, are more conserved than those in the nuclear genomes of the SSS group. The different evolutionary patterns of coding and non-coding regions in the mitochondrial and nuclear genomes may be related to the origin of the aerobic fermentation lifestyle in this group. Our analysis thus provides novel insights into the evolution of mitochondrial genomes.

  17. DUSP11 – An RNA phosphatase that regulates host and viral non-coding RNAs in mammalian cells

    PubMed Central

    Burke, James M.; Sullivan, Christopher S.

    2017-01-01

    ABSTRACT Dual-specificity phosphatase 11 (DUSP11) is a conserved protein tyrosine phosphatase (PTP) in metazoans. The cellular substrates and physiologic activities of DUSP11 remain largely unknown. In nematodes, DUSP11 is required for normal development and RNA interference against endogenous RNAs (endo-RNAi) via molecular mechanisms that are not well understood. However, mammals lack analogous endo-RNAi pathways and consequently, a role for DUSP11 in mammalian RNA silencing was unanticipated. Recent work from our laboratory demonstrated that DUSP11 activity alters the silencing potential of noncanonical viral miRNAs in mammalian cells. Our studies further uncovered direct cellular substrates of DUSP11 and suggest that DUSP11 is part of regulatory pathway that controls the abundance of select triphosphorylated noncoding RNAs. Here, we highlight recent findings and present new data that advance understanding of mammalian DUSP11 during gene silencing and discuss the emerging biological activities of DUSP11 in mammalian cells. PMID:28296624

  18. Dissecting non-coding RNA mechanisms in cellulo by single-molecule high-resolution localization and counting

    PubMed Central

    Pitchiaya, Sethuramasundaram; Krishnan, Vishalakshi; Custer, Thomas C.; Walter, Nils G.

    2013-01-01

    Non-coding RNAs (ncRNAs) recently were discovered to outnumber their protein-coding counterparts, yet their diverse functions are still poorly understood. Here we report on a method for the intracellular Single-molecule High Resolution Localization and Counting (iSHiRLoC) of microRNAs (miRNAs), a conserved, ubiquitous class of regulatory ncRNAs that controls the expression of over 60% of all mammalian protein coding genes post-transcriptionally, by a mechanism shrouded by seemingly contradictory observations. We present protocols to execute single particle tracking (SPT) and single-molecule counting of functional microinjected, fluorophore-labeled miRNAs and thereby extract diffusion coefficients and molecular stoichiometries of micro-ribonucleoprotein (miRNP) complexes from living and fixed cells, respectively. This probing of miRNAs at the single molecule level sheds new light on the intracellular assembly/disassembly of miRNPs, thus beginning to unravel the dynamic nature of this important gene regulatory pathway and facilitating the development of a parsimonious model for their obscured mechanism of action. PMID:23820309

  19. The molecular dynamics of long noncoding RNA control of transcription in PTEN and its pseudogene

    PubMed Central

    Lister, Nicholas; Shevchenko, Galina; Walshe, James L.; Groen, Jessica; Johnsson, Per; Vidarsdóttir, Linda; Grander, Dan; Ataide, Sandro F.; Morris, Kevin V.

    2017-01-01

    RNA has been found to interact with chromatin and modulate gene transcription. In human cells, little is known about how long noncoding RNAs (lncRNAs) interact with target loci in the context of chromatin. We find here, using the phosphatase and tensin homolog (PTEN) pseudogene as a model system, that antisense lncRNAs interact first with a 5′ UTR-containing promoter-spanning transcript, which is then followed by the recruitment of DNA methyltransferase 3a (DNMT3a), ultimately resulting in the transcriptional and epigenetic control of gene expression. Moreover, we find that the lncRNA and promoter-spanning transcript interaction are based on a combination of structural and sequence components of the antisense lncRNA. These observations suggest, on the basis of this one example, that evolutionary pressures may be placed on RNA structure more so than sequence conservation. Collectively, the observations presented here suggest a much more complex and vibrant RNA regulatory world may be operative in the regulation of gene expression. PMID:28847966

  20. Nuclear factor 90 uses an ADAR2-like binding mode to recognize specific bases in dsRNA.

    PubMed

    Jayachandran, Uma; Grey, Heather; Cook, Atlanta G

    2016-02-29

    Nuclear factors 90 and 45 (NF90 and NF45) form a protein complex involved in the post-transcriptional control of many genes in vertebrates. NF90 is a member of the dsRNA binding domain (dsRBD) family of proteins. RNA binding partners identified so far include elements in 3' untranslated regions of specific mRNAs and several non-coding RNAs. In NF90, a tandem pair of dsRBDs separated by a natively unstructured segment confers dsRNA binding activity. We determined a crystal structure of the tandem dsRBDs of NF90 in complex with a synthetic dsRNA. This complex shows surprising similarity to the tandem dsRBDs from an adenosine-to-inosine editing enzyme, ADAR2 in complex with a substrate RNA. Residues involved in unusual base-specific recognition in the minor groove of dsRNA are conserved between NF90 and ADAR2. These data suggest that, like ADAR2, underlying sequences in dsRNA may influence how NF90 recognizes its target RNAs. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Splicing-independent loading of TREX on nascent RNA is required for efficient expression of dual-strand piRNA clusters in Drosophila

    PubMed Central

    Hur, Junho K.; Luo, Yicheng; Moon, Sungjin; Ninova, Maria; Marinov, Georgi K.; Chung, Yun D.; Aravin, Alexei A.

    2016-01-01

    The conserved THO/TREX (transcription/export) complex is critical for pre-mRNA processing and mRNA nuclear export. In metazoa, TREX is loaded on nascent RNA transcribed by RNA polymerase II in a splicing-dependent fashion; however, how TREX functions is poorly understood. Here we show that Thoc5 and other TREX components are essential for the biogenesis of piRNA, a distinct class of small noncoding RNAs that control expression of transposable elements (TEs) in the Drosophila germline. Mutations in TREX lead to defects in piRNA biogenesis, resulting in derepression of multiple TE families, gametogenesis defects, and sterility. TREX components are enriched on piRNA precursors transcribed from dual-strand piRNA clusters and colocalize in distinct nuclear foci that overlap with sites of piRNA transcription. The localization of TREX in nuclear foci and its loading on piRNA precursor transcripts depend on Cutoff, a protein associated with chromatin of piRNA clusters. Finally, we show that TREX is required for accumulation of nascent piRNA precursors. Our study reveals a novel splicing-independent mechanism for TREX loading on nascent RNA and its importance in piRNA biogenesis. PMID:27036967

  2. Global Identification and Characterization of Transcriptionally Active Regions in the Rice Genome

    PubMed Central

    Stolc, Viktor; Deng, Wei; He, Hang; Korbel, Jan; Chen, Xuewei; Tongprasit, Waraporn; Ronald, Pamela; Chen, Runsheng; Gerstein, Mark; Wang Deng, Xing

    2007-01-01

    Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking. We report here the identification of 25,352 and 27,744 transcriptionally active regions (TARs) not encoded by annotated exons in the rice (Oryza. sativa) subspecies japonica and indica, respectively. The non-exonic TARs account for approximately two thirds of the total TARs detected by tiling arrays and represent transcripts likely conserved between japonica and indica. Transcription of 21,018 (83%) japonica non-exonic TARs was verified through expression profiling in 10 tissue types using a re-array in which annotated genes and TARs were each represented by five independent probes. Subsequent analyses indicate that about 80% of the japonica TARs that were not assigned to annotated exons can be assigned to various putatively functional or structural elements of the rice genome, including splice variants, uncharacterized portions of incompletely annotated genes, antisense transcripts, duplicated gene fragments, and potential non-coding RNAs. These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome. PMID:17372628

  3. Genomicus update 2015: KaryoView and MatrixView provide a genome-wide perspective to multispecies comparative genomics.

    PubMed

    Louis, Alexandra; Nguyen, Nga Thi Thuy; Muffato, Matthieu; Roest Crollius, Hugues

    2015-01-01

    The Genomicus web server (http://www.genomicus.biologie.ens.fr/genomicus) is a visualization tool allowing comparative genomics in four different phyla (Vertebrate, Fungi, Metazoan and Plants). It provides access to genomic information from extant species, as well as ancestral gene content and gene order for vertebrates and flowering plants. Here we present the new features available for vertebrate genome with a focus on new graphical tools. The interface to enter the database has been improved, two pairwise genome comparison tools are now available (KaryoView and MatrixView) and the multiple genome comparison tools (PhyloView and AlignView) propose three new kinds of representation and a more intuitive menu. These new developments have been implemented for Genomicus portal dedicated to vertebrates. This allows the analysis of 68 extant animal genomes, as well as 58 ancestral reconstructed genomes. The Genomicus server also provides access to ancestral gene orders, to facilitate evolutionary and comparative genomics studies, as well as computationally predicted regulatory interactions, thanks to the representation of conserved non-coding elements with their putative gene targets. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Evolution in the block: common elements of 5S rDNA organization and evolutionary patterns in distant fish genera.

    PubMed

    Campo, Daniel; García-Vázquez, Eva

    2012-01-01

    The 5S rDNA is organized in the genome as tandemly repeated copies of a structural unit composed of a coding sequence plus a nontranscribed spacer (NTS). The coding region is highly conserved in the evolution, whereas the NTS vary in both length and sequence. It has been proposed that 5S rRNA genes are members of a gene family that have arisen through concerted evolution. In this study, we describe the molecular organization and evolution of the 5S rDNA in the genera Lepidorhombus and Scophthalmus (Scophthalmidae) and compared it with already known 5S rDNA of the very different genera Merluccius (Merluccidae) and Salmo (Salmoninae), to identify common structural elements or patterns for understanding 5S rDNA evolution in fish. High intra- and interspecific diversity within the 5S rDNA family in all the genera can be explained by a combination of duplications, deletions, and transposition events. Sequence blocks with high similarity in all the 5S rDNA members across species were identified for the four studied genera, with evidences of intense gene conversion within noncoding regions. We propose a model to explain the evolution of the 5S rDNA, in which the evolutionary units are blocks of nucleotides rather than the entire sequences or single nucleotides. This model implies a "two-speed" evolution: slow within blocks (homogenized by recombination) and fast within the gene family (diversified by duplications and deletions).

  5. Comparative Genomics in Drosophila.

    PubMed

    Oti, Martin; Pane, Attilio; Sammeth, Michael

    2018-01-01

    Since the pioneering studies of Thomas Hunt Morgan and coworkers at the dawn of the twentieth century, Drosophila melanogaster and its sister species have tremendously contributed to unveil the rules underlying animal genetics, development, behavior, evolution, and human disease. Recent advances in DNA sequencing technologies launched Drosophila into the post-genomic era and paved the way for unprecedented comparative genomics investigations. The complete sequencing and systematic comparison of the genomes from 12 Drosophila species represents a milestone achievement in modern biology, which allowed a plethora of different studies ranging from the annotation of known and novel genomic features to the evolution of chromosomes and, ultimately, of entire genomes. Despite the efforts of countless laboratories worldwide, the vast amount of data that were produced over the past 15 years is far from being fully explored.In this chapter, we will review some of the bioinformatic approaches that were developed to interrogate the genomes of the 12 Drosophila species. Setting off from alignments of the entire genomic sequences, the degree of conservation can be separately evaluated for every region of the genome, providing already first hints about elements that are under purifying selection and therefore likely functional. Furthermore, the careful analysis of repeated sequences sheds light on the evolutionary dynamics of transposons, an enigmatic and fascinating class of mobile elements housed in the genomes of animals and plants. Comparative genomics also aids in the computational identification of the transcriptionally active part of the genome, first and foremost of protein-coding loci, but also of transcribed nevertheless apparently noncoding regions, which were once considered "junk" DNA. Eventually, the synergy between functional and comparative genomics also facilitates in silico and in vivo studies on cis-acting regulatory elements, like transcription factor binding sites, that due to the high degree of sequence variability usually impose increased challenges for bioinformatics approaches.

  6. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    PubMed Central

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C.P.G.M.; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R.; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce BJ; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia MT; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent

    2016-01-01

    SUMMARY The extent to which low-frequency (minor allele frequency [MAF] between 1–5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is largely unknown. Bone mineral density (BMD) is highly heritable, is a major predictor of osteoporotic fractures and has been previously associated with common genetic variants1–8, and rare, population-specific, coding variants9. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n=2,882 from UK10K), whole-exome sequencing (n= 3,549), deep imputation of genotyped samples using a combined UK10K/1000Genomes reference panel (n=26,534), and de-novo replication genotyping (n= 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size 4-fold larger than the mean of previously reported common variants for lumbar spine BMD8 (rs11692564[T], MAF = 1.7%, replication effect size = +0.20 standard deviations [SD], Pmeta = 2×10−14), which was also associated with a decreased risk of fracture (OR = 0.85; P = 2×10−11; ncases = 98,742 and ncontrols = 409,511). Using an En1Cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, likely as a consequence of high bone turn-over. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817[T], MAF = 1.1%, replication effect size = +0.39 SD, Pmeta = 1×10−11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population. PMID:26367794

  7. Discovery of stimulation-responsive immune enhancers with CRISPR activation

    PubMed Central

    Simeonov, Dimitre R.; Gowen, Benjamin G.; Boontanrart, Mandy; Roth, Theodore L.; Gagnon, John D.; Mumbach, Maxwell R.; Satpathy, Ansuman T.; Lee, Youjin; Bray, Nicolas L.; Chan, Alice Y.; Lituiev, Dmytro S.; Nguyen, Michelle L.; Gate, Rachel E.; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M.; Mitros, Therese; Ray, Graham J.; Curie, Gemma L.; Naddaf, Nicki; Chu, Julia S.; Ma, Hong; Boyer, Eric; Van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R.; Schumann, Kathrin; Daly, Mark J.; Farh, Kyle K; Ansel, K. Mark; Ye, Chun J.; Greenleaf, William J.; Anderson, Mark S.; Bluestone, Jeffrey A.; Chang, Howard Y.; Corn, Jacob E.; Marson, Alexander

    2017-01-01

    The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues1–3. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption4–6, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa)7 to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (TH17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs. PMID:28854172

  8. Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain

    PubMed Central

    2012-01-01

    Background Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. Results By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. Conclusions Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function. PMID:22651826

  9. Discovery of stimulation-responsive immune enhancers with CRISPR activation.

    PubMed

    Simeonov, Dimitre R; Gowen, Benjamin G; Boontanrart, Mandy; Roth, Theodore L; Gagnon, John D; Mumbach, Maxwell R; Satpathy, Ansuman T; Lee, Youjin; Bray, Nicolas L; Chan, Alice Y; Lituiev, Dmytro S; Nguyen, Michelle L; Gate, Rachel E; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M; Mitros, Therese; Ray, Graham J; Curie, Gemma L; Naddaf, Nicki; Chu, Julia S; Ma, Hong; Boyer, Eric; Van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R; Schumann, Kathrin; Daly, Mark J; Farh, Kyle K; Ansel, K Mark; Ye, Chun J; Greenleaf, William J; Anderson, Mark S; Bluestone, Jeffrey A; Chang, Howard Y; Corn, Jacob E; Marson, Alexander

    2017-09-07

    The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa) to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (T H 17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs.

  10. Discovery of stimulation-responsive immune enhancers with CRISPR activation

    NASA Astrophysics Data System (ADS)

    Simeonov, Dimitre R.; Gowen, Benjamin G.; Boontanrart, Mandy; Roth, Theodore L.; Gagnon, John D.; Mumbach, Maxwell R.; Satpathy, Ansuman T.; Lee, Youjin; Bray, Nicolas L.; Chan, Alice Y.; Lituiev, Dmytro S.; Nguyen, Michelle L.; Gate, Rachel E.; Subramaniam, Meena; Li, Zhongmei; Woo, Jonathan M.; Mitros, Therese; Ray, Graham J.; Curie, Gemma L.; Naddaf, Nicki; Chu, Julia S.; Ma, Hong; Boyer, Eric; van Gool, Frederic; Huang, Hailiang; Liu, Ruize; Tobin, Victoria R.; Schumann, Kathrin; Daly, Mark J.; Farh, Kyle K.; Ansel, K. Mark; Ye, Chun J.; Greenleaf, William J.; Anderson, Mark S.; Bluestone, Jeffrey A.; Chang, Howard Y.; Corn, Jacob E.; Marson, Alexander

    2017-09-01

    The majority of genetic variants associated with common human diseases map to enhancers, non-coding elements that shape cell-type-specific transcriptional programs and responses to extracellular cues. Systematic mapping of functional enhancers and their biological contexts is required to understand the mechanisms by which variation in non-coding genetic sequences contributes to disease. Functional enhancers can be mapped by genomic sequence disruption, but this approach is limited to the subset of enhancers that are necessary in the particular cellular context being studied. We hypothesized that recruitment of a strong transcriptional activator to an enhancer would be sufficient to drive target gene expression, even if that enhancer was not currently active in the assayed cells. Here we describe a discovery platform that can identify stimulus-responsive enhancers for a target gene independent of stimulus exposure. We used tiled CRISPR activation (CRISPRa) to synthetically recruit a transcriptional activator to sites across large genomic regions (more than 100 kilobases) surrounding two key autoimmunity risk loci, CD69 and IL2RA. We identified several CRISPRa-responsive elements with chromatin features of stimulus-responsive enhancers, including an IL2RA enhancer that harbours an autoimmunity risk variant. Using engineered mouse models, we found that sequence perturbation of the disease-associated Il2ra enhancer did not entirely block Il2ra expression, but rather delayed the timing of gene activation in response to specific extracellular signals. Enhancer deletion skewed polarization of naive T cells towards a pro-inflammatory T helper (TH17) cell state and away from a regulatory T cell state. This integrated approach identifies functional enhancers and reveals how non-coding variation associated with human immune dysfunction alters context-specific gene programs.

  11. AP1 Keeps Chromatin Poised for Action | Center for Cancer Research

    Cancer.gov

    The human genome harbors gene-encoding DNA, the blueprint for building proteins that regulate cellular function. Embedded across the genome, in non-coding regions, are DNA elements to which regulatory factors bind. The interaction of regulatory factors with DNA at these sites modifies gene expression to modulate cell activity. In cells, DNA exists in a complex with proteins

  12. Genome-wide piRNA profiles of virus transmitting whitefly Bemisia tabaci during feeding on TYLCV-infected tomato

    USDA-ARS?s Scientific Manuscript database

    Small RNAs (sRNAs) are 20-31 nucleotide (nt) non-coding regulatory elements commonly found in plants and animals, which are classified as short interfering RNA (siRNA), microRNA (miRNA) and Piwi-interacting RNA (piRNA). The whitefly Bemisia tabaci MEAM1 is a vector capable of transmitting many devas...

  13. Transcription regulation by distal enhancers

    PubMed Central

    Stadhouders, Ralph; van den Heuvel, Anita; Kolovos, Petros; Jorna, Ruud; Leslie, Kris; Grosveld, Frank; Soler, Eric

    2012-01-01

    Genome-wide chromatin profiling efforts have shown that enhancers are often located at large distances from gene promoters within the noncoding genome. Whereas enhancers can stimulate transcription initiation by communicating with promoters via chromatin looping mechanisms, we propose that enhancers may also stimulate transcription elongation by physical interactions with intronic elements. We review here recent findings derived from the study of the hematopoietic system. PMID:22771987

  14. Connected Gene Communities Underlie Transcriptional Changes in Cornelia de Lange Syndrome.

    PubMed

    Boudaoud, Imène; Fournier, Éric; Baguette, Audrey; Vallée, Maxime; Lamaze, Fabien C; Droit, Arnaud; Bilodeau, Steve

    2017-09-01

    Cornelia de Lange syndrome (CdLS) is a complex multisystem developmental disorder caused by mutations in cohesin subunits and regulators. While its precise molecular mechanisms are not well defined, they point toward a global deregulation of the transcriptional gene expression program. Cohesin is associated with the boundaries of chromosome domains and with enhancer and promoter regions connecting the three-dimensional genome organization with transcriptional regulation. Here, we show that connected gene communities, structures emerging from the interactions of noncoding regulatory elements and genes in the three-dimensional chromosomal space, provide a molecular explanation for the pathoetiology of CdLS associated with mutations in the cohesin-loading factor NIPBL and the cohesin subunit SMC1A NIPBL and cohesin are important constituents of connected gene communities that are centrally positioned at noncoding regulatory elements. Accordingly, genes deregulated in CdLS are positioned within reach of NIPBL- and cohesin-occupied regions through promoter-promoter interactions. Our findings suggest a dynamic model where NIPBL loads cohesin to connect genes in communities, offering an explanation for the gene expression deregulation in the CdLS. Copyright © 2017 by the Genetics Society of America.

  15. Identification of antisense long noncoding RNAs that function as SINEUPs in human cells.

    PubMed

    Schein, Aleks; Zucchelli, Silvia; Kauppinen, Sakari; Gustincich, Stefano; Carninci, Piero

    2016-09-20

    Mammalian genomes encode numerous natural antisense long noncoding RNAs (lncRNAs) that regulate gene expression. Recently, an antisense lncRNA to mouse Ubiquitin carboxyl-terminal hydrolase L1 (Uchl1) was reported to increase UCHL1 protein synthesis, representing a new functional class of lncRNAs, designated as SINEUPs, for SINE element-containing translation UP-regulators. Here, we show that an antisense lncRNA to the human protein phosphatase 1 regulatory subunit 12A (PPP1R12A), named as R12A-AS1, which overlaps with the 5' UTR and first coding exon of the PPP1R12A mRNA, functions as a SINEUP, increasing PPP1R12A protein translation in human cells. The SINEUP activity depends on the aforementioned sense-antisense interaction and a free right Alu monomer repeat element at the 3' end of R12A-AS1. In addition, we identify another human antisense lncRNA with SINEUP activity. Our results demonstrate for the first time that human natural antisense lncRNAs can up-regulate protein translation, suggesting that endogenous SINEUPs may be widespread and present in many mammalian species.

  16. Transcriptional regulatory elements in the noncoding region of human papillomavirus type 6

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Tzyy-Choou.

    1989-01-01

    The structure and function of the transcriptional regulatory region of human papillomavirus type 6 (HPV-6) has been investigated. To investigate tissue specific gene expression, a sensitive method to detect and localize HPV-6 viral DNA, mRNA and protein in plastic-embedded tissue sections of genital and respiratory tract papillomata by using in situ hybridization and immunoperoxidase assays has been developed. This method, using ultrathin sections and strand-specific {sup 3}H labeled riboprobes, offers the advantages of superior morphological preservation and detection of viral genomes at low copy number with good resolution, and the modified immunocytochemistry provides better sensitivity. The results suggest that genitalmore » tract epithelium is more permissive for HPV-6 replication than respiratory tract epithelium. To study the tissue tropism of HPV-6 at the level of regulation of viral gene expression, the polymerase chain reaction was used to isolate the noncoding region (NCR) of HPV-6 in independent isolates. Nucleotide sequence analysis of molecularly cloned DNA identified base substitutions, deletions/insertions and tandem duplications. Transcriptional regulatory elements in the NCR were assayed in recombinant plasmids containing the bacterial gene for chloramphenicol acetyl transferase.« less

  17. 10 CFR Appendix D to Part 436 - Energy Program Conservation Elements

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 10 Energy 3 2013-01-01 2013-01-01 false Energy Program Conservation Elements D Appendix D to Part 436 Energy DEPARTMENT OF ENERGY ENERGY CONSERVATION FEDERAL ENERGY MANAGEMENT AND PLANNING PROGRAMS Pt. 436, App. D Appendix D to Part 436—Energy Program Conservation Elements (a) In all successful energy...

  18. 10 CFR Appendix D to Part 436 - Energy Program Conservation Elements

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 10 Energy 3 2014-01-01 2014-01-01 false Energy Program Conservation Elements D Appendix D to Part 436 Energy DEPARTMENT OF ENERGY ENERGY CONSERVATION FEDERAL ENERGY MANAGEMENT AND PLANNING PROGRAMS Pt. 436, App. D Appendix D to Part 436—Energy Program Conservation Elements (a) In all successful energy...

  19. 10 CFR Appendix D to Part 436 - Energy Program Conservation Elements

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 10 Energy 3 2011-01-01 2011-01-01 false Energy Program Conservation Elements D Appendix D to Part 436 Energy DEPARTMENT OF ENERGY ENERGY CONSERVATION FEDERAL ENERGY MANAGEMENT AND PLANNING PROGRAMS Pt. 436, App. D Appendix D to Part 436—Energy Program Conservation Elements (a) In all successful energy...

  20. 10 CFR Appendix D to Part 436 - Energy Program Conservation Elements

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 10 Energy 3 2012-01-01 2012-01-01 false Energy Program Conservation Elements D Appendix D to Part 436 Energy DEPARTMENT OF ENERGY ENERGY CONSERVATION FEDERAL ENERGY MANAGEMENT AND PLANNING PROGRAMS Pt. 436, App. D Appendix D to Part 436—Energy Program Conservation Elements (a) In all successful energy...

  1. 10 CFR Appendix D to Part 436 - Energy Program Conservation Elements

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 10 Energy 3 2010-01-01 2010-01-01 false Energy Program Conservation Elements D Appendix D to Part 436 Energy DEPARTMENT OF ENERGY ENERGY CONSERVATION FEDERAL ENERGY MANAGEMENT AND PLANNING PROGRAMS Pt. 436, App. D Appendix D to Part 436—Energy Program Conservation Elements (a) In all successful energy...

  2. Regulation of neural macroRNAs by the transcriptional repressor REST

    PubMed Central

    Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J.; Stanton, Lawrence W.; Lipovich, Leonard

    2009-01-01

    The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs (“macroRNAs”), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer. PMID:19050060

  3. Regulation of neural macroRNAs by the transcriptional repressor REST.

    PubMed

    Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J; Stanton, Lawrence W; Lipovich, Leonard

    2009-01-01

    The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs ("macroRNAs"), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer.

  4. Separating the wheat from the chaff: systematic identification of functionally relevant noncoding variants in ADHD.

    PubMed

    Tong, J H S; Hawi, Z; Dark, C; Cummins, T D R; Johnson, B P; Newman, D P; Lau, R; Vance, A; Heussler, H S; Matthews, N; Bellgrove, M A; Pang, K C

    2016-11-01

    Attention deficit hyperactivity disorder (ADHD) is a highly heritable psychiatric condition with negative lifetime outcomes. Uncovering its genetic architecture should yield important insights into the neurobiology of ADHD and assist development of novel treatment strategies. Twenty years of candidate gene investigations and more recently genome-wide association studies have identified an array of potential association signals. In this context, separating the likely true from false associations ('the wheat' from 'the chaff') will be crucial for uncovering the functional biology of ADHD. Here, we defined a set of 2070 DNA variants that showed evidence of association with ADHD (or were in linkage disequilibrium). More than 97% of these variants were noncoding, and were prioritised for further exploration using two tools-genome-wide annotation of variants (GWAVA) and Combined Annotation-Dependent Depletion (CADD)-that were recently developed to rank variants based upon their likely pathogenicity. Capitalising on recent efforts such as the Encyclopaedia of DNA Elements and US National Institutes of Health Roadmap Epigenomics Projects to improve understanding of the noncoding genome, we subsequently identified 65 variants to which we assigned functional annotations, based upon their likely impact on alternative splicing, transcription factor binding and translational regulation. We propose that these 65 variants, which possess not only a high likelihood of pathogenicity but also readily testable functional hypotheses, represent a tractable shortlist for future experimental validation in ADHD. Taken together, this study brings into sharp focus the likely relevance of noncoding variants for the genetic risk associated with ADHD, and more broadly suggests a bioinformatics approach that should be relevant to other psychiatric disorders.

  5. NONCODE v2.0: decoding the non-coding.

    PubMed

    He, Shunmin; Liu, Changning; Skogerbø, Geir; Zhao, Haitao; Wang, Jie; Liu, Tao; Bai, Baoyan; Zhao, Yi; Chen, Runsheng

    2008-01-01

    The NONCODE database is an integrated knowledge database designed for the analysis of non-coding RNAs (ncRNAs). Since NONCODE was first released 3 years ago, the number of known ncRNAs has grown rapidly, and there is growing recognition that ncRNAs play important regulatory roles in most organisms. In the updated version of NONCODE (NONCODE v2.0), the number of collected ncRNAs has reached 206 226, including a wide range of microRNAs, Piwi-interacting RNAs and mRNA-like ncRNAs. The improvements brought to the database include not only new and updated ncRNA data sets, but also an incorporation of BLAST alignment search service and access through our custom UCSC Genome Browser. NONCODE can be found under http://www.noncode.org or http://noncode.bioinfo.org.cn.

  6. Conserved features of eukaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70.

    PubMed Central

    Hunt, C; Morimoto, R I

    1985-01-01

    We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075

  7. Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hudson, William H.; Pickard, Mark R.; de Vera, Ian Mitchelle S.

    2014-12-23

    The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic noncoding RNAs (lincRNAs). Although lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA growth arrest-specific 5 (Gas5), which regulates steroid-mediated transcriptional regulation, growth arrest and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5more » lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions.« less

  8. Transcription regulation by distal enhancers: who's in the loop?

    PubMed

    Stadhouders, Ralph; van den Heuvel, Anita; Kolovos, Petros; Jorna, Ruud; Leslie, Kris; Grosveld, Frank; Soler, Eric

    2012-01-01

    Genome-wide chromatin profiling efforts have shown that enhancers are often located at large distances from gene promoters within the noncoding genome. Whereas enhancers can stimulate transcription initiation by communicating with promoters via chromatin looping mechanisms, we propose that enhancers may also stimulate transcription elongation by physical interactions with intronic elements. We review here recent findings derived from the study of the hematopoietic system.

  9. Current Research on Non-Coding Ribonucleic Acid (RNA).

    PubMed

    Wang, Jing; Samuels, David C; Zhao, Shilin; Xiang, Yu; Zhao, Ying-Yong; Guo, Yan

    2017-12-05

    Non-coding ribonucleic acid (RNA) has without a doubt captured the interest of biomedical researchers. The ability to screen the entire human genome with high-throughput sequencing technology has greatly enhanced the identification, annotation and prediction of the functionality of non-coding RNAs. In this review, we discuss the current landscape of non-coding RNA research and quantitative analysis. Non-coding RNA will be categorized into two major groups by size: long non-coding RNAs and small RNAs. In long non-coding RNA, we discuss regular long non-coding RNA, pseudogenes and circular RNA. In small RNA, we discuss miRNA, transfer RNA, piwi-interacting RNA, small nucleolar RNA, small nuclear RNA, Y RNA, single recognition particle RNA, and 7SK RNA. We elaborate on the origin, detection method, and potential association with disease, putative functional mechanisms, and public resources for these non-coding RNAs. We aim to provide readers with a complete overview of non-coding RNAs and incite additional interest in non-coding RNA research.

  10. The emergence of noncoding RNAs as Heracles in autophagy.

    PubMed

    Zhang, Jian; Wang, Peiyuan; Wan, Lin; Xu, Shouping; Pang, Da

    2017-06-03

    Macroautophagy/autophagy is a catabolic process that is widely found in nature. Over the past few decades, mounting evidence has indicated that noncoding RNAs, ranging from small noncoding RNAs to long noncoding RNAs (lncRNAs) and even circular RNAs (circRNAs), mediate the transcriptional and post-transcriptional regulation of autophagy-related genes by participating in autophagy regulatory networks. The differential expression of noncoding RNAs affects autophagy levels at different physiological and pathological stages, including embryonic proliferation and differentiation, cellular senescence, and even diseases such as cancer. We summarize the current knowledge regarding noncoding RNA dysregulation in autophagy and investigate the molecular regulatory mechanisms underlying noncoding RNA involvement in autophagy regulatory networks. Then, we integrate public resources to predict autophagy-related noncoding RNAs across species and discuss strategies for and the challenges of identifying autophagy-related noncoding RNAs. This article will deepen our understanding of the relationship between noncoding RNAs and autophagy, and provide new insights to specifically target noncoding RNAs in autophagy-associated therapeutic strategies.

  11. Roles of Non-Coding RNA in Sugarcane-Microbe Interaction.

    PubMed

    Thiebaut, Flávia; Rojas, Cristian A; Grativol, Clícia; Calixto, Edmundo P da R; Motta, Mariana R; Ballesteros, Helkin G F; Peixoto, Barbara; de Lima, Berenice N S; Vieira, Lucas M; Walter, Maria Emilia; de Armas, Elvismary M; Entenza, Júlio O P; Lifschitz, Sergio; Farinelli, Laurent; Hemerly, Adriana S; Ferreira, Paulo C G

    2017-12-20

    Studies have highlighted the importance of non-coding RNA regulation in plant-microbe interaction. However, the roles of sugarcane microRNAs (miRNAs) in the regulation of disease responses have not been investigated. Firstly, we screened the sRNA transcriptome of sugarcane infected with Acidovorax avenae . Conserved and novel miRNAs were identified. Additionally, small interfering RNAs (siRNAs) were aligned to differentially expressed sequences from the sugarcane transcriptome. Interestingly, many siRNAs aligned to a transcript encoding a copper-transporter gene whose expression was induced in the presence of A. avenae , while the siRNAs were repressed in the presence of A. avenae . Moreover, a long intergenic non-coding RNA was identified as a potential target or decoy of miR408. To extend the bioinformatics analysis, we carried out independent inoculations and the expression patterns of six miRNAs were validated by quantitative reverse transcription-PCR (qRT-PCR). Among these miRNAs, miR408-a copper-microRNA-was downregulated. The cleavage of a putative miR408 target, a laccase, was confirmed by a modified 5'RACE (rapid amplification of cDNA ends) assay. MiR408 was also downregulated in samples infected with other pathogens, but it was upregulated in the presence of a beneficial diazotrophic bacteria. Our results suggest that regulation by miR408 is important in sugarcane sensing whether microorganisms are either pathogenic or beneficial, triggering specific miRNA-mediated regulatory mechanisms accordingly.

  12. Roles of Non-Coding RNA in Sugarcane-Microbe Interaction

    PubMed Central

    Grativol, Clícia; Motta, Mariana R.; Ballesteros, Helkin G. F.; Peixoto, Barbara; Vieira, Lucas M.; Walter, Maria Emilia; de Armas, Elvismary M.; Entenza, Júlio O. P.; Lifschitz, Sergio; Farinelli, Laurent; Hemerly, Adriana S.

    2017-01-01

    Studies have highlighted the importance of non-coding RNA regulation in plant-microbe interaction. However, the roles of sugarcane microRNAs (miRNAs) in the regulation of disease responses have not been investigated. Firstly, we screened the sRNA transcriptome of sugarcane infected with Acidovorax avenae. Conserved and novel miRNAs were identified. Additionally, small interfering RNAs (siRNAs) were aligned to differentially expressed sequences from the sugarcane transcriptome. Interestingly, many siRNAs aligned to a transcript encoding a copper-transporter gene whose expression was induced in the presence of A. avenae, while the siRNAs were repressed in the presence of A. avenae. Moreover, a long intergenic non-coding RNA was identified as a potential target or decoy of miR408. To extend the bioinformatics analysis, we carried out independent inoculations and the expression patterns of six miRNAs were validated by quantitative reverse transcription-PCR (qRT-PCR). Among these miRNAs, miR408—a copper-microRNA—was downregulated. The cleavage of a putative miR408 target, a laccase, was confirmed by a modified 5′RACE (rapid amplification of cDNA ends) assay. MiR408 was also downregulated in samples infected with other pathogens, but it was upregulated in the presence of a beneficial diazotrophic bacteria. Our results suggest that regulation by miR408 is important in sugarcane sensing whether microorganisms are either pathogenic or beneficial, triggering specific miRNA-mediated regulatory mechanisms accordingly. PMID:29657296

  13. Many human accelerated regions are developmental enhancers

    PubMed Central

    Capra, John A.; Erwin, Genevieve D.; McKinsey, Gabriel; Rubenstein, John L. R.; Pollard, Katherine S.

    2013-01-01

    The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology. PMID:24218637

  14. The Mitochondrial Cytochrome Oxidase Subunit I Gene Occurs on a Minichromosome with Extensive Heteroplasmy in Two Species of Chewing Lice, Geomydoecus aurei and Thomomydoecus minor

    PubMed Central

    Pietan, Lucas L.; Spradling, Theresa A.

    2016-01-01

    In animals, mitochondrial DNA (mtDNA) typically occurs as a single circular chromosome with 13 protein-coding genes and 22 tRNA genes. The various species of lice examined previously, however, have shown mitochondrial genome rearrangements with a range of chromosome sizes and numbers. Our research demonstrates that the mitochondrial genomes of two species of chewing lice found on pocket gophers, Geomydoecus aurei and Thomomydoecus minor, are fragmented with the 1,536 base-pair (bp) cytochrome-oxidase subunit I (cox1) gene occurring as the only protein-coding gene on a 1,916–1,964 bp minicircular chromosome in the two species, respectively. The cox1 gene of T. minor begins with an atypical start codon, while that of G. aurei does not. Components of the non-protein coding sequence of G. aurei and T. minor include a tRNA (isoleucine) gene, inverted repeat sequences consistent with origins of replication, and an additional non-coding region that is smaller than the non-coding sequence of other lice with such fragmented mitochondrial genomes. Sequences of cox1 minichromosome clones for each species reveal extensive length and sequence heteroplasmy in both coding and noncoding regions. The highly variable non-gene regions of G. aurei and T. minor have little sequence similarity with one another except for a 19-bp region of phylogenetically conserved sequence with unknown function. PMID:27589589

  15. Transcription Factor Binding Profiles Reveal Cyclic Expression of Human Protein-coding Genes and Non-coding RNAs

    PubMed Central

    Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.

    2013-01-01

    Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175

  16. The Yersinia pestis gcvB gene encodes two small regulatory RNA molecules

    PubMed Central

    McArthur, Sarah D; Pulvermacher, Sarah C; Stauffer, George V

    2006-01-01

    Background In recent years it has become clear that small non-coding RNAs function as regulatory elements in bacterial virulence and bacterial stress responses. We tested for the presence of the small non-coding GcvB RNAs in Y. pestis as possible regulators of gene expression in this organism. Results In this study, we report that the Yersinia pestis KIM6 gcvB gene encodes two small RNAs. Transcription of gcvB is activated by the GcvA protein and repressed by the GcvR protein. The gcvB-encoded RNAs are required for repression of the Y. pestis dppA gene, encoding the periplasmic-binding protein component of the dipeptide transport system, showing that the GcvB RNAs have regulatory activity. A deletion of the gcvB gene from the Y. pestis KIM6 chromosome results in a decrease in the generation time of the organism as well as a change in colony morphology. Conclusion The results of this study indicate that the Y. pestis gcvB gene encodes two small non-coding regulatory RNAs that repress dppA expression. A gcvB deletion is pleiotropic, suggesting that the sRNAs are likely involved in controlling genes in addition to dppA. PMID:16768793

  17. Transcriptional role of androgen receptor in the expression of long non-coding RNA Sox2OT in neurogenesis

    PubMed Central

    Tosetti, Valentina; Sassone, Jenny; Ferri, Anna L. M.; Taiana, Michela; Bedini, Gloria; Nava, Sara; Brenna, Greta; Di Resta, Chiara; Pareyson, Davide; Di Giulio, Anna Maria; Carelli, Stephana

    2017-01-01

    The complex architecture of adult brain derives from tightly regulated migration and differentiation of precursor cells generated during embryonic neurogenesis. Changes at transcriptional level of genes that regulate migration and differentiation may lead to neurodevelopmental disorders. Androgen receptor (AR) is a transcription factor that is already expressed during early embryonic days. However, AR role in the regulation of gene expression at early embryonic stage is yet to be determinate. Long non-coding RNA (lncRNA) Sox2 overlapping transcript (Sox2OT) plays a crucial role in gene expression control during development but its transcriptional regulation is still to be clearly defined. Here, using Bicalutamide in order to pharmacologically inactivated AR, we investigated whether AR participates in the regulation of the transcription of the lncRNASox2OTat early embryonic stage. We identified a new DNA binding region upstream of Sox2 locus containing three androgen response elements (ARE), and found that AR binds such a sequence in embryonic neural stem cells and in mouse embryonic brain. Our data suggest that through this binding, AR can promote the RNA polymerase II dependent transcription of Sox2OT. Our findings also suggest that AR participates in embryonic neurogenesis through transcriptional control of the long non-coding RNA Sox2OT. PMID:28704421

  18. Transcriptional role of androgen receptor in the expression of long non-coding RNA Sox2OT in neurogenesis.

    PubMed

    Tosetti, Valentina; Sassone, Jenny; Ferri, Anna L M; Taiana, Michela; Bedini, Gloria; Nava, Sara; Brenna, Greta; Di Resta, Chiara; Pareyson, Davide; Di Giulio, Anna Maria; Carelli, Stephana; Parati, Eugenio A; Gorio, Alfredo

    2017-01-01

    The complex architecture of adult brain derives from tightly regulated migration and differentiation of precursor cells generated during embryonic neurogenesis. Changes at transcriptional level of genes that regulate migration and differentiation may lead to neurodevelopmental disorders. Androgen receptor (AR) is a transcription factor that is already expressed during early embryonic days. However, AR role in the regulation of gene expression at early embryonic stage is yet to be determinate. Long non-coding RNA (lncRNA) Sox2 overlapping transcript (Sox2OT) plays a crucial role in gene expression control during development but its transcriptional regulation is still to be clearly defined. Here, using Bicalutamide in order to pharmacologically inactivated AR, we investigated whether AR participates in the regulation of the transcription of the lncRNASox2OTat early embryonic stage. We identified a new DNA binding region upstream of Sox2 locus containing three androgen response elements (ARE), and found that AR binds such a sequence in embryonic neural stem cells and in mouse embryonic brain. Our data suggest that through this binding, AR can promote the RNA polymerase II dependent transcription of Sox2OT. Our findings also suggest that AR participates in embryonic neurogenesis through transcriptional control of the long non-coding RNA Sox2OT.

  19. The 'dark matter' in the plant genomes: non-coding and unannotated DNA sequences associated with open chromatin.

    PubMed

    Jiang, Jiming

    2015-04-01

    Sequencing of complete plant genomes has become increasingly more routine since the advent of the next-generation sequencing technology. Identification and annotation of large amounts of noncoding but functional DNA sequences, including cis-regulatory DNA elements (CREs), have become a new frontier in plant genome research. Genomic regions containing active CREs bound to regulatory proteins are hypersensitive to DNase I digestion and are called DNase I hypersensitive sites (DHSs). Several recent DHS studies in plants illustrate that DHS datasets produced by DNase I digestion followed by next-generation sequencing (DNase-seq) are highly valuable for the identification and characterization of CREs associated with plant development and responses to environmental cues. DHS-based genomic profiling has opened a door to identify and annotate the 'dark matter' in sequenced plant genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. RNA regulatory networks in animals and plants: a long noncoding RNA perspective.

    PubMed

    Bai, Youhuang; Dai, Xiaozhuan; Harrison, Andrew P; Chen, Ming

    2015-03-01

    A recent highlight of genomics research has been the discovery of many families of transcripts which have function but do not code for proteins. An important group is long noncoding RNAs (lncRNAs), which are typically longer than 200 nt, and whose members originate from thousands of loci across genomes. We review progress in understanding the biogenesis and regulatory mechanisms of lncRNAs. We describe diverse computational and high throughput technologies for identifying and studying lncRNAs. We discuss the current knowledge of functional elements embedded in lncRNAs as well as insights into the lncRNA-based regulatory network in animals. We also describe genome-wide studies of large amount of lncRNAs in plants, as well as knowledge of selected plant lncRNAs with a focus on biotic/abiotic stress-responsive lncRNAs. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  1. A Looking-Glass of Non-Coding RNAs in Oral Cancer

    PubMed Central

    Irimie, Alexandra Iulia; Braicu, Cornelia; Sonea, Laura; Zimta, Alina Andreea; Diudea, Diana; Buduru, Smaranda; Berindan-Neagoe, Ioana

    2017-01-01

    Oral cancer is a multifactorial pathology and is characterized by the lack of efficient treatment and accurate diagnostic tools. This is mainly due the late diagnosis; therefore, reliable biomarkers for the timely detection of the disease and patient stratification are required. Non-coding RNAs (ncRNAs) are key elements in the physiological and pathological processes of various cancers, which is also reflected in oral cancer development and progression. A better understanding of their role could give a more thorough perspective on the future treatment options for this cancer type. This review offers a glimpse into the ncRNA involvement in oral cancer, which can help the medical community tap into the world of ncRNAs and lay the ground for more powerful diagnostic, prognostic and treatment tools for oral cancer that will ultimately help build a brighter future for these patients. PMID:29206174

  2. [The ENCODE project and functional genomics studies].

    PubMed

    Ding, Nan; Qu, Hongzhu; Fang, Xiangdong

    2014-03-01

    Upon the completion of the Human Genome Project, scientists have been trying to interpret the underlying genomic code for human biology. Since 2003, National Human Genome Research Institute (NHGRI) has invested nearly $0.3 billion and gathered over 440 scientists from more than 32 institutions in the United States, China, United Kingdom, Japan, Spain and Singapore to initiate the Encyclopedia of DNA Elements (ENCODE) project, aiming to identify and analyze all regulatory elements in the human genome. Taking advantage of the development of next-generation sequencing technologies and continuous improvement of experimental methods, ENCODE had made remarkable achievements: identified methylation and histone modification of DNA sequences and their regulatory effects on gene expression through altering chromatin structures, categorized binding sites of various transcription factors and constructed their regulatory networks, further revised and updated database for pseudogenes and non-coding RNA, and identified SNPs in regulatory sequences associated with diseases. These findings help to comprehensively understand information embedded in gene and genome sequences, the function of regulatory elements as well as the molecular mechanism underlying the transcriptional regulation by noncoding regions, and provide extensive data resource for life sciences, particularly for translational medicine. We re-viewed the contributions of high-throughput sequencing platform development and bioinformatical technology improve-ment to the ENCODE project, the association between epigenetics studies and the ENCODE project, and the major achievement of the ENCODE project. We also provided our prospective on the role of the ENCODE project in promoting the development of basic and clinical medicine.

  3. Sequence and structure-specific elements of HERG mRNA determine channel synthesis and trafficking efficiency

    PubMed Central

    Sroubek, Jakub; Krishnan, Yamini; McDonald, Thomas V.

    2013-01-01

    Human ether-á-gogo-related gene (HERG) encodes a potassium channel that is highly susceptible to deleterious mutations resulting in susceptibility to fatal cardiac arrhythmias. Most mutations adversely affect HERG channel assembly and trafficking. Why the channel is so vulnerable to missense mutations is not well understood. Since nothing is known of how mRNA structural elements factor in channel processing, we synthesized a codon-modified HERG cDNA (HERG-CM) where the codons were synonymously changed to reduce GC content, secondary structure, and rare codon usage. HERG-CM produced typical IKr-like currents; however, channel synthesis and processing were markedly different. Translation efficiency was reduced for HERG-CM, as determined by heterologous expression, in vitro translation, and polysomal profiling. Trafficking efficiency to the cell surface was greatly enhanced, as assayed by immunofluorescence, subcellular fractionation, and surface labeling. Chimeras of HERG-NT/CM indicated that trafficking efficiency was largely dependent on 5′ sequences, while translation efficiency involved multiple areas. These results suggest that HERG translation and trafficking rates are independently governed by noncoding information in various regions of the mRNA molecule. Noncoding information embedded within the mRNA may play a role in the pathogenesis of hereditary arrhythmia syndromes and could provide an avenue for targeted therapeutics.—Sroubek, J., Krishnan, Y., McDonald, T V. Sequence- and structure-specific elements of HERG mRNA determine channel synthesis and trafficking efficiency. PMID:23608144

  4. Massive gene transfer and extensive RNA editing of a symbiotic dinoflagellate plastid genome.

    PubMed

    Mungpakdee, Sutada; Shinzato, Chuya; Takeuchi, Takeshi; Kawashima, Takeshi; Koyanagi, Ryo; Hisata, Kanako; Tanaka, Makiko; Goto, Hiroki; Fujie, Manabu; Lin, Senjie; Satoh, Nori; Shoguchi, Eiichi

    2014-05-31

    Genome sequencing of Symbiodinium minutum revealed that 95 of 109 plastid-associated genes have been transferred to the nuclear genome and subsequently expanded by gene duplication. Only 14 genes remain in plastids and occur as DNA minicircles. Each minicircle (1.8-3.3 kb) contains one gene and a conserved noncoding region containing putative promoters and RNA-binding sites. Nine types of RNA editing, including a novel G/U type, were discovered in minicircle transcripts but not in genes transferred to the nucleus. In contrast to DNA editing sites in dinoflagellate mitochondria, which tend to be highly conserved across all taxa, editing sites employed in DNA minicircles are highly variable from species to species. Editing is crucial for core photosystem protein function. It restores evolutionarily conserved amino acids and increases peptidyl hydropathy. It also increases protein plasticity necessary to initiate photosystem complex assembly. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  5. Identification of microRNAs and their targets in Finger millet by high throughput sequencing.

    PubMed

    Usha, S; Jyothi, M N; Sharadamma, N; Dixit, Rekha; Devaraj, V R; Nagesh Babu, R

    2015-12-15

    MicroRNAs are short non-coding RNAs which play an important role in regulating gene expression by mRNA cleavage or by translational repression. The majority of identified miRNAs were evolutionarily conserved; however, others expressed in a species-specific manner. Finger millet is an important cereal crop; nonetheless, no practical information is available on microRNAs to date. In this study, we have identified 95 conserved microRNAs belonging to 39 families and 3 novel microRNAs by high throughput sequencing. For the identified conserved and novel miRNAs a total of 507 targets were predicted. 11 miRNAs were validated and tissue specificity was determined by stem loop RT-qPCR, Northern blot. GO analyses revealed targets of miRNA were involved in wide range of regulatory functions. This study implies large number of known and novel miRNAs found in Finger millet which may play important role in growth and development. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. The Big Entity of New RNA World: Long Non-Coding RNAs in Microvascular Complications of Diabetes.

    PubMed

    Raut, Satish K; Khullar, Madhu

    2018-01-01

    A major part of the genome is known to be transcribed into non-protein coding RNAs (ncRNAs), such as microRNA and long non-coding RNA (lncRNA). The importance of ncRNAs is being increasingly recognized in physiological and pathological processes. lncRNAs are a novel class of ncRNAs that do not code for proteins and are important regulators of gene expression. In the past, these molecules were thought to be transcriptional "noise" with low levels of evolutionary conservation. However, recent studies provide strong evidence indicating that lncRNAs are (i) regulated during various cellular processes, (ii) exhibit cell type-specific expression, (iii) localize to specific organelles, and (iv) associated with human diseases. Emerging evidence indicates an aberrant expression of lncRNAs in diabetes and diabetes-related microvascular complications. In the present review, we discuss the current state of knowledge of lncRNAs, their genesis from genome, and the mechanism of action of individual lncRNAs in the pathogenesis of microvascular complications of diabetes and therapeutic approaches.

  7. Regulated Formation of lncRNA-DNA Hybrids Enables Faster Transcriptional Induction and Environmental Adaptation.

    PubMed

    Cloutier, Sara C; Wang, Siwen; Ma, Wai Kit; Al Husini, Nadra; Dhoondia, Zuzer; Ansari, Athar; Pascuzzi, Pete E; Tran, Elizabeth J

    2016-02-04

    Long non-coding (lnc)RNAs, once thought to merely represent noise from imprecise transcription initiation, have now emerged as major regulatory entities in all eukaryotes. In contrast to the rapidly expanding identification of individual lncRNAs, mechanistic characterization has lagged behind. Here we provide evidence that the GAL lncRNAs in the budding yeast S. cerevisiae promote transcriptional induction in trans by formation of lncRNA-DNA hybrids or R-loops. The evolutionarily conserved RNA helicase Dbp2 regulates formation of these R-loops as genomic deletion or nuclear depletion results in accumulation of these structures across the GAL cluster gene promoters and coding regions. Enhanced transcriptional induction is manifested by lncRNA-dependent displacement of the Cyc8 co-repressor and subsequent gene looping, suggesting that these lncRNAs promote induction by altering chromatin architecture. Moreover, the GAL lncRNAs confer a competitive fitness advantage to yeast cells because expression of these non-coding molecules correlates with faster adaptation in response to an environmental switch. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.

    PubMed

    Sun, Lei; Liu, Hui; Zhang, Lin; Meng, Jia

    2015-01-01

    Functional long non-coding RNAs (lncRNAs) have been bringing novel insight into biological study, however it is still not trivial to accurately distinguish the lncRNA transcripts (LNCTs) from the protein coding ones (PCTs). As various information and data about lncRNAs are preserved by previous studies, it is appealing to develop novel methods to identify the lncRNAs more accurately. Our method lncRScan-SVM aims at classifying PCTs and LNCTs using support vector machine (SVM). The gold-standard datasets for lncRScan-SVM model training, lncRNA prediction and method comparison were constructed according to the GENCODE gene annotations of human and mouse respectively. By integrating features derived from gene structure, transcript sequence, potential codon sequence and conservation, lncRScan-SVM outperforms other approaches, which is evaluated by several criteria such as sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC) and area under curve (AUC). In addition, several known human lncRNA datasets were assessed using lncRScan-SVM. LncRScan-SVM is an efficient tool for predicting the lncRNAs, and it is quite useful for current lncRNA study.

  9. Evolving nucleotide binding surfaces

    NASA Technical Reports Server (NTRS)

    Kieber-Emmons, T.; Rein, R.

    1981-01-01

    An analysis is presented of the stability and nature of binding of a nucleotide to several known dehydrogenases. The employed approach includes calculation of hydrophobic stabilization of the binding motif and its intermolecular interaction with the ligand. The evolutionary changes of the binding motif are studied by calculating the Euclidean deviation of the respective dehydrogenases. Attention is given to the possible structural elements involved in the origin of nucleotide recognition by non-coded primordial polypeptides.

  10. Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development

    PubMed Central

    2011-01-01

    Background We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. Results The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. Conclusions Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution. PMID:21854559

  11. Deciphering the Regulatory Logic of an Ancient, Ultraconserved Nuclear Receptor Enhancer Module

    PubMed Central

    Bagamasbad, Pia D.; Bonett, Ronald M.; Sachs, Laurent; Buisine, Nicolas; Raj, Samhitha; Knoedler, Joseph R.; Kyono, Yasuhiro; Ruan, Yijun; Ruan, Xiaoan

    2015-01-01

    Cooperative, synergistic gene regulation by nuclear hormone receptors can increase sensitivity and amplify cellular responses to hormones. We investigated thyroid hormone (TH) and glucocorticoid (GC) synergy on the Krüppel-like factor 9 (Klf9) gene, which codes for a zinc finger transcription factor involved in development and homeostasis of diverse tissues. We identified regions of the Xenopus and mouse Klf9 genes 5–6 kb upstream of the transcription start sites that supported synergistic transactivation by TH plus GC. Within these regions, we found an orthologous sequence of approximately 180 bp that is highly conserved among tetrapods, but absent in other chordates, and possesses chromatin marks characteristic of an enhancer element. The Xenopus and mouse approximately 180-bp DNA element conferred synergistic transactivation by hormones in transient transfection assays, so we designate this the Klf9 synergy module (KSM). We identified binding sites within the mouse KSM for TH receptor, GC receptor, and nuclear factor κB. TH strongly increased recruitment of liganded GC receptor and serine 5 phosphorylated (initiating) RNA polymerase II to chromatin at the KSM, suggesting a mechanism for transcriptional synergy. The KSM is transcribed to generate long noncoding RNAs, which are also synergistically induced by combined hormone treatment, and the KSM interacts with the Klf9 promoter and a far upstream region through chromosomal looping. Our findings support that the KSM plays a central role in hormone regulation of vertebrate Klf9 genes, it evolved in the tetrapod lineage, and has been maintained by strong stabilizing selection. PMID:25866873

  12. Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development.

    PubMed

    Renfree, Marilyn B; Papenfuss, Anthony T; Deakin, Janine E; Lindsay, James; Heider, Thomas; Belov, Katherine; Rens, Willem; Waters, Paul D; Pharo, Elizabeth A; Shaw, Geoff; Wong, Emily S W; Lefèvre, Christophe M; Nicholas, Kevin R; Kuroki, Yoko; Wakefield, Matthew J; Zenger, Kyall R; Wang, Chenwei; Ferguson-Smith, Malcolm; Nicholas, Frank W; Hickford, Danielle; Yu, Hongshi; Short, Kirsty R; Siddle, Hannah V; Frankenberg, Stephen R; Chew, Keng Yih; Menzies, Brandon R; Stringer, Jessica M; Suzuki, Shunsuke; Hore, Timothy A; Delbridge, Margaret L; Patel, Hardip R; Mohammadi, Amir; Schneider, Nanette Y; Hu, Yanqiu; O'Hara, William; Al Nadaf, Shafagh; Wu, Chen; Feng, Zhi-Ping; Cocks, Benjamin G; Wang, Jianghui; Flicek, Paul; Searle, Stephen M J; Fairley, Susan; Beal, Kathryn; Herrero, Javier; Carone, Dawn M; Suzuki, Yutaka; Sugano, Sumio; Toyoda, Atsushi; Sakaki, Yoshiyuki; Kondo, Shinji; Nishida, Yuichiro; Tatsumoto, Shoji; Mandiou, Ion; Hsu, Arthur; McColl, Kaighin A; Lansdell, Benjamin; Weinstock, George; Kuczek, Elizabeth; McGrath, Annette; Wilson, Peter; Men, Artem; Hazar-Rethinam, Mehlika; Hall, Allison; Davis, John; Wood, David; Williams, Sarah; Sundaravadanam, Yogi; Muzny, Donna M; Jhangiani, Shalini N; Lewis, Lora R; Morgan, Margaret B; Okwuonu, Geoffrey O; Ruiz, San Juana; Santibanez, Jireh; Nazareth, Lynne; Cree, Andrew; Fowler, Gerald; Kovar, Christie L; Dinh, Huyen H; Joshi, Vandita; Jing, Chyn; Lara, Fremiet; Thornton, Rebecca; Chen, Lei; Deng, Jixin; Liu, Yue; Shen, Joshua Y; Song, Xing-Zhi; Edson, Janette; Troon, Carmen; Thomas, Daniel; Stephens, Amber; Yapa, Lankesha; Levchenko, Tanya; Gibbs, Richard A; Cooper, Desmond W; Speed, Terence P; Fujiyama, Asao; Graves, Jennifer A M; O'Neill, Rachel J; Pask, Andrew J; Forrest, Susan M; Worley, Kim C

    2011-08-29

    We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution.

  13. Conserved structures formed by heterogeneous RNA sequences drive silencing of an inflammation responsive post-transcriptional operon

    PubMed Central

    Basu, Abhijit; Jain, Niyati; Tolbert, Blanton S.; Komar, Anton A.

    2017-01-01

    Abstract RNA–protein interactions with physiological outcomes usually rely on conserved sequences within the RNA element. By contrast, activity of the diverse gamma-interferon-activated inhibitor of translation (GAIT)-elements relies on the conserved RNA folding motifs rather than the conserved sequence motifs. These elements drive the translational silencing of a group of chemokine (CC/CXC) and chemokine receptor (CCR) mRNAs, thereby helping to resolve physiological inflammation. Despite sequence dissimilarity, these RNA elements adopt common secondary structures (as revealed by 2D-1H NMR spectroscopy), providing a basis for their interaction with the RNA-binding GAIT complex. However, many of these elements (e.g. those derived from CCL22, CXCL13, CCR4 and ceruloplasmin (Cp) mRNAs) have substantially different affinities for GAIT complex binding. Toeprinting analysis shows that different positions within the overall conserved GAIT element structure contribute to differential affinities of the GAIT protein complex towards the elements. Thus, heterogeneity of GAIT elements may provide hierarchical fine-tuning of the resolution of inflammation. PMID:29069516

  14. Biological significance of long non-coding RNA FTX expression in human colorectal cancer.

    PubMed

    Guo, Xiao-Bo; Hua, Zhu; Li, Chen; Peng, Li-Pan; Wang, Jing-Shen; Wang, Bo; Zhi, Qiao-Ming

    2015-01-01

    The purpose of this study was to determine the expression of long non-coding RNA (lncRNA) FTX and analyze its prognostic and biological significance in colorectal cancer (CRC). A quantitative reverse transcription PCR was performed to detect the expression of long non-coding RNA FTX in 35 pairs of colorectal cancer and corresponding noncancerous tissues. The expression of long non-coding RNA FTX was detected in 187 colorectal cancer tissues and its correlations with clinicopathological factors of patients were examined. Univariate and multivariate analyses were performed to analyze the prognostic significance of Long Non-coding RNA FTX expression. The effects of long non-coding RNA FTX expression on malignant phenotypes of colorectal cancer cells and its possible biological significances were further determined. Long non-coding RNA FTX was significantly upregulated in colorectal cancer tissues, and low long non-coding RNA FTX expression was significantly correlated with differentiation grade, lymph vascular invasion, and clinical stage. Patients with high long non-coding RNA FTX showed poorer overall survival than those with low long non-coding RNA FTX. Multivariate analyses indicated that status of long non-coding RNA FTX was an independent prognostic factor for patients. Functional analyses showed that upregulation of long non-coding RNA FTX significantly promoted growth, migration, invasion, and increased colony formation in colorectal cancer cells. Therefore, long non-coding RNA FTX may be a potential biomarker for predicting the survival of colorectal cancer patients and might be a molecular target for treatment of human colorectal cancer.

  15. Hepatic Long Intergenic Noncoding RNAs: High Promoter Conservation and Dynamic, Sex-Dependent Transcriptional Regulation by Growth Hormone

    PubMed Central

    Melia, Tisha; Hao, Pengying; Yilmaz, Feyza

    2015-01-01

    Long intergenic noncoding RNAs (lincRNAs) are increasingly recognized as key chromatin regulators, yet few studies have characterized lincRNAs in a single tissue under diverse conditions. Here, we analyzed 45 mouse liver RNA sequencing (RNA-Seq) data sets collected under diverse conditions to systematically characterize 4,961 liver lincRNAs, 59% of them novel, with regard to gene structures, species conservation, chromatin accessibility, transcription factor binding, and epigenetic states. To investigate the potential for functionality, we focused on the responses of the liver lincRNAs to growth hormone stimulation, which imparts clinically relevant sex differences to hepatic metabolism and liver disease susceptibility. Sex-biased expression characterized 247 liver lincRNAs, with many being nuclear RNA enriched and regulated by growth hormone. The sex-biased lincRNA genes are enriched for nearby and correspondingly sex-biased accessible chromatin regions, as well as sex-biased binding sites for growth hormone-regulated transcriptional activators (STAT5, hepatocyte nuclear factor 6 [HNF6], FOXA1, and FOXA2) and transcriptional repressors (CUX2 and BCL6). Repression of female-specific lincRNAs in male liver, but not that of male-specific lincRNAs in female liver, was associated with enrichment of H3K27me3-associated inactive states and poised (bivalent) enhancer states. Strikingly, we found that liver-specific lincRNA gene promoters are more highly species conserved and have a significantly higher frequency of proximal binding by liver transcription factors than liver-specific protein-coding gene promoters. Orthologs for many liver lincRNAs were identified in one or more supraprimates, including two rat lincRNAs showing the same growth hormone-regulated, sex-biased expression as their mouse counterparts. This integrative analysis of liver lincRNA chromatin states, transcription factor occupancy, and growth hormone regulation provides novel insights into the expression of sex-specific lincRNAs and their potential for regulation of sex differences in liver physiology and disease. PMID:26459762

  16. Metazoan tRNA introns generate stable circular RNAs in vivo

    PubMed Central

    Lu, Zhipeng; Filonov, Grigory S.; Noto, John J.; Schmidt, Casey A.; Hatkevich, Talia L.; Wen, Ying; Jaffrey, Samie R.; Matera, A. Gregory

    2015-01-01

    We report the discovery of a class of abundant circular noncoding RNAs that are produced during metazoan tRNA splicing. These transcripts, termed tRNA intronic circular (tric)RNAs, are conserved features of animal transcriptomes. Biogenesis of tricRNAs requires anciently conserved tRNA sequence motifs and processing enzymes, and their expression is regulated in an age-dependent and tissue-specific manner. Furthermore, we exploited this biogenesis pathway to develop an in vivo expression system for generating “designer” circular RNAs in human cells. Reporter constructs expressing RNA aptamers such as Spinach and Broccoli can be used to follow the transcription and subcellular localization of tricRNAs in living cells. Owing to the superior stability of circular vs. linear RNA isoforms, this expression system has a wide range of potential applications, from basic research to pharmaceutical science. PMID:26194134

  17. Co-chaperone Hsp70/Hsp90-organizing protein (Hop) is required for transposon silencing and Piwi-interacting RNA (piRNA) biogenesis.

    PubMed

    Karam, Joseph A; Parikh, Rasesh Y; Nayak, Dhananjaya; Rosenkranz, David; Gangaraju, Vamsi K

    2017-04-14

    Piwi-interacting RNAs (piRNAs) are 26-30-nucleotide germ line-specific small non-coding RNAs that have evolutionarily conserved function in mobile genetic element (transposons) silencing and maintenance of genome integrity. Drosophila Hsp70/90-organizing protein homolog (Hop), a co-chaperone, interacts with piRNA-binding protein Piwi and mediates silencing of phenotypic variations. However, it is not known whether Hop has a direct role in piRNA biogenesis and transposon silencing. Here, we show that knockdown of Hop in the germ line nurse cells (GLKD) of Drosophila ovaries leads to activation of transposons. Hop GLKD females can lay eggs at the same rate as wild-type counterparts, but the eggs do not hatch into larvae. Hop GLKD leads to the accumulation of γ-H2Av foci in the germ line, indicating increased DNA damage in the ovary. We also show that Hop GLKD-induced transposon up-regulation is due to inefficient piRNA biogenesis. Based on these results, we conclude that Hop is a critical component of the piRNA pathway and that it maintains genome integrity by silencing transposons. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  18. Compound heterozygous deletions in pseudoautosomal region 1 in an infant with mild manifestations of langer mesomelic dysplasia.

    PubMed

    Tsuchiya, Takayoshi; Shibata, Minoru; Numabe, Hironao; Jinno, Tomoko; Nakabayashi, Kazuhiko; Nishimura, Gen; Nagai, Toshiro; Ogata, Tsutomu; Fukami, Maki

    2014-02-01

    Haploinsufficiency of SHOX on the short arm pseudoautosomal region (PAR1) leads to Leri-Weill dyschondrosteosis (LWD), and nullizygosity of SHOX results in Langer mesomelic dysplasia (LMD). Molecular defects of LWD/LMD include various microdeletions in PAR1 that involve exons and/or the putative upstream or downstream enhancer regions of SHOX, as well as several intragenic mutations. Here, we report on a Japanese male infant with mild manifestations of LMD and hitherto unreported microdeletions in PAR1. Clinical analysis revealed mesomelic short stature with various radiological findings indicative of LMD. Molecular analyses identified compound heterozygous deletions, that is, a maternally inherited ∼46 kb deletion involving the upstream region and exons 1-5 of SHOX, and a paternally inherited ∼500 kb deletion started from a position ∼300 kb downstream from SHOX. In silico analysis revealed that the downstream deletion did not affect the known putative enhancer regions of SHOX, although it encompassed several non-coding elements which were well conserved among various species with SHOX orthologs. These results provide the possibility of the presence of a novel enhancer for SHOX in the genomic region ∼300 to ∼800 kb downstream of the start codon. © 2013 Wiley Periodicals, Inc.

  19. Control of myogenesis by rodent SINE-containing lncRNAs

    PubMed Central

    Wang, Jiashi; Gong, Chenguang; Maquat, Lynne E.

    2013-01-01

    Staufen1-mediated mRNA decay (SMD) degrades mRNAs that harbor a Staufen1-binding site (SBS) in their 3′ untranslated regions (UTRs). Human SBSs can form by intermolecular base-pairing between a 3′ UTR Alu element and an Alu element within a long noncoding RNA (lncRNA) called a ½-sbsRNA. Since Alu elements are confined to primates, it was unclear how SMD occurs in rodents. Here we identify mouse mRNA 3′ UTRs and lncRNAs that contain a B1, B2, B4, or identifier (ID) element. We show that SMD occurs in mouse cells via mRNA–lncRNA base-pairing of partially complementary elements and that mouse ½-sbsRNA (m½-sbsRNA)-triggered SMD regulates C2C12 cell myogenesis. Our findings define new roles for lncRNAs as well as B and ID short interspersed elements (SINEs) in mice that undoubtedly influence many developmental and homeostatic pathways. PMID:23558772

  20. Biological significance of long non-coding RNA FTX expression in human colorectal cancer

    PubMed Central

    Guo, Xiao-Bo; Hua, Zhu; Li, Chen; Peng, Li-Pan; Wang, Jing-Shen; Wang, Bo; Zhi, Qiao-Ming

    2015-01-01

    The purpose of this study was to determine the expression of long non-coding RNA (lncRNA) FTX and analyze its prognostic and biological significance in colorectal cancer (CRC). A quantitative reverse transcription PCR was performed to detect the expression of long non-coding RNA FTX in 35 pairs of colorectal cancer and corresponding noncancerous tissues. The expression of long non-coding RNA FTX was detected in 187 colorectal cancer tissues and its correlations with clinicopathological factors of patients were examined. Univariate and multivariate analyses were performed to analyze the prognostic significance of Long Non-coding RNA FTX expression. The effects of long non-coding RNA FTX expression on malignant phenotypes of colorectal cancer cells and its possible biological significances were further determined. Long non-coding RNA FTX was significantly upregulated in colorectal cancer tissues, and low long non-coding RNA FTX expression was significantly correlated with differentiation grade, lymph vascular invasion, and clinical stage. Patients with high long non-coding RNA FTX showed poorer overall survival than those with low long non-coding RNA FTX. Multivariate analyses indicated that status of long non-coding RNA FTX was an independent prognostic factor for patients. Functional analyses showed that upregulation of long non-coding RNA FTX significantly promoted growth, migration, invasion, and increased colony formation in colorectal cancer cells. Therefore, long non-coding RNA FTX may be a potential biomarker for predicting the survival of colorectal cancer patients and might be a molecular target for treatment of human colorectal cancer. PMID:26629053

  1. SoxB2 in sea urchin development: implications in neurogenesis, ciliogenesis and skeletal patterning.

    PubMed

    Anishchenko, Evgeniya; Arnone, Maria Ina; D'Aniello, Salvatore

    2018-01-01

    Current studies in evolutionary developmental biology are focused on the reconstruction of gene regulatory networks in target animal species. From decades, the scientific interest on genetic mechanisms orchestrating embryos development has been increasing in consequence to the fact that common features shared by evolutionarily distant phyla are being clarified. In 2011, a study across eumetazoan species showed for the first time the existence of a highly conserved non-coding element controlling the SoxB2 gene, which is involved in the early specification of the nervous system. This discovery raised several questions about SoxB2 function and regulation in deuterostomes from an evolutionary point of view. Due to the relevant phylogenetic position within deuterostomes, the sea urchin Strongylocentrotus purpuratus represents an advantageous animal model in the field of evolutionary developmental biology. Herein, we show a comprehensive study of SoxB2 functions in sea urchins, in particular its expression pattern in a wide range of developmental stages, and its co-localization with other neurogenic markers, as SoxB1 , SoxC and Elav . Moreover, this work provides a detailed description of the phenotype of sea urchin SoxB2 knocked-down embryos, confirming its key function in neurogenesis and revealing, for the first time, its additional roles in oral and aboral ectoderm cilia and skeletal rod morphology. We concluded that SoxB2 in sea urchins has a neurogenic function; however, this gene could have multiple roles in sea urchin embryogenesis, expanding its expression in non-neurogenic cells. We showed that SoxB2 is functionally conserved among deuterostomes and suggested that in S. purpuratus this gene acquired additional functions, being involved in ciliogenesis and skeletal patterning.

  2. Genetic, comparative genomic, and expression analyses of the Mc1r locus in the polychromatic Midas cichlid fish (Teleostei, Cichlidae Amphilophus sp.) species group.

    PubMed

    Henning, Frederico; Renz, Adina Josepha; Fukamachi, Shoji; Meyer, Axel

    2010-05-01

    Natural populations of the Midas cichlid species in several different crater lakes in Nicaragua exhibit a conspicuous color polymorphism. Most individuals are dark and the remaining have a gold coloration. The color morphs mate assortatively and sympatric population differentiation has been shown based on neutral molecular data. We investigated the color polymorphism using segregation analysis and a candidate gene approach. The segregation patterns observed in a mapping cross between a gold and a dark individual were consistent with a single dominant gene as a cause of the gold phenotype. This suggests that a simple genetic architecture underlies some of the speciation events in the Midas cichlids. We compared the expression levels of several candidate color genes Mc1r, Ednrb1, Slc45a2, and Tfap1a between the color morphs. Mc1r was found to be up regulated in the gold morph. Given its widespread association in color evolution and role on melanin synthesis, the Mc1r locus was further investigated using sequences derived from a genomic library. Comparative analysis revealed conserved synteny in relation to the majority of teleosts and highlighted several previously unidentified conserved non-coding elements (CNEs) in the upstream and downstream regions in the vicinity of Mc1r. The identification of the CNEs regions allowed the comparison of sequences from gold and dark specimens of natural populations. No polymorphisms were found between in the population sample and Mc1r showed no linkage to the gold phenotype in the mapping cross, demonstrating that it is not causally related to the color polymorphism in the Midas cichlid.

  3. NPInter v3.0: an upgraded database of noncoding RNA-associated interactions

    PubMed Central

    Hao, Yajing; Wu, Wei; Li, Hui; Yuan, Jiao; Luo, Jianjun; Zhao, Yi; Chen, Runsheng

    2016-01-01

    Despite the fact that a large quantity of noncoding RNAs (ncRNAs) have been identified, their functions remain unclear. To enable researchers to have a better understanding of ncRNAs’ functions, we updated the NPInter database to version 3.0, which contains experimentally verified interactions between ncRNAs (excluding tRNAs and rRNAs), especially long noncoding RNAs (lncRNAs) and other biomolecules (proteins, mRNAs, miRNAs and genomic DNAs). In NPInter v3.0, interactions pertaining to ncRNAs are not only manually curated from scientific literature but also curated from high-throughput technologies. In addition, we also curated lncRNA–miRNA interactions from in silico predictions supported by AGO CLIP-seq data. When compared with NPInter v2.0, the interactions are more informative (with additional information on tissues or cell lines, binding sites, conservation, co-expression values and other features) and more organized (with divisions on data sets by data sources, tissues or cell lines, experiments and other criteria). NPInter v3.0 expands the data set to 491,416 interactions in 188 tissues (or cell lines) from 68 kinds of experimental technologies. NPInter v3.0 also improves the user interface and adds new web services, including a local UCSC Genome Browser to visualize binding sites. Additionally, NPInter v3.0 defined a high-confidence set of interactions and predicted the functions of lncRNAs in human and mouse based on the interactions curated in the database. NPInter v3.0 is available at http://www.bioinfo.org/NPInter/. Database URL: http://www.bioinfo.org/NPInter/ PMID:27087310

  4. The Non-Coding RNA Ncr0700/PmgR1 is Required for Photomixotrophic Growth and the Regulation of Glycogen Accumulation in the Cyanobacterium Synechocystis sp. PCC 6803.

    PubMed

    de Porcellinis, Alice J; Klähn, Stephan; Rosgaard, Lisa; Kirsch, Rebekka; Gutekunst, Kirstin; Georg, Jens; Hess, Wolfgang R; Sakuragi, Yumiko

    2016-10-01

    Carbohydrate metabolism is a tightly regulated process in photosynthetic organisms. In the cyanobacterium Synechocystis sp. PCC 6803, the photomixotrophic growth protein A (PmgA) is involved in the regulation of glucose and storage carbohydrate (i.e. glycogen) metabolism, while its biochemical activity and possible factors acting downstream of PmgA are unknown. Here, a genome-wide microarray analysis of a ΔpmgA strain identified the expression of 36 protein-coding genes and 42 non-coding transcripts as significantly altered. From these, the non-coding RNA Ncr0700 was identified as the transcript most strongly reduced in abundance. Ncr0700 is widely conserved among cyanobacteria. In Synechocystis its expression is inversely correlated with light intensity. Similarly to a ΔpmgA mutant, a Δncr0700 deletion strain showed an approximately 2-fold increase in glycogen content under photoautotrophic conditions and wild-type-like growth. Moreover, its growth was arrested by 38 h after a shift to photomixotrophic conditions. Ectopic expression of Ncr0700 in Δncr0700 and ΔpmgA restored the glycogen content and photomixotrophic growth to wild-type levels. These results indicate that Ncr0700 is required for photomixotrophic growth and the regulation of glycogen accumulation, and acts downstream of PmgA. Hence Ncr0700 is renamed here as PmgR1 for photomixotrophic growth RNA 1. © The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  5. Transposable elements at the center of the crossroads between embryogenesis, embryonic stem cells, reprogramming, and long non-coding RNAs.

    PubMed

    Hutchins, Andrew Paul; Pei, Duanqing

    Transposable elements (TEs) are mobile genomic sequences of DNA capable of autonomous and non-autonomous duplication. TEs have been highly successful, and nearly half of the human genome now consists of various families of TEs. Originally thought to be non-functional, these elements have been co-opted by animal genomes to perform a variety of physiological functions ranging from TE-derived proteins acting directly in normal biological functions, to innovations in transcription factor logic and influence on epigenetic control of gene expression. During embryonic development, when the genome is epigenetically reprogrammed and DNA-demethylated, TEs are released from repression and show embryonic stage-specific expression, and in human and mouse embryos, intact TE-derived endogenous viral particles can even be detected. A similar process occurs during the reprogramming of somatic cells to pluripotent cells: When the somatic DNA is demethylated, TEs are released from repression. In embryonic stem cells (ESCs), where DNA is hypomethylated, an elaborate system of epigenetic control is employed to suppress TEs, a system that often overlaps with normal epigenetic control of ESC gene expression. Finally, many long non-coding RNAs (lncRNAs) involved in normal ESC function and those assisting or impairing reprogramming contain multiple TEs in their RNA. These TEs may act as regulatory units to recruit RNA-binding proteins and epigenetic modifiers. This review covers how TEs are interlinked with the epigenetic machinery and lncRNAs, and how these links influence each other to modulate aspects of ESCs, embryogenesis, and somatic cell reprogramming.

  6. Evolutionary analysis reveals regulatory and functional landscape of coding and non-coding RNA editing.

    PubMed

    Zhang, Rui; Deng, Patricia; Jacobson, Dionna; Li, Jin Billy

    2017-02-01

    Adenosine-to-inosine RNA editing diversifies the transcriptome and promotes functional diversity, particularly in the brain. A plethora of editing sites has been recently identified; however, how they are selected and regulated and which are functionally important are largely unknown. Here we show the cis-regulation and stepwise selection of RNA editing during Drosophila evolution and pinpoint a large number of functional editing sites. We found that the establishment of editing and variation in editing levels across Drosophila species are largely explained and predicted by cis-regulatory elements. Furthermore, editing events that arose early in the species tree tend to be more highly edited in clusters and enriched in slowly-evolved neuronal genes, thus suggesting that the main role of RNA editing is for fine-tuning neurological functions. While nonsynonymous editing events have been long recognized as playing a functional role, in addition to nonsynonymous editing sites, a large fraction of 3'UTR editing sites is evolutionarily constrained, highly edited, and thus likely functional. We find that these 3'UTR editing events can alter mRNA stability and affect miRNA binding and thus highlight the functional roles of noncoding RNA editing. Our work, through evolutionary analyses of RNA editing in Drosophila, uncovers novel insights of RNA editing regulation as well as its functions in both coding and non-coding regions.

  7. Evolutionary analysis reveals regulatory and functional landscape of coding and non-coding RNA editing

    PubMed Central

    Jacobson, Dionna

    2017-01-01

    Adenosine-to-inosine RNA editing diversifies the transcriptome and promotes functional diversity, particularly in the brain. A plethora of editing sites has been recently identified; however, how they are selected and regulated and which are functionally important are largely unknown. Here we show the cis-regulation and stepwise selection of RNA editing during Drosophila evolution and pinpoint a large number of functional editing sites. We found that the establishment of editing and variation in editing levels across Drosophila species are largely explained and predicted by cis-regulatory elements. Furthermore, editing events that arose early in the species tree tend to be more highly edited in clusters and enriched in slowly-evolved neuronal genes, thus suggesting that the main role of RNA editing is for fine-tuning neurological functions. While nonsynonymous editing events have been long recognized as playing a functional role, in addition to nonsynonymous editing sites, a large fraction of 3’UTR editing sites is evolutionarily constrained, highly edited, and thus likely functional. We find that these 3’UTR editing events can alter mRNA stability and affect miRNA binding and thus highlight the functional roles of noncoding RNA editing. Our work, through evolutionary analyses of RNA editing in Drosophila, uncovers novel insights of RNA editing regulation as well as its functions in both coding and non-coding regions. PMID:28166241

  8. Noncoding transcripts in sense and antisense orientation regulate the epigenetic state of ribosomal RNA genes.

    PubMed

    Bierhoff, H; Schmitz, K; Maass, F; Ye, J; Grummt, I

    2010-01-01

    Alternative transcription of the same gene in sense and antisense orientation regulates expression of protein-coding genes. Here we show that noncoding RNA (ncRNA) in sense and antisense orientation also controls transcription of rRNA genes (rDNA). rDNA exists in two types of chromatin--a euchromatic conformation that is permissive to transcription and a heterochromatic conformation that is transcriptionally silent. Silencing of rDNA is mediated by NoRC, a chromatin-remodeling complex that triggers heterochromatin formation. NoRC function requires RNA that is complementary to the rDNA promoter (pRNA). pRNA forms a DNA:RNA triplex with a regulatory element in the rDNA promoter, and this triplex structure is recognized by DNMT3b. The results imply that triplex-mediated targeting of DNMT3b to specific sequences may be a common pathway in epigenetic regulation. We also show that rDNA is transcribed in antisense orientation. The level of antisense RNA (asRNA) is down-regulated in cancer cells and up-regulated in senescent cells. Ectopic asRNA triggers trimethylation of histone H4 at lysine 20 (H4K20me3), suggesting that antisense transcripts guide the histone methyltransferase Suv4-20 to rDNA. The results reveal that noncoding RNAs in sense and antisense orientation are important determinants of the epigenetic state of rDNA.

  9. Non coding RNAs in vascular disease - from basic science to clinical applications: Scientific update from the Working Group of Myocardial Function of the European Society of Cardiology

    PubMed

    Fiedler, Jan; Baker, Andrew H; Dimmeler, Stefanie; Heymans, Stephane; Mayr, Manuel; Thum, Thomas

    2018-05-23

    Non-coding RNAs are increasingly recognized not only as regulators of various biological functions but also as targets for a new generation of RNA therapeutics and biomarkers. We hereby review recent insights relating to non-coding RNAs including microRNAs (e.g. miR-126, miR-146a), long non-coding RNAs (e.g. MIR503HG, GATA6-AS, SMILR) and circular RNAs (e.g. cZNF292) and their role in vascular diseases. This includes identification and therapeutic use of hypoxia-regulated non-coding RNAs and endogenous non-coding RNAs that regulate intrinsic smooth muscle cell signalling, age-related non-coding RNAs and non-coding RNAs involved in the regulation of mitochondrial biology and metabolic control. Finally, we discuss non-coding RNA species with biomarker potential.

  10. A Summary of the Space-Time Conservation Element and Solution Element (CESE) Method

    NASA Technical Reports Server (NTRS)

    Wang, Xiao-Yen J.

    2015-01-01

    The space-time Conservation Element and Solution Element (CESE) method for solving conservation laws is examined for its development motivation and design requirements. The characteristics of the resulting scheme are discussed. The discretization of the Euler equations is presented to show readers how to construct a scheme based on the CESE method. The differences and similarities between the CESE method and other traditional methods are discussed. The strengths and weaknesses of the method are also addressed.

  11. The non-coding RNA landscape of human hematopoiesis and leukemia.

    PubMed

    Schwarzer, Adrian; Emmrich, Stephan; Schmidt, Franziska; Beck, Dominik; Ng, Michelle; Reimer, Christina; Adams, Felix Ferdinand; Grasedieck, Sarah; Witte, Damian; Käbler, Sebastian; Wong, Jason W H; Shah, Anushi; Huang, Yizhou; Jammal, Razan; Maroz, Aliaksandra; Jongen-Lavrencic, Mojca; Schambach, Axel; Kuchenbauer, Florian; Pimanda, John E; Reinhardt, Dirk; Heckl, Dirk; Klusmann, Jan-Henning

    2017-08-09

    Non-coding RNAs have emerged as crucial regulators of gene expression and cell fate decisions. However, their expression patterns and regulatory functions during normal and malignant human hematopoiesis are incompletely understood. Here we present a comprehensive resource defining the non-coding RNA landscape of the human hematopoietic system. Based on highly specific non-coding RNA expression portraits per blood cell population, we identify unique fingerprint non-coding RNAs-such as LINC00173 in granulocytes-and assign these to critical regulatory circuits involved in blood homeostasis. Following the incorporation of acute myeloid leukemia samples into the landscape, we further uncover prognostically relevant non-coding RNA stem cell signatures shared between acute myeloid leukemia blasts and healthy hematopoietic stem cells. Our findings highlight the importance of the non-coding transcriptome in the formation and maintenance of the human blood hierarchy.While micro-RNAs are known regulators of haematopoiesis and leukemogenesis, the role of long non-coding RNAs is less clear. Here the authors provide a non-coding RNA expression landscape of the human hematopoietic system, highlighting their role in the formation and maintenance of the human blood hierarchy.

  12. Circular RNA: a new star in neurological diseases.

    PubMed

    Li, Tao-Ran; Jia, Yan-Jie; Wang, Qun; Shao, Xiao-Qiu; Lv, Rui-Juan

    2017-08-01

    Circular RNAs (circRNAs) are novel endogenous non-coding RNAs characterized by the presence of a covalent bond linking the 3' and 5' ends generated by backsplicing. In this review, we summarize a number of the latest theories regarding the biogenesis, properties and functions of circRNAs. Specifically, we focus on the advancing characteristics and functions of circRNAs in the brain and neurological diseases. CircRNAs exhibit the characteristics of species conservation, abundance and tissue/developmental-stage-specific expression in the brain. We also describe the relationship between circRNAs and several neurological diseases and highlight their functions in neurological diseases.

  13. The ribonucleoprotein Csr network.

    PubMed

    Seyll, Ethel; Van Melderen, Laurence

    2013-11-08

    Ribonucleoprotein complexes are essential regulatory components in bacteria. In this review, we focus on the carbon storage regulator (Csr) network, which is well conserved in the bacterial world. This regulatory network is composed of the CsrA master regulator, its targets and regulators. CsrA binds to mRNA targets and regulates translation either negatively or positively. Binding to small non-coding RNAs controls activity of this protein. Expression of these regulators is tightly regulated at the level of transcription and stability by various global regulators (RNAses, two-component systems, alarmone). We discuss the implications of these complex regulations in bacterial adaptation.

  14. The complete mitogenome of the Australian tadpole shrimp Triops australiensis (Spencer & Hall, 1895) (Crustacea: Branchiopoda: Notostraca).

    PubMed

    Gan, Han Ming; Tan, Mun Hua; Lee, Yin Peng; Austin, Christopher M

    2016-05-01

    The mitochondrial genome sequence of the Australian tadpole shrimp, Triops australiensis is presented (GenBank Accession Number: NC_024439) and compared with other Triops species. Triops australiensis has a mitochondrial genome of 15,125 base pairs consisting of 13 protein-coding genes, 2 ribosomal subunit genes, 22 transfer RNAs, and a non-coding AT-rich region. The T. australiensis mitogenome is composed of 36.4% A, 16.1% C, 12.3% G and 35.1% T. The mitogenome gene order conforms to the primitive arrangement for Branchiopod crustaceans, which is also conserved within the Pancrustacean.

  15. Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters.

    PubMed

    Javierre, Biola M; Burren, Oliver S; Wilder, Steven P; Kreuzhuber, Roman; Hill, Steven M; Sewitz, Sven; Cairns, Jonathan; Wingett, Steven W; Várnai, Csilla; Thiecke, Michiel J; Burden, Frances; Farrow, Samantha; Cutler, Antony J; Rehnström, Karola; Downes, Kate; Grassi, Luigi; Kostadima, Myrto; Freire-Pritchett, Paula; Wang, Fan; Stunnenberg, Hendrik G; Todd, John A; Zerbino, Daniel R; Stegle, Oliver; Ouwehand, Willem H; Frontini, Mattia; Wallace, Chris; Spivakov, Mikhail; Fraser, Peter

    2016-11-17

    Long-range interactions between regulatory elements and gene promoters play key roles in transcriptional regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Here, we use promoter capture Hi-C to identify interacting regions of 31,253 promoters in 17 human primary hematopoietic cell types. We show that promoter interactions are highly cell type specific and enriched for links between active promoters and epigenetically marked enhancers. Promoter interactomes reflect lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched in genetic variants linked with altered expression of genes they contact, highlighting their functional role. We exploit this rich resource to connect non-coding disease variants to putative target promoters, prioritizing thousands of disease-candidate genes and implicating disease pathways. Our results demonstrate the power of primary cell promoter interactomes to reveal insights into genomic regulatory mechanisms underlying common diseases. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  16. Long non-coding RNAs as molecular players in plant defense against pathogens.

    PubMed

    Zaynab, Madiha; Fatima, Mahpara; Abbas, Safdar; Umair, Muhammad; Sharif, Yasir; Raza, Muhammad Ammar

    2018-05-31

    Long non-coding RNAs (lncRNAs) has significant role in of gene expression and silencing pathways for several biological processes in eukaryotes. lncRNAs has been reported as key player in remodeling chromatin and genome architecture, RNA stabilization and transcription regulation, including enhancer-associated activity. Host lncRNAs are reckoned as compulsory elements of plant defense. In response to pathogen attack, plants protect themselves with the help of lncRNAs -dependent immune systems in which lncRNAs regulate pathogen-associated molecular patterns (PAMPs) and other effectors. Role of lncRNAs in plant microbe interaction has been studied extensively but regulations of several lncRNAs still need extensive research. In this study we discussed and provide as overview the topical advancements and findings relevant to pathogen attack and plant defense mediated by lncRNAs. It is hoped that lncRNAs would be exploited as a mainstream player to achieve food security by tackling different plant diseases. Copyright © 2018. Published by Elsevier Ltd.

  17. Destabilization of B2 RNA by EZH2 activates the stress response

    PubMed Central

    Zovoilis, Athanasios; Cifuentes-Rojas, Catherine; Chu, Hsueh-Ping; Hernandez, Alfredo J.; Lee, Jeannie T.

    2017-01-01

    SUMMARY More than 98% of the mammalian genome is noncoding and interspersed transposable elements account for ~50% of noncoding space. Here, we demonstrate that a specific interaction between the Polycomb protein, EZH2, and RNA made from B2 SINE retrotransposons controls stress-responsive genes in mouse cells. In the heat shock model, B2 RNA binds stress genes and suppresses their transcription. Upon stress, EZH2 is recruited and triggers cleavage of B2 RNA. B2 degradation in turn upregulates stress genes. Evidence indicates that B2 RNA operates as “speed bump” against advancement of RNA Polymerase II, and temperature stress releases the brakes on transcriptional elongation. These data attribute a new function to EZH2 that is independent of its histone methyltransferase activity and reconcile how EZH2 can be associated with both gene repression and activation. Our study reveals that EZH2 and B2 together control activation of a large network of genes involved in thermal stress. PMID:27984727

  18. Structural insights into the stabilization of MALAT1 noncoding RNA by a bipartite triple helix

    PubMed Central

    Brown, Jessica A.; Bulkley, David; Wang, Jimin; Valenstein, Max L.; Yario, Therese A.; Steitz, Thomas A.; Steitz, Joan A.

    2014-01-01

    Metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) is a highly-abundant nuclear long noncoding RNA that promotes malignancy. A 3′-stem-loop structure is predicted to confer stability by engaging a downstream A-rich tract in a triple helix, similar to the expression and nuclear retention element (ENE) from the KSHV polyadenylated nuclear RNA. The 3.1-Å resolution crystal structure of the human MALAT1 ENE and A-rich tract reveals a bipartite triple helix containing stacks of five and four U•A-U triples separated by a C+•G-C triplet and C-G doublet, extended by two A-minor interactions. In vivo decay assays indicate that this blunt-ended triple helix, with the 3′ nucleotide in a U•A-U triple, inhibits rapid nuclear RNA decay. Interruption of the triple helix by the C-G doublet induces a “helical reset” that explains why triple-helical stacks longer than six do not occur in nature. PMID:24952594

  19. Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations

    PubMed Central

    Araya, Carlos L.; Cenik, Can; Reuter, Jason A.; Kiss, Gert; Pande, Vijay S.; Snyder, Michael P.; Greenleaf, William J.

    2015-01-01

    Cancer sequencing studies have primarily identified cancer-driver genes by the accumulation of protein-altering mutations. An improved method would be annotation-independent, sensitive to unknown distributions of functions within proteins, and inclusive of non-coding drivers. We employed density-based clustering methods in 21 tumor types to detect variably-sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and non-coding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs reveal spatial clustering of mutations at molecular domains and interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated among tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally-agnostic driver identification. PMID:26691984

  20. Stable CoT-1 repeat RNA is abundant and associated with euchromatic interphase chromosomes

    PubMed Central

    Hall, Lisa L.; Carone, Dawn M.; Gomez, Alvin; Kolpa, Heather J.; Byron, Meg; Mehta, Nitish; Fackelmayer, Frank O.; Lawrence, Jeanne B.

    2014-01-01

    SUMMARY Recent studies recognize a vast diversity of non-coding RNAs with largely unknown functions, but few have examined interspersed repeat sequences, which constitute almost half our genome. RNA hybridization in situ using CoT-1 (highly repeated) DNA probes detects surprisingly abundant euchromatin-associated RNA comprised predominantly of repeat sequences (“CoT-1 RNA”), including LINE-1. CoT-1-hybridizing RNA strictly localizes to the interphase chromosome territory in cis, and remains stably associated with the chromosome territory following prolonged transcriptional inhibition. The CoT-1 RNA territory resists mechanical disruption and fractionates with the non-chromatin scaffold, but can be experimentally released. Loss of repeat-rich, stable nuclear RNAs from euchromatin corresponds to aberrant chromatin distribution and condensation. CoT-1 RNA has several properties similar to XIST chromosomal RNA, but is excluded from chromatin condensed by XIST. These findings impact two “black boxes” of genome science: the poorly understood diversity of non-coding RNA and the unexplained abundance of repetitive elements. PMID:24581492

  1. Advances in esophageal cancer: A new perspective on pathogenesis associated with long non-coding RNAs.

    PubMed

    Huang, Xiaomei; Zhou, Xi; Hu, Qing; Sun, Binyu; Deng, Mingming; Qi, Xiaolong; Lü, Muhan

    2018-01-28

    Esophageal cancer is a malignant digestive tract cancer with high mortality. Although studies have found that esophageal cancer is involved in a complex and important gene regulation network, the pathogenesis remains unclear. The recently described long non-coding RNAs (lncRNAs) are one effective part of the gene regulation network. However, in past decades, lncRNAs were thought to be "transcript noise" or "pseudogenes" and were thus ignored. Early studies indicated that lncRNAs play pivotal roles during evolution. However, in recent years, increasing research has revealed that many lncRNAs are associated with tumorigenesis. In particular, lncRNAs may act as important elements for epigenetic regulation, transcription, post-transcriptional regulation and post-translational modification of proteins. Additionally, they may be novel biomarkers for tumors and therapeutic targets in cancer. Here, we summarize the functions of lncRNAs in esophageal cancer, with an emphasis on lncRNA-mediated regulatory mechanisms that affect the biological characteristics of esophageal cancer. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Argonaute Proteins and Mechanisms of RNA Interference in Eukaryotes and Prokaryotes.

    PubMed

    Olina, A V; Kulbachinskiy, A V; Aravin, A A; Esyunina, D M

    2018-05-01

    Noncoding RNAs play essential roles in genetic regulation in all organisms. In eukaryotic cells, many small noncoding RNAs act in complex with Argonaute proteins and regulate gene expression by recognizing complementary RNA targets. The complexes of Argonaute proteins with small RNAs also play a key role in silencing of mobile genetic elements and, in some cases, viruses. These processes are collectively called RNA interference. RNA interference is a powerful tool for specific gene silencing in both basic research and therapeutic applications. Argonaute proteins are also found in prokaryotic organisms. Recent studies have shown that prokaryotic Argonautes can also cleave their target nucleic acids, in particular DNA. This activity of prokaryotic Argonautes might potentially be used to edit eukaryotic genomes. However, the molecular mechanisms of small nucleic acid biogenesis and the functions of Argonaute proteins, in particular in bacteria and archaea, remain largely unknown. Here we briefly review available data on the RNA interference processes and Argonaute proteins in eukaryotes and prokaryotes.

  3. Metazoan tRNA introns generate stable circular RNAs in vivo.

    PubMed

    Lu, Zhipeng; Filonov, Grigory S; Noto, John J; Schmidt, Casey A; Hatkevich, Talia L; Wen, Ying; Jaffrey, Samie R; Matera, A Gregory

    2015-09-01

    We report the discovery of a class of abundant circular noncoding RNAs that are produced during metazoan tRNA splicing. These transcripts, termed tRNA intronic circular (tric)RNAs, are conserved features of animal transcriptomes. Biogenesis of tricRNAs requires anciently conserved tRNA sequence motifs and processing enzymes, and their expression is regulated in an age-dependent and tissue-specific manner. Furthermore, we exploited this biogenesis pathway to develop an in vivo expression system for generating "designer" circular RNAs in human cells. Reporter constructs expressing RNA aptamers such as Spinach and Broccoli can be used to follow the transcription and subcellular localization of tricRNAs in living cells. Owing to the superior stability of circular vs. linear RNA isoforms, this expression system has a wide range of potential applications, from basic research to pharmaceutical science. © 2015 Lu et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  4. Transcriptional landscapes of Axolotl (Ambystoma mexicanum).

    PubMed

    Caballero-Pérez, Juan; Espinal-Centeno, Annie; Falcon, Francisco; García-Ortega, Luis F; Curiel-Quesada, Everardo; Cruz-Hernández, Andrés; Bako, Laszlo; Chen, Xuemei; Martínez, Octavio; Alberto Arteaga-Vázquez, Mario; Herrera-Estrella, Luis; Cruz-Ramírez, Alfredo

    2018-01-15

    The axolotl (Ambystoma mexicanum) is the vertebrate model system with the highest regeneration capacity. Experimental tools established over the past 100 years have been fundamental to start unraveling the cellular and molecular basis of tissue and limb regeneration. In the absence of a reference genome for the Axolotl, transcriptomic analysis become fundamental to understand the genetic basis of regeneration. Here we present one of the most diverse transcriptomic data sets for Axolotl by profiling coding and non-coding RNAs from diverse tissues. We reconstructed a population of 115,906 putative protein coding mRNAs as full ORFs (including isoforms). We also identified 352 conserved miRNAs and 297 novel putative mature miRNAs. Systematic enrichment analysis of gene expression allowed us to identify tissue-specific protein-coding transcripts. We also found putative novel and conserved microRNAs which potentially target mRNAs which are reported as important disease candidates in heart and liver. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Strategies and tools for whole genome alignments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas

    2002-11-25

    The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With amore » view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.« less

  6. Computational analysis of conserved RNA secondary structure in transcriptomes and genomes.

    PubMed

    Eddy, Sean R

    2014-01-01

    Transcriptomics experiments and computational predictions both enable systematic discovery of new functional RNAs. However, many putative noncoding transcripts arise instead from artifacts and biological noise, and current computational prediction methods have high false positive rates. I discuss prospects for improving computational methods for analyzing and identifying functional RNAs, with a focus on detecting signatures of conserved RNA secondary structure. An interesting new front is the application of chemical and enzymatic experiments that probe RNA structure on a transcriptome-wide scale. I review several proposed approaches for incorporating structure probing data into the computational prediction of RNA secondary structure. Using probabilistic inference formalisms, I show how all these approaches can be unified in a well-principled framework, which in turn allows RNA probing data to be easily integrated into a wide range of analyses that depend on RNA secondary structure inference. Such analyses include homology search and genome-wide detection of new structural RNAs.

  7. Long Noncoding RNA-Associated Transcriptomic Changes in Resiliency or Susceptibility to Depression and Response to Antidepressant Treatment

    PubMed Central

    Roy, Bhaskar; Wang, Qingzhong; Dwivedi, Yogesh

    2018-01-01

    Abstract Background Recent emergence of long noncoding RNAs in regulating gene expression and thereby modulating physiological functions in brain has manifested their possible role in psychiatric disorders. In this study, the roles of long noncoding RNAs in susceptibility and resiliency to develop stress-induced depression and their response to antidepressant treatment were examined. Methods Microarray-based transcriptome-wide changes in long noncoding RNAs were determined in hippocampus of male Holtzman rats who showed susceptibility (learned helplessness) or resiliency (nonlearned helplessness) to develop depression. Changes in long noncoding RNA expression were also ascertained after subchronic administration of fluoxetine to learned helplessness rats. Bioinformatic and target prediction analyses (cis- and trans-acting) and qPCR-based assays were performed to decipher the functional role of altered long noncoding RNAs. Results Group-wise comparison showed an overrepresented class of long noncoding RNAs that were uniquely associated with nonlearned helplessness or learned helplessness behavior. Chromosomal mapping within the 5-kbp flank region of the top 20 dysregulated long noncoding RNAs in the learned helplessness group showed several target genes that were regulated through cis- or trans-actions, including Zbtb20 and Zfp385b from zinc finger binding protein family. Genomic context of differentially expressed long noncoding RNAs showed an overall blunted response in the learned helplessness group regardless of the long noncoding RNA classes analyzed. Gene ontology exhibited the functional clustering for anatomical structure development, cellular architecture modulation, protein metabolism, and cellular communications. Fluoxetine treatment reversed learned helplessness-induced changes in many long noncoding RNAs and target genes. Conclusions The involvement of specific classes of long noncoding RNAs with distinctive roles in modulating target gene expression could confer the role of long noncoding RNAs in resiliency or susceptibility to develop depression with a reciprocal response to antidepressant treatment. PMID:29390069

  8. Promoter of lncRNA Gene PVT1 Is a Tumor-Suppressor DNA Boundary Element. | Office of Cancer Genomics

    Cancer.gov

    Noncoding mutations in cancer genomes are frequent but challenging to interpret. PVT1 encodes an oncogenic lncRNA, but recurrent translocations and deletions in human cancers suggest alternative mechanisms. Here, we show that the PVT1 promoter has a tumor-suppressor function that is independent of PVT1 lncRNA. CRISPR interference of PVT1 promoter enhances breast cancer cell competition and growth in vivo.

  9. A family of long intergenic non-coding RNA genes in human chromosomal region 22q11.2 carry a DNA translocation breakpoint/AT-rich sequence

    PubMed Central

    2018-01-01

    FAM230C, a long intergenic non-coding RNA (lincRNA) gene in human chromosome 13 (chr13) is a member of lincRNA genes termed family with sequence similarity 230. An analysis using bioinformatics search tools and alignment programs was undertaken to determine properties of FAM230C and its related genes. Results reveal that the DNA translocation element, the Translocation Breakpoint Type A (TBTA) sequence, which consists of satellite DNA, Alu elements, and AT-rich sequences is embedded in the FAM230C gene. Eight lincRNA genes related to FAM230C also carry the TBTA sequences. These genes were formed from a large segment of the 3’ half of the FAM230C sequence duplicated in chr22, and are specifically in regions of low copy repeats (LCR22)s, in or close to the 22q.11.2 region. 22q11.2 is a chromosomal segment that undergoes a high rate of DNA translocation and is prone to genetic deletions. FAM230C-related genes present in other chromosomes do not carry the TBTA motif and were formed from the 5’ half region of the FAM230C sequence. These findings identify a high specificity in lincRNA gene formation by gene sequence duplication in different chromosomes. PMID:29668722

  10. Target Discovery for Precision Medicine Using High-Throughput Genome Engineering.

    PubMed

    Guo, Xinyi; Chitale, Poonam; Sanjana, Neville E

    2017-01-01

    Over the past few years, programmable RNA-guided nucleases such as the CRISPR/Cas9 system have ushered in a new era of precision genome editing in diverse model systems and in human cells. Functional screens using large libraries of RNA guides can interrogate a large hypothesis space to pinpoint particular genes and genetic elements involved in fundamental biological processes and disease-relevant phenotypes. Here, we review recent high-throughput CRISPR screens (e.g. loss-of-function, gain-of-function, and targeting noncoding elements) and highlight their potential for uncovering novel therapeutic targets, such as those involved in cancer resistance to small molecular drugs and immunotherapies, tumor evolution, infectious disease, inborn genetic disorders, and other therapeutic challenges.

  11. A new numerical framework for solving conservation laws: The method of space-time conservation element and solution element

    NASA Technical Reports Server (NTRS)

    Chang, Sin-Chung; To, Wai-Ming

    1991-01-01

    A new numerical framework for solving conservation laws is being developed. It employs: (1) a nontraditional formulation of the conservation laws in which space and time are treated on the same footing, and (2) a nontraditional use of discrete variables such as numerical marching can be carried out by using a set of relations that represents both local and global flux conservation.

  12. Noncoding Subgenomic Flavivirus RNA Is Processed by the Mosquito RNA Interference Machinery and Determines West Nile Virus Transmission by Culex pipiens Mosquitoes.

    PubMed

    Göertz, G P; Fros, J J; Miesen, P; Vogels, C B F; van der Bent, M L; Geertsema, C; Koenraadt, C J M; van Rij, R P; van Oers, M M; Pijlman, G P

    2016-11-15

    Flaviviruses, such as Zika virus, yellow fever virus, dengue virus, and West Nile virus (WNV), are a serious concern for human health. Flaviviruses produce an abundant noncoding subgenomic flavivirus RNA (sfRNA) in infected cells. sfRNA results from stalling of the host 5'-3' exoribonuclease XRN1/Pacman on conserved RNA structures in the 3' untranslated region (UTR) of the viral genomic RNA. sfRNA production is conserved in insect-specific, mosquito-borne, and tick-borne flaviviruses and flaviviruses with no known vector, suggesting a pivotal role for sfRNA in the flavivirus life cycle. Here, we investigated the function of sfRNA during WNV infection of Culex pipiens mosquitoes and evaluated its role in determining vector competence. An sfRNA1-deficient WNV was generated that displayed growth kinetics similar to those of wild-type WNV in both RNA interference (RNAi)-competent and -compromised mosquito cell lines. Small-RNA deep sequencing of WNV-infected mosquitoes indicated an active small interfering RNA (siRNA)-based antiviral response for both the wild-type and sfRNA1-deficient viruses. Additionally, we provide the first evidence that sfRNA is an RNAi substrate in vivo Two reproducible small-RNA hot spots within the 3' UTR/sfRNA of the wild-type virus mapped to RNA stem-loops SL-III and 3' SL, which stick out of the three-dimensional (3D) sfRNA structure model. Importantly, we demonstrate that sfRNA-deficient WNV displays significantly decreased infection and transmission rates in vivo when administered via the blood meal. Finally, we show that transmission and infection rates are not affected by sfRNA after intrathoracic injection, thereby identifying sfRNA as a key driver to overcome the mosquito midgut infection barrier. This is the first report to describe a key biological function of sfRNA for flavivirus infection of the arthropod vector, providing an explanation for the strict conservation of sfRNA production. Understanding the flavivirus transmission cycle is important to identify novel targets to interfere with disease and to aid development of virus control strategies. Flaviviruses produce an abundant noncoding viral RNA called sfRNA in both arthropod and mammalian cells. To evaluate the role of sfRNA in flavivirus transmission, we infected mosquitoes with the flavivirus West Nile virus and an sfRNA-deficient mutant West Nile virus. We demonstrate that sfRNA determines the infection and transmission rates of West Nile virus in Culex pipiens mosquitoes. Comparison of infection via the blood meal versus intrathoracic injection, which bypasses the midgut, revealed that sfRNA is important to overcome the mosquito midgut barrier. We also show that sfRNA is processed by the antiviral RNA interference machinery in mosquitoes. This is the first report to describe a pivotal biological function of sfRNA in arthropods. The results explain why sfRNA production is evolutionarily conserved. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  13. Noncoding Subgenomic Flavivirus RNA Is Processed by the Mosquito RNA Interference Machinery and Determines West Nile Virus Transmission by Culex pipiens Mosquitoes

    PubMed Central

    Göertz, G. P.; Fros, J. J.; Miesen, P.; Vogels, C. B. F.; van der Bent, M. L.; Geertsema, C.; Koenraadt, C. J. M.; van Oers, M. M.

    2016-01-01

    ABSTRACT Flaviviruses, such as Zika virus, yellow fever virus, dengue virus, and West Nile virus (WNV), are a serious concern for human health. Flaviviruses produce an abundant noncoding subgenomic flavivirus RNA (sfRNA) in infected cells. sfRNA results from stalling of the host 5′-3′ exoribonuclease XRN1/Pacman on conserved RNA structures in the 3′ untranslated region (UTR) of the viral genomic RNA. sfRNA production is conserved in insect-specific, mosquito-borne, and tick-borne flaviviruses and flaviviruses with no known vector, suggesting a pivotal role for sfRNA in the flavivirus life cycle. Here, we investigated the function of sfRNA during WNV infection of Culex pipiens mosquitoes and evaluated its role in determining vector competence. An sfRNA1-deficient WNV was generated that displayed growth kinetics similar to those of wild-type WNV in both RNA interference (RNAi)-competent and -compromised mosquito cell lines. Small-RNA deep sequencing of WNV-infected mosquitoes indicated an active small interfering RNA (siRNA)-based antiviral response for both the wild-type and sfRNA1-deficient viruses. Additionally, we provide the first evidence that sfRNA is an RNAi substrate in vivo. Two reproducible small-RNA hot spots within the 3′ UTR/sfRNA of the wild-type virus mapped to RNA stem-loops SL-III and 3′ SL, which stick out of the three-dimensional (3D) sfRNA structure model. Importantly, we demonstrate that sfRNA-deficient WNV displays significantly decreased infection and transmission rates in vivo when administered via the blood meal. Finally, we show that transmission and infection rates are not affected by sfRNA after intrathoracic injection, thereby identifying sfRNA as a key driver to overcome the mosquito midgut infection barrier. This is the first report to describe a key biological function of sfRNA for flavivirus infection of the arthropod vector, providing an explanation for the strict conservation of sfRNA production. IMPORTANCE Understanding the flavivirus transmission cycle is important to identify novel targets to interfere with disease and to aid development of virus control strategies. Flaviviruses produce an abundant noncoding viral RNA called sfRNA in both arthropod and mammalian cells. To evaluate the role of sfRNA in flavivirus transmission, we infected mosquitoes with the flavivirus West Nile virus and an sfRNA-deficient mutant West Nile virus. We demonstrate that sfRNA determines the infection and transmission rates of West Nile virus in Culex pipiens mosquitoes. Comparison of infection via the blood meal versus intrathoracic injection, which bypasses the midgut, revealed that sfRNA is important to overcome the mosquito midgut barrier. We also show that sfRNA is processed by the antiviral RNA interference machinery in mosquitoes. This is the first report to describe a pivotal biological function of sfRNA in arthropods. The results explain why sfRNA production is evolutionarily conserved. PMID:27581979

  14. miPrimer: an empirical-based qPCR primer design method for small noncoding microRNA

    PubMed Central

    Kang, Shih-Ting; Hsieh, Yi-Shan; Feng, Chi-Ting; Chen, Yu-Ting; Yang, Pok Eric; Chen, Wei-Ming

    2018-01-01

    MicroRNAs (miRNAs) are 18–25 nucleotides (nt) of highly conserved, noncoding RNAs involved in gene regulation. Because of miRNAs’ short length, the design of miRNA primers for PCR amplification remains a significant challenge. Adding to the challenge are miRNAs similar in sequence and miRNA family members that often only differ in sequences by 1 nt. Here, we describe a novel empirical-based method, miPrimer, which greatly reduces primer dimerization and increases primer specificity by factoring various intrinsic primer properties and employing four primer design strategies. The resulting primer pairs displayed an acceptable qPCR efficiency of between 90% and 110%. When tested on miRNA families, miPrimer-designed primers are capable of discriminating among members of miRNA families, as validated by qPCR assays using Quark Biosciences’ platform. Of the 120 miRNA primer pairs tested, 95.6% and 93.3% were successful in amplifying specifically non-family and family miRNA members, respectively, after only one design trial. In summary, miPrimer provides a cost-effective and valuable tool for designing miRNA primers. PMID:29208706

  15. Upregulation of Haploinsufficient Gene Expression in the Brain by Targeting a Long Non-coding RNA Improves Seizure Phenotype in a Model of Dravet Syndrome.

    PubMed

    Hsiao, J; Yuan, T Y; Tsai, M S; Lu, C Y; Lin, Y C; Lee, M L; Lin, S W; Chang, F C; Liu Pimentel, H; Olive, C; Coito, C; Shen, G; Young, M; Thorne, T; Lawrence, M; Magistri, M; Faghihi, M A; Khorkova, O; Wahlestedt, C

    2016-07-01

    Dravet syndrome is a devastating genetic brain disorder caused by heterozygous loss-of-function mutation in the voltage-gated sodium channel gene SCN1A. There are currently no treatments, but the upregulation of SCN1A healthy allele represents an appealing therapeutic strategy. In this study we identified a novel, evolutionary conserved mechanism controlling the expression of SCN1A that is mediated by an antisense non-coding RNA (SCN1ANAT). Using oligonucleotide-based compounds (AntagoNATs) targeting SCN1ANAT we were able to induce specific upregulation of SCN1A both in vitro and in vivo, in the brain of Dravet knock-in mouse model and a non-human primate. AntagoNAT-mediated upregulation of Scn1a in postnatal Dravet mice led to significant improvements in seizure phenotype and excitability of hippocampal interneurons. These results further elucidate the pathophysiology of Dravet syndrome and outline a possible new approach for the treatment of this and other genetic disorders with similar etiology. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  16. A long non-coding RNA, LncMyoD, regulates skeletal muscle differentiation by blocking IMP2-mediated mRNA translation.

    PubMed

    Gong, Chenguang; Li, Zhizhong; Ramanujan, Krishnan; Clay, Ieuan; Zhang, Yunyu; Lemire-Brachat, Sophie; Glass, David J

    2015-07-27

    Increasing evidence suggests that long non-coding RNAs (LncRNAs) represent a new class of regulators of stem cells. However, the roles of LncRNAs in stem cell maintenance and myogenesis remain largely unexamined. For this study, hundreds of intergenic LncRNAs were identified that are expressed in myoblasts and regulated during differentiation. One of these LncRNAs, termed LncMyoD, is encoded next to the Myod gene and is directly activated by MyoD during myoblast differentiation. Knockdown of LncMyoD strongly inhibits terminal muscle differentiation, largely due to a failure to exit the cell cycle. LncMyoD directly binds to IGF2-mRNA-binding protein 2 (IMP2) and negatively regulates IMP2-mediated translation of proliferation genes such as N-Ras and c-Myc. While the RNA sequence of LncMyoD is not well conserved between human and mouse, its locus, gene structure, and function are preserved. The MyoD-LncMyoD-IMP2 pathway elucidates a mechanism as to how MyoD blocks proliferation to create a permissive state for differentiation. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. Deciphering the transcriptional cis-regulatory code.

    PubMed

    Yáñez-Cuna, J Omar; Kvon, Evgeny Z; Stark, Alexander

    2013-01-01

    Information about developmental gene expression resides in defined regulatory elements, called enhancers, in the non-coding part of the genome. Although cells reliably utilize enhancers to orchestrate gene expression, a cis-regulatory code that would allow their interpretation has remained one of the greatest challenges of modern biology. In this review, we summarize studies from the past three decades that describe progress towards revealing the properties of enhancers and discuss how recent approaches are providing unprecedented insights into regulatory elements in animal genomes. Over the next years, we believe that the functional characterization of regulatory sequences in entire genomes, combined with recent computational methods, will provide a comprehensive view of genomic regulatory elements and their building blocks and will enable researchers to begin to understand the sequence basis of the cis-regulatory code. Copyright © 2012 Elsevier Ltd. All rights reserved.

  18. The Evolution of Bony Vertebrate Enhancers at Odds with Their Coding Sequence Landscape.

    PubMed

    Yousaf, Aisha; Sohail Raza, Muhammad; Ali Abbasi, Amir

    2015-08-06

    Enhancers lie at the heart of transcriptional and developmental gene regulation. Therefore, changes in enhancer sequences usually disrupt the target gene expression and result in disease phenotypes. Despite the well-established role of enhancers in development and disease, evolutionary sequence studies are lacking. The current study attempts to unravel the puzzle of bony vertebrates' conserved noncoding elements (CNE) enhancer evolution. Bayesian phylogenetics of enhancer sequences spotlights promising interordinal relationships among placental mammals, proposing a closer relationship between humans and laurasiatherians while placing rodents at the basal position. Clock-based estimates of enhancer evolution provided a dynamic picture of interspecific rate changes across the bony vertebrate lineage. Moreover, coelacanth in the study augmented our appreciation of the vertebrate cis-regulatory evolution during water-land transition. Intriguingly, we observed a pronounced upsurge in enhancer evolution in land-dwelling vertebrates. These novel findings triggered us to further investigate the evolutionary trend of coding as well as CNE nonenhancer repertoires, to highlight the relative evolutionary dynamics of diverse genomic landscapes. Surprisingly, the evolutionary rates of enhancer sequences were clearly at odds with those of the coding and the CNE nonenhancer sequences during vertebrate adaptation to land, with land vertebrates exhibiting significantly reduced rates of coding sequence evolution in comparison to their fast evolving regulatory landscape. The observed variation in tetrapod cis-regulatory elements caused the fine-tuning of associated gene regulatory networks. Therefore, the increased evolutionary rate of tetrapods' enhancer sequences might be responsible for the variation in developmental regulatory circuits during the process of vertebrate adaptation to land. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  19. An expanding universe of noncoding RNAs between the poles of basic science and clinical investigations.

    PubMed

    Weil, Patrick P; Hensel, Kai O; Weber, David; Postberg, Jan

    2016-03-01

    The Keystone Symposium 'MicroRNAs and Noncoding RNAs in Cancer', Keystone, CO, USA, 7-12 June 2015 Since the discovery of RNAi, great efforts have been undertaken to unleash the potential biomedical applicability of small noncoding RNAs, mainly miRNAs, involving their use as biomarkers for personalized diagnostics or their usability as active agents or therapy targets. The research's focus on the noncoding RNA world is now slowly moving from a phase of basic discoveries into a new phase, where every single molecule out of many hundreds of cataloged noncoding RNAs becomes dissected in order to investigate these molecules' biomedical relevance. In addition, RNA classes neglected before, such as long noncoding RNAs or circular RNAs attract more attention. Numerous timely results and hypotheses were presented at the 2015 Keystone Symposium 'MicroRNAs and Noncoding RNAs in Cancer'.

  20. Facts and updates about cardiovascular non-coding RNAs in heart failure.

    PubMed

    Thum, Thomas

    2015-09-01

    About 11% of all deaths include heart failure as a contributing cause. The annual cost of heart failure amounts to US $34,000,000,000 in the United States alone. With the exception of heart transplantation, there is no curative therapy available. Only occasionally there are new areas in science that develop into completely new research fields. The topic on non-coding RNAs, including microRNAs, long non-coding RNAs, and circular RNAs, is such a field. In this short review, we will discuss the latest developments about non-coding RNAs in cardiovascular disease. MicroRNAs are short regulatory non-coding endogenous RNA species that are involved in virtually all cellular processes. Long non-coding RNAs also regulate gene and protein levels; however, by much more complicated and diverse mechanisms. In general, non-coding RNAs have been shown to be of great value as therapeutic targets in adverse cardiac remodelling and also as diagnostic and prognostic biomarkers for heart failure. In the future, non-coding RNA-based therapeutics are likely to enter the clinical reality offering a new treatment approach of heart failure.

  1. Noncoding RNAs in human intervertebral disc degeneration: An integrated microarray study.

    PubMed

    Liu, Xu; Che, Lu; Xie, Yan-Ke; Hu, Qing-Jie; Ma, Chi-Jiao; Pei, Yan-Jun; Wu, Zhi-Gang; Liu, Zhi-Heng; Fan, Li-Ying; Wang, Hai-Qiang

    2015-09-01

    Accumulating evidence indicates that noncoding RNAs play important roles in a multitude of biological processes. The striking findings of miRNAs (microRNAs) and lncRNAs (long noncoding RNAs) as members of noncoding RNAs open up an exciting era in the studies of gene regulation. More recently, the reports of circRNAs (circular RNAs) add fuel to the noncoding RNAs research. Human intervertebral disc degeneration (IDD) is a main cause of low back pain as a disabling spinal disease. We have addressed the expression profiles if miRNAs, lncRNAs and mRNAs in IDD (Wang et al., J Pathology, 2011 and Wan et al., Arthritis Res Ther, 2014). Furthermore, we thoroughly analysed noncoding RNAs, including miRNAs, lncRNAs and circRNAs in IDD using the very same samples. Here we delineate in detail the contents of the aforementioned microarray analyses. Microarray and sample annotation data were deposited in GEO under accession number GSE67567 as SuperSeries. The integrated analyses of these noncoding RNAs will shed a novel light on coding-noncoding regulatory machinery.

  2. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes.

    PubMed

    Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou

    2011-11-01

    Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.

  3. Evolution of the unspliced transcriptome.

    PubMed

    Engelhardt, Jan; Stadler, Peter F

    2015-08-20

    Despite their abundance, unspliced EST data have received little attention as a source of information on non-coding RNAs. Very little is know, therefore, about the genomic distribution of unspliced non-coding transcripts and their relationship with the much better studied regularly spliced products. In particular, their evolution has remained virtually unstudied. We systematically study the evidence on unspliced transcripts available in EST annotation tracks for human and mouse, comprising 104,980 and 66,109 unspliced EST clusters, respectively. Roughly one third of these are located totally inside introns of known genes (TINs) and another third overlaps exonic regions (PINs). Eleven percent are "intergenic", far away from any annotated gene. Direct evidence for the independent transcription of many PINs and TINs is obtained from CAGE tag and chromatin data. We predict more than 2000 3'UTR-associated RNA candidates for each human and mouse. Fifteen to twenty percent of the unspliced EST cluster are conserved between human and mouse. With the exception of TINs, the sequences of unspliced EST clusters evolve significantly slower than genomic background. Furthermore, like spliced lincRNAs, they show highly tissue-specific expression patterns. Unspliced long non-coding RNAs are an important, rapidly evolving, component of mammalian transcriptomes. Their analysis is complicated by their preferential association with complex transcribed loci that usually also harbor a plethora of spliced transcripts. Unspliced EST data, although typically disregarded in transcriptome analysis, can be used to gain insights into this rarely investigated transcriptome component. The frequently postulated connection between lack of splicing and nuclear retention and the surprising overlap of chromatin-associated transcripts suggests that this class of transcripts might be involved in chromatin organization and possibly other mechanisms of epigenetic control.

  4. Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome.

    PubMed

    Bush, Stephen J; Muriuki, Charity; McCulloch, Mary E B; Farquhar, Iseabail L; Clark, Emily L; Hume, David A

    2018-04-24

    mRNA-like long non-coding RNAs (lncRNAs) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. Thus, in many cases lncRNA detection by RNA-sequencing (RNA-seq) is compromised by stochastic sampling. To account for this and create a catalogue of ruminant lncRNAs, we compared de novo assembled lncRNAs derived from large RNA-seq datasets in transcriptional atlas projects for sheep and goats with previous lncRNAs assembled in cattle and human. We then combined the novel lncRNAs with the sheep transcriptional atlas to identify co-regulated sets of protein-coding and non-coding loci. Few lncRNAs could be reproducibly assembled from a single dataset, even with deep sequencing of the same tissues from multiple animals. Furthermore, there was little sequence overlap between lncRNAs that were assembled from pooled RNA-seq data. We combined positional conservation (synteny) with cross-species mapping of candidate lncRNAs to identify a consensus set of ruminant lncRNAs and then used the RNA-seq data to demonstrate detectable and reproducible expression in each species. In sheep, 20 to 30% of lncRNAs were located close to protein-coding genes with which they are strongly co-expressed, which is consistent with the evolutionary origin of some ncRNAs in enhancer sequences. Nevertheless, most of the lncRNAs are not co-expressed with neighbouring protein-coding genes. Alongside substantially expanding the ruminant lncRNA repertoire, the outcomes of our analysis demonstrate that stochastic sampling can be partly overcome by combining RNA-seq datasets from related species. This has practical implications for the future discovery of lncRNAs in other species.

  5. A transcriptional serenAID: the role of noncoding RNAs in class switch recombination

    PubMed Central

    Yewdell, William T.; Chaudhuri, Jayanta

    2017-01-01

    Abstract During an immune response, activated B cells may undergo class switch recombination (CSR), a molecular rearrangement that allows B cells to switch from expressing IgM and IgD to a secondary antibody heavy chain isotype such as IgG, IgA or IgE. Secondary antibody isotypes provide the adaptive immune system with distinct effector functions to optimally combat various pathogens. CSR occurs between repetitive DNA elements within the immunoglobulin heavy chain (Igh) locus, termed switch (S) regions and requires the DNA-modifying enzyme activation-induced cytidine deaminase (AID). AID-mediated DNA deamination within S regions initiates the formation of DNA double-strand breaks, which serve as biochemical beacons for downstream DNA repair pathways that coordinate the ligation of DNA breaks. Myriad factors contribute to optimal AID targeting; however, many of these factors also localize to genomic regions outside of the Igh locus. Thus, a current challenge is to explain the specific targeting of AID to the Igh locus. Recent studies have implicated noncoding RNAs in CSR, suggesting a provocative mechanism that incorporates Igh-specific factors to enable precise AID targeting. Here, we chronologically recount the rich history of noncoding RNAs functioning in CSR to provide a comprehensive context for recent and future discoveries. We present a model for the RNA-guided targeting of AID that attempts to integrate historical and recent findings, and highlight potential caveats. Lastly, we discuss testable hypotheses ripe for current experimentation, and explore promising ideas for future investigations. PMID:28535205

  6. SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.

    2002-01-01

    Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs inmore » gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.« less

  7. Refined mapping of autoimmune disease associated genetic variants with gene expression suggests an important role for non-coding RNAs.

    PubMed

    Ricaño-Ponce, Isis; Zhernakova, Daria V; Deelen, Patrick; Luo, Oscar; Li, Xingwang; Isaacs, Aaron; Karjalainen, Juha; Di Tommaso, Jennifer; Borek, Zuzanna Agnieszka; Zorro, Maria M; Gutierrez-Achury, Javier; Uitterlinden, Andre G; Hofman, Albert; van Meurs, Joyce; Netea, Mihai G; Jonkers, Iris H; Withoff, Sebo; van Duijn, Cornelia M; Li, Yang; Ruan, Yijun; Franke, Lude; Wijmenga, Cisca; Kumar, Vinod

    2016-04-01

    Genome-wide association and fine-mapping studies in 14 autoimmune diseases (AID) have implicated more than 250 loci in one or more of these diseases. As more than 90% of AID-associated SNPs are intergenic or intronic, pinpointing the causal genes is challenging. We performed a systematic analysis to link 460 SNPs that are associated with 14 AID to causal genes using transcriptomic data from 629 blood samples. We were able to link 71 (39%) of the AID-SNPs to two or more nearby genes, providing evidence that for part of the AID loci multiple causal genes exist. While 54 of the AID loci are shared by one or more AID, 17% of them do not share candidate causal genes. In addition to finding novel genes such as ULK3, we also implicate novel disease mechanisms and pathways like autophagy in celiac disease pathogenesis. Furthermore, 42 of the AID SNPs specifically affected the expression of 53 non-coding RNA genes. To further understand how the non-coding genome contributes to AID, the SNPs were linked to functional regulatory elements, which suggest a model where AID genes are regulated by network of chromatin looping/non-coding RNAs interactions. The looping model also explains how a causal candidate gene is not necessarily the gene closest to the AID SNP, which was the case in nearly 50% of cases. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  8. DDM1 represses noncoding RNA expression and RNA-directed DNA methylation in heterochromatin.

    PubMed

    Tan, Feng; Lu, Yue; Jiang, Wei; Zhao, Yu; Wu, Tian; Zhang, Ruoyu; Zhou, Dao-Xiu

    2018-05-24

    Cytosine methylation of DNA, which occurs at CG, CHG, and CHH (H=A, C, or T) sequences in plants, is a hallmark for epigenetic repression of repetitive sequences. The chromatin remodeling factor DECREASE IN DNA METHYLATION1 (DDM1) is essential for DNA methylation, especially at CG and CHG sequences. However, its potential role in RNA-directed DNA methylation (RdDM) and in chromatin function is not completely understood in rice (Oryza sativa). In this work, we used high-throughput approaches to study the function of rice DDM1 (OsDDM1) in RdDM and the expression of non-coding RNA (ncRNA). We show that loss of function of OsDDM1 results in ectopic CHH methylation of transposable elements and repeats. The ectopic CHH methylation was dependent on rice DOMAINS REARRANGED METHYLTRANSFERASE2 (OsDRM2), a DNA methyltransferase involved in RdDM. Mutations in OsDDM1 lead to decreases of histone H3K9me2 and increases in the levels of heterochromatic small RNA (sRNA) and long noncoding RNA (lncRNA). In particular, OsDDM1 was found to be essential to repress transcription of the two repetitive sequences, Centromeric Retrotransposons of Rice1 (CRR1) and the dominant centromeric CentO repeats. These results suggest that OsDDM1 antagonizes RdDM at heterochromatin and represses tissue-specific expression of ncRNA from repetitive sequences in the rice genome. {copyright, serif} 2018 American Society of Plant Biologists. All rights reserved.

  9. High-Resolution Genuinely Multidimensional Solution of Conservation Laws by the Space-Time Conservation Element and Solution Element Method

    NASA Technical Reports Server (NTRS)

    Himansu, Ananda; Chang, Sin-Chung; Yu, Sheng-Tao; Wang, Xiao-Yen; Loh, Ching-Yuen; Jorgenson, Philip C. E.

    1999-01-01

    In this overview paper, we review the basic principles of the method of space-time conservation element and solution element for solving the conservation laws in one and two spatial dimensions. The present method is developed on the basis of local and global flux conservation in a space-time domain, in which space and time are treated in a unified manner. In contrast to the modern upwind schemes, the approach here does not use the Riemann solver and the reconstruction procedure as the building blocks. The drawbacks of the upwind approach, such as the difficulty of rationally extending the 1D scalar approach to systems of equations and particularly to multiple dimensions is here contrasted with the uniformity and ease of generalization of the Conservation Element and Solution Element (CE/SE) 1D scalar schemes to systems of equations and to multiple spatial dimensions. The assured compatibility with the simplest type of unstructured meshes, and the uniquely simple nonreflecting boundary conditions of the present method are also discussed. The present approach has yielded high-resolution shocks, rarefaction waves, acoustic waves, vortices, ZND detonation waves, and shock/acoustic waves/vortices interactions. Moreover, since no directional splitting is employed, numerical resolution of two-dimensional calculations is comparable to that of the one-dimensional calculations. Some sample applications displaying the strengths and broad applicability of the CE/SE method are reviewed.

  10. A 3-dimensional mass conserving element for compressible flows

    NASA Technical Reports Server (NTRS)

    Fix, G.; Suri, M.

    1985-01-01

    A variety of finite element schemes has been used in the numerical approximation of compressible flows particularly in underwater acoustics. In many instances instabilities have been generated due to the lack of mass conservation. Two- and three-dimensional elements are developed which avoid these problems.

  11. Identification of evolutionarily conserved Momordica charantia microRNAs using computational approach and its utility in phylogeny analysis.

    PubMed

    Thirugnanasambantham, Krishnaraj; Saravanan, Subramanian; Karikalan, Kulandaivelu; Bharanidharan, Rajaraman; Lalitha, Perumal; Ilango, S; HairulIslam, Villianur Ibrahim

    2015-10-01

    Momordica charantia (bitter gourd, bitter melon) is a monoecious Cucurbitaceae with anti-oxidant, anti-microbial, anti-viral and anti-diabetic potential. Molecular studies on this economically valuable plant are very essential to understand its phylogeny and evolution. MicroRNAs (miRNAs) are conserved, small, non-coding RNA with ability to regulate gene expression by bind the 3' UTR region of target mRNA and are evolved at different rates in different plant species. In this study we have utilized homology based computational approach and identified 27 mature miRNAs for the first time from this bio-medically important plant. The phylogenetic tree developed from binary data derived from the data on presence/absence of the identified miRNAs were noticed to be uncertain and biased. Most of the identified miRNAs were highly conserved among the plant species and sequence based phylogeny analysis of miRNAs resolved the above difficulties in phylogeny approach using miRNA. Predicted gene targets of the identified miRNAs revealed their importance in regulation of plant developmental process. Reported miRNAs held sequence conservation in mature miRNAs and the detailed phylogeny analysis of pre-miRNA sequences revealed genus specific segregation of clusters. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Rare pseudoautosomal copy-number variations involving SHOX and/or its flanking regions in individuals with and without short stature.

    PubMed

    Fukami, Maki; Naiki, Yasuhiro; Muroya, Koji; Hamajima, Takashi; Soneda, Shun; Horikawa, Reiko; Jinno, Tomoko; Katsumi, Momori; Nakamura, Akie; Asakura, Yumi; Adachi, Masanori; Ogata, Tsutomu; Kanzaki, Susumu

    2015-09-01

    Pseudoautosomal region 1 (PAR1) contains SHOX, in addition to seven highly conserved non-coding DNA elements (CNEs) with cis-regulatory activity. Microdeletions involving SHOX exons 1-6a and/or the CNEs result in idiopathic short stature (ISS) and Leri-Weill dyschondrosteosis (LWD). Here, we report six rare copy-number variations (CNVs) in PAR1 identified through copy-number analyzes of 245 ISS/LWD patients and 15 unaffected individuals. The six CNVs consisted of three microduplications encompassing SHOX and some of the CNEs, two microduplications in the SHOX 3'-region affecting one or four of the downstream CNEs, and a microdeletion involving SHOX exon 6b and its neighboring CNE. The amplified DNA fragments of two SHOX-containing duplications were detected at chromosomal regions adjacent to the original positions. The breakpoints of a SHOX-containing duplication resided within Alu repeats. A microduplication encompassing four downstream CNEs was identified in an unaffected father-daughter pair, whereas the other five CNVs were detected in ISS patients. These results suggest that microduplications involving SHOX cause ISS by disrupting the cis-regulatory machinery of this gene and that at least some of microduplications in PAR1 arise from Alu-mediated non-allelic homologous recombination. The pathogenicity of other rare PAR1-linked CNVs, such as CNE-containing microduplications and exon 6b-flanking microdeletions, merits further investigation.

  13. The SHOX region and its mutations.

    PubMed

    Capone, L; Iughetti, L; Sabatini, S; Bacciaglia, A; Forabosco, A

    2010-06-01

    The short stature homeobox-containing (SHOX) gene lies in the pseudoautosomal region 1 (PAR1) that comprises 2.6 Mb of the short-arm tips of both the X and Y chromosomes. It is known that its heterozygous mutations cause Leri-Weill dyschondrosteosis (LWD) (OMIM #127300), while its homozygous mutations cause a severe form of dwarfism known as Langer mesomelic dysplasia (LMD) (OMIM #249700). The analysis of 238 LWD patients between 1998 and 2007 by multiple authors shows a prevalence of deletions (46.4%) compared to point mutations (21.2%). On the whole, deletions and point mutations account for about 67% of LWD patients. SHOX is located within a 1000 kb desert region without genes. The comparative genomic analysis of this region between genomes of different vertebrates has led to the identification of evolutionarily conserved non-coding DNA elements (CNE). Further functional studies have shown that one of these CNE downstream of the SHOX gene is necessary for the expression of SHOX; this is considered to be typical "enhancer" activity. Including the enhancer, the overall mutation of the SHOX region in LWD patients does not hold in 100% of cases. Various authors have demonstrated the existence of other CNE both downstream and upstream of SHOX regions. The resulting conclusion is that it is necessary to reanalyze all LWD/LMD patients without SHOX mutations for the presence of mutations in the 5'- and 3'-flanking SHOX regions.

  14. Structural basis for MTR4-ZCCHC8 interactions that stimulate the MTR4 helicase in the nuclear exosome-targeting complex.

    PubMed

    Puno, M Rhyan; Lima, Christopher D

    2018-06-12

    The nuclear exosome-targeting (NEXT) complex functions as an RNA exosome cofactor and is involved in surveillance and turnover of aberrant transcripts and noncoding RNAs. NEXT is a ternary complex composed of the RNA-binding protein RBM7, the scaffold zinc-knuckle protein ZCCHC8, and the helicase MTR4. While RNA interactions with RBM7 are known, it remains unclear how NEXT subunits collaborate to recognize and prepare substrates for degradation. Here, we show that MTR4 helicase activity is enhanced when associated with RBM7 and ZCCHC8. While uridine-rich substrates interact with RBM7 and are preferred, optimal activity is observed when substrates include a polyadenylated 3' end. We identify a bipartite interaction of ZCCHC8 with MTR4 and uncover a role for the conserved C-terminal domain of ZCCHC8 in stimulating MTR4 helicase and ATPase activities. A crystal structure reveals that the ZCCHC8 C-terminal domain binds the helicase core in a manner that is distinct from that observed for Saccharomyces cerevisiae exosome cofactors Trf4p and Air2p. Our results are consistent with a model whereby effective targeting of substrates by NEXT entails recognition of elements within the substrate and activation of MTR4 helicase activity. Copyright © 2018 the Author(s). Published by PNAS.

  15. Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

    PubMed Central

    2012-01-01

    Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678

  16. Southern Great Plains Rapid Ecoregional Assessment: pre-assessment report

    USGS Publications Warehouse

    Assal, Timothy J.; Melcher, Cynthia P.; Carr, Natasha B.

    2015-01-01

    An overview on the ecology and management issues for each Conservation Element is provided, including distribution and ecology, landscape structure and dynamics, and associated species of management concern affiliated with each Conservation Element. For each Conservation Element, effects of the Change Agents are described. An overview of potential key ecological attributes and potential Change Agents are summarized by conceptual models and tables. The tables provide an organizational framework and background information for evaluating the key ecological attributes and Change Agents in Phase II.

  17. Cis-acting RNA elements in the Hepatitis C virus RNA genome

    PubMed Central

    Sagan, Selena M.; Chahal, Jasmin; Sarnow, Peter

    2017-01-01

    Hepatitis C virus (HCV) infection is a rapidly increasing global health problem with an estimated 170 million people infected worldwide. HCV is a hepatotropic, positive-sense RNA virus of the family Flaviviridae. As a positive-sense RNA virus, the HCV genome itself must serve as a template for translation, replication and packaging. The viral RNA must therefore be a dynamic structure that is able to readily accommodate structural changes to expose different regions of the genome to viral and cellular proteins to carry out the HCV life cycle. The ∼9600 nucleotide viral genome contains a single long open reading frame flanked by 5′ and 3′ non-coding regions that contain cis-acting RNA elements important for viral translation, replication and stability. Additional cis-acting RNA elements have also been identified in the coding sequences as well as in the 3′ end of the negative-strand replicative intermediate. Herein, we provide an overview of the importance of these cis-acting RNA elements in the HCV life cycle. PMID:25576644

  18. Homeland security in the C. elegans germ line: insights into the biogenesis and function of piRNAs.

    PubMed

    Kasper, Dionna M; Gardner, Kathryn E; Reinke, Valerie

    2014-01-01

    While most eukaryotic genomes contain transposable elements that can provide select evolutionary advantages to a given organism, failure to tightly control the mobility of such transposable elements can result in compromised genomic integrity of both parental and subsequent generations. Together with the Piwi subfamily of Argonaute proteins, small, non-coding Piwi-interacting RNAs (piRNAs) primarily function in the germ line to defend the genome against the potentially deleterious effects that can be caused by transposition. Here, we describe recent discoveries concerning the biogenesis and function of piRNAs in the nematode Caenorhabditis elegans, illuminating how the faithful production of these mature species can impart a robust defense mechanism for the germ line to counteract problems caused by foreign genetic elements across successive generations by contributing to the epigenetic memory of non-self vs. self.

  19. Nezha, a novel active miniature inverted-repeat transposable element in cyanobacteria

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou Fengfeng; Tran Thao; Xu Ying

    2008-01-25

    Miniature inverted-repeat transposable elements (MITEs) were first identified in plants and exerted extensive proliferations throughout eukaryotic and archaeal genomes. But very few MITEs have been characterized in bacteria. We identified a novel MITE, called Nezha, in cyanobacteria Anabaena variabilis ATCC 29413 and Nostoc sp. PCC 7120. Nezha, like most previously known MITEs in other organisms, is small in size, non-coding, carrying TIR and DR signals, and of potential to form a stable RNA secondary structure, and it tends to insert into A+T-rich regions. Recent transpositions of Nezha were observed in A. variabilis ATCC 29413 and Nostoc sp. PCC 7120, respectively.more » Nezha might have proliferated recently with aid from the transposase encoded by ISNpu3-like elements. A possible horizontal transfer event of Nezha from cyanobacteria to Polaromonas JS666 is also observed.« less

  20. Conservative discretization of the Landau collision integral

    DOE PAGES

    Hirvijoki, E.; Adams, M. F.

    2017-03-28

    Here we describe a density, momentum-, and energy-conserving discretization of the nonlinear Landau collision integral. The method is suitable for both the finite-element and discontinuous Galerkin methods and does not require structured meshes. The conservation laws for the discretization are proven algebraically and demonstrated numerically for an axially symmetric nonlinear relaxation problem using a finite-element implementation.

  1. Segmenting the human genome based on states of neutral genetic divergence.

    PubMed

    Kuruppumullage Don, Prabhani; Ananda, Guruprasad; Chiaromonte, Francesca; Makova, Kateryna D

    2013-09-03

    Many studies have demonstrated that divergence levels generated by different mutation types vary and covary across the human genome. To improve our still-incomplete understanding of the mechanistic basis of this phenomenon, we analyze several mutation types simultaneously, anchoring their variation to specific regions of the genome. Using hidden Markov models on insertion, deletion, nucleotide substitution, and microsatellite divergence estimates inferred from human-orangutan alignments of neutrally evolving genomic sequences, we segment the human genome into regions corresponding to different divergence states--each uniquely characterized by specific combinations of divergence levels. We then parsed the mutagenic contributions of various biochemical processes associating divergence states with a broad range of genomic landscape features. We find that high divergence states inhabit guanine- and cytosine (GC)-rich, highly recombining subtelomeric regions; low divergence states cover inner parts of autosomes; chromosome X forms its own state with lowest divergence; and a state of elevated microsatellite mutability is interspersed across the genome. These general trends are mirrored in human diversity data from the 1000 Genomes Project, and departures from them highlight the evolutionary history of primate chromosomes. We also find that genes and noncoding functional marks [annotations from the Encyclopedia of DNA Elements (ENCODE)] are concentrated in high divergence states. Our results provide a powerful tool for biomedical data analysis: segmentations can be used to screen personal genome variants--including those associated with cancer and other diseases--and to improve computational predictions of noncoding functional elements.

  2. RNA Helicase Associated with AU-rich Element (RHAU/DHX36) Interacts with the 3′-Tail of the Long Non-coding RNA BC200 (BCYRN1)*

    PubMed Central

    Booy, Evan P.; McRae, Ewan K. S.; Howard, Ryan; Deo, Soumya R.; Ariyo, Emmanuel O.; Dzananovic, Edis; Meier, Markus; Stetefeld, Jörg; McKenna, Sean A.

    2016-01-01

    RNA helicase associated with AU-rich element (RHAU) is an ATP-dependent RNA helicase that demonstrates high affinity for quadruplex structures in DNA and RNA. To elucidate the significance of these quadruplex-RHAU interactions, we have performed RNA co-immunoprecipitation screens to identify novel RNAs bound to RHAU and characterize their function. In the course of this study, we have identified the non-coding RNA BC200 (BCYRN1) as specifically enriched upon RHAU immunoprecipitation. Although BC200 does not adopt a quadruplex structure and does not bind the quadruplex-interacting motif of RHAU, it has direct affinity for RHAU in vitro. Specifically designed BC200 truncations and RNase footprinting assays demonstrate that RHAU binds to an adenosine-rich region near the 3′-end of the RNA. RHAU truncations support binding that is dependent upon a region within the C terminus and is specific to RHAU isoform 1. Tests performed to assess whether BC200 interferes with RHAU helicase activity have demonstrated the ability of BC200 to act as an acceptor of unwound quadruplexes via a cytosine-rich region near the 3′-end of the RNA. Furthermore, an interaction between BC200 and the quadruplex-containing telomerase RNA was confirmed by pull-down assays of the endogenous RNAs. This leads to the possibility that RHAU may direct BC200 to bind and exert regulatory functions at quadruplex-containing RNA or DNA sequences. PMID:26740632

  3. RNA connectivity requirements between conserved elements in the core of the yeast telomerase RNP

    PubMed Central

    Mefford, Melissa A; Rafiq, Qundeel; Zappulla, David C

    2013-01-01

    Telomerase is a specialized chromosome end-replicating enzyme required for genome duplication in many eukaryotes. An RNA and reverse transcriptase protein subunit comprise its enzymatic core. Telomerase is evolving rapidly, particularly its RNA component. Nevertheless, nearly all telomerase RNAs, including those of H. sapiens and S. cerevisiae, share four conserved structural elements: a core-enclosing helix (CEH), template-boundary element, template, and pseudoknot, in this order along the RNA. It is not clear how these elements coordinate telomerase activity. We find that although rearranging the order of the four conserved elements in the yeast telomerase RNA subunit, TLC1, disrupts activity, the RNA ends can be moved between the template and pseudoknot in vitro and in vivo. However, the ends disrupt activity when inserted between the other structured elements, defining an Area of Required Connectivity (ARC). Within the ARC, we find that only the junction nucleotides between the pseudoknot and CEH are essential. Integrating all of our findings provides a basic map of functional connections in the core of the yeast telomerase RNP and a framework to understand conserved element coordination in telomerase mechanism. PMID:24129512

  4. The Regulatory Properties of Autonomous Subtelomeric P Elements Are Sensitive to a Suppressor of Variegation in Drosophila Melanogaster

    PubMed Central

    Ronsseray, S.; Lehmann, M.; Nouaud, D.; Anxolabehere, D.

    1996-01-01

    Genetic recombination was used in Drosophila melanogaster to isolate P elements, inserted at the telomeres of X chromosomes (cytological site 1A) from natural populations, in a genetic background devoid of other P elements. We show that complete maternally inherited P repression in the germline (P cytotype) can be elicited by only two autonomous P elements at 1A and that a single element at this site has partial regulatory properties. The analysis of the surrounding chromosomal regions of the P elements at 1A shows that in all cases these elements are flanked by Telomeric Associated Sequences, tandemly repetitive noncoding sequences that have properties of heterochromatin. In addition, we show that the regulatory properties of P elements at 1A can be inhibited by some of the mutant alleles of the Su(var)205 gene and by a deficiency of this gene. However, the regulatory properties of reference P strains (Harwich and Texas 007) are not impaired by Su(var)205 mutations. Su(var)205 encodes Heterochromatin Protein 1 (HP1). These results suggest that the HP1 dosage effect on the P element properties is site-dependent and could involve the structure of the chromatin. PMID:8844154

  5. piRNA pathway targets active LINE1 elements to establish the repressive H3K9me3 mark in germ cells

    PubMed Central

    Pezic, Dubravka; Manakov, Sergei A.; Sachidanandam, Ravi; Aravin, Alexei A.

    2014-01-01

    Transposable elements (TEs) occupy a large fraction of metazoan genomes and pose a constant threat to genomic integrity. This threat is particularly critical in germ cells, as changes in the genome that are induced by TEs will be transmitted to the next generation. Small noncoding piwi-interacting RNAs (piRNAs) recognize and silence a diverse set of TEs in germ cells. In mice, piRNA-guided transposon repression correlates with establishment of CpG DNA methylation on their sequences, yet the mechanism and the spectrum of genomic targets of piRNA silencing are unknown. Here we show that in addition to DNA methylation, the piRNA pathway is required to maintain a high level of the repressive H3K9me3 histone modification on long interspersed nuclear elements (LINEs) in germ cells. piRNA-dependent chromatin repression targets exclusively full-length elements of actively transposing LINE families, demonstrating the remarkable ability of the piRNA pathway to recognize active elements among the large number of genomic transposon fragments. PMID:24939875

  6. [Relevance of long non-coding RNAs in tumour biology].

    PubMed

    Nagy, Zoltán; Szabó, Diána Rita; Zsippai, Adrienn; Falus, András; Rácz, Károly; Igaz, Péter

    2012-09-23

    The discovery of the biological relevance of non-coding RNA molecules represents one of the most significant advances in contemporary molecular biology. It has turned out that a major fraction of the non-coding part of the genome is transcribed. Beside small RNAs (including microRNAs) more and more data are disclosed concerning long non-coding RNAs of 200 nucleotides to 100 kb length that are implicated in the regulation of several basic molecular processes (cell proliferation, chromatin functioning, microRNA-mediated effects, etc.). Some of these long non-coding RNAs have been associated with human tumours, including H19, HOTAIR, MALAT1, etc., the different expression of which has been noted in various neoplasms relative to healthy tissues. Long non-coding RNAs may represent novel markers of molecular diagnostics and they might even turn out to be targets of therapeutic intervention.

  7. Wyoming Basin Rapid Ecoregional Assessment: Work Plan

    USGS Publications Warehouse

    Carr, Natasha B.; Garman, Steven L.; Walters, Annika; Ray, Andrea; Melcher, Cynthia P.; Wesner, Jeff S.; O’Donnell, Michael S.; Sherrill, Kirk R.; Babel, Nils C.; Bowen, Zachary H.

    2013-01-01

    The overall goal of the Rapid Ecoregional Assessments (REAs) being conducted for the Bureau of Land Management (BLM) is to provide information that supports regional planning and analysis for the management of ecological resources. The REA provides an assessment of baseline ecological conditions, an evaluation of current risks from drivers of ecosystem change, and a predictive capacity for evaluating future risks. The REA also may be used for identifying priority areas for conservation or restoration and for assessing the cumulative effects of a variety of land uses. There are several components of the REAs. Management Questions, developed by the BLM and partners for the ecoregion, identify the information needed for addressing land-management responsibilities. Conservation Elements represent regionally significant aquatic and terrestrial species and communities that are to be conserved and (or) restored. The REA also will evaluate major drivers of ecosystem change (Change Agents) currently affecting or likely to affect the status of Conservation Elements. We selected 8 major biomes and 19 species or species assemblages to be included as Conservation Elements. We will address the four primary Change Agents—development, fire, invasive species, and climate change—required for the REA. The purpose of the work plan for the Wyoming Basin REA is to document the selection process for, and final list of, Management Questions, Conservation Elements, and Change Agents. The work plan also presents the overall assessment framework that will be used to assess the status of Conservation Elements and answer Management Questions.

  8. The statistics of Pearce element diagrams and the Chayes closure problem

    NASA Astrophysics Data System (ADS)

    Nicholls, J.

    1988-05-01

    Pearce element ratios are defined as having a constituent in their denominator that is conserved in a system undergoing change. The presence of a conserved element in the denominator simplifies the statistics of such ratios and renders them subject to statistical tests, especially tests of significance of the correlation coefficient between Pearce element ratios. Pearce element ratio diagrams provide unambigous tests of petrologic hypotheses because they are based on the stoichiometry of rock-forming minerals. There are three ways to recognize a conserved element: 1. The petrologic behavior of the element can be used to select conserved ones. They are usually the incompatible elements. 2. The ratio of two conserved elements will be constant in a comagmatic suite. 3. An element ratio diagram that is not constructed with a conserved element in the denominator will have a trend with a near zero intercept. The last two criteria can be tested statistically. The significance of the slope, intercept and correlation coefficient can be tested by estimating the probability of obtaining the observed values from a random population of arrays. This population of arrays must satisfy two criteria: 1. The population must contain at least one array that has the means and variances of the array of analytical data for the rock suite. 2. Arrays with the means and variances of the data must not be so abundant in the population that nearly every array selected at random has the properties of the data. The population of random closed arrays can be obtained from a population of open arrays whose elements are randomly selected from probability distributions. The means and variances of these probability distributions are themselves selected from probability distributions which have means and variances equal to a hypothetical open array that would give the means and variances of the data on closure. This hypothetical open array is called the Chayes array. Alternatively, the population of random closed arrays can be drawn from the compositional space available to rock-forming processes. The minerals comprising the available space can be described with one additive component per mineral phase and a small number of exchange components. This space is called Thompson space. Statistics based on either space lead to the conclusion that Pearce element ratios are statistically valid and that Pearce element diagrams depict the processes that create chemical inhomogeneities in igneous rock suites.

  9. The sequence, structure and evolutionary features of HOTAIR in mammals

    PubMed Central

    2011-01-01

    Background An increasing number of long noncoding RNAs (lncRNAs) have been identified recently. Different from all the others that function in cis to regulate local gene expression, the newly identified HOTAIR is located between HoxC11 and HoxC12 in the human genome and regulates HoxD expression in multiple tissues. Like the well-characterised lncRNA Xist, HOTAIR binds to polycomb proteins to methylate histones at multiple HoxD loci, but unlike Xist, many details of its structure and function, as well as the trans regulation, remain unclear. Moreover, HOTAIR is involved in the aberrant regulation of gene expression in cancer. Results To identify conserved domains in HOTAIR and study the phylogenetic distribution of this lncRNA, we searched the genomes of 10 mammalian and 3 non-mammalian vertebrates for matches to its 6 exons and the two conserved domains within the 1800 bp exon6 using Infernal. There was just one high-scoring hit for each mammal, but many low-scoring hits were found in both mammals and non-mammalian vertebrates. These hits and their flanking genes in four placental mammals and platypus were examined to determine whether HOTAIR contained elements shared by other lncRNAs. Several of the hits were within unknown transcripts or ncRNAs, many were within introns of, or antisense to, protein-coding genes, and conservation of the flanking genes was observed only between human and chimpanzee. Phylogenetic analysis revealed discrete evolutionary dynamics for orthologous sequences of HOTAIR exons. Exon1 at the 5' end and a domain in exon6 near the 3' end, which contain domains that bind to multiple proteins, have evolved faster in primates than in other mammals. Structures were predicted for exon1, two domains of exon6 and the full HOTAIR sequence. The sequence and structure of two fragments, in exon1 and the domain B of exon6 respectively, were identified to robustly occur in predicted structures of exon1, domain B of exon6 and the full HOTAIR in mammals. Conclusions HOTAIR exists in mammals, has poorly conserved sequences and considerably conserved structures, and has evolved faster than nearby HoxC genes. Exons of HOTAIR show distinct evolutionary features, and a 239 bp domain in the 1804 bp exon6 is especially conserved. These features, together with the absence of some exons and sequences in mouse, rat and kangaroo, suggest ab initio generation of HOTAIR in marsupials. Structure prediction identifies two fragments in the 5' end exon1 and the 3' end domain B of exon6, with sequence and structure invariably occurring in various predicted structures of exon1, the domain B of exon6 and the full HOTAIR. PMID:21496275

  10. Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression

    PubMed Central

    Cowper-Sal·lari, Richard; Zhang, Xiaoyang; Wright, Jason B.; Bailey, Swneke D.; Cole, Michael D.; Eeckhoute, Jerome; Moore, Jason H.; Lupien, Mathieu

    2012-01-01

    Genome-wide association studies (GWASs) have identified thousands of single nucleotide polymorphisms (SNPs) associated with human traits and diseases. But because the vast majority of these SNPs are located in the noncoding regions of the genome their risk promoting mechanisms are elusive. Employing a new methodology combining cistromics, epigenomics and genotype imputation we annotate the noncoding regions of the genome in breast cancer cells and systematically identify the functional nature of SNPs associated with breast cancer risk. Our results demonstrate that breast cancer risk-associated SNPs are enriched in the cistromes of FOXA1 and ESR1 and the epigenome of H3K4me1 in a cancer and cell-type-specific manner. Furthermore, the majority of these risk-associated SNPs modulate the affinity of chromatin for FOXA1 at distal regulatory elements, which results in allele-specific gene expression, exemplified by the effect of the rs4784227 SNP on the TOX3 gene found within the 16q12.1 risk locus. PMID:23001124

  11. Epigenetics and the Developmental Origins of Health and ...

    EPA Pesticide Factsheets

    Epigenetic programming is likely to be an important mechanism underlying the lasting influence of the developmental environment on lifelong health, a concept known as the Developmental Origins of Health and Disease (DOHaD). DNA methylation, posttranslational histone protei n modifications, noncoding RNAs and recruited protein complexes are elements of the epigenetic regulation of gene transcription. These heritable but reversible changes in gene function are dynamic and labile during specific stages of the reproductive cycle and development. Epigenetic marks may be maintained throughout an individual's lifespan and can alter the life-long risk of disease; the nature of these epigenetic marks and their potential alteration by environmental factors is an area of active research. This chapter provides an overview of epigenetic regulation, particularly as it occurs as an essential component of embryo-fetal development. In this chapter we will present key features of DNA methylation and histone protein modifications, including the enzymes involved and the effects of these modifications on gene transcription. We will discuss the interplay of these dynamic modifications and the emerging role of noncoding RNAs in epigenetic gene regulation.

  12. Transcription of tandemly repetitive DNA: functional roles.

    PubMed

    Biscotti, Maria Assunta; Canapa, Adriana; Forconi, Mariko; Olmo, Ettore; Barucca, Marco

    2015-09-01

    A considerable fraction of the eukaryotic genome is made up of satellite DNA constituted of tandemly repeated sequences. These elements are mainly located at centromeres, pericentromeres, and telomeres and are major components of constitutive heterochromatin. Although originally satellite DNA was thought silent and inert, an increasing number of studies are providing evidence on its transcriptional activity supporting, on the contrary, an unexpected dynamicity. This review summarizes the multiple structural roles of satellite noncoding RNAs at chromosome level. Indeed, satellite noncoding RNAs play a role in the establishment of a heterochromatic state at centromere and telomere. These highly condensed structures are indispensable to preserve chromosome integrity and genome stability, preventing recombination events, and ensuring the correct chromosome pairing and segregation. Moreover, these RNA molecules seem to be involved also in maintaining centromere identity and in elongation, capping, and replication of telomere. Finally, the abnormal variation of centromeric and pericentromeric DNA transcription across major eukaryotic lineages in stress condition and disease has evidenced the critical role that these transcripts may play and the potentially dire consequences for the organism.

  13. Roles of long non-coding RNAs in gastric cancer metastasis

    PubMed Central

    Yang, Zi-Guo; Gao, Ling; Guo, Xiao-Bo; Shi, Yu-Long

    2015-01-01

    Gastric cancer is the second leading cause of cancer-related deaths. Metastasis, which is an important element of gastric cancer, leads to a high mortality rate and to a poor prognosis. Gastric cancer metastasis has a complex progression that involves multiple biological processes. The comprehensive mechanisms of metastasis remain unclear, though traditional regulation modulates the molecular functions associated with metastasis. Long non-coding RNAs (lncRNAs) have a role in different gene regulatory pathways by epigenetic modification and by transcriptional and post-transcription regulation. lncRNAs participate in various diseases, including Alzheimer’s disease, cardiovascular disease, and cancer. The altered expressions of certain lncRNAs are linked to gastric cancer metastasis and invasion, as with tumor suppressor genes or oncogenes. Studies have partly elucidated the roles of lncRNAs as biomarkers and in therapies, as well as their gene regulatory mechanisms. However, comprehensive knowledge regarding the functional mechanisms of gene regulation in metastatic gastric cancer remains scarce. To provide a theoretical basis for therapeutic intervention in metastatic gastric cancer, we reviewed the functions of lncRNAs and their regulatory roles in gastric cancer metastasis. PMID:25954095

  14. A novel RNA binding surface of the TAM domain of TIP5/BAZ2A mediates epigenetic regulation of rRNA genes.

    PubMed

    Anosova, Irina; Melnik, Svitlana; Tripsianes, Konstantinos; Kateb, Fatiha; Grummt, Ingrid; Sattler, Michael

    2015-05-26

    The chromatin remodeling complex NoRC, comprising the subunits SNF2h and TIP5/BAZ2A, mediates heterochromatin formation at major clusters of repetitive elements, including rRNA genes, centromeres and telomeres. Association with chromatin requires the interaction of the TAM (TIP5/ARBP/MBD) domain of TIP5 with noncoding RNA, which targets NoRC to specific genomic loci. Here, we show that the NMR structure of the TAM domain of TIP5 resembles the fold of the MBD domain, found in methyl-CpG binding proteins. However, the TAM domain exhibits an extended MBD fold with unique C-terminal extensions that constitute a novel surface for RNA binding. Mutation of critical amino acids within this surface abolishes RNA binding in vitro and in vivo. Our results explain the distinct binding specificities of TAM and MBD domains to RNA and methylated DNA, respectively, and reveal structural features for the interaction of NoRC with non-coding RNA. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Noncoding sequence classification based on wavelet transform analysis: part II

    NASA Astrophysics Data System (ADS)

    Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez-Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.

    2017-09-01

    DNA sequences in human genome can be divided into the coding and noncoding ones. We hypothesize that the characteristic periodicities of the noncoding sequences are related to their function. We describe the procedure to identify these characteristic periodicities using the wavelet analysis. Our results show that three groups of noncoding sequences, each one with different biological function, may be differentiated by their wavelet coefficients within specific frequency range.

  16. Crosstalk between the Notch signaling pathway and non-coding RNAs in gastrointestinal cancers

    PubMed Central

    Pan, Yangyang; Mao, Yuyan; Jin, Rong; Jiang, Lei

    2018-01-01

    The Notch signaling pathway is one of the main signaling pathways that mediates direct contact between cells, and is essential for normal development. It regulates various cellular processes, including cell proliferation, apoptosis, migration, invasion, angiogenesis and metastasis. It additionally serves an important function in tumor progression. Non-coding RNAs mainly include small microRNAs, long non-coding RNAs and circular RNAs. At present, a large body of literature supports the biological significance of non-coding RNAs in tumor progression. It is also becoming increasingly evident that cross-talk exists between Notch signaling and non-coding RNAs. The present review summarizes the current knowledge of Notch-mediated gastrointestinal cancer cell processes, and the effect of the crosstalk between the three major types of non-coding RNAs and the Notch signaling pathway on the fate of gastrointestinal cancer cells. PMID:29285185

  17. Conserved Structural Elements in the V3 Crown of HIV-1 gp120

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, X.; Burke, V; Totrov, M

    2010-01-01

    Binding of the third variable region (V3) of the HIV-1 envelope glycoprotein gp120 to the cell-surface coreceptors CCR5 or CXCR4 during viral entry suggests that there are conserved structural elements in this sequence-variable region. These conserved elements could serve as epitopes to be targeted by a vaccine against HIV-1. Here we perform a systematic structural analysis of representative human anti-V3 monoclonal antibodies in complex with V3 peptides, revealing that the crown of V3 has four conserved structural elements: an arch, a band, a hydrophobic core and the peptide backbone. These are either unaffected by or are subject to minimal sequencemore » variation. As these regions are targeted by cross-clade neutralizing human antibodies, they provide a blueprint for the design of vaccine immunogens that could elicit broadly cross-reactive protective antibodies.« less

  18. The impact of age, biogenesis, and genomic clustering on Drosophila microRNA evolution

    PubMed Central

    Mohammed, Jaaved; Flynt, Alex S.; Siepel, Adam; Lai, Eric C.

    2013-01-01

    The molecular evolutionary signatures of miRNAs inform our understanding of their emergence, biogenesis, and function. The known signatures of miRNA evolution have derived mostly from the analysis of deeply conserved, canonical loci. In this study, we examine the impact of age, biogenesis pathway, and genomic arrangement on the evolutionary properties of Drosophila miRNAs. Crucial to the accuracy of our results was our curation of high-quality miRNA alignments, which included nearly 150 corrections to ortholog calls and nucleotide sequences of the global 12-way Drosophilid alignments currently available. Using these data, we studied primary sequence conservation, normalized free-energy values, and types of structure-preserving substitutions. We expand upon common miRNA evolutionary patterns that reflect fundamental features of miRNAs that are under functional selection. We observe that melanogaster-subgroup-specific miRNAs, although recently emerged and rapidly evolving, nonetheless exhibit evolutionary signatures that are similar to well-conserved miRNAs and distinct from other structured noncoding RNAs and bulk conserved non-miRNA hairpins. This provides evidence that even young miRNAs may be selected for regulatory activities. More strikingly, we observe that mirtrons and clustered miRNAs both exhibit distinct evolutionary properties relative to solo, well-conserved miRNAs, even after controlling for sequence depth. These studies highlight the previously unappreciated impact of biogenesis strategy and genomic location on the evolutionary dynamics of miRNAs, and affirm that miRNAs do not evolve as a unitary class. PMID:23882112

  19. Circular RNA: A new star of noncoding RNAs.

    PubMed

    Qu, Shibin; Yang, Xisheng; Li, Xiaolei; Wang, Jianlin; Gao, Yuan; Shang, Runze; Sun, Wei; Dou, Kefeng; Li, Haimin

    2015-09-01

    Circular RNAs (circRNAs) are a novel type of RNA that, unlike linear RNAs, form a covalently closed continuous loop and are highly represented in the eukaryotic transcriptome. Recent studies have discovered thousands of endogenous circRNAs in mammalian cells. CircRNAs are largely generated from exonic or intronic sequences, and reverse complementary sequences or RNA-binding proteins (RBPs) are necessary for circRNA biogenesis. The majority of circRNAs are conserved across species, are stable and resistant to RNase R, and often exhibit tissue/developmental-stage-specific expression. Recent research has revealed that circRNAs can function as microRNA (miRNA) sponges, regulators of splicing and transcription, and modifiers of parental gene expression. Emerging evidence indicates that circRNAs might play important roles in atherosclerotic vascular disease risk, neurological disorders, prion diseases and cancer; exhibit aberrant expression in colorectal cancer (CRC) and pancreatic ductal adenocarcinoma (PDAC); and serve as diagnostic or predictive biomarkers of some diseases. Similar to miRNAs and long noncoding RNAs (lncRNAs), circRNAs are becoming a new research hotspot in the field of RNA and could be widely involved in the processes of life. Herein, we review the formation and properties of circRNAs, their functions, and their potential significance in disease. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  20. Japanese encephalitis virus non-coding RNA inhibits activation of interferon by blocking nuclear translocation of interferon regulatory factor 3.

    PubMed

    Chang, Ruey-Yi; Hsu, Ta-Wen; Chen, Yen-Lin; Liu, Shu-Fan; Tsai, Yi-Jer; Lin, Yun-Tong; Chen, Yi-Shiuan; Fan, Yi-Hsin

    2013-09-27

    Noncoding RNA (ncRNA) plays a critical role in modulating a broad range of diseases. All arthropod-borne flaviviruses produce short fragment ncRNA (sfRNA) collinear with highly conserved regions of the 3'-untranslated region (UTR) in the viral genome. We show that the molar ratio of sfRNA to genomic RNA in Japanese encephalitis virus (JEV) persistently infected cells is greater than that in acutely infected cells, indicating an sfRNA role in establishing persistent infection. Transfecting excess quantities of sfRNA into JEV-infected cells reduced interferon-β (IFN-β) promoter activity by 57% and IFN-β mRNA levels by 52%, compared to mock-transfected cells. Transfection of sfRNA into JEV-infected cells also reduced phosphorylation of interferon regulatory factor-3 (IRF-3), the IFN-β upstream regulator, and blocked roughly 30% of IRF-3 nuclear localization. Furthermore, JEV-infected sfRNA transfected cells produced 23% less IFN-β-stimulated apoptosis than mock-transfected groups did. Taken together, these results suggest that sfRNA plays a role against host-cell antiviral responses, prevents cells from undergoing apoptosis, and thus contributes to viral persistence. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. RNA processing in Neurospora crassa mitochondria: use of transfer RNA sequences as signals.

    PubMed Central

    Breitenberger, C A; Browning, K S; Alzner-DeWeerd, B; RajBhandary, U L

    1985-01-01

    We have used RNA gel transfer hybridization, S1 nuclease mapping and primer extension to analyze transcripts derived from several genes in Neurospora crassa mitochondria. The transcripts studied include those for cytochrome oxidase subunit III, 17S rRNA and an unidentified open reading frame. In all three cases, initial transcripts are long, include tRNA sequences, and are subsequently processed to generate the mature RNAs. We find that endpoints of the most abundant transcripts generally coincide with those of tRNA sequences. We therefore conclude that tRNA sequences in long transcripts act as primary signals for RNA processing in N. crassa mitochondria. The situation is somewhat analogous to that observed in mammalian mitochondrial systems. The difference, however, is that in mammalian mitochondria, noncoding spacers between tRNA, rRNA and protein genes are very short and in many cases non-existent, allowing no room for intergenic RNA processing signals whereas, in N. crassa mtDNA, intergenic non-coding sequences are usually several hundred nucleotides long and contain highly conserved GC-rich palindromic sequences. Since these GC-rich palindromic sequences are retained in the processed mature RNAs, we conclude that they do not serve as signals for RNA processing. Images Fig. 2. Fig. 3. Fig. 4. Fig. 5. Fig. 6. Fig. 7. PMID:2990893

  2. Extraordinary Structured Noncoding RNAs Revealed by Bacterial Metagenome Analysis

    PubMed Central

    Weinberg, Zasha; Perreault, Jonathan; Meyer, Michelle M.; Breaker, Ronald R.

    2012-01-01

    Estimates of the total number of bacterial species1-3 suggest that existing DNA sequence databases carry only a tiny fraction of the total amount of DNA sequence space represented by this division of life. Indeed, environmental DNA samples have been shown to encode many previously unknown classes of proteins4 and RNAs5. Bioinformatics searches6-10 of genomic DNA from bacteria commonly identify novel noncoding RNAs (ncRNAs)10-12 such as riboswitches13,14. In rare instances, RNAs that exhibit more extensive sequence and structural conservation across a wide range of bacteria are encountered15,16. Given that large structured RNAs are known to carry out complex biochemical functions such as protein synthesis and RNA processing reactions, identifying more RNAs of great size and intricate structure is likely to reveal additional biochemical functions that can be achieved by RNA. We applied an updated computational pipeline17 to discover ncRNAs that rival the known large ribozymes in size and structural complexity or that are among the most abundant RNAs in bacteria that encode them. These RNAs would have been difficult or impossible to detect without examining environmental DNA sequences, suggesting that numerous RNAs with extraordinary size, structural complexity, or other exceptional characteristics remain to be discovered in unexplored sequence space. PMID:19956260

  3. Transcriptome-wide discovery of circular RNAs in Archaea

    PubMed Central

    Danan, Miri; Schwartz, Schraga; Edelheit, Sarit; Sorek, Rotem

    2012-01-01

    Circular RNA forms had been described in all domains of life. Such RNAs were shown to have diverse biological functions, including roles in the life cycle of viral and viroid genomes, and in maturation of permuted tRNA genes. Despite their potentially important biological roles, discovery of circular RNAs has so far been mostly serendipitous. We have developed circRNA-seq, a combined experimental/computational approach that enriches for circular RNAs and allows profiling their prevalence in a whole-genome, unbiased manner. Application of this approach to the archaeon Sulfolobus solfataricus P2 revealed multiple circular transcripts, a subset of which was further validated independently. The identified circular RNAs included expected forms, such as excised tRNA introns and rRNA processing intermediates, but were also enriched with non-coding RNAs, including C/D box RNAs and RNase P, as well as circular RNAs of unknown function. Many of the identified circles were conserved in Sulfolobus acidocaldarius, further supporting their functional significance. Our results suggest that circular RNAs, and particularly circular non-coding RNAs, are more prevalent in archaea than previously recognized, and might have yet unidentified biological roles. Our study establishes a specific and sensitive approach for identification of circular RNAs using RNA-seq, and can readily be applied to other organisms. PMID:22140119

  4. Detection of hyper-conserved regions in hepatitis B virus X gene potentially useful for gene therapy.

    PubMed

    González, Carolina; Tabernero, David; Cortese, Maria Francesca; Gregori, Josep; Casillas, Rosario; Riveiro-Barciela, Mar; Godoy, Cristina; Sopena, Sara; Rando, Ariadna; Yll, Marçal; Lopez-Martinez, Rosa; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco

    2018-05-21

    To detect hyper-conserved regions in the hepatitis B virus (HBV) X gene ( HBX ) 5' region that could be candidates for gene therapy. The study included 27 chronic hepatitis B treatment-naive patients in various clinical stages (from chronic infection to cirrhosis and hepatocellular carcinoma, both HBeAg-negative and HBeAg-positive), and infected with HBV genotypes A-F and H. In a serum sample from each patient with viremia > 3.5 log IU/mL, the HBX 5' end region [nucleotide (nt) 1255-1611] was PCR-amplified and submitted to next-generation sequencing (NGS). We assessed genotype variants by phylogenetic analysis, and evaluated conservation of this region by calculating the information content of each nucleotide position in a multiple alignment of all unique sequences (haplotypes) obtained by NGS. Conservation at the HBx protein amino acid (aa) level was also analyzed. NGS yielded 1333069 sequences from the 27 samples, with a median of 4578 sequences/sample (2487-9279, IQR 2817). In 14/27 patients (51.8%), phylogenetic analysis of viral nucleotide haplotypes showed a complex mixture of genotypic variants. Analysis of the information content in the haplotype multiple alignments detected 2 hyper-conserved nucleotide regions, one in the HBX upstream non-coding region (nt 1255-1286) and the other in the 5' end coding region (nt 1519-1603). This last region coded for a conserved amino acid region (aa 63-76) that partially overlaps a Kunitz-like domain. Two hyper-conserved regions detected in the HBX 5' end may be of value for targeted gene therapy, regardless of the patients' clinical stage or HBV genotype.

  5. Comprehensive analysis of coding-lncRNA gene co-expression network uncovers conserved functional lncRNAs in zebrafish.

    PubMed

    Chen, Wen; Zhang, Xuan; Li, Jing; Huang, Shulan; Xiang, Shuanglin; Hu, Xiang; Liu, Changning

    2018-05-09

    Zebrafish is a full-developed model system for studying development processes and human disease. Recent studies of deep sequencing had discovered a large number of long non-coding RNAs (lncRNAs) in zebrafish. However, only few of them had been functionally characterized. Therefore, how to take advantage of the mature zebrafish system to deeply investigate the lncRNAs' function and conservation is really intriguing. We systematically collected and analyzed a series of zebrafish RNA-seq data, then combined them with resources from known database and literatures. As a result, we obtained by far the most complete dataset of zebrafish lncRNAs, containing 13,604 lncRNA genes (21,128 transcripts) in total. Based on that, a co-expression network upon zebrafish coding and lncRNA genes was constructed and analyzed, and used to predict the Gene Ontology (GO) and the KEGG annotation of lncRNA. Meanwhile, we made a conservation analysis on zebrafish lncRNA, identifying 1828 conserved zebrafish lncRNA genes (1890 transcripts) that have their putative mammalian orthologs. We also found that zebrafish lncRNAs play important roles in regulation of the development and function of nervous system; these conserved lncRNAs present a significant sequential and functional conservation, with their mammalian counterparts. By integrative data analysis and construction of coding-lncRNA gene co-expression network, we gained the most comprehensive dataset of zebrafish lncRNAs up to present, as well as their systematic annotations and comprehensive analyses on function and conservation. Our study provides a reliable zebrafish-based platform to deeply explore lncRNA function and mechanism, as well as the lncRNA commonality between zebrafish and human.

  6. Cardiovascular RNA interference therapy: the broadening tool and target spectrum.

    PubMed

    Poller, Wolfgang; Tank, Juliane; Skurk, Carsten; Gast, Martina

    2013-08-16

    Understanding of the roles of noncoding RNAs (ncRNAs) within complex organisms has fundamentally changed. It is increasingly possible to use ncRNAs as diagnostic and therapeutic tools in medicine. Regarding disease pathogenesis, it has become evident that confinement to the analysis of protein-coding regions of the human genome is insufficient because ncRNA variants have been associated with important human diseases. Thus, inclusion of noncoding genomic elements in pathogenetic studies and their consideration as therapeutic targets is warranted. We consider aspects of the evolutionary and discovery history of ncRNAs, as far as they are relevant for the identification and selection of ncRNAs with likely therapeutic potential. Novel therapeutic strategies are based on ncRNAs, and we discuss here RNA interference as a highly versatile tool for gene silencing. RNA interference-mediating RNAs are small, but only parts of a far larger spectrum encompassing ncRNAs up to many kilobasepairs in size. We discuss therapeutic options in cardiovascular medicine offered by ncRNAs and key issues to be solved before clinical translation. Convergence of multiple technical advances is highlighted as a prerequisite for the translational progress achieved in recent years. Regarding safety, we review properties of RNA therapeutics, which may immunologically distinguish them from their endogenous counterparts, all of which underwent sophisticated evolutionary adaptation to specific biological contexts. Although our understanding of the noncoding human genome is only fragmentary to date, it is already feasible to develop RNA interference against a rapidly broadening spectrum of therapeutic targets and to translate this to the clinical setting under certain restrictions.

  7. Structure and mechanism of the T-box riboswitches

    PubMed Central

    Zhang, Jinwei

    2015-01-01

    In most Gram-positive bacteria, including many clinically devastating pathogens from genera such as Bacillus, Clostridium, Listeria and Staphylococcus, T-box riboswitches sense and regulate intracellular availability of amino acids through a multipartite mRNA-tRNA interaction. The T-box mRNA leaders respond to nutrient starvation by specifically binding cognate tRNAs and sensing whether the bound tRNA is aminoacylated, as a proxy for amino acid availability. Based on this readout, T-boxes direct a transcriptional or translational switch to control the expression of downstream genes involved in various aspects of amino acid metabolism: biosynthesis, transport, aminoacylation, transamidation, etc. Two decades after its discovery, the structural and mechanistic underpinnings of the T-box riboswitch were recently elucidated, producing a wealth of insights into how two structured RNAs can recognize each other with robust affinity and exquisite selectivity. The T-box paradigm exemplifies how natural non-coding RNAs can interact not just through sequence complementarity, but can add molecular specificity by precisely juxtaposing RNA structural motifs, exploiting inherently flexible elements and the biophysical properties of post-transcriptional modifications, ultimately achieving a high degree of shape complementarity through mutually induced fit. The T-box also provides a proof-of-principle that compact RNA domains can recognize minute chemical changes (such as tRNA aminoacylation) on another RNA. The unveiling of the structure and mechanism of the T-box system thus expands our appreciation of the range of capabilities and modes of action of structured non-coding RNAs, and hints at the existence of networks of non-coding RNAs that communicate through both, structural and sequence specificity. PMID:25959893

  8. Evolutionary impact of transposable elements on genomic diversity and lineage-specific innovation in vertebrates.

    PubMed

    Warren, Ian A; Naville, Magali; Chalopin, Domitille; Levin, Perrine; Berger, Chloé Suzanne; Galiana, Delphine; Volff, Jean-Nicolas

    2015-09-01

    Since their discovery, a growing body of evidence has emerged demonstrating that transposable elements are important drivers of species diversity. These mobile elements exhibit a great variety in structure, size and mechanisms of transposition, making them important putative actors in organism evolution. The vertebrates represent a highly diverse and successful lineage that has adapted to a wide range of different environments. These animals also possess a rich repertoire of transposable elements, with highly diverse content between lineages and even between species. Here, we review how transposable elements are driving genomic diversity and lineage-specific innovation within vertebrates. We discuss the large differences in TE content between different vertebrate groups and then go on to look at how they affect organisms at a variety of levels: from the structure of chromosomes to their involvement in the regulation of gene expression, as well as in the formation and evolution of non-coding RNAs and protein-coding genes. In the process of doing this, we highlight how transposable elements have been involved in the evolution of some of the key innovations observed within the vertebrate lineage, driving the group's diversity and success.

  9. Regulation of mammalian cell differentiation by long non-coding RNAs

    PubMed Central

    Hu, Wenqian; Alvarez-Dominguez, Juan R; Lodish, Harvey F

    2012-01-01

    Differentiation of specialized cell types from stem and progenitor cells is tightly regulated at several levels, both during development and during somatic tissue homeostasis. Many long non-coding RNAs have been recognized as an additional layer of regulation in the specification of cellular identities; these non-coding species can modulate gene-expression programmes in various biological contexts through diverse mechanisms at the transcriptional, translational or messenger RNA stability levels. Here, we summarize findings that implicate long non-coding RNAs in the control of mammalian cell differentiation. We focus on several representative differentiation systems and discuss how specific long non-coding RNAs contribute to the regulation of mammalian development. PMID:23070366

  10. Long Noncoding RNA H19 Inhibits Cell Viability, Migration, and Invasion Via Downregulation of IRS-1 in Thyroid Cancer Cells

    PubMed Central

    Wang, Peng; Xu, Weimin; Liu, Haixia; Bu, Qingao; Sun, Diwen

    2017-01-01

    Thyroid cancer is a common endocrine gland malignancy which exhibited rapid increased incidence worldwide in recent decades. This study was aimed to investigate the role of long noncoding RNA H19 in thyroid cancer. Long noncoding RNA H19 was overexpressed or knockdown in thyroid cancer cells SW579 and TPC-1, and the expression of long noncoding RNA H19 was detected by real-time polymerase chain reaction. The cell viability, migration, and invasion were determined by 3-(4, 5-dimethyl-2-thiazolyl)-2, 5-diphenyl-2-H-tetrazolium bromide assay, Transwell assay, and wound healing assay, respectively. Furthermore, cell apoptosis was analyzed by flow cytometry, and expressions of some factors that were related to phosphatidyl inositide 3-kinases/protein kinase B and nuclear factor κB signal pathway were measured by Western blotting. This study revealed that cell viability and migration/invasion of SW579 and TPC-1 were significantly decreased by long noncoding RNA H19 overexpression compared with the control group (P < .05), whereas cell apoptosis was statistically increased (P < .001). Meanwhile, cell viability and migration/invasion were significantly increased after long noncoding RNA H19 knockdown (P < .05). Furthermore, long noncoding RNA H19 negatively regulated the expression of insulin receptor substrate 1 and thus effect on cell proliferation and apoptosis. Insulin receptor substrate 1 regulated the activation of phosphatidyl inositide 3-kinases/AKT and nuclear factor κB signal pathways. In conclusion, long noncoding RNA H19 could suppress cell viability, migration, and invasion via downregulation of insulin receptor substrate 1 in SW579 and TPC-1 cells. These results suggested the important role of long noncoding RNA H19 in thyroid cancer, and long noncoding RNA H19 might be a potential target of thyroid cancer treatment. PMID:29332545

  11. Non-coding cancer driver candidates identified with a sample- and position-specific model of the somatic mutation rate

    PubMed Central

    Juul, Malene; Bertl, Johanna; Guo, Qianyun; Nielsen, Morten Muhlig; Świtnicki, Michał; Hornshøj, Henrik; Madsen, Tobias; Hobolth, Asger; Pedersen, Jakob Skou

    2017-01-01

    Non-coding mutations may drive cancer development. Statistical detection of non-coding driver regions is challenged by a varying mutation rate and uncertainty of functional impact. Here, we develop a statistically founded non-coding driver-detection method, ncdDetect, which includes sample-specific mutational signatures, long-range mutation rate variation, and position-specific impact measures. Using ncdDetect, we screened non-coding regulatory regions of protein-coding genes across a pan-cancer set of whole-genomes (n = 505), which top-ranked known drivers and identified new candidates. For individual candidates, presence of non-coding mutations associates with altered expression or decreased patient survival across an independent pan-cancer sample set (n = 5454). This includes an antigen-presenting gene (CD1A), where 5’UTR mutations correlate significantly with decreased survival in melanoma. Additionally, mutations in a base-excision-repair gene (SMUG1) correlate with a C-to-T mutational-signature. Overall, we find that a rich model of mutational heterogeneity facilitates non-coding driver identification and integrative analysis points to candidates of potential clinical relevance. DOI: http://dx.doi.org/10.7554/eLife.21778.001 PMID:28362259

  12. The Inescapable Influence of Noncoding RNAs in Cancer

    PubMed Central

    Adams, Brian D.; Anastasiadou, Eleni; Esteller, Manel; He, Lin; Slack, Frank J.

    2015-01-01

    This report summarizes information presented at the 2015 Keystone Symposium on “MicroRNAs and Noncoding RNAs in Cancer”. Nearly two decades after the discovery of the first microRNA (miRNA), the role of noncoding RNAs in developmental processes and the mechanisms behind their dysregulation in cancer has been steadily elucidated. Excitingly, miRNAs have begun making their way into the clinic to combat disease such a hepatitis C, and various forms of cancer. Therefore, at this Keystone meeting novel findings were presented that enhance our view on how small and long noncoding RNAs control developmental timing and oncogenic processes. Recurring themes included, 1) how miRNAs can be differentially processed, degraded, and regulated by ribonucleoprotein (RNP) complexes, 2) how particular miRNA genetic networks that control developmental process, when disrupted, can result in cancer disease, 3) the technologies available to therapeutically deliver RNA to combat diseases such as cancer, and 4) the elucidation of the mechanism of actions for long noncoding RNAs, currently a poorly understood class of noncoding RNA. During the meeting there was an emphasis on presenting unpublished findings, and the breadth of topics covered reflected how inescapable the influence of noncoding RNAs are in development and cancer. PMID:26567137

  13. The central nervous system transcriptome of the weakly electric brown ghost knifefish (Apteronotus leptorhynchus): de novo assembly, annotation, and proteomics validation.

    PubMed

    Salisbury, Joseph P; Sîrbulescu, Ruxandra F; Moran, Benjamin M; Auclair, Jared R; Zupanc, Günther K H; Agar, Jeffrey N

    2015-03-11

    The brown ghost knifefish (Apteronotus leptorhynchus) is a weakly electric teleost fish of particular interest as a versatile model system for a variety of research areas in neuroscience and biology. The comprehensive information available on the neurophysiology and neuroanatomy of this organism has enabled significant advances in such areas as the study of the neural basis of behavior, the development of adult-born neurons in the central nervous system and their involvement in the regeneration of nervous tissue, as well as brain aging and senescence. Despite substantial scientific interest in this species, no genomic resources are currently available. Here, we report the de novo assembly and annotation of the A. leptorhynchus transcriptome. After evaluating several trimming and transcript reconstruction strategies, de novo assembly using Trinity uncovered 42,459 unique contigs containing at least a partial protein-coding sequence based on alignment to a reference set of known Actinopterygii sequences. As many as 11,847 of these contigs contained full or near-full length protein sequences, providing broad coverage of the proteome. A variety of non-coding RNA sequences were also identified and annotated, including conserved long intergenic non-coding RNA and other long non-coding RNA observed previously to be expressed in adult zebrafish (Danio rerio) brain, as well as a variety of miRNA, snRNA, and snoRNA. Shotgun proteomics confirmed translation of open reading frames from over 2,000 transcripts, including alternative splice variants. Assignment of tandem mass spectra was greatly improved by use of the assembly compared to databases of sequences from closely related organisms. The assembly and raw reads have been deposited at DDBJ/EMBL/GenBank under the accession number GBKR00000000. Tandem mass spectrometry data is available via ProteomeXchange with identifier PXD001285. Presented here is the first release of an annotated de novo transcriptome assembly from Apteronotus leptorhynchus, providing a broad overview of RNA expressed in central nervous system tissue. The assembly, which includes substantial coverage of a wide variety of both protein coding and non-coding transcripts, will allow the development of better tools to understand the mechanisms underlying unique characteristics of the knifefish model system, such as their tremendous regenerative capacity and negligible brain senescence.

  14. Birth, coming of age and death: The intriguing life of long noncoding RNAs.

    PubMed

    Samudyata; Castelo-Branco, Gonçalo; Bonetti, Alessandro

    2018-07-01

    Mammalian genomes are pervasively transcribed, with long noncoding RNAs being the most abundant fraction. Recent studies have highlighted the central role played by these transcripts in several physiological and pathological processes. Despite several metabolic features shared between coding and noncoding transcripts, these two classes of RNAs exhibit multiple differences regarding their biogenesis and processing. Here we review such distinctions, focusing on the unique features of specific long noncoding RNAs. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. The complete mitochondrial genome of Pholis nebulosus (Perciformes: Pholidae).

    PubMed

    Wang, Zhongquan; Qin, Kaili; Liu, Jingxi; Song, Na; Han, Zhiqiang; Gao, Tianxiang

    2016-11-01

    In this study, the complete mitochondrial genome (mitogenome) sequence of Pholis nebulosus has been determined by long polymerase chain reaction and primer-walking methods. The mitogenome is a circular molecule of 16 524 bp in length, including the typical structure of 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 2 non-coding regions (L-strand replication origin and control region), the gene contents of which are identical to those observed in most bony fishes. Within the control region, we identified the termination-associated sequence domain (TAS), and the conserved sequence block domain (CSB-F, CSB-E, CSB-D, CSB-C, CSB-B, CSB-A, CSB-1, CSB-2, CSB-3).

  16. Core histone genes of Giardia intestinalis: genomic organization, promoter structure, and expression

    PubMed Central

    Yee, Janet; Tang, Anita; Lau, Wei-Ling; Ritter, Heather; Delport, Dewald; Page, Melissa; Adam, Rodney D; Müller, Miklós; Wu, Gang

    2007-01-01

    Background Giardia intestinalis is a protist found in freshwaters worldwide, and is the most common cause of parasitic diarrhea in humans. The phylogenetic position of this parasite is still much debated. Histones are small, highly conserved proteins that associate tightly with DNA to form chromatin within the nucleus. There are two classes of core histone genes in higher eukaryotes: DNA replication-independent histones and DNA replication-dependent ones. Results We identified two copies each of the core histone H2a, H2b and H3 genes, and three copies of the H4 gene, at separate locations on chromosomes 3, 4 and 5 within the genome of Giardia intestinalis, but no gene encoding a H1 linker histone could be recognized. The copies of each gene share extensive DNA sequence identities throughout their coding and 5' noncoding regions, which suggests these copies have arisen from relatively recent gene duplications or gene conversions. The transcription start sites are at triplet A sequences 1–27 nucleotides upstream of the translation start codon for each gene. We determined that a 50 bp region upstream from the start of the histone H4 coding region is the minimal promoter, and a highly conserved 15 bp sequence called the histone motif (him) is essential for its activity. The Giardia core histone genes are constitutively expressed at approximately equivalent levels and their mRNAs are polyadenylated. Competition gel-shift experiments suggest that a factor within the protein complex that binds him may also be a part of the protein complexes that bind other promoter elements described previously in Giardia. Conclusion In contrast to other eukaryotes, the Giardia genome has only a single class of core histone genes that encode replication-independent histones. Our inability to locate a gene encoding the linker histone H1 leads us to speculate that the H1 protein may not be required for the compaction of Giardia's small and gene-rich genome. PMID:17425802

  17. Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

    PubMed Central

    Borodovsky, M; Rudd, K E; Koonin, E V

    1994-01-01

    The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428

  18. 2-D Structure of the A Region of Xist RNA and Its Implication for PRC2 Association

    PubMed Central

    Maenner, Sylvain; Blaud, Magali; Fouillen, Laetitia; Savoye, Anne; Marchand, Virginie; Dubois, Agnès; Sanglier-Cianférani, Sarah; Van Dorsselaer, Alain; Clerc, Philippe; Avner, Philip; Visvikis, Athanase; Branlant, Christiane

    2010-01-01

    In placental mammals, inactivation of one of the X chromosomes in female cells ensures sex chromosome dosage compensation. The 17 kb non-coding Xist RNA is crucial to this process and accumulates on the future inactive X chromosome. The most conserved Xist RNA region, the A region, contains eight or nine repeats separated by U-rich spacers. It is implicated in the recruitment of late inactivated X genes to the silencing compartment and likely in the recruitment of complex PRC2. Little is known about the structure of the A region and more generally about Xist RNA structure. Knowledge of its structure is restricted to an NMR study of a single A repeat element. Our study is the first experimental analysis of the structure of the entire A region in solution. By the use of chemical and enzymatic probes and FRET experiments, using oligonucleotides carrying fluorescent dyes, we resolved problems linked to sequence redundancies and established a 2-D structure for the A region that contains two long stem-loop structures each including four repeats. Interactions formed between repeats and between repeats and spacers stabilize these structures. Conservation of the spacer terminal sequences allows formation of such structures in all sequenced Xist RNAs. By combination of RNP affinity chromatography, immunoprecipitation assays, mass spectrometry, and Western blot analysis, we demonstrate that the A region can associate with components of the PRC2 complex in mouse ES cell nuclear extracts. Whilst a single four-repeat motif is able to associate with components of this complex, recruitment of Suz12 is clearly more efficient when the entire A region is present. Our data with their emphasis on the importance of inter-repeat pairing change fundamentally our conception of the 2-D structure of the A region of Xist RNA and support its possible implication in recruitment of the PRC2 complex. PMID:20052282

  19. The value of countryside elements in the conservation of a threatened arboreal marsupial Petaurus norfolcensis in agricultural landscapes of south-eastern Australia--the disproportional value of scattered trees.

    PubMed

    Crane, Mason J; Lindenmayer, David B; Cunningham, Ross B

    2014-01-01

    Human activities, particularly agriculture, have transformed much of the world's terrestrial environment. Within these anthropogenic landscapes, a variety of relictual and semi-natural habitats exist, which we term countryside elements. The habitat value of countryside elements (hereafter termed 'elements') is increasingly recognised. We quantify the relative value of four kinds of such 'elements' (linear roadside remnants, native vegetation patches, scattered trees and tree plantings) used by a threatened Australian arboreal marsupial, the squirrel glider (Petaurus norfolcensis). We examined relationships between home range size and the availability of each 'element' and whether the usage was relative to predicted levels of use. The use of 'elements' by gliders was largely explained by their availability, but there was a preference for native vegetation patches and scattered trees. We found home range size was significantly smaller with increasing area of scattered trees and a contrasting effect with increasing area of linear roadside remnants or native vegetation patches. Our work showed that each 'element' was used and as such had a role in the conservation of the squirrel glider, but their relative value varied. We illustrate the need to assess the conservation value of countryside elements so they can be incorporated into the holistic management of agricultural landscapes. This work demonstrates the disproportional value of scattered trees, underscoring the need to specifically incorporate and/or enhance the protection and recruitment of scattered trees in biodiversity conservation policy and management.

  20. Noncoding Genomics in Gastric Cancer and the Gastric Precancerous Cascade: Pathogenesis and Biomarkers

    PubMed Central

    Garcia-Bloj, Benjamin; Fry, Jacqueline; Wichmann, Ignacio

    2015-01-01

    Gastric cancer is the fifth most common cancer and the third leading cause of cancer-related death, whose patterns vary among geographical regions and ethnicities. It is a multifactorial disease, and its development depends on infection by Helicobacter pylori (H. pylori) and Epstein-Barr virus (EBV), host genetic factors, and environmental factors. The heterogeneity of the disease has begun to be unraveled by a comprehensive mutational evaluation of primary tumors. The low-abundance of mutations suggests that other mechanisms participate in the evolution of the disease, such as those found through analyses of noncoding genomics. Noncoding genomics includes single nucleotide polymorphisms (SNPs), regulation of gene expression through DNA methylation of promoter sites, miRNAs, other noncoding RNAs in regulatory regions, and other topics. These processes and molecules ultimately control gene expression. Potential biomarkers are appearing from analyses of noncoding genomics. This review focuses on noncoding genomics and potential biomarkers in the context of gastric cancer and the gastric precancerous cascade. PMID:26379360

  1. Non-coding RNAs and Berberine: A new mechanism of its anti-diabetic activities.

    PubMed

    Chang, Wenguang

    2017-01-15

    Type 2 Diabetes (T2D) is a metabolic disease with high mortality and morbidity. Non-coding RNAs, including small and long non-coding RNAs, are a novel class of functional RNA molecules that regulate multiple biological functions through diverse mechanisms. Studies in the last decade have demonstrated that non-coding RNAs may represent compelling therapeutic targets and play important roles in regulating the course of insulin resistance and T2D. Berberine, a plant-based alkaloid, has shown promise as an anti-hyperglycaemic, anti-hyperlipidaemic agent against T2D. Previous studies have primarily focused on a diverse array of efficacy end points of berberine in the pathogenesis of metabolic syndromes and inflammation or oxidative stress. Currently, an increasing number of studies have revealed the importance of non-coding RNAs as regulators of the anti-diabetic effects of berberine. The regulation of non-coding RNAs has been associated with several therapeutic actions of berberine in T2D progression. Thus, this review summarizes the anti-diabetic mechanisms of berberine by focusing on its role in regulating non-coding RNA, thus demonstrating that berberine exerts global anti-diabetic effects by targeting non-coding RNAs and that these effects involve several miRNAs, lncRNAs and multiple signal pathways, which may enhance the current understanding of the anti-diabetic mechanism actions of berberine and provide new pathological targets for the development of berberine-related drugs. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Prediction of plant lncRNA by ensemble machine learning classifiers.

    PubMed

    Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian

    2018-05-02

    In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.

  3. The genome in three dimensions: a new frontier in human brain research.

    PubMed

    Mitchell, Amanda C; Bharadwaj, Rahul; Whittle, Catheryne; Krueger, Winfried; Mirnics, Karoly; Hurd, Yasmin; Rasmussen, Theodore; Akbarian, Schahram

    2014-06-15

    Less than 1.5% of the human genome encodes protein. However, vast portions of the human genome are subject to transcriptional and epigenetic regulation, and many noncoding regulatory DNA elements are thought to regulate the spatial organization of interphase chromosomes. For example, chromosomal "loopings" are pivotal for the orderly process of gene expression, by enabling distal regulatory enhancer or silencer elements to directly interact with proximal promoter and transcription start sites, potentially bypassing hundreds of kilobases of interspersed sequence on the linear genome. To date, however, epigenetic studies in the human brain are mostly limited to the exploration of DNA methylation and posttranslational modifications of the nucleosome core histones. In contrast, very little is known about the regulation of supranucleosomal structures. Here, we show that chromosome conformation capture, a widely used approach to study higher-order chromatin, is applicable to tissue collected postmortem, thereby informing about genome organization in the human brain. We introduce chromosome conformation capture protocols for brain and compare higher-order chromatin structures at the chromosome 6p22.2-22.1 schizophrenia and bipolar disorder susceptibility locus, and additional neurodevelopmental risk genes, (DPP10, MCPH1) in adult prefrontal cortex and various cell culture systems, including neurons derived from reprogrammed skin cells. We predict that the exploration of three-dimensional genome architectures and function will open up new frontiers in human brain research and psychiatric genetics and provide novel insights into the epigenetic risk architectures of regulatory noncoding DNA. Copyright © 2014 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  4. Homo sapiens-Specific Binding Site Variants within Brain Exclusive Enhancers Are Subject to Accelerated Divergence across Human Population.

    PubMed

    Zehra, Rabail; Abbasi, Amir Ali

    2018-03-01

    Empirical assessments of human accelerated noncoding DNA frgaments have delineated presence of many cis-regulatory elements. Enhancers make up an important category of such accelerated cis-regulatory elements that efficiently control the spatiotemporal expression of many developmental genes. Establishing plausible reasons for accelerated enhancer sequence divergence in Homo sapiens has been termed significant in various previously published studies. This acceleration by including closely related primates and archaic human data has the potential to open up evolutionary avenues for deducing present-day brain structure. This study relied on empirically confirmed brain exclusive enhancers to avoid any misjudgments about their regulatory status and categorized among them a subset of enhancers with an exceptionally accelerated rate of lineage specific divergence in humans. In this assorted set, 13 distinct transcription factor binding sites were located that possessed unique existence in humans. Three of 13 such sites belonging to transcription factors SOX2, RUNX1/3, and FOS/JUND possessed single nucleotide variants that made them unique to H. sapiens upon comparisons with Neandertal and Denisovan orthologous sequences. These variants modifying the binding sites in modern human lineage were further substantiated as single nucleotide polymorphisms via exploiting 1000 Genomes Project Phase3 data. Long range haplotype based tests laid out evidence of positive selection to be governing in African population on two of the modern human motif modifying alleles with strongest results for SOX2 binding site. In sum, our study acknowledges acceleration in noncoding regulatory landscape of the genome and highlights functional parts within it to have undergone accelerated divergence in present-day human population.

  5. A locally conservative stabilized continuous Galerkin finite element method for two-phase flow in poroelastic subsurfaces

    NASA Astrophysics Data System (ADS)

    Deng, Q.; Ginting, V.; McCaskill, B.; Torsu, P.

    2017-10-01

    We study the application of a stabilized continuous Galerkin finite element method (CGFEM) in the simulation of multiphase flow in poroelastic subsurfaces. The system involves a nonlinear coupling between the fluid pressure, subsurface's deformation, and the fluid phase saturation, and as such, we represent this coupling through an iterative procedure. Spatial discretization of the poroelastic system employs the standard linear finite element in combination with a numerical diffusion term to maintain stability of the algebraic system. Furthermore, direct calculation of the normal velocities from pressure and deformation does not entail a locally conservative field. To alleviate this drawback, we propose an element based post-processing technique through which local conservation can be established. The performance of the method is validated through several examples illustrating the convergence of the method, the effectivity of the stabilization term, and the ability to achieve locally conservative normal velocities. Finally, the efficacy of the method is demonstrated through simulations of realistic multiphase flow in poroelastic subsurfaces.

  6. GEMPIC: geometric electromagnetic particle-in-cell methods

    NASA Astrophysics Data System (ADS)

    Kraus, Michael; Kormann, Katharina; Morrison, Philip J.; Sonnendrücker, Eric

    2017-08-01

    We present a novel framework for finite element particle-in-cell methods based on the discretization of the underlying Hamiltonian structure of the Vlasov-Maxwell system. We derive a semi-discrete Poisson bracket, which retains the defining properties of a bracket, anti-symmetry and the Jacobi identity, as well as conservation of its Casimir invariants, implying that the semi-discrete system is still a Hamiltonian system. In order to obtain a fully discrete Poisson integrator, the semi-discrete bracket is used in conjunction with Hamiltonian splitting methods for integration in time. Techniques from finite element exterior calculus ensure conservation of the divergence of the magnetic field and Gauss' law as well as stability of the field solver. The resulting methods are gauge invariant, feature exact charge conservation and show excellent long-time energy and momentum behaviour. Due to the generality of our framework, these conservation properties are guaranteed independently of a particular choice of the finite element basis, as long as the corresponding finite element spaces satisfy certain compatibility conditions.

  7. An upwind space-time conservation element and solution element scheme for solving dusty gas flow model

    NASA Astrophysics Data System (ADS)

    Rehman, Asad; Ali, Ishtiaq; Qamar, Shamsul

    An upwind space-time conservation element and solution element (CE/SE) scheme is extended to numerically approximate the dusty gas flow model. Unlike central CE/SE schemes, the current method uses the upwind procedure to derive the numerical fluxes through the inner boundary of conservation elements. These upwind fluxes are utilized to calculate the gradients of flow variables. For comparison and validation, the central upwind scheme is also applied to solve the same dusty gas flow model. The suggested upwind CE/SE scheme resolves the contact discontinuities more effectively and preserves the positivity of flow variables in low density flows. Several case studies are considered and the results of upwind CE/SE are compared with the solutions of central upwind scheme. The numerical results show better performance of the upwind CE/SE method as compared to the central upwind scheme.

  8. Global alteration of microRNAs and transposon-derived small RNAs in cotton (Gossypium hirsutum) during Cotton leafroll dwarf polerovirus (CLRDV) infection.

    PubMed

    Romanel, Elisson; Silva, Tatiane F; Corrêa, Régis L; Farinelli, Laurent; Hawkins, Jennifer S; Schrago, Carlos E G; Vaslin, Maite F S

    2012-11-01

    Small RNAs (sRNAs) are a class of non-coding RNAs ranging from 20- to 40-nucleotides (nts) that are present in most eukaryotic organisms. In plants, sRNAs are involved in the regulation of development, the maintenance of genome stability and the antiviral response. Viruses, however, can interfere with and exploit the silencing-based regulatory networks, causing the deregulation of sRNAs, including small interfering RNAs (siRNAs) and microRNAs (miRNAs). To understand the impact of viral infection on the plant sRNA pathway, we deep sequenced the sRNAs in cotton leaves infected with Cotton leafroll dwarf virus (CLRDV), which is a member of the economically important virus family Luteoviridae. A total of 60 putative conserved cotton miRNAs were identified, including 19 new miRNA families that had not been previously described in cotton. Some of these miRNAs were clearly misregulated during viral infection, and their possible role in symptom development and disease progression is discussed. Furthermore, we found that the 24-nt heterochromatin-associated siRNAs were quantitatively and qualitatively altered in the infected plant, leading to the reactivation of at least one cotton transposable element. This is the first study to explore the global alterations of sRNAs in virus-infected cotton plants. Our results indicate that some CLRDV-induced symptoms may be correlated with the deregulation of miRNA and/or epigenetic networks.

  9. Discrete conservation properties for shallow water flows using mixed mimetic spectral elements

    NASA Astrophysics Data System (ADS)

    Lee, D.; Palha, A.; Gerritsma, M.

    2018-03-01

    A mixed mimetic spectral element method is applied to solve the rotating shallow water equations. The mixed method uses the recently developed spectral element histopolation functions, which exactly satisfy the fundamental theorem of calculus with respect to the standard Lagrange basis functions in one dimension. These are used to construct tensor product solution spaces which satisfy the generalized Stokes theorem, as well as the annihilation of the gradient operator by the curl and the curl by the divergence. This allows for the exact conservation of first order moments (mass, vorticity), as well as higher moments (energy, potential enstrophy), subject to the truncation error of the time stepping scheme. The continuity equation is solved in the strong form, such that mass conservation holds point wise, while the momentum equation is solved in the weak form such that vorticity is globally conserved. While mass, vorticity and energy conservation hold for any quadrature rule, potential enstrophy conservation is dependent on exact spatial integration. The method possesses a weak form statement of geostrophic balance due to the compatible nature of the solution spaces and arbitrarily high order spatial error convergence.

  10. A Dual Origin of the Xist Gene from a Protein-Coding Gene and a Set of Transposable Elements

    PubMed Central

    Elisaphenko, Eugeny A.; Kolesnikov, Nikolay N.; Shevchenko, Alexander I.; Rogozin, Igor B.; Nesterova, Tatyana B.; Brockdorff, Neil; Zakian, Suren M.

    2008-01-01

    X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA. PMID:18575625

  11. Potential functions of microRNAs in starch metabolism and development revealed by miRNA transcriptome profiling of cassava cultivars and their wild progenitor.

    PubMed

    Chen, Xin; Xia, Jing; Xia, Zhiqiang; Zhang, Hefang; Zeng, Changying; Lu, Cheng; Zhang, Weixiong; Wang, Wenquan

    2015-02-04

    MicroRNAs (miRNAs) are small (approximately 21 nucleotide) non-coding RNAs that are key post-transcriptional gene regulators in eukaryotic organisms. More than 100 cassava miRNAs have been identified in a conservation analysis and a repertoire of cassava miRNAs have also been characterised by next-generation sequencing (NGS) in recent studies. Here, using NGS, we profiled small non-coding RNAs and mRNA genes in two cassava cultivars and their wild progenitor to identify and characterise miRNAs that are potentially involved in plant growth and starch biosynthesis. Six small RNA and six mRNA libraries from leaves and roots of the two cultivars, KU50 and Arg7, and their wild progenitor, W14, were subjected to NGS. Analysis of the sequencing data revealed 29 conserved miRNA families and 33 new miRNA families. Together, these miRNAs potentially targeted a total of 360 putative target genes. Whereas 16 miRNA families were highly expressed in cultivar leaves, another 13 miRNA families were highly expressed in storage roots of cultivars. Co-expression analysis revealed that the expression level of some targets had negative relationship with their corresponding miRNAs in storage roots and leaves; these targets included MYB33, ARF10, GRF1, RD19, APL2, NF-YA3 and SPL2, which are known to be involved in plant development, starch biosynthesis and response to environmental stimuli. The identified miRNAs, target mRNAs and target gene ontology annotation all shed light on the possible functions of miRNAs in Manihot species. The differential expression of miRNAs between cultivars and their wild progenitor, together with our analysis of GO annotation and confirmation of miRNA: target pairs, might provide insight into know the differences between wild progenitor and cultivated cassava.

  12. A Multi-Platform Draft de novo Genome Assembly and Comparative Analysis for the Scarlet Macaw (Ara macao)

    PubMed Central

    Seabury, Christopher M.; Dowd, Scot E.; Seabury, Paul M.; Raudsepp, Terje; Brightsmith, Donald J.; Liboriussen, Poul; Halley, Yvette; Fisher, Colleen A.; Owens, Elaine; Viswanathan, Ganesh; Tizard, Ian R.

    2013-01-01

    Data deposition to NCBI Genomes This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AMXX00000000 (SMACv1.0, unscaffolded genome assembly). The version described in this paper is the first version (AMXX01000000). The scaffolded assembly (SMACv1.1) has been deposited at DDBJ/EMBL/GenBank under the accession AOUJ00000000, and is also the first version (AOUJ01000000). Strong biological interest in traits such as the acquisition and utilization of speech, cognitive abilities, and longevity catalyzed the utilization of two next-generation sequencing platforms to provide the first-draft de novo genome assembly for the large, new world parrot Ara macao (Scarlet Macaw). Despite the challenges associated with genome assembly for an outbred avian species, including 951,507 high-quality putative single nucleotide polymorphisms, the final genome assembly (>1.035 Gb) includes more than 997 Mb of unambiguous sequence data (excluding N’s). Cytogenetic analyses including ZooFISH revealed complex rearrangements associated with two scarlet macaw macrochromosomes (AMA6, AMA7), which supports the hypothesis that translocations, fusions, and intragenomic rearrangements are key factors associated with karyotype evolution among parrots. In silico annotation of the scarlet macaw genome provided robust evidence for 14,405 nuclear gene annotation models, their predicted transcripts and proteins, and a complete mitochondrial genome. Comparative analyses involving the scarlet macaw, chicken, and zebra finch genomes revealed high levels of nucleotide-based conservation as well as evidence for overall genome stability among the three highly divergent species. Application of a new whole-genome analysis of divergence involving all three species yielded prioritized candidate genes and noncoding regions for parrot traits of interest (i.e., speech, intelligence, longevity) which were independently supported by the results of previous human GWAS studies. We also observed evidence for genes and noncoding loci that displayed extreme conservation across the three avian lineages, thereby reflecting their likely biological and developmental importance among birds. PMID:23667475

  13. Comparative analysis of long non-coding RNAs in Atlantic and Coho salmon reveals divergent transcriptome responses associated with immunity and tissue repair during sea lice infestation.

    PubMed

    Valenzuela-Muñoz, Valentina; Valenzuela-Miranda, Diego; Gallardo-Escárate, Cristian

    2018-05-24

    The increasing capacity of transcriptomic analysis by high throughput sequencing has highlighted the presence of a large proportion of transcripts that do not encode proteins. In particular, long non-coding RNAs (lncRNAs) are sequences with low coding potential and conservation among species. Moreover, cumulative evidence has revealed important roles in post-transcriptional gene modulation in several taxa. In fish, the role of lncRNAs has been scarcely studied and even less so during the immune response against sea lice. In the present study we mined for lncRNAs in Atlantic salmon (Salmo salar) and Coho salmon (Oncorhynkus kisutch), which are affected by the sea louse Caligus rogercresseyi, evaluating the degree of sequence conservation between these two fish species and their putative roles during the infection process. Herein, Atlantic and Coho salmon were infected with 35 lice/fish and evaluated after 7 and 14 days post-infestation (dpi). For RNA sequencing, samples from skin and head kidney were collected. A total of 5658/4140 and 3678/2123 lncRNAs were identified in uninfected/infected Atlantic and Coho salmon transcriptomes, respectively. Species-specific transcription patterns were observed in exclusive lncRNAs according to the tissue analyzed. Furthermore, neighbor gene GO enrichment analysis of the top 100 highly regulated lncRNAs in Atlantic salmon showed that lncRNAs were localized near genes related to the immune response. On the other hand, in Coho salmon the highly regulated lncRNAs were localized near genes involved in tissue repair processes. This study revealed high regulation of lncRNAs closely localized to immune and tissue repair-related genes in Atlantic and Coho salmon, respectively, suggesting putative roles for lncRNAs in salmon against sea lice infestation. Copyright © 2018 Elsevier Ltd. All rights reserved.

  14. Long Non-Coding RNAs Responsive to Salt and Boron Stress in the Hyper-Arid Lluteño Maize from Atacama Desert.

    PubMed

    Huanca-Mamani, Wilson; Arias-Carrasco, Raúl; Cárdenas-Ninasivincha, Steffany; Rojas-Herrera, Marcelo; Sepúlveda-Hermosilla, Gonzalo; Caris-Maldonado, José Carlos; Bastías, Elizabeth; Maracaja-Coutinho, Vinicius

    2018-03-20

    Long non-coding RNAs (lncRNAs) have been defined as transcripts longer than 200 nucleotides, which lack significant protein coding potential and possess critical roles in diverse cellular processes. Long non-coding RNAs have recently been functionally characterized in plant stress-response mechanisms. In the present study, we perform a comprehensive identification of lncRNAs in response to combined stress induced by salinity and excess of boron in the Lluteño maize, a tolerant maize landrace from Atacama Desert, Chile. We use deep RNA sequencing to identify a set of 48,345 different lncRNAs, of which 28,012 (58.1%) are conserved with other maize (B73, Mo17 or Palomero), with the remaining 41.9% belonging to potentially Lluteño exclusive lncRNA transcripts. According to B73 maize reference genome sequence, most Lluteño lncRNAs correspond to intergenic transcripts. Interestingly, Lluteño lncRNAs presents an unusual overall higher expression compared to protein coding genes under exposure to stressed conditions. In total, we identified 1710 putatively responsive to the combined stressed conditions of salt and boron exposure. We also identified a set of 848 stress responsive potential trans natural antisense transcripts ( trans -NAT) lncRNAs, which seems to be regulating genes associated with regulation of transcription, response to stress, response to abiotic stimulus and participating of the nicotianamine metabolic process. Reverse transcription-quantitative PCR (RT-qPCR) experiments were performed in a subset of lncRNAs, validating their existence and expression patterns. Our results suggest that a diverse set of maize lncRNAs from leaves and roots is responsive to combined salt and boron stress, being the first effort to identify lncRNAs from a maize landrace adapted to extreme conditions such as the Atacama Desert. The information generated is a starting point to understand the genomic adaptabilities suffered by this maize to surpass this extremely stressed environment.

  15. Identification of microRNAs and long non-coding RNAs involved in fatty acid biosynthesis in tree peony seeds.

    PubMed

    Yin, Dan-Dan; Li, Shan-Shan; Shu, Qing-Yan; Gu, Zhao-Yu; Wu, Qian; Feng, Cheng-Yong; Xu, Wen-Zhong; Wang, Liang-Sheng

    2018-08-05

    MicroRNAs (miRNAs) and long noncoding RNAs (lncRNAs) act as important molecular regulators in a wide range of biological processes during plant development and seed formation, including oil production. Tree peony seeds contain >90% unsaturated fatty acids (UFAs) and high proportions of α-linolenic acid (ALA, > 40%). To dissect the non-coding RNAs (ncRNAs) pathway involved in fatty acids synthesis in tree peony seeds, we construct six small RNA libraries and six transcriptome libraries from developing seeds of two cultivars (J and S) containing different content of fatty acid compositions. After deep sequencing the RNA libraries, the ncRNA expression profiles of tree peony seeds in two cultivars were systematically and comparatively analyzed. A total of 318 known and 153 new miRNAs and 22,430 lncRNAs were identified, among which 106 conserved and 9 novel miRNAs and 2785 lncRNAs were differentially expressed between the two cultivars. In addition, potential target genes of the microRNA and lncRNAs were also predicted and annotated. Among them, 9 miRNAs and 39 lncRNAs were predicted to target lipid related genes. Results showed that all of miR414, miR156b, miR2673b, miR7826, novel-m0027-5p, TR24651|c0_g1, TR24544|c0_g15, and TR27305|c0_g1 were up-regulated and expressed at a higher level in high-ALA cultivar J when compared to low-ALA cultivar S, suggesting that these ncRNAs and target genes are possibly involved in different fatty acid synthesis and lipid metabolism through post-transcriptional regulation. These results provide a better understanding of the roles of ncRNAs during fatty acid biosynthesis and metabolism in tree peony seeds. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. Complex Tissue-Specific Patterns and Distribution of Multiple RAGE Splice Variants in Different Mammals

    PubMed Central

    López-Díez, Raquel; Rastrojo, Alberto; Villate, Olatz; Aguado, Begoña

    2013-01-01

    The receptor for advanced glycosylation end products (RAGE) is a multiligand receptor involved in diverse cell signaling pathways. Previous studies show that this gene expresses several splice variants in human, mouse, and dog. Alternative splicing (AS) plays an important role in expanding transcriptomic and proteomic diversity, and it has been related to disease. AS is also one of the main evolutionary mechanisms in mammalian genomes. However, limited information is available regarding the AS of RAGE in a wide context of mammalian tissues. In this study, we examined in detail the different RAGE mRNAs generated by AS from six mammals, including two primates (human and monkey), two artiodactyla (cow and pig), and two rodentia (mouse and rat) in 6–18 different tissues including fetal, adult, and tumor. By nested reverse transcription-polymerase chain reaction (RT-PCR) we identified a high number of splice variants including noncoding transcripts and predicted coding ones with different potential protein modifications affecting mainly the transmembrane and ligand-binding domains that could influence their biological function. However, analysis of RNA-seq data enabled detecting only the most abundant splice variants. More than 80% of the detected RT-PCR variants (87 of 101 transcripts) are novel (different exon/intron structure to the previously described ones), and interestingly, 20–60% of the total transcripts (depending on the species) are noncoding ones that present tissue specificity. Our results suggest that RAGE undergoes extensive AS in mammals, with different expression patterns among adult, fetal, and tumor tissues. Moreover, most splice variants seem to be species specific, especially the noncoding variants, with only two (canonical human Tv1-RAGE, and human N-truncated or Tv10-RAGE) conserved among the six different species. This could indicate a special evolution pattern of this gene at mRNA level. PMID:24273313

  17. Long Non-Coding RNAs Responsive to Salt and Boron Stress in the Hyper-Arid Lluteño Maize from Atacama Desert

    PubMed Central

    Huanca-Mamani, Wilson; Arias-Carrasco, Raúl; Cárdenas-Ninasivincha, Steffany; Rojas-Herrera, Marcelo; Sepúlveda-Hermosilla, Gonzalo; Caris-Maldonado, José Carlos; Bastías, Elizabeth; Maracaja-Coutinho, Vinicius

    2018-01-01

    Long non-coding RNAs (lncRNAs) have been defined as transcripts longer than 200 nucleotides, which lack significant protein coding potential and possess critical roles in diverse cellular processes. Long non-coding RNAs have recently been functionally characterized in plant stress–response mechanisms. In the present study, we perform a comprehensive identification of lncRNAs in response to combined stress induced by salinity and excess of boron in the Lluteño maize, a tolerant maize landrace from Atacama Desert, Chile. We use deep RNA sequencing to identify a set of 48,345 different lncRNAs, of which 28,012 (58.1%) are conserved with other maize (B73, Mo17 or Palomero), with the remaining 41.9% belonging to potentially Lluteño exclusive lncRNA transcripts. According to B73 maize reference genome sequence, most Lluteño lncRNAs correspond to intergenic transcripts. Interestingly, Lluteño lncRNAs presents an unusual overall higher expression compared to protein coding genes under exposure to stressed conditions. In total, we identified 1710 putatively responsive to the combined stressed conditions of salt and boron exposure. We also identified a set of 848 stress responsive potential trans natural antisense transcripts (trans-NAT) lncRNAs, which seems to be regulating genes associated with regulation of transcription, response to stress, response to abiotic stimulus and participating of the nicotianamine metabolic process. Reverse transcription-quantitative PCR (RT-qPCR) experiments were performed in a subset of lncRNAs, validating their existence and expression patterns. Our results suggest that a diverse set of maize lncRNAs from leaves and roots is responsive to combined salt and boron stress, being the first effort to identify lncRNAs from a maize landrace adapted to extreme conditions such as the Atacama Desert. The information generated is a starting point to understand the genomic adaptabilities suffered by this maize to surpass this extremely stressed environment. PMID:29558449

  18. Heavy Chronic Intermittent Ethanol Exposure Alters Small Noncoding RNAs in Mouse Sperm and Epididymosomes.

    PubMed

    Rompala, Gregory R; Mounier, Anais; Wolfe, Cody M; Lin, Qishan; Lefterov, Iliya; Homanics, Gregg E

    2018-01-01

    While the risks of maternal alcohol abuse during pregnancy are well-established, several preclinical studies suggest that chronic preconception alcohol consumption by either parent may also have significance consequences for offspring health and development. Notably, since isogenic male mice used in these studies are not involved in gestation or rearing of offspring, the cross-generational effects of paternal alcohol exposure suggest a germline-based epigenetic mechanism. Many recent studies have demonstrated that the effects of paternal environmental exposures such as stress or malnutrition can be transmitted to the next generation via alterations to small noncoding RNAs in sperm. Therefore, we used high throughput sequencing to examine the effect of preconception ethanol on small noncoding RNAs in sperm. We found that chronic intermittent ethanol exposure altered several small noncoding RNAs from three of the major small RNA classes in sperm, tRNA-derived small RNA (tDR), mitochondrial small RNA, and microRNA. Six of the ethanol-responsive small noncoding RNAs were evaluated with RT-qPCR on a separate cohort of mice and five of the six were confirmed to be altered by chronic ethanol exposure, supporting the validity of the sequencing results. In addition to altered sperm RNA abundance, chronic ethanol exposure affected post-transcriptional modifications to sperm small noncoding RNAs, increasing two nucleoside modifications previously identified in mitochondrial tRNA. Furthermore, we found that chronic ethanol reduced epididymal expression of a tRNA methyltransferase, Nsun2 , known to directly regulate tDR biogenesis. Finally, ethanol-responsive sperm tDR are similarly altered in extracellular vesicles of the epididymis (i.e., epididymosomes), supporting the hypothesis that alterations to sperm tDR emerge in the epididymis and that epididymosomes are the primary source of small noncoding RNAs in sperm. These results add chronic ethanol to the growing list of paternal exposures that can affect small noncoding RNA abundance and nucleoside modifications in sperm. As small noncoding RNAs in sperm have been shown to causally induce heritable phenotypes in offspring, additional research is warranted to understand the potential effects of ethanol-responsive sperm small noncoding RNAs on offspring health and development.

  19. Heavy Chronic Intermittent Ethanol Exposure Alters Small Noncoding RNAs in Mouse Sperm and Epididymosomes

    PubMed Central

    Rompala, Gregory R.; Mounier, Anais; Wolfe, Cody M.; Lin, Qishan; Lefterov, Iliya; Homanics, Gregg E.

    2018-01-01

    While the risks of maternal alcohol abuse during pregnancy are well-established, several preclinical studies suggest that chronic preconception alcohol consumption by either parent may also have significance consequences for offspring health and development. Notably, since isogenic male mice used in these studies are not involved in gestation or rearing of offspring, the cross-generational effects of paternal alcohol exposure suggest a germline-based epigenetic mechanism. Many recent studies have demonstrated that the effects of paternal environmental exposures such as stress or malnutrition can be transmitted to the next generation via alterations to small noncoding RNAs in sperm. Therefore, we used high throughput sequencing to examine the effect of preconception ethanol on small noncoding RNAs in sperm. We found that chronic intermittent ethanol exposure altered several small noncoding RNAs from three of the major small RNA classes in sperm, tRNA-derived small RNA (tDR), mitochondrial small RNA, and microRNA. Six of the ethanol-responsive small noncoding RNAs were evaluated with RT-qPCR on a separate cohort of mice and five of the six were confirmed to be altered by chronic ethanol exposure, supporting the validity of the sequencing results. In addition to altered sperm RNA abundance, chronic ethanol exposure affected post-transcriptional modifications to sperm small noncoding RNAs, increasing two nucleoside modifications previously identified in mitochondrial tRNA. Furthermore, we found that chronic ethanol reduced epididymal expression of a tRNA methyltransferase, Nsun2, known to directly regulate tDR biogenesis. Finally, ethanol-responsive sperm tDR are similarly altered in extracellular vesicles of the epididymis (i.e., epididymosomes), supporting the hypothesis that alterations to sperm tDR emerge in the epididymis and that epididymosomes are the primary source of small noncoding RNAs in sperm. These results add chronic ethanol to the growing list of paternal exposures that can affect small noncoding RNA abundance and nucleoside modifications in sperm. As small noncoding RNAs in sperm have been shown to causally induce heritable phenotypes in offspring, additional research is warranted to understand the potential effects of ethanol-responsive sperm small noncoding RNAs on offspring health and development. PMID:29472946

  20. Carnivore-specific SINEs (Can-SINEs): distribution, evolution, and genomic impact.

    PubMed

    Walters-Conte, Kathryn B; Johnson, Diana L E; Allard, Marc W; Pecon-Slattery, Jill

    2011-01-01

    Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics.

  1. Carnivore-Specific SINEs (Can-SINEs): Distribution, Evolution, and Genomic Impact

    PubMed Central

    Johnson, Diana L.E.; Allard, Marc W.; Pecon-Slattery, Jill

    2011-01-01

    Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics. PMID:21846743

  2. Stress induced gene expression drives transient DNA methylation changes at adjacent repetitive elements.

    PubMed

    Secco, David; Wang, Chuang; Shou, Huixia; Schultz, Matthew D; Chiarenza, Serge; Nussaume, Laurent; Ecker, Joseph R; Whelan, James; Lister, Ryan

    2015-07-21

    Cytosine DNA methylation (mC) is a genome modification that can regulate the expression of coding and non-coding genetic elements. However, little is known about the involvement of mC in response to environmental cues. Using whole genome bisulfite sequencing to assess the spatio-temporal dynamics of mC in rice grown under phosphate starvation and recovery conditions, we identified widespread phosphate starvation-induced changes in mC, preferentially localized in transposable elements (TEs) close to highly induced genes. These changes in mC occurred after changes in nearby gene transcription, were mostly DCL3a-independent, and could partially be propagated through mitosis, however no evidence of meiotic transmission was observed. Similar analyses performed in Arabidopsis revealed a very limited effect of phosphate starvation on mC, suggesting a species-specific mechanism. Overall, this suggests that TEs in proximity to environmentally induced genes are silenced via hypermethylation, and establishes the temporal hierarchy of transcriptional and epigenomic changes in response to stress.

  3. Interplay between DNA methylation, histone modification and chromatin remodeling in stem cells and during development.

    PubMed

    Ikegami, Kohta; Ohgane, Jun; Tanaka, Satoshi; Yagi, Shintaro; Shiota, Kunio

    2009-01-01

    Genes constitute only a small proportion of the mammalian genome, the majority of which is composed of non-genic repetitive elements including interspersed repeats and satellites. A unique feature of the mammalian genome is that there are numerous tissue-dependent, differentially methylated regions (T-DMRs) in the non-repetitive sequences, which include genes and their regulatory elements. The epigenetic status of T-DMRs varies from that of repetitive elements and constitutes the DNA methylation profile genome-wide. Since the DNA methylation profile is specific to each cell and tissue type, much like a fingerprint, it can be used as a means of identification. The formation of DNA methylation profiles is the basis for cell differentiation and development in mammals. The epigenetic status of each T-DMR is regulated by the interplay between DNA methyltransferases, histone modification enzymes, histone subtypes, non-histone nuclear proteins and non-coding RNAs. In this review, we will discuss how these epigenetic factors cooperate to establish cell- and tissue-specific DNA methylation profiles.

  4. Virulence Phenotypes of Legionella pneumophila Associated with Noncoding RNA lpr0035

    PubMed Central

    Jayakumar, Deepak; Early, Julie V.

    2012-01-01

    The Philadelphia-1 strain of Legionella pneumophila, the causative organism of Legionnaires' disease, contains a recently discovered noncoding RNA, lpr0035. lpr0035 straddles the 5′ chromosomal junction of a 45-kbp mobile genetic element, pLP45, which can exist as an episome or integrated in the bacterial chromosome. A 121-bp deletion was introduced in strain JR32, a Philadelphia-1 derivative. The deletion inactivated lpr0035, removed the 49-bp direct repeat at the 5′ junction of pLP45, and locked pLP45 in the chromosome. Intracellular multiplication of the deletion mutant was decreased by nearly 3 orders of magnitude in Acanthamoeba castellanii amoebae and nearly 2 orders of magnitude in J774 mouse macrophages. Entry of the deletion mutant into amoebae and macrophages was decreased by >70%. The level of entry in both hosts was restored to that in strain JR32 by plasmid copies of two open reading frames immediately downstream of the 5′ junction and plasmid lpr0035 driven by its endogenous promoter. When induced from a tac promoter, plasmid lpr0035 completely reversed the intracellular multiplication defect in macrophages but was without effect in amoebae. These data are the first evidence of a role for noncoding RNA lpr0035, which has homologs in six other Legionella genomes, in entry of L. pneumophila into amoebae and macrophages and in host-specific intracellular multiplication. The data also demonstrate that deletion of a direct-repeat sequence restricts the mobility of pLP45 and is a means of studying the role of pLP45 mobility in Legionella virulence phenotypes. PMID:22966048

  5. Identification and Characterization of MicroRNAs in Small Brown Planthopper (Laodephax striatellus) by Next-Generation Sequencing

    PubMed Central

    Lou, Yonggen; Cheng, Jia'an; Zhang, Hengmu; Xu, Jian-Hong

    2014-01-01

    MicroRNAs (miRNAs) are endogenous non-coding small RNAs that regulate gene expression at the post-transcriptional level and are thought to play critical roles in many metabolic activities in eukaryotes. The small brown planthopper (Laodephax striatellus Fallén), one of the most destructive agricultural pests, causes great damage to crops including rice, wheat, and maize. However, information about the genome of L. striatellus is limited. In this study, a small RNA library was constructed from a mixed L. striatellus population and sequenced by Solexa sequencing technology. A total of 501 mature miRNAs were identified, including 227 conserved and 274 novel miRNAs belonging to 125 and 250 families, respectively. Sixty-nine conserved miRNAs that are included in 38 families are predicted to have an RNA secondary structure typically found in miRNAs. Many miRNAs were validated by stem-loop RT-PCR. Comparison with the miRNAs in 84 animal species from miRBase showed that the conserved miRNA families we identified are highly conserved in the Arthropoda phylum. Furthermore, miRanda predicted 2701 target genes for 378 miRNAs, which could be categorized into 52 functional groups annotated by gene ontology. The function of miRNA target genes was found to be very similar between conserved and novel miRNAs. This study of miRNAs in L. striatellus will provide new information and enhance the understanding of the role of miRNAs in the regulation of L. striatellus metabolism and development. PMID:25057821

  6. Identification and Functional Prediction of Large Intergenic Noncoding RNAs (lincRNAs) in Rainbow Trout (Oncorhynchus mykiss)

    USDA-ARS?s Scientific Manuscript database

    Long noncoding RNAs (lncRNAs) have been recognized in recent years as key regulators of diverse cellular processes. Genome-wide large-scale projects have uncovered thousands of lncRNAs in many model organisms. Large intergenic noncoding RNAs (lincRNAs) are lncRNAs that are transcribed from intergeni...

  7. Role of non-coding RNAs in non-aging-related neurological disorders.

    PubMed

    Vieira, A S; Dogini, D B; Lopes-Cendes, I

    2018-06-11

    Protein coding sequences represent only 2% of the human genome. Recent advances have demonstrated that a significant portion of the genome is actively transcribed as non-coding RNA molecules. These non-coding RNAs are emerging as key players in the regulation of biological processes, and act as "fine-tuners" of gene expression. Neurological disorders are caused by a wide range of genetic mutations, epigenetic and environmental factors, and the exact pathophysiology of many of these conditions is still unknown. It is currently recognized that dysregulations in the expression of non-coding RNAs are present in many neurological disorders and may be relevant in the mechanisms leading to disease. In addition, circulating non-coding RNAs are emerging as potential biomarkers with great potential impact in clinical practice. In this review, we discuss mainly the role of microRNAs and long non-coding RNAs in several neurological disorders, such as epilepsy, Huntington disease, fragile X-associated ataxia, spinocerebellar ataxias, amyotrophic lateral sclerosis (ALS), and pain. In addition, we give information about the conditions where microRNAs have demonstrated to be potential biomarkers such as in epilepsy, pain, and ALS.

  8. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  9. The expanding universe of noncoding RNAs.

    PubMed

    Hannon, G J; Rivas, F V; Murchison, E P; Steitz, J A

    2006-01-01

    The 71st Cold Spring Harbor Symposium on Quantitative Biology celebrated the numerous and expanding roles of regulatory RNAs in systems ranging from bacteria to mammals. It was clearly evident that noncoding RNAs are undergoing a renaissance, with reports of their involvement in nearly every cellular process. Previously known classes of longer noncoding RNAs were shown to function by every possible means-acting catalytically, sensing physiological states through adoption of complex secondary and tertiary structures, or using their primary sequences for recognition of target sites. The many recently discovered classes of small noncoding RNAs, generally less than 35 nucleotides in length, most often exert their effects by guiding regulatory complexes to targets via base-pairing. With the ability to analyze the RNA products of the genome in ever greater depth, it has become clear that the universe of noncoding RNAs may extend far beyond the boundaries we had previously imagined. Thus, as much as the Symposium highlighted exciting progress in the field, it also revealed how much farther we must go to understand fully the biological impact of noncoding RNAs.

  10. Short intronic repeat sequences facilitate circular RNA production

    PubMed Central

    Liang, Dongming

    2014-01-01

    Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery “backsplices” and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3′ end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. PMID:25281217

  11. [Comparative organization and the origin of noncoding regulatory RNA genes from X-chromosome inactivation center of human and mouse].

    PubMed

    Kolesnikov, N N; Elisafenko, E A

    2010-10-01

    After the radiation of primates and rodents, the evolution of X-chromosome inactivation centers in human and mouse (XIC/Xic) followed two different directions. Human XIC followed the pathway towards transposon accumulation (the repeat proportion in the center constitutes 72%), especially LINEs, which prevail in the center. On the contrary, mouse Xic eliminated long repeats and accumulated species-specific SIN Es (the repeat proportion in the center constitutes 35%). The mechanism underlying inactivation of one of the X chromosomes in female mammals appeared on the basis of trasnsposons. The key gene of the inactivation process, XIST/Xist, similarly to other long noncoding RNA genes, like TSIX/Tsix, JPX/Jpx, and FTX/Ftx, was formed with the involvement of different transposon sequences. Furthermore, two clusters ofmicroRNA genes from inactivation center originated from L2 [1]. In mouse, one of such clusters has been preserved in the form of microRNA pseudogenes. Thus, long ncRNA genes and microRNAs appeared during the period of transposable elements expansion in this locus, 140 to 105 Myr ago, after the radiation of marsupials and placental mammal lineages.

  12. Function and regulation of AUTS2, a gene implicated in autism and human evolution.

    PubMed

    Oksenberg, Nir; Stevison, Laurie; Wall, Jeffrey D; Ahituv, Nadav

    2013-01-01

    Nucleotide changes in the AUTS2 locus, some of which affect only noncoding regions, are associated with autism and other neurological disorders, including attention deficit hyperactivity disorder, epilepsy, dyslexia, motor delay, language delay, visual impairment, microcephaly, and alcohol consumption. In addition, AUTS2 contains the most significantly accelerated genomic region differentiating humans from Neanderthals, which is primarily composed of noncoding variants. However, the function and regulation of this gene remain largely unknown. To characterize auts2 function, we knocked it down in zebrafish, leading to a smaller head size, neuronal reduction, and decreased mobility. To characterize AUTS2 regulatory elements, we tested sequences for enhancer activity in zebrafish and mice. We identified 23 functional zebrafish enhancers, 10 of which were active in the brain. Our mouse enhancer assays characterized three mouse brain enhancers that overlap an ASD-associated deletion and four mouse enhancers that reside in regions implicated in human evolution, two of which are active in the brain. Combined, our results show that AUTS2 is important for neurodevelopment and expose candidate enhancer sequences in which nucleotide variation could lead to neurological disease and human-specific traits.

  13. AP1 Keeps Chromatin Poised for Action | Center for Cancer Research

    Cancer.gov

    The human genome harbors gene-encoding DNA, the blueprint for building proteins that regulate cellular function. Embedded across the genome, in non-coding regions, are DNA elements to which regulatory factors bind. The interaction of regulatory factors with DNA at these sites modifies gene expression to modulate cell activity. In cells, DNA exists in a complex with proteins called chromatin that compacts the DNA in the nucleus, strongly restricting access to DNA sequences. As a result, regulatory factors only interact with a small subset of their potential binding elements in a given cell to regulate genes. How factors recognize and select sites in chromatin across the genome is not well understood -- but several discoveries in CCR’s Laboratory of Receptor Biology and Gene Expression (LRBGE) have shed light on the mechanisms that direct factors to DNA.

  14. Genetic therapy for the nervous system.

    PubMed

    Bowers, William J; Breakefield, Xandra O; Sena-Esteves, Miguel

    2011-04-15

    Genetic therapy is undergoing a renaissance with expansion of viral and synthetic vectors, use of oligonucleotides (RNA and DNA) and sequence-targeted regulatory molecules, as well as genetically modified cells, including induced pluripotent stem cells from the patients themselves. Several clinical trials for neurologic syndromes appear quite promising. This review covers genetic strategies to ameliorate neurologic syndromes of different etiologies, including lysosomal storage diseases, Alzheimer's disease and other amyloidopathies, Parkinson's disease, spinal muscular atrophy, amyotrophic lateral sclerosis and brain tumors. This field has been propelled by genetic technologies, including identifying disease genes and disruptive mutations, design of genomic interacting elements to regulate transcription and splicing of specific precursor mRNAs and use of novel non-coding regulatory RNAs. These versatile new tools for manipulation of genetic elements provide the ability to tailor the mode of genetic intervention to specific aspects of a disease state.

  15. Negative regulation of early polyomavirus expression in mouse embryonal carcinoma cells.

    PubMed Central

    Cremisi, C; Babinet, C

    1986-01-01

    Embryonal carcinoma cells are resistant to infection by polyomavirus (Py). We showed that this block was partially removed by inhibiting protein synthesis temporarily. The block was also partially removed when Py was coinfected with simian virus 40. Cycloheximide treatment of cells infected with Py mutants able to grow on PCC4 embryonal carcinoma cells led to 3- to 10-fold increases in the production of T-antigen-positive cells. At 31 degrees C, Py T-antigen expression was enhanced when the cells were treated with cycloheximide. We suggest that a negative labile regulatory protein(s) is synthesized in PCC4 cells, preventing the initiation of early Py transcription by binding to the noncoding sequence, especially the enhancer element B and perhaps also element A, and that the Py mutants retained a binding site(s). PMID:3016339

  16. Identification of miRNA from Bouteloua gracilis, a drought tolerant grass, by deep sequencing and their in silico analysis.

    PubMed

    Ordóñez-Baquera, Perla Lucía; González-Rodríguez, Everardo; Aguado-Santacruz, Gerardo Armando; Rascón-Cruz, Quintín; Conesa, Ana; Moreno-Brito, Verónica; Echavarria, Raquel; Dominguez-Viveros, Joel

    2017-02-01

    MicroRNAs (miRNAs) are small non-coding RNA molecules that regulate signal transduction, development, metabolism, and stress responses in plants through post-transcriptional degradation and/or translational repression of target mRNAs. Several studies have addressed the role of miRNAs in model plant species, but miRNA expression and function in economically important forage crops, such as Bouteloua gracilis (Poaceae), a high-quality and drought-resistant grass distributed in semiarid regions of the United States and northern Mexico remain unknown. We applied high-throughput sequencing technology and bioinformatics analysis and identified 31 conserved miRNA families and 53 novel putative miRNAs with different abundance of reads in chlorophyllic cell cultures derived from B. gracilis. Some conserved miRNA families were highly abundant and possessed predicted targets involved in metabolism, plant growth and development, and stress responses. We also predicted additional identified novel miRNAs with specific targets, including B. gracilis ESTs, which were detected under drought stress conditions. Here we report 31 conserved miRNA families and 53 putative novel miRNAs in B. gracilis. Our results suggested the presence of regulatory miRNAs involved in modulating physiological and stress responses in this grass species. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Expression analysis and in silico characterization of intronic long noncoding RNAs in renal cell carcinoma: emerging functional associations

    PubMed Central

    2013-01-01

    Background Intronic and intergenic long noncoding RNAs (lncRNAs) are emerging gene expression regulators. The molecular pathogenesis of renal cell carcinoma (RCC) is still poorly understood, and in particular, limited studies are available for intronic lncRNAs expressed in RCC. Methods Microarray experiments were performed with custom-designed arrays enriched with probes for lncRNAs mapping to intronic genomic regions. Samples from 18 primary RCC tumors and 11 nontumor adjacent matched tissues were analyzed. Meta-analyses were performed with microarray expression data from three additional human tissues (normal liver, prostate tumor and kidney nontumor samples), and with large-scale public data for epigenetic regulatory marks and for evolutionarily conserved sequences. Results A signature of 29 intronic lncRNAs differentially expressed between RCC and nontumor samples was obtained (false discovery rate (FDR) <5%). A signature of 26 intronic lncRNAs significantly correlated with the RCC five-year patient survival outcome was identified (FDR <5%, p-value ≤0.01). We identified 4303 intronic antisense lncRNAs expressed in RCC, of which 22% were significantly (p <0.05) cis correlated with the expression of the mRNA in the same locus across RCC and three other human tissues. Gene Ontology (GO) analysis of those loci pointed to 'regulation of biological processes’ as the main enriched category. A module map analysis of the protein-coding genes significantly (p <0.05) trans correlated with the 20% most abundant lncRNAs, identified 51 enriched GO terms (p <0.05). We determined that 60% of the expressed lncRNAs are evolutionarily conserved. At the genomic loci containing the intronic RCC-expressed lncRNAs, a strong association (p <0.001) was found between their transcription start sites and genomic marks such as CpG islands, RNA Pol II binding and histones methylation and acetylation. Conclusion Intronic antisense lncRNAs are widely expressed in RCC tumors. Some of them are significantly altered in RCC in comparison with nontumor samples. The majority of these lncRNAs is evolutionarily conserved and possibly modulated by epigenetic modifications. Our data suggest that these RCC lncRNAs may contribute to the complex network of regulatory RNAs playing a role in renal cell malignant transformation. PMID:24238219

  18. The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4).

    PubMed

    Huntemann, Marcel; Ivanova, Natalia N; Mavromatis, Konstantinos; Tripp, H James; Paez-Espino, David; Palaniappan, Krishnaveni; Szeto, Ernest; Pillay, Manoj; Chen, I-Min A; Pati, Amrita; Nielsen, Torben; Markowitz, Victor M; Kyrpides, Nikos C

    2015-01-01

    The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.

  19. Ftx is a non-coding RNA which affects Xist expression and chromatin structure within the X-inactivation center region.

    PubMed

    Chureau, Corinne; Chantalat, Sophie; Romito, Antonio; Galvani, Angélique; Duret, Laurent; Avner, Philip; Rougeulle, Claire

    2011-02-15

    X chromosome inactivation (XCI) is an essential epigenetic process which involves several non-coding RNAs (ncRNAs), including Xist, the master regulator of X-inactivation initiation. Xist is flanked in its 5' region by a large heterochromatic hotspot, which contains several transcription units including a gene of unknown function, Ftx (five prime to Xist). In this article, we describe the characterization and functional analysis of murine Ftx. We present evidence that Ftx produces a conserved functional long ncRNA, and additionally hosts microRNAs (miR) in its introns. Strikingly, Ftx partially escapes X-inactivation and is upregulated specifically in female ES cells at the onset of X-inactivation, an expression profile which closely follows that of Xist. We generated Ftx null ES cells to address the function of this gene. In these cells, only local changes in chromatin marks are detected within the hotspot, indicating that Ftx is not involved in the global maintenance of the heterochromatic structure of this region. The Ftx mutation, however, results in widespread alteration of transcript levels within the X-inactivation center (Xic) and particularly important decreases in Xist RNA levels, which were correlated with increased DNA methylation at the Xist CpG island. Altogether our results indicate that Ftx is a positive regulator of Xist and lead us to propose that Ftx is a novel ncRNA involved in XCI.

  20. Identification of 88 regulatory small RNAs in the TIGR4 strain of the human pathogen Streptococcus pneumoniae

    PubMed Central

    Acebo, Paloma; Martin-Galiano, Antonio J.; Navarro, Sara; Zaballos, Ángel; Amblar, Mónica

    2012-01-01

    Streptococcus pneumoniae is the main etiological agent of community-acquired pneumonia and a major cause of mortality and morbidity among children and the elderly. Genome sequencing of several pneumococcal strains revealed valuable information about the potential proteins and genetic diversity of this prevalent human pathogen. However, little is known about its transcriptional regulation and its small regulatory noncoding RNAs. In this study, we performed deep sequencing of the S. pneumoniae TIGR4 strain RNome to identify small regulatory RNA candidates expressed in this pathogen. We discovered 1047 potential small RNAs including intragenic, 5′- and/or 3′-overlapping RNAs and 88 small RNAs encoded in intergenic regions. With this approach, we recovered many of the previously identified intergenic small RNAs and identified 68 novel candidates, most of which are conserved in both sequence and genomic context in other S. pneumoniae strains. We confirmed the independent expression of 17 intergenic small RNAs and predicted putative mRNA targets for six of them using bioinformatics tools. Preliminary results suggest that one of these six is a key player in the regulation of competence development. This study is the biggest catalog of small noncoding RNAs reported to date in S. pneumoniae and provides a highly complete view of the small RNA network in this pathogen. PMID:22274957

Top