Sample records for non-coding dna elements

  1. Exploring the read-write genome: mobile DNA and mammalian adaptation.

    PubMed

    Shapiro, James A

    2017-02-01

    The read-write genome idea predicts that mobile DNA elements will act in evolution to generate adaptive changes in organismal DNA. This prediction was examined in the context of mammalian adaptations involving regulatory non-coding RNAs, viviparous reproduction, early embryonic and stem cell development, the nervous system, and innate immunity. The evidence shows that mobile elements have played specific and sometimes major roles in mammalian adaptive evolution by generating regulatory sites in the DNA and providing interaction motifs in non-coding RNA. Endogenous retroviruses and retrotransposons have been the predominant mobile elements in mammalian adaptive evolution, with the notable exception of bats, where DNA transposons are the major agents of RW genome inscriptions. A few examples of independent but convergent exaptation of mobile DNA elements for similar regulatory rewiring functions are noted.

  2. Genome defense against exogenous nucleic acids in eukaryotes by non-coding DNA occurs through CRISPR-like mechanisms in the cytosol and the bodyguard protection in the nucleus.

    PubMed

    Qiu, Guo-Hua

    2016-01-01

    In this review, the protective function of the abundant non-coding DNA in the eukaryotic genome is discussed from the perspective of genome defense against exogenous nucleic acids. Peripheral non-coding DNA has been proposed to act as a bodyguard that protects the genome and the central protein-coding sequences from ionizing radiation-induced DNA damage. In the proposed mechanism of protection, the radicals generated by water radiolysis in the cytosol and IR energy are absorbed, blocked and/or reduced by peripheral heterochromatin; then, the DNA damage sites in the heterochromatin are removed and expelled from the nucleus to the cytoplasm through nuclear pore complexes, most likely through the formation of extrachromosomal circular DNA. To strengthen this hypothesis, this review summarizes the experimental evidence supporting the protective function of non-coding DNA against exogenous nucleic acids. Based on these data, I hypothesize herein about the presence of an additional line of defense formed by small RNAs in the cytosol in addition to their bodyguard protection mechanism in the nucleus. Therefore, exogenous nucleic acids may be initially inactivated in the cytosol by small RNAs generated from non-coding DNA via mechanisms similar to the prokaryotic CRISPR-Cas system. Exogenous nucleic acids may enter the nucleus, where some are absorbed and/or blocked by heterochromatin and others integrate into chromosomes. The integrated fragments and the sites of DNA damage are removed by repetitive non-coding DNA elements in the heterochromatin and excluded from the nucleus. Therefore, the normal eukaryotic genome and the central protein-coding sequences are triply protected by non-coding DNA against invasion by exogenous nucleic acids. This review provides evidence supporting the protective role of non-coding DNA in genome defense. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Altruistic functions for selfish DNA.

    PubMed

    Faulkner, Geoffrey J; Carninci, Piero

    2009-09-15

    Mammalian genomes are comprised of 30-50% transposed elements (TEs). The vast majority of these TEs are truncated and mutated fragments of retrotransposons that are no longer capable of transposition. Although initially regarded as important factors in the evolution of gene regulatory networks, TEs are now commonly perceived as neutrally evolving and non-functional genomic elements. In a major development, recent works have strongly contradicted this "selfish DNA" or "junk DNA" dogma by demonstrating that TEs use a host of novel promoters to generate RNA on a massive scale across most eukaryotic cells. This transcription frequently functions to control the expression of protein-coding genes via alternative promoters, cis regulatory non protein-coding RNAs and the formation of double stranded short RNAs. If considered in sum, these findings challenge the designation of TEs as selfish and neutrally evolving genomic elements. Here, we will expand upon these themes and discuss challenges in establishing novel TE functions in vivo.

  4. Interplay between DNA methylation, histone modification and chromatin remodeling in stem cells and during development.

    PubMed

    Ikegami, Kohta; Ohgane, Jun; Tanaka, Satoshi; Yagi, Shintaro; Shiota, Kunio

    2009-01-01

    Genes constitute only a small proportion of the mammalian genome, the majority of which is composed of non-genic repetitive elements including interspersed repeats and satellites. A unique feature of the mammalian genome is that there are numerous tissue-dependent, differentially methylated regions (T-DMRs) in the non-repetitive sequences, which include genes and their regulatory elements. The epigenetic status of T-DMRs varies from that of repetitive elements and constitutes the DNA methylation profile genome-wide. Since the DNA methylation profile is specific to each cell and tissue type, much like a fingerprint, it can be used as a means of identification. The formation of DNA methylation profiles is the basis for cell differentiation and development in mammals. The epigenetic status of each T-DMR is regulated by the interplay between DNA methyltransferases, histone modification enzymes, histone subtypes, non-histone nuclear proteins and non-coding RNAs. In this review, we will discuss how these epigenetic factors cooperate to establish cell- and tissue-specific DNA methylation profiles.

  5. Genome-scale deletion screening of human long non-coding RNAs using a paired-guide RNA CRISPR library

    PubMed Central

    Zhu, Shiyou; Li, Wei; Liu, Jingze; Chen, Chen-Hao; Liao, Qi; Xu, Ping; Xu, Han; Xiao, Tengfei; Cao, Zhongzheng; Peng, Jingyu; Yuan, Pengfei; Brown, Myles; Liu, Xiaole Shirley; Wei, Wensheng

    2017-01-01

    CRISPR/Cas9 screens have been widely adopted to analyse coding gene functions, but high throughput screening of non-coding elements using this method is more challenging, because indels caused by a single cut in non-coding regions are unlikely to produce a functional knockout. A high-throughput method to produce deletions of non-coding DNA is needed. Herein, we report a high throughput genomic deletion strategy to screen for functional long non-coding RNAs (lncRNAs) that is based on a lentiviral paired-guide RNA (pgRNA) library. Applying our screening method, we identified 51 lncRNAs that can positively or negatively regulate human cancer cell growth. We individually validated 9 lncRNAs using CRISPR/Cas9-mediated genomic deletion and functional rescue, CRISPR activation or inhibition, and gene expression profiling. Our high-throughput pgRNA genome deletion method should enable rapid identification of functional mammalian non-coding elements. PMID:27798563

  6. Junk DNA and the long non-coding RNA twist in cancer genetics

    PubMed Central

    Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A

    2015-01-01

    The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839

  7. AP1 Keeps Chromatin Poised for Action | Center for Cancer Research

    Cancer.gov

    The human genome harbors gene-encoding DNA, the blueprint for building proteins that regulate cellular function. Embedded across the genome, in non-coding regions, are DNA elements to which regulatory factors bind. The interaction of regulatory factors with DNA at these sites modifies gene expression to modulate cell activity. In cells, DNA exists in a complex with proteins

  8. Long non-coding RNA produced by RNA polymerase V determines boundaries of heterochromatin

    PubMed Central

    Böhmdorfer, Gudrun; Sethuraman, Shriya; Rowley, M Jordan; Krzyszton, Michal; Rothi, M Hafiz; Bouzit, Lilia; Wierzbicki, Andrzej T

    2016-01-01

    RNA-mediated transcriptional gene silencing is a conserved process where small RNAs target transposons and other sequences for repression by establishing chromatin modifications. A central element of this process are long non-coding RNAs (lncRNA), which in Arabidopsis thaliana are produced by a specialized RNA polymerase known as Pol V. Here we show that non-coding transcription by Pol V is controlled by preexisting chromatin modifications located within the transcribed regions. Most Pol V transcripts are associated with AGO4 but are not sliced by AGO4. Pol V-dependent DNA methylation is established on both strands of DNA and is tightly restricted to Pol V-transcribed regions. This indicates that chromatin modifications are established in close proximity to Pol V. Finally, Pol V transcription is preferentially enriched on edges of silenced transposable elements, where Pol V transcribes into TEs. We propose that Pol V may play an important role in the determination of heterochromatin boundaries. DOI: http://dx.doi.org/10.7554/eLife.19092.001 PMID:27779094

  9. Highly conserved elements discovered in vertebrates are present in non-syntenic loci of tunicates, act as enhancers and can be transcribed during development

    PubMed Central

    Sanges, Remo; Hadzhiev, Yavor; Gueroult-Bellone, Marion; Roure, Agnes; Ferg, Marco; Meola, Nicola; Amore, Gabriele; Basu, Swaraj; Brown, Euan R.; De Simone, Marco; Petrera, Francesca; Licastro, Danilo; Strähle, Uwe; Banfi, Sandro; Lemaire, Patrick; Birney, Ewan; Müller, Ferenc; Stupka, Elia

    2013-01-01

    Co-option of cis-regulatory modules has been suggested as a mechanism for the evolution of expression sites during development. However, the extent and mechanisms involved in mobilization of cis-regulatory modules remains elusive. To trace the history of non-coding elements, which may represent candidate ancestral cis-regulatory modules affirmed during chordate evolution, we have searched for conserved elements in tunicate and vertebrate (Olfactores) genomes. We identified, for the first time, 183 non-coding sequences that are highly conserved between the two groups. Our results show that all but one element are conserved in non-syntenic regions between vertebrate and tunicate genomes, while being syntenic among vertebrates. Nevertheless, in all the groups, they are significantly associated with transcription factors showing specific functions fundamental to animal development, such as multicellular organism development and sequence-specific DNA binding. The majority of these regions map onto ultraconserved elements and we demonstrate that they can act as functional enhancers within the organism of origin, as well as in cross-transgenesis experiments, and that they are transcribed in extant species of Olfactores. We refer to the elements as ‘Olfactores conserved non-coding elements’. PMID:23393190

  10. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs eachmore » inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.« less

  11. Transposable elements at the center of the crossroads between embryogenesis, embryonic stem cells, reprogramming, and long non-coding RNAs.

    PubMed

    Hutchins, Andrew Paul; Pei, Duanqing

    Transposable elements (TEs) are mobile genomic sequences of DNA capable of autonomous and non-autonomous duplication. TEs have been highly successful, and nearly half of the human genome now consists of various families of TEs. Originally thought to be non-functional, these elements have been co-opted by animal genomes to perform a variety of physiological functions ranging from TE-derived proteins acting directly in normal biological functions, to innovations in transcription factor logic and influence on epigenetic control of gene expression. During embryonic development, when the genome is epigenetically reprogrammed and DNA-demethylated, TEs are released from repression and show embryonic stage-specific expression, and in human and mouse embryos, intact TE-derived endogenous viral particles can even be detected. A similar process occurs during the reprogramming of somatic cells to pluripotent cells: When the somatic DNA is demethylated, TEs are released from repression. In embryonic stem cells (ESCs), where DNA is hypomethylated, an elaborate system of epigenetic control is employed to suppress TEs, a system that often overlaps with normal epigenetic control of ESC gene expression. Finally, many long non-coding RNAs (lncRNAs) involved in normal ESC function and those assisting or impairing reprogramming contain multiple TEs in their RNA. These TEs may act as regulatory units to recruit RNA-binding proteins and epigenetic modifiers. This review covers how TEs are interlinked with the epigenetic machinery and lncRNAs, and how these links influence each other to modulate aspects of ESCs, embryogenesis, and somatic cell reprogramming.

  12. Living Organisms Author Their Read-Write Genomes in Evolution.

    PubMed

    Shapiro, James A

    2017-12-06

    Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with "non-coding" DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called "non-coding" RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations.

  13. AP1 Keeps Chromatin Poised for Action | Center for Cancer Research

    Cancer.gov

    The human genome harbors gene-encoding DNA, the blueprint for building proteins that regulate cellular function. Embedded across the genome, in non-coding regions, are DNA elements to which regulatory factors bind. The interaction of regulatory factors with DNA at these sites modifies gene expression to modulate cell activity. In cells, DNA exists in a complex with proteins called chromatin that compacts the DNA in the nucleus, strongly restricting access to DNA sequences. As a result, regulatory factors only interact with a small subset of their potential binding elements in a given cell to regulate genes. How factors recognize and select sites in chromatin across the genome is not well understood -- but several discoveries in CCR’s Laboratory of Receptor Biology and Gene Expression (LRBGE) have shed light on the mechanisms that direct factors to DNA.

  14. Open chromatin reveals the functional maize genome

    USDA-ARS?s Scientific Manuscript database

    Every cellular process mediated through nuclear DNA must contend with chromatin. As results from ENCODE show, open chromatin assays can efficiently integrate across diverse regulatory elements, revealing functional non-coding genome. In this study, we use a MNase hypersensitivity assay to discover o...

  15. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  16. GATA: A graphic alignment tool for comparative sequenceanalysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nix, David A.; Eisen, Michael B.

    2005-01-01

    Several problems exist with current methods used to align DNA sequences for comparative sequence analysis. Most dynamic programming algorithms assume that conserved sequence elements are collinear. This assumption appears valid when comparing orthologous protein coding sequences. Functional constraints on proteins provide strong selective pressure against sequence inversions, and minimize sequence duplications and feature shuffling. For non-coding sequences this collinearity assumption is often invalid. For example, enhancers contain clusters of transcription factor binding sites that change in number, orientation, and spacing during evolution yet the enhancer retains its activity. Dotplot analysis is often used to estimate non-coding sequence relatedness. Yet dotmore » plots do not actually align sequences and thus cannot account well for base insertions or deletions. Moreover, they lack an adequate statistical framework for comparing sequence relatedness and are limited to pairwise comparisons. Lastly, dot plots and dynamic programming text outputs fail to provide an intuitive means for visualizing DNA alignments.« less

  17. Transposable elements and G-quadruplexes.

    PubMed

    Kejnovsky, Eduard; Tokan, Viktor; Lexa, Matej

    2015-09-01

    A significant part of eukaryotic genomes is formed by transposable elements (TEs) containing not only genes but also regulatory sequences. Some of the regulatory sequences located within TEs can form secondary structures like hairpins or three-stranded (triplex DNA) and four-stranded (quadruplex DNA) conformations. This review focuses on recent evidence showing that G-quadruplex-forming sequences in particular are often present in specific parts of TEs in plants and humans. We discuss the potential role of these structures in the TE life cycle as well as the impact of G-quadruplexes on replication, transcription, translation, chromatin status, and recombination. The aim of this review is to emphasize that TEs may serve as vehicles for the genomic spread of G-quadruplexes. These non-canonical DNA structures and their conformational switches may constitute another regulatory system that, together with small and long non-coding RNA molecules and proteins, contribute to the complex cellular network resulting in the large diversity of eukaryotes.

  18. Satellite DNA Modulates Gene Expression in the Beetle Tribolium castaneum after Heat Stress

    PubMed Central

    Feliciello, Isidoro; Akrap, Ivana; Ugarković, Đurđica

    2015-01-01

    Non-coding repetitive DNAs have been proposed to perform a gene regulatory role, however for tandemly repeated satellite DNA no such role was defined until now. Here we provide the first evidence for a role of satellite DNA in the modulation of gene expression under specific environmental conditions. The major satellite DNA TCAST1 in the beetle Tribolium castaneum is preferentially located within pericentromeric heterochromatin but is also dispersed as single repeats or short arrays in the vicinity of protein-coding genes within euchromatin. Our results show enhanced suppression of activity of TCAST1-associated genes and slower recovery of their activity after long-term heat stress relative to the same genes without associated TCAST1 satellite DNA elements. The level of gene suppression is not influenced by the distance of TCAST1 elements from the associated genes up to 40 kb from the genes’ transcription start sites, but it does depend on the copy number of TCAST1 repeats within an element, being stronger for the higher number of copies. The enhanced gene suppression correlates with the enrichment of the repressive histone marks H3K9me2/3 at dispersed TCAST1 elements and their flanking regions as well as with increased expression of TCAST1 satellite DNA. The results reveal transient, RNAi based heterochromatin formation at dispersed TCAST1 repeats and their proximal regions as a mechanism responsible for enhanced silencing of TCAST1-associated genes. Differences in the pattern of distribution of TCAST1 elements contribute to gene expression diversity among T. castaneum strains after long-term heat stress and might have an impact on adaptation to different environmental conditions. PMID:26275223

  19. Identification of a Conserved Non-Protein-Coding Genomic Element that Plays an Essential Role in Alphabaculovirus Pathogenesis

    PubMed Central

    Kikhno, Irina

    2014-01-01

    Highly homologous sequences 154–157 bp in length grouped under the name of “conserved non-protein-coding element” (CNE) were revealed in all of the sequenced genomes of baculoviruses belonging to the genus Alphabaculovirus. A CNE alignment led to the detection of a set of highly conserved nucleotide clusters that occupy strictly conserved positions in the CNE sequence. The significant length of the CNE and conservation of both its length and cluster architecture were identified as a combination of characteristics that make this CNE different from known viral non-coding functional sequences. The essential role of the CNE in the Alphabaculovirus life cycle was demonstrated through the use of a CNE-knockout Autographa californica multiple nucleopolyhedrovirus (AcMNPV) bacmid. It was shown that the essential function of the CNE was not mediated by the presumed expression activities of the protein- and non-protein-coding genes that overlap the AcMNPV CNE. On the basis of the presented data, the AcMNPV CNE was categorized as a complex-structured, polyfunctional genomic element involved in an essential DNA transaction that is associated with an undefined function of the baculovirus genome. PMID:24740153

  20. Stress induced gene expression drives transient DNA methylation changes at adjacent repetitive elements.

    PubMed

    Secco, David; Wang, Chuang; Shou, Huixia; Schultz, Matthew D; Chiarenza, Serge; Nussaume, Laurent; Ecker, Joseph R; Whelan, James; Lister, Ryan

    2015-07-21

    Cytosine DNA methylation (mC) is a genome modification that can regulate the expression of coding and non-coding genetic elements. However, little is known about the involvement of mC in response to environmental cues. Using whole genome bisulfite sequencing to assess the spatio-temporal dynamics of mC in rice grown under phosphate starvation and recovery conditions, we identified widespread phosphate starvation-induced changes in mC, preferentially localized in transposable elements (TEs) close to highly induced genes. These changes in mC occurred after changes in nearby gene transcription, were mostly DCL3a-independent, and could partially be propagated through mitosis, however no evidence of meiotic transmission was observed. Similar analyses performed in Arabidopsis revealed a very limited effect of phosphate starvation on mC, suggesting a species-specific mechanism. Overall, this suggests that TEs in proximity to environmentally induced genes are silenced via hypermethylation, and establishes the temporal hierarchy of transcriptional and epigenomic changes in response to stress.

  1. DNA rearrangements directed by non-coding RNAs in ciliates

    PubMed Central

    Mochizuki, Kazufumi

    2013-01-01

    Extensive programmed rearrangement of DNA, including DNA elimination, chromosome fragmentation, and DNA descrambling, takes place in the newly developed macronucleus during the sexual reproduction of ciliated protozoa. Recent studies have revealed that two distant classes of ciliates use distinct types of non-coding RNAs to regulate such DNA rearrangement events. DNA elimination in Tetrahymena is regulated by small non-coding RNAs that are produced and utilized in an RNAi-related process. It has been proposed that the small RNAs produced from the micronuclear genome are used to identify eliminated DNA sequences by whole-genome comparison between the parental macronucleus and the micronucleus. In contrast, DNA descrambling in Oxytricha is guided by long non-coding RNAs that are produced from the parental macronuclear genome. These long RNAs are proposed to act as templates for the direct descrambling events that occur in the developing macronucleus. Both cases provide useful examples to study epigenetic chromatin regulation by non-coding RNAs. PMID:21956937

  2. A family of long intergenic non-coding RNA genes in human chromosomal region 22q11.2 carry a DNA translocation breakpoint/AT-rich sequence

    PubMed Central

    2018-01-01

    FAM230C, a long intergenic non-coding RNA (lincRNA) gene in human chromosome 13 (chr13) is a member of lincRNA genes termed family with sequence similarity 230. An analysis using bioinformatics search tools and alignment programs was undertaken to determine properties of FAM230C and its related genes. Results reveal that the DNA translocation element, the Translocation Breakpoint Type A (TBTA) sequence, which consists of satellite DNA, Alu elements, and AT-rich sequences is embedded in the FAM230C gene. Eight lincRNA genes related to FAM230C also carry the TBTA sequences. These genes were formed from a large segment of the 3’ half of the FAM230C sequence duplicated in chr22, and are specifically in regions of low copy repeats (LCR22)s, in or close to the 22q.11.2 region. 22q11.2 is a chromosomal segment that undergoes a high rate of DNA translocation and is prone to genetic deletions. FAM230C-related genes present in other chromosomes do not carry the TBTA motif and were formed from the 5’ half region of the FAM230C sequence. These findings identify a high specificity in lincRNA gene formation by gene sequence duplication in different chromosomes. PMID:29668722

  3. Hundreds of conserved non-coding genomic regions are independently lost in mammals

    PubMed Central

    Hiller, Michael; Schaar, Bruce T.; Bejerano, Gill

    2012-01-01

    Conserved non-protein-coding DNA elements (CNEs) often encode cis-regulatory elements and are rarely lost during evolution. However, CNE losses that do occur can be associated with phenotypic changes, exemplified by pelvic spine loss in sticklebacks. Using a computational strategy to detect complete loss of CNEs in mammalian genomes while strictly controlling for artifacts, we find >600 CNEs that are independently lost in at least two mammalian lineages, including a spinal cord enhancer near GDF11. We observed several genomic regions where multiple independent CNE loss events happened; the most extreme is the DIAPH2 locus. We show that CNE losses often involve deletions and that CNE loss frequencies are non-uniform. Similar to less pleiotropic enhancers, we find that independently lost CNEs are shorter, slightly less constrained and evolutionarily younger than CNEs without detected losses. This suggests that independently lost CNEs are less pleiotropic and that pleiotropic constraints contribute to non-uniform CNE loss frequencies. We also detected 35 CNEs that are independently lost in the human lineage and in other mammals. Our study uncovers an interesting aspect of the evolution of functional DNA in mammalian genomes. Experiments are necessary to test if these independently lost CNEs are associated with parallel phenotype changes in mammals. PMID:23042682

  4. Crucial steps to life: From chemical reactions to code using agents.

    PubMed

    Witzany, Guenther

    2016-02-01

    The concepts of the origin of the genetic code and the definitions of life changed dramatically after the RNA world hypothesis. Main narratives in molecular biology and genetics such as the "central dogma," "one gene one protein" and "non-coding DNA is junk" were falsified meanwhile. RNA moved from the transition intermediate molecule into centre stage. Additionally the abundance of empirical data concerning non-random genetic change operators such as the variety of mobile genetic elements, persistent viruses and defectives do not fit with the dominant narrative of error replication events (mutations) as being the main driving forces creating genetic novelty and diversity. The reductionistic and mechanistic views on physico-chemical properties of the genetic code are no longer convincing as appropriate descriptions of the abundance of non-random genetic content operators which are active in natural genetic engineering and natural genome editing. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  5. A novel non prophage(-like) gene-intervening element within gerE that is reconstituted during sporulation in Bacillus cereus ATCC10987.

    PubMed

    Abe, Kimihiro; Shimizu, Shin-Ya; Tsuda, Shuhei; Sato, Tsutomu

    2017-09-12

    Gene rearrangement is a widely-shared phenomenon in spore forming bacteria, in which prophage(-like) elements interrupting sporulation-specific genes are excised from the host genome to reconstitute the intact gene. Here, we report a novel class of gene-intervening elements, named gin, inserted in the 225 bp gerE-coding region of the B. cereus ATCC10987 genome, which generates a sporulation-specific rearrangement. gin has no phage-related genes and possesses three site-specific recombinase genes; girA, girB, and girC. We demonstrated that the gerE rearrangement occurs at the middle stage of sporulation, in which site-specific DNA recombination took place within the 9 bp consensus sequence flanking the disrupted gerE segments. Deletion analysis of gin uncovered that GirC and an additional factor, GirX, are responsible for gerE reconstitution. Involvement of GirC and GirX in DNA recombination was confirmed by an in vitro recombination assay. These results broaden the definition of the sporulation-specific gene rearrangement phenomenon: gene-intervening elements are not limited to phage DNA but may include non-viral genetic elements that carry a developmentally-regulated site-specific recombination system.

  6. Stress induced gene expression drives transient DNA methylation changes at adjacent repetitive elements

    PubMed Central

    Secco, David; Wang, Chuang; Shou, Huixia; Schultz, Matthew D; Chiarenza, Serge; Nussaume, Laurent; Ecker, Joseph R; Whelan, James; Lister, Ryan

    2015-01-01

    Cytosine DNA methylation (mC) is a genome modification that can regulate the expression of coding and non-coding genetic elements. However, little is known about the involvement of mC in response to environmental cues. Using whole genome bisulfite sequencing to assess the spatio-temporal dynamics of mC in rice grown under phosphate starvation and recovery conditions, we identified widespread phosphate starvation-induced changes in mC, preferentially localized in transposable elements (TEs) close to highly induced genes. These changes in mC occurred after changes in nearby gene transcription, were mostly DCL3a-independent, and could partially be propagated through mitosis, however no evidence of meiotic transmission was observed. Similar analyses performed in Arabidopsis revealed a very limited effect of phosphate starvation on mC, suggesting a species-specific mechanism. Overall, this suggests that TEs in proximity to environmentally induced genes are silenced via hypermethylation, and establishes the temporal hierarchy of transcriptional and epigenomic changes in response to stress. DOI: http://dx.doi.org/10.7554/eLife.09343.001 PMID:26196146

  7. Plasticity of DNA methylation and gene expression under zinc deficiency in Arabidopsis roots.

    PubMed

    Chen, Xiaochao; Schönberger, Brigitte; Menz, Jochen; Ludewig, Uwe

    2018-05-25

    DNA methylation is a heritable chromatin modification that maintains chromosome stability, regulates transposon silencing and appears to be involved in gene expression in response to environmental conditions. Environmental stress alters DNA methylation patterns that are correlated with gene expression differences. Here, genome-wide differential DNA-methylation was identified upon prolonged Zn deficiency, leading to hypo- and hyper-methylated chromosomal regions. Preferential CpG methylation changes occurred in gene promoters and gene bodies, but did not overlap with transcriptional start sites. Methylation changes were also prominent in transposable elements. By contrast, non-CG methylation differences were exclusively found in promoters of protein coding genes and in transposable elements. Strongly Zn deficiency-induced genes and their promoters were mostly non-methylated, irrespective of Zn supply. Differential DNA methylation in the CpG and CHG, but not in the CHH context, was found close to a few up-regulated Zn-deficiency genes. However, the transcriptional Zn-deficiency response in roots appeared little correlated with associated DNA methylation changes in promoters or gene bodies. Furthermore, under Zn deficiency, developmental defects were identified in an Arabidopsis mutant lacking non-CpG methylation. The root methylome thus responds specifically to a micro-nutrient deficiency and is important for efficient Zn utilization at low availability, but the relationship of differential methylation and differentially expressed genes is surprisingly poor.

  8. The mitochondrial genome of Malus domestica and the import-driven hypothesis of mitochondrial genome expansion in seed plants.

    PubMed

    Goremykin, Vadim V; Lockhart, Peter J; Viola, Roberto; Velasco, Riccardo

    2012-08-01

    Mitochondrial genomes of spermatophytes are the largest of all organellar genomes. Their large size has been attributed to various factors; however, the relative contribution of these factors to mitochondrial DNA (mtDNA) expansion remains undetermined. We estimated their relative contribution in Malus domestica (apple). The mitochondrial genome of apple has a size of 396 947 bp and a one to nine ratio of coding to non-coding DNA, close to the corresponding average values for angiosperms. We determined that 71.5% of the apple mtDNA sequence was highly similar to sequences of its nuclear DNA. Using nuclear gene exons, nuclear transposable elements and chloroplast DNA as markers of promiscuous DNA content in mtDNA, we estimated that approximately 20% of the apple mtDNA consisted of DNA sequences imported from other cell compartments, mostly from the nucleus. Similar marker-based estimates of promiscuous DNA content in the mitochondrial genomes of other species ranged between 21.2 and 25.3% of the total mtDNA length for grape, between 23.1 and 38.6% for rice, and between 47.1 and 78.4% for maize. All these estimates are conservative, because they underestimate the import of non-functional DNA. We propose that the import of promiscuous DNA is a core mechanism for mtDNA size expansion in seed plants. In apple, maize and grape this mechanism contributed far more to genome expansion than did homologous recombination. In rice the estimated contribution of both mechanisms was found to be similar. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.

  9. Functional interrogation of non-coding DNA through CRISPR genome editing

    PubMed Central

    Canver, Matthew C.; Bauer, Daniel E.; Orkin, Stuart H.

    2017-01-01

    Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. PMID:28288828

  10. Resurrection of DNA Function In Vivo from an Extinct Genome

    PubMed Central

    Pask, Andrew J.; Behringer, Richard R.; Renfree, Marilyn B.

    2008-01-01

    There is a burgeoning repository of information available from ancient DNA that can be used to understand how genomes have evolved and to determine the genetic features that defined a particular species. To assess the functional consequences of changes to a genome, a variety of methods are needed to examine extinct DNA function. We isolated a transcriptional enhancer element from the genome of an extinct marsupial, the Tasmanian tiger (Thylacinus cynocephalus or thylacine), obtained from 100 year-old ethanol-fixed tissues from museum collections. We then examined the function of the enhancer in vivo. Using a transgenic approach, it was possible to resurrect DNA function in transgenic mice. The results demonstrate that the thylacine Col2A1 enhancer directed chondrocyte-specific expression in this extinct mammalian species in the same way as its orthologue does in mice. While other studies have examined extinct coding DNA function in vitro, this is the first example of the restoration of extinct non-coding DNA and examination of its function in vivo. Our method using transgenesis can be used to explore the function of regulatory and protein-coding sequences obtained from any extinct species in an in vivo model system, providing important insights into gene evolution and diversity. PMID:18493600

  11. DNA topoisomerase 1α promotes transcriptional silencing of transposable elements through DNA methylation and histone lysine 9 dimethylation in Arabidopsis.

    PubMed

    Dinh, Thanh Theresa; Gao, Lei; Liu, Xigang; Li, Dongming; Li, Shengben; Zhao, Yuanyuan; O'Leary, Michael; Le, Brandon; Schmitz, Robert J; Manavella, Pablo A; Manavella, Pablo; Li, Shaofang; Weigel, Detlef; Pontes, Olga; Ecker, Joseph R; Chen, Xuemei

    2014-07-01

    RNA-directed DNA methylation (RdDM) and histone H3 lysine 9 dimethylation (H3K9me2) are related transcriptional silencing mechanisms that target transposable elements (TEs) and repeats to maintain genome stability in plants. RdDM is mediated by small and long noncoding RNAs produced by the plant-specific RNA polymerases Pol IV and Pol V, respectively. Through a chemical genetics screen with a luciferase-based DNA methylation reporter, LUCL, we found that camptothecin, a compound with anti-cancer properties that targets DNA topoisomerase 1α (TOP1α) was able to de-repress LUCL by reducing its DNA methylation and H3K9me2 levels. Further studies with Arabidopsis top1α mutants showed that TOP1α silences endogenous RdDM loci by facilitating the production of Pol V-dependent long non-coding RNAs, AGONAUTE4 recruitment and H3K9me2 deposition at TEs and repeats. This study assigned a new role in epigenetic silencing to an enzyme that affects DNA topology.

  12. Functional interrogation of non-coding DNA through CRISPR genome editing.

    PubMed

    Canver, Matthew C; Bauer, Daniel E; Orkin, Stuart H

    2017-05-15

    Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Decoding the non-coding genome: elucidating genetic risk outside the coding genome.

    PubMed

    Barr, C L; Misener, V L

    2016-01-01

    Current evidence emerging from genome-wide association studies indicates that the genetic underpinnings of complex traits are likely attributable to genetic variation that changes gene expression, rather than (or in combination with) variation that changes protein-coding sequences. This is particularly compelling with respect to psychiatric disorders, as genetic changes in regulatory regions may result in differential transcriptional responses to developmental cues and environmental/psychosocial stressors. Until recently, however, the link between transcriptional regulation and psychiatric genetic risk has been understudied. Multiple obstacles have contributed to the paucity of research in this area, including challenges in identifying the positions of remote (distal from the promoter) regulatory elements (e.g. enhancers) and their target genes and the underrepresentation of neural cell types and brain tissues in epigenome projects - the availability of high-quality brain tissues for epigenetic and transcriptome profiling, particularly for the adolescent and developing brain, has been limited. Further challenges have arisen in the prediction and testing of the functional impact of DNA variation with respect to multiple aspects of transcriptional control, including regulatory-element interaction (e.g. between enhancers and promoters), transcription factor binding and DNA methylation. Further, the brain has uncommon DNA-methylation marks with unique genomic distributions not found in other tissues - current evidence suggests the involvement of non-CG methylation and 5-hydroxymethylation in neurodevelopmental processes but much remains unknown. We review here knowledge gaps as well as both technological and resource obstacles that will need to be overcome in order to elucidate the involvement of brain-relevant gene-regulatory variants in genetic risk for psychiatric disorders. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.

  14. Detecting the borders between coding and non-coding DNA regions in prokaryotes based on recursive segmentation and nucleotide doublets statistics

    PubMed Central

    2012-01-01

    Background Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved. Methods In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets. Results Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, Rickettsia prowazekii, Borrelia burgdorferi and E.coli, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences. Conclusions This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences. PMID:23282225

  15. Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis.

    PubMed

    Spangler, Jacob B; Feltus, Frank Alex

    2013-01-01

    Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression.

  16. Conserved Non-Coding Sequences are Associated with Rates of mRNA Decay in Arabidopsis

    PubMed Central

    Spangler, Jacob B.; Feltus, Frank Alex

    2013-01-01

    Steady-state mRNA levels are tightly regulated through a combination of transcriptional and post-transcriptional control mechanisms. The discovery of cis-acting DNA elements that encode these control mechanisms is of high importance. We have investigated the influence of conserved non-coding sequences (CNSs), DNA patterns retained after an ancient whole genome duplication event, on the breadth of gene expression and the rates of mRNA decay in Arabidopsis thaliana. The absence of CNSs near α duplicate genes was associated with a decrease in breadth of gene expression and slower mRNA decay rates while the presence CNSs near α duplicates was associated with an increase in breadth of gene expression and faster mRNA decay rates. The observed difference in mRNA decay rate was fastest in genes with CNSs in both non-transcribed and transcribed regions, albeit through an unknown mechanism. This study supports the notion that some Arabidopsis CNSs regulate the steady-state mRNA levels through post-transcriptional control mechanisms and that CNSs also play a role in controlling the breadth of gene expression. PMID:23675377

  17. Superimposed Code Theoretic Analysis of DNA Codes and DNA Computing

    DTIC Science & Technology

    2008-01-01

    complements of one another and the DNA duplex formed is a Watson - Crick (WC) duplex. However, there are many instances when the formation of non-WC...that the user’s requirements for probe selection are met based on the Watson - Crick probe locality within a target. The second type, called...AFRL-RI-RS-TR-2007-288 Final Technical Report January 2008 SUPERIMPOSED CODE THEORETIC ANALYSIS OF DNA CODES AND DNA COMPUTING

  18. The Genome of the Western Clawed Frog Xenopus tropicalis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hellsten, Uffe; Harland, Richard M.; Gilchrist, Michael J.

    2009-10-01

    The western clawed frog Xenopus tropicalis is an important model for vertebrate development that combines experimental advantages of the African clawed frog Xenopus laevis with more tractable genetics. Here we present a draft genome sequence assembly of X. tropicalis. This genome encodes over 20,000 protein-coding genes, including orthologs of at least 1,700 human disease genes. Over a million expressed sequence tags validated the annotation. More than one-third of the genome consists of transposable elements, with unusually prevalent DNA transposons. Like other tetrapods, the genome contains gene deserts enriched for conserved non-coding elements. The genome exhibits remarkable shared synteny with humanmore » and chicken over major parts of large chromosomes, broken by lineage-specific chromosome fusions and fissions, mainly in the mammalian lineage.« less

  19. Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

    PubMed

    Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

    2017-03-27

    Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.

  20. Palindromic repetitive DNA elements with coding potential in Methanocaldococcus jannaschii.

    PubMed

    Suyama, Mikita; Lathe, Warren C; Bork, Peer

    2005-10-10

    We have identified 141 novel palindromic repetitive elements in the genome of euryarchaeon Methanocaldococcus jannaschii. The total length of these elements is 14.3kb, which corresponds to 0.9% of the total genomic sequence and 6.3% of all extragenic regions. The elements can be divided into three groups (MJRE1-3) based on the sequence similarity. The low sequence identity within each of the groups suggests rather old origin of these elements in M. jannaschii. Three MJRE2 elements were located within the protein coding regions without disrupting the coding potential of the host genes, indicating that insertion of repeats might be a widespread mechanism to enhance sequence diversity in coding regions.

  1. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    PubMed

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  2. Coding of DNA samples and data in the pharmaceutical industry: current practices and future directions--perspective of the I-PWG.

    PubMed

    Franc, M A; Cohen, N; Warner, A W; Shaw, P M; Groenen, P; Snapir, A

    2011-04-01

    DNA samples collected in clinical trials and stored for future research are valuable to pharmaceutical drug development. Given the perceived higher risk associated with genetic research, industry has implemented complex coding methods for DNA. Following years of experience with these methods and with addressing questions from institutional review boards (IRBs), ethics committees (ECs) and health authorities, the industry has started reexamining the extent of the added value offered by these methods. With the goal of harmonization, the Industry Pharmacogenomics Working Group (I-PWG) conducted a survey to gain an understanding of company practices for DNA coding and to solicit opinions on their effectiveness at protecting privacy. The results of the survey and the limitations of the coding methods are described. The I-PWG recommends dialogue with key stakeholders regarding coding practices such that equal standards are applied to DNA and non-DNA samples. The I-PWG believes that industry standards for privacy protection should provide adequate safeguards for DNA and non-DNA samples/data and suggests a need for more universal standards for samples stored for future research.

  3. Atypical epigenetic mark in an atypical location: cytosine methylation at asymmetric (CNN) sites within the body of a non-repetitive tomato gene.

    PubMed

    González, Rodrigo M; Ricardi, Martiniano M; Iusem, Norberto D

    2011-05-20

    Eukaryotic DNA methylation is one of the most studied epigenetic processes, as it results in a direct and heritable covalent modification triggered by external stimuli. In contrast to mammals, plant DNA methylation, which is stimulated by external cues exemplified by various abiotic types of stress, is often found not only at CG sites but also at CNG (N denoting A, C or T) and CNN (asymmetric) sites. A genome-wide analysis of DNA methylation in Arabidopsis has shown that CNN methylation is preferentially concentrated in transposon genes and non-coding repetitive elements. We are particularly interested in investigating the epigenetics of plant species with larger and more complex genomes than Arabidopsis, particularly with regards to the associated alterations elicited by abiotic stress. We describe the existence of CNN-methylated epialleles that span Asr1, a non-transposon, protein-coding gene from tomato plants that lacks an orthologous counterpart in Arabidopsis. In addition, to test the hypothesis of a link between epigenetics modifications and the adaptation of crop plants to abiotic stress, we exhaustively explored the cytosine methylation status in leaf Asr1 DNA, a model gene in our system, resulting from water-deficit stress conditions imposed on tomato plants. We found that drought conditions brought about removal of methyl marks at approximately 75 of the 110 asymmetric (CNN) sites analysed, concomitantly with a decrease of the repressive H3K27me3 epigenetic mark and a large induction of expression at the RNA level. When pinpointing those sites, we observed that demethylation occurred mostly in the intronic region. These results demonstrate a novel genomic distribution of CNN methylation, namely in the transcribed region of a protein-coding, non-repetitive gene, and the changes in those epigenetic marks that are caused by water stress. These findings may represent a general mechanism for the acquisition of new epialleles in somatic cells, which are pivotal for regulating gene expression in plants.

  4. Transposable Elements in Human Cancer: Causes and Consequences of Deregulation.

    PubMed

    Anwar, Sumadi Lukman; Wulaningsih, Wahyu; Lehmann, Ulrich

    2017-05-04

    Transposable elements (TEs) comprise nearly half of the human genome and play an essential role in the maintenance of genomic stability, chromosomal architecture, and transcriptional regulation. TEs are repetitive sequences consisting of RNA transposons, DNA transposons, and endogenous retroviruses that can invade the human genome with a substantial contribution in human evolution and genomic diversity. TEs are therefore firmly regulated from early embryonic development and during the entire course of human life by epigenetic mechanisms, in particular DNA methylation and histone modifications. The deregulation of TEs has been reported in some developmental diseases, as well as for different types of human cancers. To date, the role of TEs, the mechanisms underlying TE reactivation, and the interplay with DNA methylation in human cancers remain largely unexplained. We reviewed the loss of epigenetic regulation and subsequent genomic instability, chromosomal aberrations, transcriptional deregulation, oncogenic activation, and aberrations of non-coding RNAs as the potential mechanisms underlying TE deregulation in human cancers.

  5. Transposable Elements in Human Cancer: Causes and Consequences of Deregulation

    PubMed Central

    Anwar, Sumadi Lukman; Wulaningsih, Wahyu; Lehmann, Ulrich

    2017-01-01

    Transposable elements (TEs) comprise nearly half of the human genome and play an essential role in the maintenance of genomic stability, chromosomal architecture, and transcriptional regulation. TEs are repetitive sequences consisting of RNA transposons, DNA transposons, and endogenous retroviruses that can invade the human genome with a substantial contribution in human evolution and genomic diversity. TEs are therefore firmly regulated from early embryonic development and during the entire course of human life by epigenetic mechanisms, in particular DNA methylation and histone modifications. The deregulation of TEs has been reported in some developmental diseases, as well as for different types of human cancers. To date, the role of TEs, the mechanisms underlying TE reactivation, and the interplay with DNA methylation in human cancers remain largely unexplained. We reviewed the loss of epigenetic regulation and subsequent genomic instability, chromosomal aberrations, transcriptional deregulation, oncogenic activation, and aberrations of non-coding RNAs as the potential mechanisms underlying TE deregulation in human cancers. PMID:28471386

  6. Early Evolution of Conserved Regulatory Sequences Associated with Development in Vertebrates

    PubMed Central

    McEwen, Gayle K.; Goode, Debbie K.; Parker, Hugo J.; Woolfe, Adam; Callaway, Heather; Elgar, Greg

    2009-01-01

    Comparisons between diverse vertebrate genomes have uncovered thousands of highly conserved non-coding sequences, an increasing number of which have been shown to function as enhancers during early development. Despite their extreme conservation over 500 million years from humans to cartilaginous fish, these elements appear to be largely absent in invertebrates, and, to date, there has been little understanding of their mode of action or the evolutionary processes that have modelled them. We have now exploited emerging genomic sequence data for the sea lamprey, Petromyzon marinus, to explore the depth of conservation of this type of element in the earliest diverging extant vertebrate lineage, the jawless fish (agnathans). We searched for conserved non-coding elements (CNEs) at 13 human gene loci and identified lamprey elements associated with all but two of these gene regions. Although markedly shorter and less well conserved than within jawed vertebrates, identified lamprey CNEs are able to drive specific patterns of expression in zebrafish embryos, which are almost identical to those driven by the equivalent human elements. These CNEs are therefore a unique and defining characteristic of all vertebrates. Furthermore, alignment of lamprey and other vertebrate CNEs should permit the identification of persistent sequence signatures that are responsible for common patterns of expression and contribute to the elucidation of the regulatory language in CNEs. Identifying the core regulatory code for development, common to all vertebrates, provides a foundation upon which regulatory networks can be constructed and might also illuminate how large conserved regulatory sequence blocks evolve and become fixed in genomic DNA. PMID:20011110

  7. Divergent genome evolution caused by regional variation in DNA gain and loss between human and mouse

    PubMed Central

    Kortschak, R. Daniel

    2018-01-01

    The forces driving the accumulation and removal of non-coding DNA and ultimately the evolution of genome size in complex organisms are intimately linked to genome structure and organisation. Our analysis provides a novel method for capturing the regional variation of lineage-specific DNA gain and loss events in their respective genomic contexts. To further understand this connection we used comparative genomics to identify genome-wide individual DNA gain and loss events in the human and mouse genomes. Focusing on the distribution of DNA gains and losses, relationships to important structural features and potential impact on biological processes, we found that in autosomes, DNA gains and losses both followed separate lineage-specific accumulation patterns. However, in both species chromosome X was particularly enriched for DNA gain, consistent with its high L1 retrotransposon content required for X inactivation. We found that DNA loss was associated with gene-rich open chromatin regions and DNA gain events with gene-poor closed chromatin regions. Additionally, we found that DNA loss events tended to be smaller than DNA gain events suggesting that they were able to accumulate in gene-rich open chromatin regions due to their reduced capacity to interrupt gene regulatory architecture. GO term enrichment showed that mouse loss hotspots were strongly enriched for terms related to developmental processes. However, these genes were also located in regions with a high density of conserved elements, suggesting that despite high levels of DNA loss, gene regulatory architecture remained conserved. This is consistent with a model in which DNA gain and loss results in turnover or “churning” in regulatory element dense regions of open chromatin, where interruption of regulatory elements is selected against. PMID:29677183

  8. Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures

    PubMed Central

    Stark, Alexander; Lin, Michael F.; Kheradpour, Pouya; Pedersen, Jakob S.; Parts, Leopold; Carlson, Joseph W.; Crosby, Madeline A.; Rasmussen, Matthew D.; Roy, Sushmita; Deoras, Ameya N.; Ruby, J. Graham; Brennecke, Julius; Hodges, Emily; Hinrichs, Angie S.; Caspi, Anat; Paten, Benedict; Park, Seung-Won; Han, Mira V.; Maeder, Morgan L.; Polansky, Benjamin J.; Robson, Bryanne E.; Aerts, Stein; van Helden, Jacques; Hassan, Bassem; Gilbert, Donald G.; Eastman, Deborah A.; Rice, Michael; Weir, Michael; Hahn, Matthew W.; Park, Yongkyu; Dewey, Colin N.; Pachter, Lior; Kent, W. James; Haussler, David; Lai, Eric C.; Bartel, David P.; Hannon, Gregory J.; Kaufman, Thomas C.; Eisen, Michael B.; Clark, Andrew G.; Smith, Douglas; Celniker, Susan E.; Gelbart, William M.; Kellis, Manolis

    2008-01-01

    Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or ‘evolutionary signatures’, dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies. PMID:17994088

  9. A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding trnH-psbA spacer region.

    PubMed

    Kress, W John; Erickson, David L

    2007-06-06

    A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level. Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species. A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination.

  10. The legumin gene family: structure of a B type gene of Vicia faba and a possible legumin gene specific regulatory element.

    PubMed Central

    Bäumlein, H; Wobus, U; Pustell, J; Kafatos, F C

    1986-01-01

    The field bean, Vicia faba L. var. minor, possesses two sub-families of 11 S legumin genes named A and B. We isolated from a genomic library a B-type gene (LeB4) and determined its primary DNA sequence. Gene LeB4 codes for a 484 amino acid residue prepropolypeptide, encompassing a signal peptide of 22 amino acid residues, an acidic, very hydrophilic alpha-chain of 281 residues and a basic, somewhat hydrophobic beta-chain of 181 residues. The latter two coding regions are immediately contiguous, but each is interrupted by a short intron. Type A legumin genes from soybean and pea are known to have introns in the same two positions, in addition to an extra intron (within the alpha-coding sequence). Sequence comparisons of legumin genes from these three plants revealed a highly conserved sequence element of at least 28 bp, centered at approximately 100 bp upstream of each cap site. The element is absent from the equivalent position of all non-legumin and other plant and fungal genes examined. We tentatively name this element "legumin box" and suggest that it may have a function in the regulation of legumin gene expression. PMID:3960730

  11. Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).

    PubMed

    Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar

    2016-12-01

    In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.

  12. Comparative evolution history of SINEs in Arabidopsis thaliana and Brassica oleracea: evidence for a high rate of SINE loss.

    PubMed

    Lenoir, A; Pélissier, T; Bousquet-Antonelli, C; Deragon, J M

    2005-01-01

    Brassica oleracea and Arabidopsis thaliana belong to the Brassicaceae(Cruciferae) family and diverged 16 to 19 million years ago. Although the genome size of B. oleracea (approximately 600 million base pairs) is more than four times that of A. thaliana (approximately 130 million base pairs), their gene content is believed to be very similar with more than 85% sequence identity in the coding region. Therefore, this important difference in genome size is likely to reflect a different rate of non-coding DNA accumulation. Transposable elements (TEs) constitute a major fraction of non-coding DNA in plant species. A different rate in TE accumulation between two closely related species can result in significant genome size variations in a short evolutionary period. Short interspersed elements (SINEs) are non-autonomous retroposons that have invaded the genome of most eukaryote species. Several SINE families are present in B. oleracea and A. thaliana and we found that two of them (called RathE1 and RathE2) are present in both species. In this study, the tempo of evolution of RathE1 and RathE2 SINE families in both species was compared. We observed that most B. oleracea RathE2 SINEs are "young" (close to the consensus sequence) and abundant while elements from this family are more degenerated and much less abundant in A. thaliana. However, the situation is different for the RathE1 SINE family for which the youngest elements are found in A. thaliana. Surprisingly, no SINE was found to occupy the same (orthologous) genomic locus in both species suggesting that either these SINE families were not amplified at a significant rate in the common ancestor of the two species or that older elements were lost and only the recent (lineage-specific) insertions remain. To test this latter hypothesis, loci containing a recently inserted SINE in the A. thaliana col-0 ecotype were selected and characterized in several other A. thaliana ecotypes. In addition to the expected SINE containing allele and the pre-integrative allele (i.e. the "empty" allele), we observed in the different ecotypes, alleles with truncated portions of the SINE (up to the complete loss of the element) and of the immediate genomic flanking sequences. The absence of SINEs in orthologous positions between B. oleracea and A. thaliana and the presence in recently diverged A. thaliana ecotypes of alleles containing severely truncated SINEs suggest a very high rate of SINE loss in these species.

  13. Living Organisms Author Their Read-Write Genomes in Evolution

    PubMed Central

    2017-01-01

    Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with “non-coding” DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called “non-coding” RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations. PMID:29211049

  14. A Two-Locus Global DNA Barcode for Land Plants: The Coding rbcL Gene Complements the Non-Coding trnH-psbA Spacer Region

    PubMed Central

    Kress, W. John; Erickson, David L.

    2007-01-01

    Background A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level. Methodology/Principal Findings Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species. Conclusions/Significance A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination. PMID:17551588

  15. Prevalence of transcription promoters within archaeal operons and coding sequences

    PubMed Central

    Koide, Tie; Reiss, David J; Bare, J Christopher; Pang, Wyming Lee; Facciotti, Marc T; Schmid, Amy K; Pan, Min; Marzolf, Bruz; Van, Phu T; Lo, Fang-Yin; Pratap, Abhishek; Deutsch, Eric W; Peterson, Amelia; Martin, Dan; Baliga, Nitin S

    2009-01-01

    Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of ∼64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein–DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3′ ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes—events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements. PMID:19536208

  16. Prevalence of transcription promoters within archaeal operons and coding sequences.

    PubMed

    Koide, Tie; Reiss, David J; Bare, J Christopher; Pang, Wyming Lee; Facciotti, Marc T; Schmid, Amy K; Pan, Min; Marzolf, Bruz; Van, Phu T; Lo, Fang-Yin; Pratap, Abhishek; Deutsch, Eric W; Peterson, Amelia; Martin, Dan; Baliga, Nitin S

    2009-01-01

    Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of approximately 64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein-DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3' ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes-events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements.

  17. Is a Genome a Codeword of an Error-Correcting Code?

    PubMed Central

    Kleinschmidt, João H.; Silva-Filho, Márcio C.; Bim, Edson; Herai, Roberto H.; Yamagishi, Michel E. B.; Palazzo, Reginaldo

    2012-01-01

    Since a genome is a discrete sequence, the elements of which belong to a set of four letters, the question as to whether or not there is an error-correcting code underlying DNA sequences is unavoidable. The most common approach to answering this question is to propose a methodology to verify the existence of such a code. However, none of the methodologies proposed so far, although quite clever, has achieved that goal. In a recent work, we showed that DNA sequences can be identified as codewords in a class of cyclic error-correcting codes known as Hamming codes. In this paper, we show that a complete intron-exon gene, and even a plasmid genome, can be identified as a Hamming code codeword as well. Although this does not constitute a definitive proof that there is an error-correcting code underlying DNA sequences, it is the first evidence in this direction. PMID:22649495

  18. Determination of the Optimal Chromosomal Location(s) for a DNA Element in Escherichia coli Using a Novel Transposon-mediated Approach.

    PubMed

    Frimodt-Møller, Jakob; Charbon, Godefroid; Krogfelt, Karen A; Løbner-Olesen, Anders

    2017-09-11

    The optimal chromosomal position(s) of a given DNA element was/were determined by transposon-mediated random insertion followed by fitness selection. In bacteria, the impact of the genetic context on the function of a genetic element can be difficult to assess. Several mechanisms, including topological effects, transcriptional interference from neighboring genes, and/or replication-associated gene dosage, may affect the function of a given genetic element. Here, we describe a method that permits the random integration of a DNA element into the chromosome of Escherichia coli and select the most favorable locations using a simple growth competition experiment. The method takes advantage of a well-described transposon-based system of random insertion, coupled with a selection of the fittest clone(s) by growth advantage, a procedure that is easily adjustable to experimental needs. The nature of the fittest clone(s) can be determined by whole-genome sequencing on a complex multi-clonal population or by easy gene walking for the rapid identification of selected clones. Here, the non-coding DNA region DARS2, which controls the initiation of chromosome replication in E. coli, was used as an example. The function of DARS2 is known to be affected by replication-associated gene dosage; the closer DARS2 gets to the origin of DNA replication, the more active it becomes. DARS2 was randomly inserted into the chromosome of a DARS2-deleted strain. The resultant clones containing individual insertions were pooled and competed against one another for hundreds of generations. Finally, the fittest clones were characterized and found to contain DARS2 inserted in close proximity to the original DARS2 location.

  19. FB-NOF is a non-autonomous transposable element, expressed in Drosophila melanogaster and present only in the melanogaster group.

    PubMed

    Badal, Martí; Xamena, Noel; Cabré, Oriol

    2013-09-10

    Most foldback elements are defective due to the lack of coding sequences but some are associated with coding sequences and may represent the entire element. This is the case of the NOF sequences found in the FB of Drosophila melanogaster, formerly considered as an autonomous TE and currently proposed as part of the so-called FB-NOF element, the transposon that would be complete and fully functional. NOF is always associated with FB and never seen apart from the FB inverted repeats (IR). This is the reason why the FB-NOF composite element can be considered the complete element. At least one of its ORFs encodes a protein that has always been considered its transposase, but no detailed studies have been carried out to verify this. In this work we test the hypothesis that FB-NOF is an active transposon nowadays. We search for its expression product, obtaining its cDNA, and propose the ORF and the sequence of its potential protein. We found that the NOF protein is not a transposase as it lacks any of the motifs of known transposases and also shows structural homology with hydrolases, therefore FB-NOF cannot belong to the superfamily MuDR/foldback, as up to now it has been classified, and can be considered as a non-autonomous transposable element. The alignment with the published genomes of 12 Drosophila species shows that NOF presence is restricted only to the 6 Drosophila species belonging to the melanogaster group. Copyright © 2013 Elsevier B.V. All rights reserved.

  20. Transcription and DNA Damage: Holding Hands or Crossing Swords?

    PubMed

    D'Alessandro, Giuseppina; d'Adda di Fagagna, Fabrizio

    2017-10-27

    Transcription has classically been considered a potential threat to genome integrity. Collision between transcription and DNA replication machinery, and retention of DNA:RNA hybrids, may result in genome instability. On the other hand, it has been proposed that active genes repair faster and preferentially via homologous recombination. Moreover, while canonical transcription is inhibited in the proximity of DNA double-strand breaks, a growing body of evidence supports active non-canonical transcription at DNA damage sites. Small non-coding RNAs accumulate at DNA double-strand break sites in mammals and other organisms, and are involved in DNA damage signaling and repair. Furthermore, RNA binding proteins are recruited to DNA damage sites and participate in the DNA damage response. Here, we discuss the impact of transcription on genome stability, the role of RNA binding proteins at DNA damage sites, and the function of small non-coding RNAs generated upon damage in the signaling and repair of DNA lesions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Mechanisms of radiation-induced gene responses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Woloschak, G.E.; Paunesku, T.

    1996-10-01

    In the process of identifying genes differentially expressed in cells exposed ultraviolet radiation, we have identified a transcript having a 26-bp region that is highly conserved in a variety of species including Bacillus circulans, yeast, pumpkin, Drosophila, mouse, and man. When the 5` region (flanking region or UTR) of a gene, the sequence is predominantly in +/+ orientation with respect to the coding DNA strand; while in the coding region and the 3` region (UTR), the sequence is most frequently in the +/-orientation with respect to the coding DNA strand. In two genes, the element is split into two parts;more » however, in most cases, it is found only once but with a minimum of 11 consecutive nucleotides precisely depicting the original sequence. The element is found in a large number of different genes with diverse functions (from human ras p21 to B. circulans chitonase). Gel shift assays demonstrated the presence of a protein in HeLa cell extracts that binds to the sense and antisense single-stranded consensus oligomers, as well as to the double- stranded oligonucleotide. When double-stranded oligomer was used, the size shift demonstrated as additional protein-oligomer complex larger than the one bound to either sense or antisense single-stranded consensus oligomers alone. It is speculated either that this element binds to protein(s) important in maintaining DNA is a single-stranded orientation for transcription or, alternatively that this element is important in the transcription-coupled DNA repair process.« less

  2. Short interspersed element (SINE) depletion and long interspersed element (LINE) abundance are not features universally required for imprinting.

    PubMed

    Cowley, Michael; de Burca, Anna; McCole, Ruth B; Chahal, Mandeep; Saadat, Ghazal; Oakey, Rebecca J; Schulz, Reiner

    2011-04-20

    Genomic imprinting is a form of gene dosage regulation in which a gene is expressed from only one of the alleles, in a manner dependent on the parent of origin. The mechanisms governing imprinted gene expression have been investigated in detail and have greatly contributed to our understanding of genome regulation in general. Both DNA sequence features, such as CpG islands, and epigenetic features, such as DNA methylation and non-coding RNAs, play important roles in achieving imprinted expression. However, the relative importance of these factors varies depending on the locus in question. Defining the minimal features that are absolutely required for imprinting would help us to understand how imprinting has evolved mechanistically. Imprinted retrogenes are a subset of imprinted loci that are relatively simple in their genomic organisation, being distinct from large imprinting clusters, and have the potential to be used as tools to address this question. Here, we compare the repeat element content of imprinted retrogene loci with non-imprinted controls that have a similar locus organisation. We observe no significant differences that are conserved between mouse and human, suggesting that the paucity of SINEs and relative abundance of LINEs at imprinted loci reported by others is not a sequence feature universally required for imprinting.

  3. Improved PCR-Based Detection of Soil Transmitted Helminth Infections Using a Next-Generation Sequencing Approach to Assay Design.

    PubMed

    Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A

    2016-03-01

    The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.

  4. RNA Helicase Associated with AU-rich Element (RHAU/DHX36) Interacts with the 3′-Tail of the Long Non-coding RNA BC200 (BCYRN1)*

    PubMed Central

    Booy, Evan P.; McRae, Ewan K. S.; Howard, Ryan; Deo, Soumya R.; Ariyo, Emmanuel O.; Dzananovic, Edis; Meier, Markus; Stetefeld, Jörg; McKenna, Sean A.

    2016-01-01

    RNA helicase associated with AU-rich element (RHAU) is an ATP-dependent RNA helicase that demonstrates high affinity for quadruplex structures in DNA and RNA. To elucidate the significance of these quadruplex-RHAU interactions, we have performed RNA co-immunoprecipitation screens to identify novel RNAs bound to RHAU and characterize their function. In the course of this study, we have identified the non-coding RNA BC200 (BCYRN1) as specifically enriched upon RHAU immunoprecipitation. Although BC200 does not adopt a quadruplex structure and does not bind the quadruplex-interacting motif of RHAU, it has direct affinity for RHAU in vitro. Specifically designed BC200 truncations and RNase footprinting assays demonstrate that RHAU binds to an adenosine-rich region near the 3′-end of the RNA. RHAU truncations support binding that is dependent upon a region within the C terminus and is specific to RHAU isoform 1. Tests performed to assess whether BC200 interferes with RHAU helicase activity have demonstrated the ability of BC200 to act as an acceptor of unwound quadruplexes via a cytosine-rich region near the 3′-end of the RNA. Furthermore, an interaction between BC200 and the quadruplex-containing telomerase RNA was confirmed by pull-down assays of the endogenous RNAs. This leads to the possibility that RHAU may direct BC200 to bind and exert regulatory functions at quadruplex-containing RNA or DNA sequences. PMID:26740632

  5. Stable CoT-1 repeat RNA is abundant and associated with euchromatic interphase chromosomes

    PubMed Central

    Hall, Lisa L.; Carone, Dawn M.; Gomez, Alvin; Kolpa, Heather J.; Byron, Meg; Mehta, Nitish; Fackelmayer, Frank O.; Lawrence, Jeanne B.

    2014-01-01

    SUMMARY Recent studies recognize a vast diversity of non-coding RNAs with largely unknown functions, but few have examined interspersed repeat sequences, which constitute almost half our genome. RNA hybridization in situ using CoT-1 (highly repeated) DNA probes detects surprisingly abundant euchromatin-associated RNA comprised predominantly of repeat sequences (“CoT-1 RNA”), including LINE-1. CoT-1-hybridizing RNA strictly localizes to the interphase chromosome territory in cis, and remains stably associated with the chromosome territory following prolonged transcriptional inhibition. The CoT-1 RNA territory resists mechanical disruption and fractionates with the non-chromatin scaffold, but can be experimentally released. Loss of repeat-rich, stable nuclear RNAs from euchromatin corresponds to aberrant chromatin distribution and condensation. CoT-1 RNA has several properties similar to XIST chromosomal RNA, but is excluded from chromatin condensed by XIST. These findings impact two “black boxes” of genome science: the poorly understood diversity of non-coding RNA and the unexplained abundance of repetitive elements. PMID:24581492

  6. Nucleic Acid Chaperone Activity of the ORF1 Protein from the Mouse LINE-1 Retrotransposon

    PubMed Central

    Martin, Sandra L.; Bushman, Frederic D.

    2001-01-01

    Non-LTR retrotransposons such as L1 elements are major components of the mammalian genome, but their mechanism of replication is incompletely understood. Like retroviruses and LTR-containing retrotransposons, non-LTR retrotransposons replicate by reverse transcription of an RNA intermediate. The details of cDNA priming and integration, however, differ between these two classes. In retroviruses, the nucleocapsid (NC) protein has been shown to assist reverse transcription by acting as a “nucleic acid chaperone,” promoting the formation of the most stable duplexes between nucleic acid molecules. A protein-coding region with an NC-like sequence is present in most non-LTR retrotransposons, but no such sequence is evident in mammalian L1 elements or other members of its class. Here we investigated the ORF1 protein from mouse L1 and found that it does in fact display nucleic acid chaperone activities in vitro. L1 ORF1p (i) promoted annealing of complementary DNA strands, (ii) facilitated strand exchange to form the most stable hybrids in competitive displacement assays, and (iii) facilitated melting of an imperfect duplex but stabilized perfect duplexes. These findings suggest a role for L1 ORF1p in mediating nucleic acid strand transfer steps during L1 reverse transcription. PMID:11134335

  7. DNA transposons have colonized the genome of the giant virus Pandoravirus salinus.

    PubMed

    Sun, Cheng; Feschotte, Cédric; Wu, Zhiqiang; Mueller, Rachel Lockridge

    2015-06-12

    Transposable elements are mobile DNA sequences that are widely distributed in prokaryotic and eukaryotic genomes, where they represent a major force in genome evolution. However, transposable elements have rarely been documented in viruses, and their contribution to viral genome evolution remains largely unexplored. Pandoraviruses are recently described DNA viruses with genome sizes that exceed those of some prokaryotes, rivaling parasitic eukaryotes. These large genomes appear to include substantial noncoding intergenic spaces, which provide potential locations for transposable element insertions. However, no mobile genetic elements have yet been reported in pandoravirus genomes. Here, we report a family of miniature inverted-repeat transposable elements (MITEs) in the Pandoravirus salinus genome, representing the first description of a virus populated with a canonical transposable element family that proliferated by transposition within the viral genome. The MITE family, which we name Submariner, includes 30 copies with all the hallmarks of MITEs: short length, terminal inverted repeats, TA target site duplication, and no coding capacity. Submariner elements show signs of transposition and are undetectable in the genome of Pandoravirus dulcis, the closest known relative Pandoravirus salinus. We identified a DNA transposon related to Submariner in the genome of Acanthamoeba castellanii, a species thought to host pandoraviruses, which contains remnants of coding sequence for a Tc1/mariner transposase. These observations suggest that the Submariner MITEs of P. salinus belong to the widespread Tc1/mariner superfamily and may have been mobilized by an amoebozoan host. Ten of the 30 MITEs in the P. salinus genome are located within coding regions of predicted genes, while others are close to genes, suggesting that these transposons may have contributed to viral genetic novelty. Our discovery highlights the remarkable ability of DNA transposons to colonize and shape genomes from all domains of life, as well as giant viruses. Our findings continue to blur the division between viral and cellular genomes, adhering to the emerging view that the content, dynamics, and evolution of the genomes of giant viruses do not substantially differ from those of cellular organisms.

  8. Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

    PubMed Central

    Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia

    2011-01-01

    Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358

  9. Transcription of highly repetitive tandemly organized DNA in amphibians and birds: A historical overview and modern concepts.

    PubMed

    Trofimova, Irina; Krasikova, Alla

    2016-12-01

    Tandemly organized highly repetitive DNA sequences are crucial structural and functional elements of eukaryotic genomes. Despite extensive evidence, satellite DNA remains an enigmatic part of the eukaryotic genome, with biological role and significance of tandem repeat transcripts remaining rather obscure. Data on tandem repeats transcription in amphibian and avian model organisms is fragmentary despite their genomes being thoroughly characterized. Review systematically covers historical and modern data on transcription of amphibian and avian satellite DNA in somatic cells and during meiosis when chromosomes acquire special lampbrush form. We highlight how transcription of tandemly repetitive DNA sequences is organized in interphase nucleus and on lampbrush chromosomes. We offer LTR-activation hypotheses of widespread satellite DNA transcription initiation during oogenesis. Recent explanations are provided for the significance of high-yield production of non-coding RNA derived from tandemly organized highly repetitive DNA. In many cases the data on the transcription of satellite DNA can be extrapolated from lampbrush chromosomes to interphase chromosomes. Lampbrush chromosomes with applied novel technical approaches such as superresolution imaging, chromosome microdissection followed by high-throughput sequencing, dynamic observation in life-like conditions provide amazing opportunities for investigation mechanisms of the satellite DNA transcription.

  10. Transcription of highly repetitive tandemly organized DNA in amphibians and birds: A historical overview and modern concepts

    PubMed Central

    Krasikova, Alla

    2016-01-01

    ABSTRACT Tandemly organized highly repetitive DNA sequences are crucial structural and functional elements of eukaryotic genomes. Despite extensive evidence, satellite DNA remains an enigmatic part of the eukaryotic genome, with biological role and significance of tandem repeat transcripts remaining rather obscure. Data on tandem repeats transcription in amphibian and avian model organisms is fragmentary despite their genomes being thoroughly characterized. Review systematically covers historical and modern data on transcription of amphibian and avian satellite DNA in somatic cells and during meiosis when chromosomes acquire special lampbrush form. We highlight how transcription of tandemly repetitive DNA sequences is organized in interphase nucleus and on lampbrush chromosomes. We offer LTR-activation hypotheses of widespread satellite DNA transcription initiation during oogenesis. Recent explanations are provided for the significance of high-yield production of non-coding RNA derived from tandemly organized highly repetitive DNA. In many cases the data on the transcription of satellite DNA can be extrapolated from lampbrush chromosomes to interphase chromosomes. Lampbrush chromosomes with applied novel technical approaches such as superresolution imaging, chromosome microdissection followed by high-throughput sequencing, dynamic observation in life-like conditions provide amazing opportunities for investigation mechanisms of the satellite DNA transcription. PMID:27763817

  11. Design pattern mining using distributed learning automata and DNA sequence alignment.

    PubMed

    Esmaeilpour, Mansour; Naderifar, Vahideh; Shukur, Zarina

    2014-01-01

    Over the last decade, design patterns have been used extensively to generate reusable solutions to frequently encountered problems in software engineering and object oriented programming. A design pattern is a repeatable software design solution that provides a template for solving various instances of a general problem. This paper describes a new method for pattern mining, isolating design patterns and relationship between them; and a related tool, DLA-DNA for all implemented pattern and all projects used for evaluation. DLA-DNA achieves acceptable precision and recall instead of other evaluated tools based on distributed learning automata (DLA) and deoxyribonucleic acid (DNA) sequences alignment. The proposed method mines structural design patterns in the object oriented source code and extracts the strong and weak relationships between them, enabling analyzers and programmers to determine the dependency rate of each object, component, and other section of the code for parameter passing and modular programming. The proposed model can detect design patterns better that available other tools those are Pinot, PTIDEJ and DPJF; and the strengths of their relationships. The result demonstrate that whenever the source code is build standard and non-standard, based on the design patterns, then the result of the proposed method is near to DPJF and better that Pinot and PTIDEJ. The proposed model is tested on the several source codes and is compared with other related models and available tools those the results show the precision and recall of the proposed method, averagely 20% and 9.6% are more than Pinot, 27% and 31% are more than PTIDEJ and 3.3% and 2% are more than DPJF respectively. The primary idea of the proposed method is organized in two following steps: the first step, elemental design patterns are identified, while at the second step, is composed to recognize actual design patterns.

  12. [The ENCODE project and functional genomics studies].

    PubMed

    Ding, Nan; Qu, Hongzhu; Fang, Xiangdong

    2014-03-01

    Upon the completion of the Human Genome Project, scientists have been trying to interpret the underlying genomic code for human biology. Since 2003, National Human Genome Research Institute (NHGRI) has invested nearly $0.3 billion and gathered over 440 scientists from more than 32 institutions in the United States, China, United Kingdom, Japan, Spain and Singapore to initiate the Encyclopedia of DNA Elements (ENCODE) project, aiming to identify and analyze all regulatory elements in the human genome. Taking advantage of the development of next-generation sequencing technologies and continuous improvement of experimental methods, ENCODE had made remarkable achievements: identified methylation and histone modification of DNA sequences and their regulatory effects on gene expression through altering chromatin structures, categorized binding sites of various transcription factors and constructed their regulatory networks, further revised and updated database for pseudogenes and non-coding RNA, and identified SNPs in regulatory sequences associated with diseases. These findings help to comprehensively understand information embedded in gene and genome sequences, the function of regulatory elements as well as the molecular mechanism underlying the transcriptional regulation by noncoding regions, and provide extensive data resource for life sciences, particularly for translational medicine. We re-viewed the contributions of high-throughput sequencing platform development and bioinformatical technology improve-ment to the ENCODE project, the association between epigenetics studies and the ENCODE project, and the major achievement of the ENCODE project. We also provided our prospective on the role of the ENCODE project in promoting the development of basic and clinical medicine.

  13. A biological inspired fuzzy adaptive window median filter (FAWMF) for enhancing DNA signal processing.

    PubMed

    Ahmad, Muneer; Jung, Low Tan; Bhuiyan, Al-Amin

    2017-10-01

    Digital signal processing techniques commonly employ fixed length window filters to process the signal contents. DNA signals differ in characteristics from common digital signals since they carry nucleotides as contents. The nucleotides own genetic code context and fuzzy behaviors due to their special structure and order in DNA strand. Employing conventional fixed length window filters for DNA signal processing produce spectral leakage and hence results in signal noise. A biological context aware adaptive window filter is required to process the DNA signals. This paper introduces a biological inspired fuzzy adaptive window median filter (FAWMF) which computes the fuzzy membership strength of nucleotides in each slide of window and filters nucleotides based on median filtering with a combination of s-shaped and z-shaped filters. Since coding regions cause 3-base periodicity by an unbalanced nucleotides' distribution producing a relatively high bias for nucleotides' usage, such fundamental characteristic of nucleotides has been exploited in FAWMF to suppress the signal noise. Along with adaptive response of FAWMF, a strong correlation between median nucleotides and the Π shaped filter was observed which produced enhanced discrimination between coding and non-coding regions contrary to fixed length conventional window filters. The proposed FAWMF attains a significant enhancement in coding regions identification i.e. 40% to 125% as compared to other conventional window filters tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. This study proves that conventional fixed length window filters applied to DNA signals do not achieve significant results since the nucleotides carry genetic code context. The proposed FAWMF algorithm is adaptive and outperforms significantly to process DNA signal contents. The algorithm applied to variety of DNA datasets produced noteworthy discrimination between coding and non-coding regions contrary to fixed window length conventional filters. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. The 'dark matter' in the plant genomes: non-coding and unannotated DNA sequences associated with open chromatin.

    PubMed

    Jiang, Jiming

    2015-04-01

    Sequencing of complete plant genomes has become increasingly more routine since the advent of the next-generation sequencing technology. Identification and annotation of large amounts of noncoding but functional DNA sequences, including cis-regulatory DNA elements (CREs), have become a new frontier in plant genome research. Genomic regions containing active CREs bound to regulatory proteins are hypersensitive to DNase I digestion and are called DNase I hypersensitive sites (DHSs). Several recent DHS studies in plants illustrate that DHS datasets produced by DNase I digestion followed by next-generation sequencing (DNase-seq) are highly valuable for the identification and characterization of CREs associated with plant development and responses to environmental cues. DHS-based genomic profiling has opened a door to identify and annotate the 'dark matter' in sequenced plant genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. DNA methylation of miRNA coding sequences putatively associated with childhood obesity.

    PubMed

    Mansego, M L; Garcia-Lacarte, M; Milagro, F I; Marti, A; Martinez, J A

    2017-02-01

    Epigenetic mechanisms may be involved in obesity onset and its consequences. The aim of the present study was to evaluate whether DNA methylation status in microRNA (miRNA) coding regions is associated with childhood obesity. DNA isolated from white blood cells of 24 children (identification sample: 12 obese and 12 non-obese) from the Grupo Navarro de Obesidad Infantil study was hybridized in a 450 K methylation microarray. Several CpGs whose DNA methylation levels were statistically different between obese and non-obese were validated by MassArray® in 95 children (validation sample) from the same study. Microarray analysis identified 16 differentially methylated CpGs between both groups (6 hypermethylated and 10 hypomethylated). DNA methylation levels in miR-1203, miR-412 and miR-216A coding regions significantly correlated with body mass index standard deviation score (BMI-SDS) and explained up to 40% of the variation of BMI-SDS. The network analysis identified 19 well-defined obesity-relevant biological pathways from the KEGG database. MassArray® validation identified three regions located in or near miR-1203, miR-412 and miR-216A coding regions differentially methylated between obese and non-obese children. The current work identified three CpG sites located in coding regions of three miRNAs (miR-1203, miR-412 and miR-216A) that were differentially methylated between obese and non-obese children, suggesting a role of miRNA epigenetic regulation in childhood obesity. © 2016 World Obesity Federation.

  16. Epigenetic mechanisms in anti-cancer actions of bioactive food components – the implications in cancer prevention

    PubMed Central

    Stefanska, B; Karlic, H; Varga, F; Fabianowska-Majewska, K; Haslberger, AG

    2012-01-01

    The hallmarks of carcinogenesis are aberrations in gene expression and protein function caused by both genetic and epigenetic modifications. Epigenetics refers to the changes in gene expression programming that alter the phenotype in the absence of a change in DNA sequence. Epigenetic modifications, which include amongst others DNA methylation, covalent modifications of histone tails and regulation by non-coding RNAs, play a significant role in normal development and genome stability. The changes are dynamic and serve as an adaptation mechanism to a wide variety of environmental and social factors including diet. A number of studies have provided evidence that some natural bioactive compounds found in food and herbs can modulate gene expression by targeting different elements of the epigenetic machinery. Nutrients that are components of one-carbon metabolism, such as folate, riboflavin, pyridoxine, cobalamin, choline, betaine and methionine, affect DNA methylation by regulating the levels of S-adenosyl-L-methionine, a methyl group donor, and S-adenosyl-L-homocysteine, which is an inhibitor of enzymes catalyzing the DNA methylation reaction. Other natural compounds target histone modifications and levels of non-coding RNAs such as vitamin D, which recruits histone acetylases, or resveratrol, which activates the deacetylase sirtuin and regulates oncogenic and tumour suppressor micro-RNAs. As epigenetic abnormalities have been shown to be both causative and contributing factors in different health conditions including cancer, natural compounds that are direct or indirect regulators of the epigenome constitute an excellent approach in cancer prevention and potentially in anti-cancer therapy. PMID:22536923

  17. The agents of natural genome editing.

    PubMed

    Witzany, Guenther

    2011-06-01

    The DNA serves as a stable information storage medium and every protein which is needed by the cell is produced from this blueprint via an RNA intermediate code. More recently it was found that an abundance of various RNA elements cooperate in a variety of steps and substeps as regulatory and catalytic units with multiple competencies to act on RNA transcripts. Natural genome editing on one side is the competent agent-driven generation and integration of meaningful DNA nucleotide sequences into pre-existing genomic content arrangements, and the ability to (re-)combine and (re-)regulate them according to context-dependent (i.e. adaptational) purposes of the host organism. Natural genome editing on the other side designates the integration of all RNA activities acting on RNA transcripts without altering DNA-encoded genes. If we take the genetic code seriously as a natural code, there must be agents that are competent to act on this code because no natural code codes itself as no natural language speaks itself. As code editing agents, viral and subviral agents have been suggested because there are several indicators that demonstrate viruses competent in both RNA and DNA natural genome editing.

  18. RPS8—a New Informative DNA Marker for Phylogeny of Babesia and Theileria Parasites in China

    PubMed Central

    Tian, Zhan-Cheng; Liu, Guang-Yuan; Yin, Hong; Luo, Jian-Xun; Guan, Gui-Quan; Luo, Jin; Xie, Jun-Ren; Shen, Hui; Tian, Mei-Yuan; Zheng, Jin-feng; Yuan, Xiao-song; Wang, Fang-fang

    2013-01-01

    Piroplasmosis is a serious debilitating and sometimes fatal disease. Phylogenetic relationships within piroplasmida are complex and remain unclear. We compared the intron–exon structure and DNA sequences of the RPS8 gene from Babesia and Theileria spp. isolates in China. Similar to 18S rDNA, the 40S ribosomal protein S8 gene, RPS8, including both coding and non-coding regions is a useful and novel genetic marker for defining species boundaries and for inferring phylogenies because it tends to have little intra-specific variation but considerable inter-specific difference. However, more samples are needed to verify the usefulness of the RPS8 (coding and non-coding regions) gene as a marker for the phylogenetic position and detection of most Babesia and Theileria species, particularly for some closely related species. PMID:24244571

  19. Paramecium tetraurelia chromatin assembly factor-1-like protein PtCAF-1 is involved in RNA-mediated control of DNA elimination.

    PubMed

    Ignarski, Michael; Singh, Aditi; Swart, Estienne C; Arambasic, Miroslav; Sandoval, Pamela Y; Nowacki, Mariusz

    2014-10-29

    Genome-wide DNA remodelling in the ciliate Paramecium is ensured by RNA-mediated trans-nuclear crosstalk between the germline and the somatic genomes during sexual development. The rearrangements include elimination of transposable elements, minisatellites and tens of thousands non-coding elements called internally eliminated sequences (IESs). The trans-nuclear genome comparison process employs a distinct class of germline small RNAs (scnRNAs) that are compared against the parental somatic genome to select the germline-specific subset of scnRNAs that subsequently target DNA elimination in the progeny genome. Only a handful of proteins involved in this process have been identified so far and the mechanism of DNA targeting is unknown. Here we describe chromatin assembly factor-1-like protein (PtCAF-1), which we show is required for the survival of sexual progeny and localizes first in the parental and later in the newly developing macronucleus. Gene silencing shows that PtCAF-1 is required for the elimination of transposable elements and a subset of IESs. PTCAF-1 depletion also impairs the selection of germline-specific scnRNAs during development. We identify specific histone modifications appearing during Paramecium development which are strongly reduced in PTCAF-1 depleted cells. Our results demonstrate the importance of PtCAF-1 for the epigenetic trans-nuclear cross-talk mechanism. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. WordCluster: detecting clusters of DNA words and genomic elements

    PubMed Central

    2011-01-01

    Background Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes. PMID:21261981

  1. Diversity and structure of PIF/Harbinger-like elements in the genome of Medicago truncatula

    PubMed Central

    Grzebelus, Dariusz; Lasota, Slawomir; Gambin, Tomasz; Kucherov, Gregory; Gambin, Anna

    2007-01-01

    Background Transposable elements constitute a significant fraction of plant genomes. The PIF/Harbinger superfamily includes DNA transposons (class II elements) carrying terminal inverted repeats and producing a 3 bp target site duplication upon insertion. The presence of an ORF coding for the DDE/DDD transposase, required for transposition, is characteristic for the autonomous PIF/Harbinger-like elements. Based on the above features, PIF/Harbinger-like elements were identified in several plant genomes and divided into several evolutionary lineages. Availability of a significant portion of Medicago truncatula genomic sequence allowed for mining PIF/Harbinger-like elements, starting from a single previously described element MtMaster. Results Twenty two putative autonomous, i.e. carrying an ORF coding for TPase and complete terminal inverted repeats, and 67 non-autonomous PIF/Harbinger-like elements were found in the genome of M. truncatula. They were divided into five families, MtPH-A5, MtPH-A6, MtPH-D,MtPH-E, and MtPH-M, corresponding to three previously identified and two new lineages. The largest families, MtPH-A6 and MtPH-M were further divided into four and three subfamilies, respectively. Non-autonomous elements were usually direct deletion derivatives of the putative autonomous element, however other types of rearrangements, including inversions and nested insertions were also observed. An interesting structural characteristic – the presence of 60 bp tandem repeats – was observed in a group of elements of subfamily MtPH-A6-4. Some families could be related to miniature inverted repeat elements (MITEs). The presence of empty loci (RESites), paralogous to those flanking the identified transposable elements, both autonomous and non-autonomous, as well as the presence of transposon insertion related size polymorphisms, confirmed that some of the mined elements were capable for transposition. Conclusion The population of PIF/Harbinger-like elements in the genome of M. truncatula is diverse. A detailed intra-family comparison of the elements' structure proved that they proliferated in the genome generally following the model of abortive gap repair. However, the presence of tandem repeats facilitated more pronounced rearrangements of the element internal regions. The insertion polymorphism of the MtPH elements and related MITE families in different populations of M. truncatula, if further confirmed experimentally, could be used as a source of molecular markers complementary to other marker systems. PMID:17996080

  2. Chromatin accessibility prediction via a hybrid deep convolutional neural network.

    PubMed

    Liu, Qiao; Xia, Fei; Yin, Qijin; Jiang, Rui

    2018-03-01

    A majority of known genetic variants associated with human-inherited diseases lie in non-coding regions that lack adequate interpretation, making it indispensable to systematically discover functional sites at the whole genome level and precisely decipher their implications in a comprehensive manner. Although computational approaches have been complementing high-throughput biological experiments towards the annotation of the human genome, it still remains a big challenge to accurately annotate regulatory elements in the context of a specific cell type via automatic learning of the DNA sequence code from large-scale sequencing data. Indeed, the development of an accurate and interpretable model to learn the DNA sequence signature and further enable the identification of causative genetic variants has become essential in both genomic and genetic studies. We proposed Deopen, a hybrid framework mainly based on a deep convolutional neural network, to automatically learn the regulatory code of DNA sequences and predict chromatin accessibility. In a series of comparison with existing methods, we show the superior performance of our model in not only the classification of accessible regions against background sequences sampled at random, but also the regression of DNase-seq signals. Besides, we further visualize the convolutional kernels and show the match of identified sequence signatures and known motifs. We finally demonstrate the sensitivity of our model in finding causative noncoding variants in the analysis of a breast cancer dataset. We expect to see wide applications of Deopen with either public or in-house chromatin accessibility data in the annotation of the human genome and the identification of non-coding variants associated with diseases. Deopen is freely available at https://github.com/kimmo1019/Deopen. ruijiang@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  3. Study characterizes long non-coding RNA’s response to DNA damage in colon cancer cells | Center for Cancer Research

    Cancer.gov

    Researchers led by Ashish Lal, Ph.D., Investigator in the Genetics Branch, have shown that when the DNA in human colon cancer cells is damaged, a long non-coding RNA (lncRNA) regulates the expression of genes that halt growth, which allows the cells to repair the damage and promote survival. Their findings suggest an important pro-survival function of a lncRNA in cancer

  4. Pdsg1 and Pdsg2, Novel Proteins Involved in Developmental Genome Remodelling in Paramecium

    PubMed Central

    Hoehener, Cristina; Singh, Aditi; Swart, Estienne C.; Nowacki, Mariusz

    2014-01-01

    The epigenetic influence of maternal cells on the development of their progeny has long been studied in various eukaryotes. Multicellular organisms usually provide their zygotes not only with nutrients but also with functional elements required for proper development, such as coding and non-coding RNAs. These maternally deposited RNAs exhibit a variety of functions, from regulating gene expression to assuring genome integrity. In ciliates, such as Paramecium these RNAs participate in the programming of large-scale genome reorganization during development, distinguishing germline-limited DNA, which is excised, from somatic-destined DNA. Only a handful of proteins playing roles in this process have been identified so far, including typical RNAi-derived factors such as Dicer-like and Piwi proteins. Here we report and characterize two novel proteins, Pdsg1 and Pdsg2 (Paramecium protein involved in Development of the Somatic Genome 1 and 2), involved in Paramecium genome reorganization. We show that these proteins are necessary for the excision of germline-limited DNA during development and the survival of sexual progeny. Knockdown of PDSG1 and PDSG2 genes affects the populations of small RNAs known to be involved in the programming of DNA elimination (scanRNAs and iesRNAs) and chromatin modification patterns during development. Our results suggest an association between RNA-mediated trans-generational epigenetic signal and chromatin modifications in the process of Paramecium genome reorganization. PMID:25397898

  5. Pdsg1 and Pdsg2, novel proteins involved in developmental genome remodelling in Paramecium.

    PubMed

    Arambasic, Miroslav; Sandoval, Pamela Y; Hoehener, Cristina; Singh, Aditi; Swart, Estienne C; Nowacki, Mariusz

    2014-01-01

    The epigenetic influence of maternal cells on the development of their progeny has long been studied in various eukaryotes. Multicellular organisms usually provide their zygotes not only with nutrients but also with functional elements required for proper development, such as coding and non-coding RNAs. These maternally deposited RNAs exhibit a variety of functions, from regulating gene expression to assuring genome integrity. In ciliates, such as Paramecium these RNAs participate in the programming of large-scale genome reorganization during development, distinguishing germline-limited DNA, which is excised, from somatic-destined DNA. Only a handful of proteins playing roles in this process have been identified so far, including typical RNAi-derived factors such as Dicer-like and Piwi proteins. Here we report and characterize two novel proteins, Pdsg1 and Pdsg2 (Paramecium protein involved in Development of the Somatic Genome 1 and 2), involved in Paramecium genome reorganization. We show that these proteins are necessary for the excision of germline-limited DNA during development and the survival of sexual progeny. Knockdown of PDSG1 and PDSG2 genes affects the populations of small RNAs known to be involved in the programming of DNA elimination (scanRNAs and iesRNAs) and chromatin modification patterns during development. Our results suggest an association between RNA-mediated trans-generational epigenetic signal and chromatin modifications in the process of Paramecium genome reorganization.

  6. Noncoding sequence classification based on wavelet transform analysis: part I

    NASA Astrophysics Data System (ADS)

    Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.

    2017-09-01

    DNA sequences in human genome can be divided into the coding and noncoding ones. Coding sequences are those that are read during the transcription. The identification of coding sequences has been widely reported in literature due to its much-studied periodicity. Noncoding sequences represent the majority of the human genome. They play an important role in gene regulation and differentiation among the cells. However, noncoding sequences do not exhibit periodicities that correlate to their functions. The ENCODE (Encyclopedia of DNA elements) and Epigenomic Roadmap Project projects have cataloged the human noncoding sequences into specific functions. We study characteristics of noncoding sequences with wavelet analysis of genomic signals.

  7. Facile and High-Throughput Synthesis of Functional Microparticles with Quick Response Codes.

    PubMed

    Ramirez, Lisa Marie S; He, Muhan; Mailloux, Shay; George, Justin; Wang, Jun

    2016-06-01

    Encoded microparticles are high demand in multiplexed assays and labeling. However, the current methods for the synthesis and coding of microparticles either lack robustness and reliability, or possess limited coding capacity. Here, a massive coding of dissociated elements (MiCODE) technology based on innovation of a chemically reactive off-stoichimetry thiol-allyl photocurable polymer and standard lithography to produce a large number of quick response (QR) code microparticles is introduced. The coding process is performed by photobleaching the QR code patterns on microparticles when fluorophores are incorporated into the prepolymer formulation. The fabricated encoded microparticles can be released from a substrate without changing their features. Excess thiol functionality on the microparticle surface allows for grafting of amine groups and further DNA probes. A multiplexed assay is demonstrated using the DNA-grafted QR code microparticles. The MiCODE technology is further characterized by showing the incorporation of BODIPY-maleimide (BDP-M) and Nile Red fluorophores for coding and the use of microcontact printing for immobilizing DNA probes on microparticle surfaces. This versatile technology leverages mature lithography facilities for fabrication and thus is amenable to scale-up in the future, with potential applications in bioassays and in labeling consumer products. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Transcriptional role of androgen receptor in the expression of long non-coding RNA Sox2OT in neurogenesis

    PubMed Central

    Tosetti, Valentina; Sassone, Jenny; Ferri, Anna L. M.; Taiana, Michela; Bedini, Gloria; Nava, Sara; Brenna, Greta; Di Resta, Chiara; Pareyson, Davide; Di Giulio, Anna Maria; Carelli, Stephana

    2017-01-01

    The complex architecture of adult brain derives from tightly regulated migration and differentiation of precursor cells generated during embryonic neurogenesis. Changes at transcriptional level of genes that regulate migration and differentiation may lead to neurodevelopmental disorders. Androgen receptor (AR) is a transcription factor that is already expressed during early embryonic days. However, AR role in the regulation of gene expression at early embryonic stage is yet to be determinate. Long non-coding RNA (lncRNA) Sox2 overlapping transcript (Sox2OT) plays a crucial role in gene expression control during development but its transcriptional regulation is still to be clearly defined. Here, using Bicalutamide in order to pharmacologically inactivated AR, we investigated whether AR participates in the regulation of the transcription of the lncRNASox2OTat early embryonic stage. We identified a new DNA binding region upstream of Sox2 locus containing three androgen response elements (ARE), and found that AR binds such a sequence in embryonic neural stem cells and in mouse embryonic brain. Our data suggest that through this binding, AR can promote the RNA polymerase II dependent transcription of Sox2OT. Our findings also suggest that AR participates in embryonic neurogenesis through transcriptional control of the long non-coding RNA Sox2OT. PMID:28704421

  9. Transcriptional role of androgen receptor in the expression of long non-coding RNA Sox2OT in neurogenesis.

    PubMed

    Tosetti, Valentina; Sassone, Jenny; Ferri, Anna L M; Taiana, Michela; Bedini, Gloria; Nava, Sara; Brenna, Greta; Di Resta, Chiara; Pareyson, Davide; Di Giulio, Anna Maria; Carelli, Stephana; Parati, Eugenio A; Gorio, Alfredo

    2017-01-01

    The complex architecture of adult brain derives from tightly regulated migration and differentiation of precursor cells generated during embryonic neurogenesis. Changes at transcriptional level of genes that regulate migration and differentiation may lead to neurodevelopmental disorders. Androgen receptor (AR) is a transcription factor that is already expressed during early embryonic days. However, AR role in the regulation of gene expression at early embryonic stage is yet to be determinate. Long non-coding RNA (lncRNA) Sox2 overlapping transcript (Sox2OT) plays a crucial role in gene expression control during development but its transcriptional regulation is still to be clearly defined. Here, using Bicalutamide in order to pharmacologically inactivated AR, we investigated whether AR participates in the regulation of the transcription of the lncRNASox2OTat early embryonic stage. We identified a new DNA binding region upstream of Sox2 locus containing three androgen response elements (ARE), and found that AR binds such a sequence in embryonic neural stem cells and in mouse embryonic brain. Our data suggest that through this binding, AR can promote the RNA polymerase II dependent transcription of Sox2OT. Our findings also suggest that AR participates in embryonic neurogenesis through transcriptional control of the long non-coding RNA Sox2OT.

  10. Study characterizes long non-coding RNA’s response to DNA damage in colon cancer cells | Center for Cancer Research

    Cancer.gov

    Researchers led by Ashish Lal, Ph.D., Investigator in the Genetics Branch, have shown that when the DNA in human colon cancer cells is damaged, a long non-coding RNA (lncRNA) regulates the expression of genes that halt growth, which allows the cells to repair the damage and promote survival. Their findings suggest an important pro-survival function of a lncRNA in cancer cells.  Read more...

  11. 41 CFR Appendix C to Chapter 301 - Standard Data Elements for Federal Travel [Traveler Identification

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... education, in scientific, professional, technical, mechanical, trade, clerical, fiscal, administrative, or... Data Elements for Federal Travel [Accounting & Certification] Group name Data elements Description Accounting Classification Accounting Code Agency accounting code. Non-Federal Source Indicator Per Diem...

  12. Microparticles: Facile and High-Throughput Synthesis of Functional Microparticles with Quick Response Codes (Small 24/2016).

    PubMed

    Ramirez, Lisa Marie S; He, Muhan; Mailloux, Shay; George, Justin; Wang, Jun

    2016-06-01

    Microparticles carrying quick response (QR) barcodes are fabricated by J. Wang and co-workers on page 3259, using a massive coding of dissociated elements (MiCODE) technology. Each microparticle can bear a special custom-designed QR code that enables encryption or tagging with unlimited multiplexity, and the QR code can be easily read by cellphone applications. The utility of MiCODE particles in multiplexed DNA detection and microtagging for anti-counterfeiting is explored. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Characterisation of cytoplasmic DNA complementary to non-retroviral RNA viruses in human cells

    PubMed Central

    Shimizu, Akira; Nakatani, Yoko; Nakamura, Takako; Jinno-Oue, Atsushi; Ishikawa, Osamu; Boeke, Jef D.; Takeuchi, Yasuhiro; Hoshino, Hiroo

    2014-01-01

    The synthesis and subsequent genomic integration of DNA that is complementary to the genomes of non-retroviral RNA viruses are rarely observed. However, upon infection of various human cell lines and primary fibroblasts with the vesicular stomatitis virus (VSV), we detected DNA complementary to the VSV RNA. The VSV DNA was detected in the cytoplasm as single-stranded DNA fully complementary to the viral mRNA from the poly(A) region to the 7-methyl guanosine cap. The formation of this DNA was cell-dependent. Experimentally, we found that the transduction of cells that do not produce VSV DNA with the long interspersed nuclear element 1 and their infection with VSV could lead to the formation of VSV DNA. Viral DNA complementary to other RNA viruses was also detected in the respective infected human cells. Thus, the genetic information of the non-retroviral RNA virus genome can flow into the DNA of mammalian cells expressing LINE-1-like elements. PMID:24875540

  14. Haben repetitive DNA-Sequenzen biologische Funktionen?

    NASA Astrophysics Data System (ADS)

    John, Maliyakal E.; Knöchel, Walter

    1983-05-01

    By DNA reassociation kinetics it is known that the eucaryotic genome consists of non-repetitive DNA, middle-repetitive DNA and highly repetitive DNA. Whereas the majority of protein-coding genes is located on non-repetitive DNA, repetitive DNA forms a constitutive part of eucaryotic DNA and its amount in most cases equals or even substantially exceeds that of non-repetitive DNA. During the past years a large body of data on repetitive DNA has accumulated and these have prompted speculations ranging from specific roles in the regulation of gene expression to that of a selfish entity with inconsequential functions. The following article summarizes recent findings on structural, transcriptional and evolutionary aspects and, although by no means being proven, some possible biological functions are discussed.

  15. Single nucleotide polymorphisms in common bean: their discovery and genotyping using a multiplex detection system

    USDA-ARS?s Scientific Manuscript database

    Single-nucleotide Polymorphism (SNP) markers are by far the most common form of DNA polymorphism in a genome. The objectives of this study were to discover SNPs in common bean comparing sequences from coding and non-coding regions obtained from Genbank and genomic DNA and to compare sequencing resu...

  16. Comprehensive analysis of single molecule sequencing-derived complete genome and whole transcriptome of Hyposidra talaca nuclear polyhedrosis virus.

    PubMed

    Nguyen, Thong T; Suryamohan, Kushal; Kuriakose, Boney; Janakiraman, Vasantharajan; Reichelt, Mike; Chaudhuri, Subhra; Guillory, Joseph; Divakaran, Neethu; Rabins, P E; Goel, Ridhi; Deka, Bhabesh; Sarkar, Suman; Ekka, Preety; Tsai, Yu-Chih; Vargas, Derek; Santhosh, Sam; Mohan, Sangeetha; Chin, Chen-Shan; Korlach, Jonas; Thomas, George; Babu, Azariah; Seshagiri, Somasekar

    2018-06-12

    We sequenced the Hyposidra talaca NPV (HytaNPV) double stranded circular DNA genome using PacBio single molecule sequencing technology. We found that the HytaNPV genome is 139,089 bp long with a GC content of 39.6%. It encodes 141 open reading frames (ORFs) including the 37 baculovirus core genes, 25 genes conserved among lepidopteran baculoviruses, 72 genes known in baculovirus, and 7 genes unique to the HytaNPV genome. It is a group II alphabaculovirus that codes for the F protein and lacks the gp64 gene found in group I alphabaculovirus viruses. Using RNA-seq, we confirmed the expression of the ORFs identified in the HytaNPV genome. Phylogenetic analysis showed HytaNPV to be closest to BusuNPV, SujuNPV and EcobNPV that infect other tea pests, Buzura suppressaria, Sucra jujuba, and Ectropis oblique, respectively. We identified repeat elements and a conserved non-coding baculovirus element in the genome. Analysis of the putative promoter sequences identified motif consistent with the temporal expression of the genes observed in the RNA-seq data.

  17. A new family of polymerases related to superfamily A DNA polymerases and T7-like DNA-dependent RNA polymerases.

    PubMed

    Iyer, Lakshminarayan M; Abhiman, Saraswathi; Aravind, L

    2008-10-04

    Using sequence profile methods and structural comparisons we characterize a previously unknown family of nucleic acid polymerases in a group of mobile elements from genomes of diverse bacteria, an algal plastid and certain DNA viruses, including the recently reported Sputnik virus. Using contextual information from domain architectures and gene-neighborhoods we present evidence that they are likely to possess both primase and DNA polymerase activity, comparable to the previously reported prim-pol proteins. These newly identified polymerases help in defining the minimal functional core of superfamily A DNA polymerases and related RNA polymerases. Thus, they provide a framework to understand the emergence of both DNA and RNA polymerization activity in this class of enzymes. They also provide evidence that enigmatic DNA viruses, such as Sputnik, might have emerged from mobile elements coding these polymerases.

  18. A new family of polymerases related to superfamily A DNA polymerases and T7-like DNA-dependent RNA polymerases

    PubMed Central

    Iyer, Lakshminarayan M; Abhiman, Saraswathi; Aravind, L

    2008-01-01

    Using sequence profile methods and structural comparisons we characterize a previously unknown family of nucleic acid polymerases in a group of mobile elements from genomes of diverse bacteria, an algal plastid and certain DNA viruses, including the recently reported Sputnik virus. Using contextual information from domain architectures and gene-neighborhoods we present evidence that they are likely to possess both primase and DNA polymerase activity, comparable to the previously reported prim-pol proteins. These newly identified polymerases help in defining the minimal functional core of superfamily A DNA polymerases and related RNA polymerases. Thus, they provide a framework to understand the emergence of both DNA and RNA polymerization activity in this class of enzymes. They also provide evidence that enigmatic DNA viruses, such as Sputnik, might have emerged from mobile elements coding these polymerases. This article was reviewed by Eugene Koonin and Mark Ragan. PMID:18834537

  19. Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome

    PubMed Central

    Hamilton, Eileen P; Kapusta, Aurélie; Huvos, Piroska E; Bidwell, Shelby L; Zafar, Nikhat; Tang, Haibao; Hadjithomas, Michalis; Krishnakumar, Vivek; Badger, Jonathan H; Caler, Elisabet V; Russ, Carsten; Zeng, Qiandong; Fan, Lin; Levin, Joshua Z; Shea, Terrance; Young, Sarah K; Hegarty, Ryan; Daza, Riza; Gujja, Sharvari; Wortman, Jennifer R; Birren, Bruce W; Nusbaum, Chad; Thomas, Jainy; Carey, Clayton M; Pritham, Ellen J; Feschotte, Cédric; Noto, Tomoko; Mochizuki, Kazufumi; Papazyan, Romeo; Taverna, Sean D; Dear, Paul H; Cassidy-Hanley, Donna M; Xiong, Jie; Miao, Wei; Orias, Eduardo; Coyne, Robert S

    2016-01-01

    The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena’s germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum. DOI: http://dx.doi.org/10.7554/eLife.19090.001 PMID:27892853

  20. Design Pattern Mining Using Distributed Learning Automata and DNA Sequence Alignment

    PubMed Central

    Esmaeilpour, Mansour; Naderifar, Vahideh; Shukur, Zarina

    2014-01-01

    Context Over the last decade, design patterns have been used extensively to generate reusable solutions to frequently encountered problems in software engineering and object oriented programming. A design pattern is a repeatable software design solution that provides a template for solving various instances of a general problem. Objective This paper describes a new method for pattern mining, isolating design patterns and relationship between them; and a related tool, DLA-DNA for all implemented pattern and all projects used for evaluation. DLA-DNA achieves acceptable precision and recall instead of other evaluated tools based on distributed learning automata (DLA) and deoxyribonucleic acid (DNA) sequences alignment. Method The proposed method mines structural design patterns in the object oriented source code and extracts the strong and weak relationships between them, enabling analyzers and programmers to determine the dependency rate of each object, component, and other section of the code for parameter passing and modular programming. The proposed model can detect design patterns better that available other tools those are Pinot, PTIDEJ and DPJF; and the strengths of their relationships. Results The result demonstrate that whenever the source code is build standard and non-standard, based on the design patterns, then the result of the proposed method is near to DPJF and better that Pinot and PTIDEJ. The proposed model is tested on the several source codes and is compared with other related models and available tools those the results show the precision and recall of the proposed method, averagely 20% and 9.6% are more than Pinot, 27% and 31% are more than PTIDEJ and 3.3% and 2% are more than DPJF respectively. Conclusion The primary idea of the proposed method is organized in two following steps: the first step, elemental design patterns are identified, while at the second step, is composed to recognize actual design patterns. PMID:25243670

  1. The site-specific ribosomal insertion element type II of Bombyx mori (R2Bm) contains the coding sequence for a reverse transcriptase-like enzyme.

    PubMed Central

    Burke, W D; Calalang, C C; Eickbush, T H

    1987-01-01

    Two classes of DNA elements interrupt a fraction of the rRNA repeats of Bombyx mori. We have analyzed by genomic blotting and sequence analysis one class of these elements which we have named R2. These elements occupy approximately 9% of the rDNA units of B. mori and appear to be homologous to the type II rDNA insertions detected in Drosophila melanogaster. Approximately 25 copies of R2 exist within the B. mori genome, of which at least 20 are located at a precise location within otherwise typical rDNA units. Nucleotide sequence analysis has revealed that the 4.2-kilobase-pair R2 element has a single large open reading frame, occupying over 82% of the total length of the element. The central region of this 1,151-amino-acid open reading frame shows homology to the reverse transcriptase enzymes found in retroviruses and certain transposable elements. Amino acid homology of this region is highest to the mobile line 1 elements of mammals, followed by the mitochondrial type II introns of fungi, and the pol gene of retroviruses. Less homology exists with transposable elements of D. melanogaster and Saccharomyces cerevisiae. Two additional regions of sequence homology between L1 and R2 elements were also found outside the reverse transcriptase region. We suggest that the R2 elements are retrotransposons that are site specific in their insertion into the genome. Such mobility would enable these elements to occupy a small fraction of the rDNA units of B. mori despite their continual elimination from the rDNA locus by sequence turnover. Images PMID:2439905

  2. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.

    PubMed

    Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong

    2012-01-01

    Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.

  3. Diversity of Prdm9 Zinc Finger Array in Wild Mice Unravels New Facets of the Evolutionary Turnover of this Coding Minisatellite

    PubMed Central

    Buard, Jérôme; Rivals, Eric; Dunoyer de Segonzac, Denis; Garres, Charlotte; Caminade, Pierre; de Massy, Bernard; Boursot, Pierre

    2014-01-01

    In humans and mice, meiotic recombination events cluster into narrow hotspots whose genomic positions are defined by the PRDM9 protein via its DNA binding domain constituted of an array of zinc fingers (ZnFs). High polymorphism and rapid divergence of the Prdm9 gene ZnF domain appear to involve positive selection at DNA-recognition amino-acid positions, but the nature of the underlying evolutionary pressures remains a puzzle. Here we explore the variability of the Prdm9 ZnF array in wild mice, and uncovered a high allelic diversity of both ZnF copy number and identity with the caracterization of 113 alleles. We analyze features of the diversity of ZnF identity which is mostly due to non-synonymous changes at codons −1, 3 and 6 of each ZnF, corresponding to amino-acids involved in DNA binding. Using methods adapted to the minisatellite structure of the ZnF array, we infer a phylogenetic tree of these alleles. We find the sister species Mus spicilegus and M. macedonicus as well as the three house mouse (Mus musculus) subspecies to be polyphyletic. However some sublineages have expanded independently in Mus musculus musculus and M. m. domesticus, the latter further showing phylogeographic substructure. Compared to random genomic regions and non-coding minisatellites, none of these patterns appears exceptional. In silico prediction of DNA binding sites for each allele, overlap of their alignments to the genome and relative coverage of the different families of interspersed repeated elements suggest a large diversity between PRDM9 variants with a potential for highly divergent distributions of recombination events in the genome with little correlation to evolutionary distance. By compiling PRDM9 ZnF protein sequences in Primates, Muridae and Equids, we find different diversity patterns among the three amino-acids most critical for the DNA-recognition function, suggesting different diversification timescales. PMID:24454780

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Helfenbein, Kevin G.; Brown, Wesley M.; Boore, Jeffrey L.

    We have sequenced the complete mitochondrial DNA (mtDNA) of the articulate brachiopod Terebratalia transversa. The circular genome is 14,291 bp in size, relatively small compared to other published metazoan mtDNAs. The 37 genes commonly found in animal mtDNA are present; the size decrease is due to the truncation of several tRNA, rRNA, and protein genes, to some nucleotide overlaps, and to a paucity of non-coding nucleotides. Although the gene arrangement differs radically from those reported for other metazoans, some gene junctions are shared with two other articulate brachiopods, Laqueus rubellus and Terebratulina retusa. All genes in the T. transversa mtDNA,more » unlike those in most metazoan mtDNAs reported, are encoded by the same strand. The A+T content (59.1 percent) is low for a metazoan mtDNA, and there is a high propensity for homopolymer runs and a strong base-compositional strand bias. The coding strand is quite G+T-rich, a skew that is shared by the confamilial (laqueid) specie s L. rubellus, but opposite to that found in T. retusa, a cancellothyridid. These compositional skews are strongly reflected in the codon usage patterns and the amino acid compositions of the mitochondrial proteins, with markedly different usage observed between T. retusa and the two laqueids. This observation, plus the similarity of the laqueid non-coding regions to the reverse complement of the non-coding region of the cancellothyridid, suggest that an inversion that resulted in a reversal in the direction of first-strand replication has occurred in one of the two lineages. In addition to the presence of one non-coding region in T. transversa that is comparable to those in the other brachiopod mtDNAs, there are two others with the potential to form secondary structures; one or both of these may be involved in the process of transcript cleavage.« less

  5. The developmental transcriptome of Drosophila melanogaster

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    University of Connecticut; Graveley, Brenton R.; Brooks, Angela N.

    Drosophila melanogaster is one of the most well studied genetic model organisms; nonetheless, its genome still contains unannotated coding and non-coding genes, transcripts, exons and RNA editing sites. Full discovery and annotation are pre-requisites for understanding how the regulation of transcription, splicing and RNA editing directs the development of this complex organism. Here we used RNA-Seq, tiling microarrays and cDNA sequencing to explore the transcriptome in 30 distinct developmental stages. We identified 111,195 new elements, including thousands of genes, coding and non-coding transcripts, exons, splicing and editing events, and inferred protein isoforms that previously eluded discovery using established experimental, predictionmore » and conservation-based approaches. These data substantially expand the number of known transcribed elements in the Drosophila genome and provide a high-resolution view of transcriptome dynamics throughout development. Drosophila melanogaster is an important non-mammalian model system that has had a critical role in basic biological discoveries, such as identifying chromosomes as the carriers of genetic information and uncovering the role of genes in development. Because it shares a substantial genic content with humans, Drosophila is increasingly used as a translational model for human development, homeostasis and disease. High-quality maps are needed for all functional genomic elements. Previous studies demonstrated that a rich collection of genes is deployed during the life cycle of the fly. Although expression profiling using microarrays has revealed the expression of, 13,000 annotated genes, it is difficult to map splice junctions and individual base modifications generated by RNA editing using such approaches. Single-base resolution is essential to define precisely the elements that comprise the Drosophila transcriptome. Estimates of the number of transcript isoforms are less accurate than estimates of the number of genes. Whereas, 20% of Drosophila genes are annotated as encoding alternatively spliced premRNAs, splice-junction microarray experiments indicate that this number is at least 40% (ref. 7). Determining the diversity of mRNAs generated by alternative promoters, alternative splicing and RNA editing will substantially increase the inferred protein repertoire. Non-coding RNA genes (ncRNAs) including short interfering RNAs (siRNAs) and microRNAS (miRNAs) (reviewed in ref. 10), and longer ncRNAs such as bxd (ref. 11) and rox (ref. 12), have important roles in gene regulation, whereas others such as small nucleolar RNAs (snoRNAs)and small nuclear RNAs (snRNAs) are important components of macromolecular machines such as the ribosome and spliceosome. The transcription and processing of these ncRNAs must also be fully documented and mapped. As part of the modENCODE project to annotate the functional elements of the D. melanogaster and Caenorhabditis elegans genomes, we used RNA-Seq and tiling microarrays to sample the Drosophila transcriptome at unprecedented depth throughout development from early embryo to ageing male and female adults. We report on a high-resolution view of the discovery, structure and dynamic expression of the D. melanogaster transcriptome.« less

  6. oriTfinder: a web-based tool for the identification of origin of transfers in DNA sequences of bacterial mobile genetic elements.

    PubMed

    Li, Xiaobin; Xie, Yingzhou; Liu, Meng; Tai, Cui; Sun, Jingyong; Deng, Zixin; Ou, Hong-Yu

    2018-05-04

    oriTfinder is a web server that facilitates the rapid identification of the origin of transfer site (oriT) of a conjugative plasmid or chromosome-borne integrative and conjugative element. The utilized back-end database oriTDB was built upon more than one thousand known oriT regions of bacterial mobile genetic elements (MGEs) as well as the known MGE-encoding relaxases and type IV coupling proteins (T4CP). With a combination of similarity searches for the oriTDB-archived oriT nucleotide sequences and the co-localization of the flanking relaxase homologous genes, the oriTfinder can predict the oriT region with high accuracy in the DNA sequence of a bacterial plasmid or chromosome in minutes. The server also detects the other transfer-related modules, including the potential relaxase gene, T4CP gene and the type IV secretion system gene cluster, and the putative genes coding for virulence factors and acquired antibiotic resistance determinants. oriTfinder may contribute to meeting the increasing demands of re-annotations for bacterial conjugative, mobilizable or non-transferable elements and aid in the rapid risk accession of disease-relevant trait dissemination in pathogenic bacteria of interest. oriTfinder is freely available to all users without any login requirement at http://bioinfo-mml.sjtu.edu.cn/oriTfinder.

  7. The identification of cis-regulatory elements: A review from a machine learning perspective.

    PubMed

    Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W

    2015-12-01

    The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.

  8. Genetic therapy for the nervous system.

    PubMed

    Bowers, William J; Breakefield, Xandra O; Sena-Esteves, Miguel

    2011-04-15

    Genetic therapy is undergoing a renaissance with expansion of viral and synthetic vectors, use of oligonucleotides (RNA and DNA) and sequence-targeted regulatory molecules, as well as genetically modified cells, including induced pluripotent stem cells from the patients themselves. Several clinical trials for neurologic syndromes appear quite promising. This review covers genetic strategies to ameliorate neurologic syndromes of different etiologies, including lysosomal storage diseases, Alzheimer's disease and other amyloidopathies, Parkinson's disease, spinal muscular atrophy, amyotrophic lateral sclerosis and brain tumors. This field has been propelled by genetic technologies, including identifying disease genes and disruptive mutations, design of genomic interacting elements to regulate transcription and splicing of specific precursor mRNAs and use of novel non-coding regulatory RNAs. These versatile new tools for manipulation of genetic elements provide the ability to tailor the mode of genetic intervention to specific aspects of a disease state.

  9. Run-length encoding graphic rules, biochemically editable designs and steganographical numeric data embedment for DNA-based cryptographical coding system.

    PubMed

    Kawano, Tomonori

    2013-03-01

    There have been a wide variety of approaches for handling the pieces of DNA as the "unplugged" tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given "passwords" and/or secret numbers using DNA sequences. The "passwords" of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original "passwords." The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed.

  10. Extension of CE/SE method to non-equilibrium dissociating flows

    NASA Astrophysics Data System (ADS)

    Wen, C. Y.; Saldivar Massimi, H.; Shen, H.

    2018-03-01

    In this study, the hypersonic non-equilibrium flows over rounded nose geometries are numerically investigated by a robust conservation element and solution element (CE/SE) code, which is based on hybrid meshes consisting of triangular and quadrilateral elements. The dissociating and recombination chemical reactions as well as the vibrational energy relaxation are taken into account. The stiff source terms are solved by an implicit trapezoidal method of integration. Comparison with laboratory and numerical cases are provided to demonstrate the accuracy and reliability of the present CE/SE code in simulating hypersonic non-equilibrium flows.

  11. Foldback intercoil DNA and the mechanism of DNA transposition.

    PubMed

    Kim, Byung-Dong

    2014-09-01

    Foldback intercoil (FBI) DNA is formed by the folding back at one point of a non-helical parallel track of double-stranded DNA at as sharp as 180° and the intertwining of two double helixes within each other's major groove to form an intercoil with a diameter of 2.2 nm. FBI DNA has been suggested to mediate intra-molecular homologous recombination of a deletion and inversion. Inter-molecular homologous recombination, known as site-specific insertion, on the other hand, is mediated by the direct perpendicular approach of the FBI DNA tip, as the attP site, onto the target DNA, as the attB site. Transposition of DNA transposons involves the pairing of terminal inverted repeats and 5-7-bp tandem target duplication. FBI DNA configuration effectively explains simple as well as replicative transposition, along with the involvement of an enhancer element. The majority of diverse retrotransposable elements that employ a target site duplication mechanism is also suggested to follow the FBI DNA-mediated perpendicular insertion of the paired intercoil ends by non-homologous end-joining, together with gap filling. A genome-wide perspective of transposable elements in light of FBI DNA is discussed.

  12. Evolution in the block: common elements of 5S rDNA organization and evolutionary patterns in distant fish genera.

    PubMed

    Campo, Daniel; García-Vázquez, Eva

    2012-01-01

    The 5S rDNA is organized in the genome as tandemly repeated copies of a structural unit composed of a coding sequence plus a nontranscribed spacer (NTS). The coding region is highly conserved in the evolution, whereas the NTS vary in both length and sequence. It has been proposed that 5S rRNA genes are members of a gene family that have arisen through concerted evolution. In this study, we describe the molecular organization and evolution of the 5S rDNA in the genera Lepidorhombus and Scophthalmus (Scophthalmidae) and compared it with already known 5S rDNA of the very different genera Merluccius (Merluccidae) and Salmo (Salmoninae), to identify common structural elements or patterns for understanding 5S rDNA evolution in fish. High intra- and interspecific diversity within the 5S rDNA family in all the genera can be explained by a combination of duplications, deletions, and transposition events. Sequence blocks with high similarity in all the 5S rDNA members across species were identified for the four studied genera, with evidences of intense gene conversion within noncoding regions. We propose a model to explain the evolution of the 5S rDNA, in which the evolutionary units are blocks of nucleotides rather than the entire sequences or single nucleotides. This model implies a "two-speed" evolution: slow within blocks (homogenized by recombination) and fast within the gene family (diversified by duplications and deletions).

  13. An algebraic hypothesis about the primeval genetic code architecture.

    PubMed

    Sánchez, Robersy; Grau, Ricardo

    2009-09-01

    A plausible architecture of an ancient genetic code is derived from an extended base triplet vector space over the Galois field of the extended base alphabet {D,A,C,G,U}, where symbol D represents one or more hypothetical bases with unspecific pairings. We hypothesized that the high degeneration of a primeval genetic code with five bases and the gradual origin and improvement of a primeval DNA repair system could make possible the transition from ancient to modern genetic codes. Our results suggest that the Watson-Crick base pairing G identical with C and A=U and the non-specific base pairing of the hypothetical ancestral base D used to define the sum and product operations are enough features to determine the coding constraints of the primeval and the modern genetic code, as well as, the transition from the former to the latter. Geometrical and algebraic properties of this vector space reveal that the present codon assignment of the standard genetic code could be induced from a primeval codon assignment. Besides, the Fourier spectrum of the extended DNA genome sequences derived from the multiple sequence alignment suggests that the called period-3 property of the present coding DNA sequences could also exist in the ancient coding DNA sequences. The phylogenetic analyses achieved with metrics defined in the N-dimensional vector space (B(3))(N) of DNA sequences and with the new evolutionary model presented here also suggest that an ancient DNA coding sequence with five or more bases does not contradict the expected evolutionary history.

  14. A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes.

    PubMed

    Hezroni, Hadas; Ben-Tov Perry, Rotem; Meir, Zohar; Housman, Gali; Lubelsky, Yoav; Ulitsky, Igor

    2017-08-30

    Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.

  15. Mind the gap; seven reasons to close fragmented genome assemblies.

    PubMed

    Thomma, Bart P H J; Seidl, Michael F; Shi-Kunne, Xiaoqian; Cook, David E; Bolton, Melvin D; van Kan, Jan A L; Faino, Luigi

    2016-05-01

    Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including those that study non-model organisms. Thus, hundreds of fungal genomes have been sequenced and are publically available today, although these initiatives have typically yielded considerably fragmented genome assemblies that often lack large contiguous genomic regions. Many important genomic features are contained in intergenic DNA that is often missing in current genome assemblies, and recent studies underscore the significance of non-coding regions and repetitive elements for the life style, adaptability and evolution of many organisms. The study of particular types of genetic elements, such as telomeres, centromeres, repetitive elements, effectors, and clusters of co-regulated genes, but also of phenomena such as structural rearrangements, genome compartmentalization and epigenetics, greatly benefits from having a contiguous and high-quality, preferably even complete and gapless, genome assembly. Here we discuss a number of important reasons to produce gapless, finished, genome assemblies to help answer important biological questions. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Non-coding stem-bulge RNAs are required for cell proliferation and embryonic development in C. elegans

    PubMed Central

    Kowalski, Madzia P.; Baylis, Howard A.; Krude, Torsten

    2015-01-01

    ABSTRACT Stem bulge RNAs (sbRNAs) are a family of small non-coding stem-loop RNAs present in Caenorhabditis elegans and other nematodes, the function of which is unknown. Here, we report the first functional characterisation of nematode sbRNAs. We demonstrate that sbRNAs from a range of nematode species are able to reconstitute the initiation of chromosomal DNA replication in the presence of replication proteins in vitro, and that conserved nucleotide sequence motifs are essential for this function. By functionally inactivating sbRNAs with antisense morpholino oligonucleotides, we show that sbRNAs are required for S phase progression, early embryonic development and the viability of C. elegans in vivo. Thus, we demonstrate a new and essential role for sbRNAs during the early development of C. elegans. sbRNAs show limited nucleotide sequence similarity to vertebrate Y RNAs, which are also essential for the initiation of DNA replication. Our results therefore establish that the essential function of small non-coding stem-loop RNAs during DNA replication extends beyond vertebrates. PMID:25908866

  17. [DNA prints instead of plantar prints in neonatal identification].

    PubMed

    Rodríguez-Alarcón Gómez, J; Martińez de Pancorbo Gómez, M; Santillana Ferrer, L; Castro Espido, A; Melchor Maros, J C; Linares Uribe, M A; Fernández-Llebrez del Rey, L; Aranguren Dúo, G

    1996-06-22

    To check the possible usefulness in studying DNA in dried blood spots taken on filter paper blotters for newborn identification. It set out to establish: 1. The validity of the method for analysis; 2. The validity of all stored samples (such as those kept in clinical records); 3. Guarantee of non-intrusion in the genetic code; 4. Acceptable price and execution time. Forty (40) anonymous 13-year-old samples of 20 subjects (2 per subject) were studied. DNA was extracted using Chelex resin and the STR ("small tandem repeat") of microsatellite DNA was studies using the "polimerase chain reaction method" (PCR). Three non coding DNA loci (CSF1PO, TPOX and THO1) were analyzed by Multiplex amplification. It was possible to type 39 samples, making it possible to match the 20 cases (one by exclusion). The complete procedure yielded the results within 24 hours in all cases. The estimated final cost was found to be a fifth of that conventional maternity/paternity tests. The study carried out made matching possible in all 20 cases (directly in 19 cases). It was not necessary to study DNA coding areas. The validity of the method for analyzing samples stored for 13 years without any special care was also demonstrated. The technic was fast, producing the results within 24 hours, and at reasonable cost.

  18. The evolutionary history of Saccharomyces species inferred from completed mitochondrial genomes and revision in the ‘yeast mitochondrial genetic code’

    PubMed Central

    Szabóová, Dana; Bielik, Peter; Poláková, Silvia; Šoltys, Katarína; Jatzová, Katarína; Szemes, Tomáš

    2017-01-01

    Abstract The yeast Saccharomyces are widely used to test ecological and evolutionary hypotheses. A large number of nuclear genomic DNA sequences are available, but mitochondrial genomic data are insufficient. We completed mitochondrial DNA (mtDNA) sequencing from Illumina MiSeq reads for all Saccharomyces species. All are circularly mapped molecules decreasing in size with phylogenetic distance from Saccharomyces cerevisiae but with similar gene content including regulatory and selfish elements like origins of replication, introns, free-standing open reading frames or GC clusters. Their most profound feature is species-specific alteration in gene order. The genetic code slightly differs from well-established yeast mitochondrial code as GUG is used rarely as the translation start and CGA and CGC code for arginine. The multilocus phylogeny, inferred from mtDNA, does not correlate with the trees derived from nuclear genes. mtDNA data demonstrate that Saccharomyces cariocanus should be assigned as a separate species and Saccharomyces bayanus CBS 380T should not be considered as a distinct species due to mtDNA nearly identical to Saccharomyces uvarum mtDNA. Apparently, comparison of mtDNAs should not be neglected in genomic studies as it is an important tool to understand the origin and evolutionary history of some yeast species. PMID:28992063

  19. Comparative analysis of mitochondrial genomes between a wheat K-type cytoplasmic male sterility (CMS) line and its maintainer line.

    PubMed

    Liu, Huitao; Cui, Peng; Zhan, Kehui; Lin, Qiang; Zhuo, Guoyin; Guo, Xiaoli; Ding, Feng; Yang, Wenlong; Liu, Dongcheng; Hu, Songnian; Yu, Jun; Zhang, Aimin

    2011-03-29

    Plant mitochondria, semiautonomous organelles that function as manufacturers of cellular ATP, have their own genome that has a slow rate of evolution and rapid rearrangement. Cytoplasmic male sterility (CMS), a common phenotype in higher plants, is closely associated with rearrangements in mitochondrial DNA (mtDNA), and is widely used to produce F1 hybrid seeds in a variety of valuable crop species. Novel chimeric genes deduced from mtDNA rearrangements causing CMS have been identified in several plants, such as rice, sunflower, pepper, and rapeseed, but there are very few reports about mtDNA rearrangements in wheat. In the present work, we describe the mitochondrial genome of a wheat K-type CMS line and compare it with its maintainer line. The complete mtDNA sequence of a wheat K-type (with cytoplasm of Aegilops kotschyi) CMS line, Ks3, was assembled into a master circle (MC) molecule of 647,559 bp and found to harbor 34 known protein-coding genes, three rRNAs (18 S, 26 S, and 5 S rRNAs), and 16 different tRNAs. Compared to our previously published sequence of a K-type maintainer line, Km3, we detected Ks3-specific mtDNA (> 100 bp, 11.38%) and repeats (> 100 bp, 29 units) as well as genes that are unique to each line: rpl5 was missing in Ks3 and trnH was absent from Km3. We also defined 32 single nucleotide polymorphisms (SNPs) in 13 protein-coding, albeit functionally irrelevant, genes, and predicted 22 unique ORFs in Ks3, representing potential candidates for K-type CMS. All these sequence variations are candidates for involvement in CMS. A comparative analysis of the mtDNA of several angiosperms, including those from Ks3, Km3, rice, maize, Arabidopsis thaliana, and rapeseed, showed that non-coding sequences of higher plants had mostly divergent multiple reorganizations during the mtDNA evolution of higher plants. The complete mitochondrial genome of the wheat K-type CMS line Ks3 is very different from that of its maintainer line Km3, especially in non-coding sequences. Sequence rearrangement has produced novel chimeric ORFs, which may be candidate genes for CMS. Comparative analysis of several angiosperm mtDNAs indicated that non-coding sequences are the most frequently reorganized during mtDNA evolution in higher plants.

  20. Capturing the Biofuel Wellhead and Powerhouse: The Chloroplast and Mitochondrial Genomes of the Leguminous Feedstock Tree Pongamia pinnata

    PubMed Central

    Kazakoff, Stephen H.; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T.; Gresshoff, Peter M.

    2012-01-01

    Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® ‘Second Generation DNA Sequencing (2GS)’ and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites. PMID:23272141

  1. Capturing the biofuel wellhead and powerhouse: the chloroplast and mitochondrial genomes of the leguminous feedstock tree Pongamia pinnata.

    PubMed

    Kazakoff, Stephen H; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T; Gresshoff, Peter M

    2012-01-01

    Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® 'Second Generation DNA Sequencing (2GS)' and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites.

  2. Run-length encoding graphic rules, biochemically editable designs and steganographical numeric data embedment for DNA-based cryptographical coding system

    PubMed Central

    Kawano, Tomonori

    2013-01-01

    There have been a wide variety of approaches for handling the pieces of DNA as the “unplugged” tools for digital information storage and processing, including a series of studies applied to the security-related area, such as DNA-based digital barcodes, water marks and cryptography. In the present article, novel designs of artificial genes as the media for storing the digitally compressed data for images are proposed for bio-computing purpose while natural genes principally encode for proteins. Furthermore, the proposed system allows cryptographical application of DNA through biochemically editable designs with capacity for steganographical numeric data embedment. As a model case of image-coding DNA technique application, numerically and biochemically combined protocols are employed for ciphering the given “passwords” and/or secret numbers using DNA sequences. The “passwords” of interest were decomposed into single letters and translated into the font image coded on the separate DNA chains with both the coding regions in which the images are encoded based on the novel run-length encoding rule, and the non-coding regions designed for biochemical editing and the remodeling processes revealing the hidden orientation of letters composing the original “passwords.” The latter processes require the molecular biological tools for digestion and ligation of the fragmented DNA molecules targeting at the polymerase chain reaction-engineered termini of the chains. Lastly, additional protocols for steganographical overwriting of the numeric data of interests over the image-coding DNA are also discussed. PMID:23750303

  3. On fuzzy semantic similarity measure for DNA coding.

    PubMed

    Ahmad, Muneer; Jung, Low Tang; Bhuiyan, Md Al-Amin

    2016-02-01

    A coding measure scheme numerically translates the DNA sequence to a time domain signal for protein coding regions identification. A number of coding measure schemes based on numerology, geometry, fixed mapping, statistical characteristics and chemical attributes of nucleotides have been proposed in recent decades. Such coding measure schemes lack the biologically meaningful aspects of nucleotide data and hence do not significantly discriminate coding regions from non-coding regions. This paper presents a novel fuzzy semantic similarity measure (FSSM) coding scheme centering on FSSM codons׳ clustering and genetic code context of nucleotides. Certain natural characteristics of nucleotides i.e. appearance as a unique combination of triplets, preserving special structure and occurrence, and ability to own and share density distributions in codons have been exploited in FSSM. The nucleotides׳ fuzzy behaviors, semantic similarities and defuzzification based on the center of gravity of nucleotides revealed a strong correlation between nucleotides in codons. The proposed FSSM coding scheme attains a significant enhancement in coding regions identification i.e. 36-133% as compared to other existing coding measure schemes tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Probability of coding of a DNA sequence: an algorithm to predict translated reading frames from their thermodynamic characteristics.

    PubMed Central

    Tramontano, A; Macchiato, M F

    1986-01-01

    An algorithm to determine the probability that a reading frame codifies for a protein is presented. It is based on the results of our previous studies on the thermodynamic characteristics of a translated reading frame. We also develop a prediction procedure to distinguish between coding and non-coding reading frames. The procedure is based on the characteristics of the putative product of the DNA sequence and not on periodicity characteristics of the sequence, so the prediction is not biased by the presence of overlapping translated reading frames or by the presence of translated reading frames on the complementary DNA strand. PMID:3753761

  5. qPMS9: An Efficient Algorithm for Quorum Planted Motif Search

    NASA Astrophysics Data System (ADS)

    Nicolae, Marius; Rajasekaran, Sanguthevar

    2015-01-01

    Discovering patterns in biological sequences is a crucial problem. For example, the identification of patterns in DNA sequences has resulted in the determination of open reading frames, identification of gene promoter elements, intron/exon splicing sites, and SH RNAs, location of RNA degradation signals, identification of alternative splicing sites, etc. In protein sequences, patterns have led to domain identification, location of protease cleavage sites, identification of signal peptides, protein interactions, determination of protein degradation elements, identification of protein trafficking elements, discovery of short functional motifs, etc. In this paper we focus on the identification of an important class of patterns, namely, motifs. We study the (l, d) motif search problem or Planted Motif Search (PMS). PMS receives as input n strings and two integers l and d. It returns all sequences M of length l that occur in each input string, where each occurrence differs from M in at most d positions. Another formulation is quorum PMS (qPMS), where the motif appears in at least q% of the strings. We introduce qPMS9, a parallel exact qPMS algorithm that offers significant runtime improvements on DNA and protein datasets. qPMS9 solves the challenging DNA (l, d)-instances (28, 12) and (30, 13). The source code is available at https://code.google.com/p/qpms9/.

  6. The fission yeast CENP-B protein Abp1 prevents pervasive transcription of repetitive DNA elements.

    PubMed

    Daulny, Anne; Mejía-Ramírez, Eva; Reina, Oscar; Rosado-Lugo, Jesus; Aguilar-Arnal, Lorena; Auer, Herbert; Zaratiegui, Mikel; Azorin, Fernando

    2016-10-01

    It is well established that eukaryotic genomes are pervasively transcribed producing cryptic unstable transcripts (CUTs). However, the mechanisms regulating pervasive transcription are not well understood. Here, we report that the fission yeast CENP-B homolog Abp1 plays an important role in preventing pervasive transcription. We show that loss of abp1 results in the accumulation of CUTs, which are targeted for degradation by the exosome pathway. These CUTs originate from different types of genomic features, but the highest increase corresponds to Tf2 retrotransposons and rDNA repeats, where they map along the entire elements. In the absence of abp1, increased RNAPII-Ser5P occupancy is observed throughout the Tf2 coding region and, unexpectedly, RNAPII-Ser5P is enriched at rDNA repeats. Loss of abp1 also results in Tf2 derepression and increased nucleolus size. Altogether these results suggest that Abp1 prevents pervasive RNAPII transcription of repetitive DNA elements (i.e., Tf2 and rDNA repeats) from internal cryptic sites. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.

    PubMed

    Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W

    2018-05-31

    In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.

  8. Genetic therapy for the nervous system

    PubMed Central

    Bowers, William J.; Breakefield, Xandra O.; Sena-Esteves, Miguel

    2011-01-01

    Genetic therapy is undergoing a renaissance with expansion of viral and synthetic vectors, use of oligonucleotides (RNA and DNA) and sequence-targeted regulatory molecules, as well as genetically modified cells, including induced pluripotent stem cells from the patients themselves. Several clinical trials for neurologic syndromes appear quite promising. This review covers genetic strategies to ameliorate neurologic syndromes of different etiologies, including lysosomal storage diseases, Alzheimer's disease and other amyloidopathies, Parkinson's disease, spinal muscular atrophy, amyotrophic lateral sclerosis and brain tumors. This field has been propelled by genetic technologies, including identifying disease genes and disruptive mutations, design of genomic interacting elements to regulate transcription and splicing of specific precursor mRNAs and use of novel non-coding regulatory RNAs. These versatile new tools for manipulation of genetic elements provide the ability to tailor the mode of genetic intervention to specific aspects of a disease state. PMID:21429918

  9. Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity

    NASA Astrophysics Data System (ADS)

    Mukherjee, Shashi Bajaj; Sen, Pradip Kumar

    2010-10-01

    Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.

  10. Hyperosmotic stress memory in Arabidopsis is mediated by distinct epigenetically labile sites in the genome and is restricted in the male germline by DNA glycosylase activity

    PubMed Central

    Wibowo, Anjar; Becker, Claude; Marconi, Gianpiero; Durr, Julius; Price, Jonathan; Hagmann, Jorg; Papareddy, Ranjith; Putra, Hadi; Kageyama, Jorge; Becker, Jorg; Weigel, Detlef; Gutierrez-Marcos, Jose

    2016-01-01

    Inducible epigenetic changes in eukaryotes are believed to enable rapid adaptation to environmental fluctuations. We have found distinct regions of the Arabidopsis genome that are susceptible to DNA (de)methylation in response to hyperosmotic stress. The stress-induced epigenetic changes are associated with conditionally heritable adaptive phenotypic stress responses. However, these stress responses are primarily transmitted to the next generation through the female lineage due to widespread DNA glycosylase activity in the male germline, and extensively reset in the absence of stress. Using the CNI1/ATL31 locus as an example, we demonstrate that epigenetically targeted sequences function as distantly-acting control elements of antisense long non-coding RNAs, which in turn regulate targeted gene expression in response to stress. Collectively, our findings reveal that plants use a highly dynamic maternal ‘short-term stress memory’ with which to respond to adverse external conditions. This transient memory relies on the DNA methylation machinery and associated transcriptional changes to extend the phenotypic plasticity accessible to the immediate offspring. DOI: http://dx.doi.org/10.7554/eLife.13546.001 PMID:27242129

  11. The annotation of repetitive elements in the genome of channel catfish (Ictalurus punctatus).

    PubMed

    Yuan, Zihao; Zhou, Tao; Bao, Lisui; Liu, Shikai; Shi, Huitong; Yang, Yujia; Gao, Dongya; Dunham, Rex; Waldbieser, Geoff; Liu, Zhanjiang

    2018-01-01

    Channel catfish (Ictalurus punctatus) is a highly adaptive species and has been used as a research model for comparative immunology, physiology, and toxicology among ectothermic vertebrates. It is also economically important for aquaculture. As such, its reference genome was generated and annotated with protein coding genes. However, the repetitive elements in the catfish genome are less well understood. In this study, over 417.8 Megabase (MB) of repetitive elements were identified and characterized in the channel catfish genome. Among them, the DNA/TcMar-Tc1 transposons are the most abundant type, making up ~20% of the total repetitive elements, followed by the microsatellites (14%). The prevalence of repetitive elements, especially the mobile elements, may have provided a driving force for the evolution of the catfish genome. A number of catfish-specific repetitive elements were identified including the previously reported Xba elements whose divergence rate was relatively low, slower than that in untranslated regions of genes but faster than the protein coding sequences, suggesting its evolutionary restrictions.

  12. The annotation of repetitive elements in the genome of channel catfish (Ictalurus punctatus)

    PubMed Central

    Yuan, Zihao; Zhou, Tao; Bao, Lisui; Liu, Shikai; Shi, Huitong; Yang, Yujia; Gao, Dongya; Dunham, Rex; Waldbieser, Geoff

    2018-01-01

    Channel catfish (Ictalurus punctatus) is a highly adaptive species and has been used as a research model for comparative immunology, physiology, and toxicology among ectothermic vertebrates. It is also economically important for aquaculture. As such, its reference genome was generated and annotated with protein coding genes. However, the repetitive elements in the catfish genome are less well understood. In this study, over 417.8 Megabase (MB) of repetitive elements were identified and characterized in the channel catfish genome. Among them, the DNA/TcMar-Tc1 transposons are the most abundant type, making up ~20% of the total repetitive elements, followed by the microsatellites (14%). The prevalence of repetitive elements, especially the mobile elements, may have provided a driving force for the evolution of the catfish genome. A number of catfish-specific repetitive elements were identified including the previously reported Xba elements whose divergence rate was relatively low, slower than that in untranslated regions of genes but faster than the protein coding sequences, suggesting its evolutionary restrictions. PMID:29763462

  13. Population-specific variation in haplotype composition and heterozygosity at the POLB locus.

    PubMed

    Yamtich, Jennifer; Speed, William C; Straka, Eva; Kidd, Judith R; Sweasy, Joann B; Kidd, Kenneth K

    2009-05-01

    DNA polymerase beta plays a central role in base excision repair (BER), which removes large numbers of endogenous DNA lesions from each cell on a daily basis. Little is currently known about germline polymorphisms within the POLB locus, making it difficult to study the association of variants at this locus with human diseases such as cancer. Yet, approximately thirty percent of human tumor types show variants of DNA polymerase beta. We have assessed the global frequency distributions of coding and common non-coding SNPs in and flanking the POLB gene for a total of 14 sites typed in approximately 2400 individuals from anthropologically defined human populations worldwide. We have found a marked difference between haplotype frequencies in African populations and in non-African populations.

  14. Identification of G-quadruplex forming sequences in three manatee papillomaviruses

    PubMed Central

    Zahin, Maryam; Dean, William L.; Ghim, Shin-je; Joh, Joongho; Gray, Robert D.; Khanal, Sujita; Bossart, Gregory D.; Mignucci-Giannoni, Antonio A.; Rouchka, Eric C.; Jenson, Alfred B.; Trent, John O.; Chaires, Jonathan B.

    2018-01-01

    The Florida manatee (Trichechus manatus latirotris) is a threatened aquatic mammal in United States coastal waters. Over the past decade, the appearance of papillomavirus-induced lesions and viral papillomatosis in manatees has been a concern for those involved in the management and rehabilitation of this species. To date, three manatee papillomaviruses (TmPVs) have been identified in Florida manatees, one forming cutaneous lesions (TmPV1) and two forming genital lesions (TmPV3 and TmPV4). We identified DNA sequences with the potential to form G-quadruplex structures (G4) across the three genomes. G4 were located on both DNA strands and across coding and non-coding regions on all TmPVs, offering multiple targets for viral control. Although G4 have been identified in several viral genomes, including human PVs, most research has focused on canonical structures comprised of three G-tetrads. In contrast, the vast majority of sequences we identified would allow the formation of non-canonical structures with only two G-tetrads. Our biophysical analysis confirmed the formation of G4 with parallel topology in three such sequences from the E2 region. Two of the structures appear comprised of multiple stacked two G-tetrad structures, perhaps serving to increase structural stability. Computational analysis demonstrated enrichment of G4 sequences on all TmPVs on the reverse strand in the E2/E4 region and on both strands in the L2 region. Several G4 sequences occurred at similar regional locations on all PVs, most notably on the reverse strand in the E2 region. In other cases, G4 were identified at similar regional locations only on PVs forming genital lesions. On all TmPVs, G4 sequences were located in the non-coding region near putative E2 binding sites. Together, these findings suggest that G4 are possible regulatory elements in TmPVs. PMID:29630682

  15. Rapid Mitochondrial Genome Evolution through Invasion of Mobile Elements in Two Closely Related Species of Arbuscular Mycorrhizal Fungi

    PubMed Central

    Beaudet, Denis; Nadimi, Maryam; Iffis, Bachir; Hijri, Mohamed

    2013-01-01

    Arbuscular mycorrhizal fungi (AMF) are common and important plant symbionts. They have coenocytic hyphae and form multinucleated spores. The nuclear genome of AMF is polymorphic and its organization is not well understood, which makes the development of reliable molecular markers challenging. In stark contrast, their mitochondrial genome (mtDNA) is homogeneous. To assess the intra- and inter-specific mitochondrial variability in closely related Glomus species, we performed 454 sequencing on total genomic DNA of Glomus sp. isolate DAOM-229456 and we compared its mtDNA with two G. irregulare isolates. We found that the mtDNA of Glomus sp. is homogeneous, identical in gene order and, with respect to the sequences of coding regions, almost identical to G. irregulare. However, certain genomic regions vary substantially, due to insertions/deletions of elements such as introns, mitochondrial plasmid-like DNA polymerase genes and mobile open reading frames. We found no evidence of mitochondrial or cytoplasmic plasmids in Glomus species, and mobile ORFs in Glomus are responsible for the formation of four gene hybrids in atp6, atp9, cox2, and nad3, which are most probably the result of horizontal gene transfer and are expressed at the mRNA level. We found evidence for substantial sequence variation in defined regions of mtDNA, even among closely related isolates with otherwise identical coding gene sequences. This variation makes it possible to design reliable intra- and inter-specific markers. PMID:23637766

  16. Rapid mitochondrial genome evolution through invasion of mobile elements in two closely related species of arbuscular mycorrhizal fungi.

    PubMed

    Beaudet, Denis; Nadimi, Maryam; Iffis, Bachir; Hijri, Mohamed

    2013-01-01

    Arbuscular mycorrhizal fungi (AMF) are common and important plant symbionts. They have coenocytic hyphae and form multinucleated spores. The nuclear genome of AMF is polymorphic and its organization is not well understood, which makes the development of reliable molecular markers challenging. In stark contrast, their mitochondrial genome (mtDNA) is homogeneous. To assess the intra- and inter-specific mitochondrial variability in closely related Glomus species, we performed 454 sequencing on total genomic DNA of Glomus sp. isolate DAOM-229456 and we compared its mtDNA with two G. irregulare isolates. We found that the mtDNA of Glomus sp. is homogeneous, identical in gene order and, with respect to the sequences of coding regions, almost identical to G. irregulare. However, certain genomic regions vary substantially, due to insertions/deletions of elements such as introns, mitochondrial plasmid-like DNA polymerase genes and mobile open reading frames. We found no evidence of mitochondrial or cytoplasmic plasmids in Glomus species, and mobile ORFs in Glomus are responsible for the formation of four gene hybrids in atp6, atp9, cox2, and nad3, which are most probably the result of horizontal gene transfer and are expressed at the mRNA level. We found evidence for substantial sequence variation in defined regions of mtDNA, even among closely related isolates with otherwise identical coding gene sequences. This variation makes it possible to design reliable intra- and inter-specific markers.

  17. Characterization of three active transposable elements recently inserted in three independent DFR-A alleles and one high-copy DNA transposon isolated from the Pink allele of the ANS gene in onion (Allium cepa L.).

    PubMed

    Kim, Sunggil; Park, Jee Young; Yang, Tae-Jin

    2015-06-01

    Intact retrotransposon and DNA transposons inserted in a single gene were characterized in onions (Allium cepa) and their transcription and copy numbers were estimated in this study. While analyzing diverse onion germplasm, large insertions in the DFR-A gene encoding dihydroflavonol 4-reductase (DFR) involved in the anthocyanin biosynthesis pathway were found in two accessions. A 5,070-bp long terminal repeat (LTR) retrotransposon inserted in the active DFR-A (R4) allele was identified from one of the large insertions and designated AcCOPIA1. An intact ORF encoded typical domains of copia-like LTR retrotransposons. However, AcCOPIA1 contained atypical 'TG' and 'TA' dinucleotides at the ends of the LTRs. A 4,615-bp DNA transposon was identified in the other large insertion. This DNA transposon, designated AcCACTA1, contained an ORF coding for a transposase showing homology with the CACTA superfamily transposable elements (TEs). Another 5,073-bp DNA transposon was identified from the DFR-A (TRN) allele. This DNA transposon, designated AchAT1, belonged to the hAT superfamily with short 4-bp terminal inverted repeats (TIRs). Finally, a 6,258-bp non-autonomous DNA transposon, designated AcPINK, was identified in the ANS-p allele encoding anthocyanidin synthase, the next downstream enzyme to DFR in the anthocyanin biosynthesis pathway. AcPINK also possessed very short 3-bp TIRs. Active transcription of AcCOPIA1, AcCACTA1, and AchAT1 was observed through RNA-Seq analysis and RT-PCR. The copy numbers of AcPINK estimated by mapping the genomic DNA reads produced by NextSeq 500 were predominantly high compared with the other TEs. A series of evidence indicated that these TEs might have transposed in these onion genes very recently, providing a stepping stone for elucidation of enormously large-sized onion genome structure.

  18. Molecular analysis of two genes between let-653 and let-56 in the unc-22(IV) region of Caenorhabditis elegans.

    PubMed

    Marra, M A; Prasad, S S; Baillie, D L

    1993-01-01

    A previous study of genomic organization described the identification of nine potential coding regions in 150 kb of genomic DNA from the unc-22(IV) region of Caenorhabditis elegans. In this study, we focus on the genomic organization of a small interval of 0.1 map unit bordered on the right by unc-22 and on the left by the left-hand breakpoints of the deficiencies sDf9, sDf19 and sDf65. This small interval at present contains a single mutagenically defined locus, the essential gene let-56. The cosmid C11F2 has previously been used to rescue let-56. Therefore, at least some of C11F2 must reside in the interval. In this paper, we report the characterization of two coding elements that reside on C11F2. Analysis of nucleotide sequence data obtained from cDNAs and cosmid subclones revealed that one of the coding elements closely resembles aromatic amino acid decarboxylases from several species. The other of these coding elements was found to closely resemble a human growth factor activatable Na+/H+ antiporter. Paris of oligonucleotide primers, predicted from both coding elements, have been used in PCR experiments to position these coding elements between the left breakpoint of sDf19 and the left breakpoint of sDf65, between the essential genes let-653 and let-56.

  19. Assessing information content and interactive relationships of subgenomic DNA sequences of the MHC using complexity theory approaches based on the non-extensive statistical mechanics

    NASA Astrophysics Data System (ADS)

    Karakatsanis, L. P.; Pavlos, G. P.; Iliopoulos, A. C.; Pavlos, E. G.; Clark, P. M.; Duke, J. L.; Monos, D. S.

    2018-09-01

    This study combines two independent domains of science, the high throughput DNA sequencing capabilities of Genomics and complexity theory from Physics, to assess the information encoded by the different genomic segments of exonic, intronic and intergenic regions of the Major Histocompatibility Complex (MHC) and identify possible interactive relationships. The dynamic and non-extensive statistical characteristics of two well characterized MHC sequences from the homozygous cell lines, PGF and COX, in addition to two other genomic regions of comparable size, used as controls, have been studied using the reconstructed phase space theorem and the non-extensive statistical theory of Tsallis. The results reveal similar non-linear dynamical behavior as far as complexity and self-organization features. In particular, the low-dimensional deterministic nonlinear chaotic and non-extensive statistical character of the DNA sequences was verified with strong multifractal characteristics and long-range correlations. The nonlinear indices repeatedly verified that MHC sequences, whether exonic, intronic or intergenic include varying levels of information and reveal an interaction of the genes with intergenic regions, whereby the lower the number of genes in a region, the less the complexity and information content of the intergenic region. Finally we showed the significance of the intergenic region in the production of the DNA dynamics. The findings reveal interesting content information in all three genomic elements and interactive relationships of the genes with the intergenic regions. The results most likely are relevant to the whole genome and not only to the MHC. These findings are consistent with the ENCODE project, which has now established that the non-coding regions of the genome remain to be of relevance, as they are functionally important and play a significant role in the regulation of expression of genes and coordination of the many biological processes of the cell.

  20. Genomic patterns associated with paternal/maternal distribution of transposable elements

    NASA Astrophysics Data System (ADS)

    Jurka, Jerzy

    2003-03-01

    Transposable elements (TEs) are specialized DNA or RNA fragments capable of surviving in intragenomic niches. They are commonly, perhaps unjustifiably referred to as "selfish" or "parasitic" elements. TEs can be divided in two major classes: retroelements and DNA transposons. The former include non-LTR retrotransposons and retrovirus-like elements, using reverse transriptase for their reproduction prior to integration into host DNA. The latter depend mostly on host DNA replication, with possible exception of rolling-circle transposons recently discovered by our team. I will review basic information on TEs, with emphasis on human Alu and L1 retroelements discussed in the context of genomic organization. TEs are non-randomly distributed in chromosomal DNA. In particular, human Alu elements tend to prefer GC-rich regions, whereas L1 accumulate in AT-rich regions. Current explanations of this phenomenon focus on the so called "target effects" and post-insertional selection. However, the proposed models appear to be unsatisfactory and alternative explanations invoking "channeling" to different chromosomal regions will be a major focus of my presentation. Transposable elements (TEs) can be expressed and integrated into host DNA in the male or female germlines, or both. Different models of expression and integration imply different proportions of TEs on sex chromosomes and autosomes. The density of recently retroposed human Alu elements is around three times higher on chromosome Y than on chromosome X, and over two times higher than the average density for all human autosomes. This implies Alu activity in paternal germlines. Analogous inter-chromosomal proportions for other repeat families should determine their compatibility with one of the three basic models describing the inheritance of TEs. Published evidence indicates that maternally and paternally imprinted genes roughly correspond to GC-rich and AT-rich DNA. This may explain the observed chromosomal distribution of Alu and L1 elements. Finally, paternal models of inheritance predict rapid accumulation of active TEs on chromosome Y. I will discuss potential implications of this phenomenon for evolution of chromosome Y and transposable elements.

  1. Lnc2Meth: a manually curated database of regulatory relationships between long non-coding RNAs and DNA methylation associated with human disease

    PubMed Central

    Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao

    2018-01-01

    Abstract Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. PMID:29069510

  2. The presence, role and clinical use of spermatozoal RNAs

    PubMed Central

    Jodar, Meritxell; Selvaraju, Sellappan; Sendler, Edward; Diamond, Michael P.; Krawetz, Stephen A.

    2013-01-01

    BACKGROUND Spermatozoa are highly differentiated, transcriptionally inert cells characterized by a compact nucleus with minimal cytoplasm. Nevertheless they contain a suite of unique RNAs that are delivered to oocyte upon fertilization. They are likely integrated as part of many different processes including genome recognition, consolidation-confrontation, early embryonic development and epigenetic transgenerational inherence. Spermatozoal RNAs also provide a window into the developmental history of each sperm thereby providing biomarkers of fertility and pregnancy outcome which are being intensely studied. METHODS Literature searches were performed to review the majority of spermatozoal RNA studies that described potential functions and clinical applications with emphasis on Next-Generation Sequencing. Human, mouse, bovine and stallion were compared as their distribution and composition of spermatozoal RNAs, using these techniques, have been described. RESULTS Comparisons highlighted the complexity of the population of spermatozoal RNAs that comprises rRNA, mRNA and both large and small non-coding RNAs. RNA-seq analysis has revealed that only a fraction of the larger RNAs retain their structure. While rRNAs are the most abundant and are highly fragmented, ensuring a translationally quiescent state, other RNAs including some mRNAs retain their functional potential, thereby increasing the opportunity for regulatory interactions. Abundant small non-coding RNAs retained in spermatozoa include miRNAs and piRNAs. Some, like miR-34c are essential to the early embryo development required for the first cellular division. Others like the piRNAs are likely part of the genomic dance of confrontation and consolidation. Other non-coding spermatozoal RNAs include transposable elements, annotated lnc-RNAs, intronic retained elements, exonic elements, chromatin-associated RNAs, small-nuclear ILF3/NF30 associated RNAs, quiescent RNAs, mse-tRNAs and YRNAs. Some non-coding RNAs are known to act as epigenetic modifiers, inducing histone modifications and DNA methylation, perhaps playing a role in transgenerational epigenetic inherence. Transcript profiling holds considerable potential for the discovery of fertility biomarkers for both agriculture and human medicine. Comparing the differential RNA profiles of infertile and fertile individuals as well as assessing species similarities, should resolve the regulatory pathways contributing to male factor infertility. CONCLUSIONS Dad delivers a complex population of RNAs to the oocyte at fertilization that likely influences fertilization, embryo development, the phenotype of the offspring and possibly future generations. Development is continuing on the use of spermatozoal RNA profiles as phenotypic markers of male factor status for use as clinical diagnostics of the father's contribution to the birth of a healthy child. PMID:23856356

  3. Microsatellites in the Eukaryotic DNA Mismatch Repair Genes as Modulators of Evolutionary Mutation Rate

    NASA Technical Reports Server (NTRS)

    Chang, Dong Kyung; Metzgar, David; Wills, Christopher; Boland, C. Richard

    2003-01-01

    All "minor" components of the human DNA mismatch repair (MMR) system-MSH3, MSH6, PMS2, and the recently discovered MLH3-contain mononucleotide microsatellites in their coding sequences. This intriguing finding contrasts with the situation found in the major components of the DNA MMR system-MSH2 and MLH1-and, in fact, most human genes. Although eukaryotic genomes are rich in microsatellites, non-triplet microsatellites are rare in coding regions. The recurring presence of exonal mononucleotide repeat sequences within a single family of human genes would therefore be considered exceptional.

  4. HEXIM1 and NEAT1 Long Non-coding RNA Form a Multi-subunit Complex that Regulates DNA-Mediated Innate Immune Response.

    PubMed

    Morchikh, Mehdi; Cribier, Alexandra; Raffel, Raoul; Amraoui, Sonia; Cau, Julien; Severac, Dany; Dubois, Emeric; Schwartz, Olivier; Bennasser, Yamina; Benkirane, Monsef

    2017-08-03

    The DNA-mediated innate immune response underpins anti-microbial defenses and certain autoimmune diseases. Here we used immunoprecipitation, mass spectrometry, and RNA sequencing to identify a ribonuclear complex built around HEXIM1 and the long non-coding RNA NEAT1 that we dubbed the HEXIM1-DNA-PK-paraspeckle components-ribonucleoprotein complex (HDP-RNP). The HDP-RNP contains DNA-PK subunits (DNAPKc, Ku70, and Ku80) and paraspeckle proteins (SFPQ, NONO, PSPC1, RBM14, and MATRIN3). We show that binding of HEXIM1 to NEAT1 is required for its assembly. We further demonstrate that the HDP-RNP is required for the innate immune response to foreign DNA, through the cGAS-STING-IRF3 pathway. The HDP-RNP interacts with cGAS and its partner PQBP1, and their interaction is remodeled by foreign DNA. Remodeling leads to the release of paraspeckle proteins, recruitment of STING, and activation of DNAPKc and IRF3. Our study establishes the HDP-RNP as a key nuclear regulator of DNA-mediated activation of innate immune response through the cGAS-STING pathway. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Massive Collection of Full-Length Complementary DNA Clones and Microarray Analyses:. Keys to Rice Transcriptome Analysis

    NASA Astrophysics Data System (ADS)

    Kikuchi, Shoshi

    2009-02-01

    Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.

  6. Free Energy Gap and Statistical Thermodynamic Fidelity of DNA Codes

    DTIC Science & Technology

    2007-10-01

    reverse-complement unless otherwise stated. For strand x, let Nx denote its complement. A (perfect) Watson - Crick duplex is the joining of complement...is possible for complementary sequences to form a non-perfectly aligned duplex, we will call any x W Nx duplex a Watson - Crick (WC) duplex. Two...DATES COVERED (From - To) 4. TITLE AND SUBTITLE FREE ENERGY GAP AND STATISTICAL THERMODYNAMIC FIDELITY OF DNA CODES 5a. CONTRACT NUMBER FA8750-07

  7. Free Energy Gap and Statistical Thermodynamic Fidelity of DNA Codes (Postprint)

    DTIC Science & Technology

    2007-01-01

    reverse-complement unless otherwise stated. For strand x, let Nx denote its complement. A (perfect) Watson - Crick duplex is the joining of complement...is possible for complementary sequences to form a non-perfectly aligned duplex, we will call any x W Nx duplex a Watson - Crick (WC) duplex. Two...DATES COVERED (From - To) 4. TITLE AND SUBTITLE FREE ENERGY GAP AND STATISTICAL THERMODYNAMIC FIDELITY OF DNA CODES 5a. CONTRACT NUMBER FA8750-07

  8. Early demethylation of non-CpG, CpC-rich, elements in the myogenin 5′-flanking region

    PubMed Central

    Fuso, Andrea; Ferraguti, Giampiero; Grandoni, Francesco; Ruggeri, Raffaella; Scarpa, Sigfrido; Strom, Roberto

    2010-01-01

    The dynamic changes and structural patterns of DNA methylation of genes without CpG islands are poorly characterized. The relevance of CpG to the non-CpG methylation equilibrium in transcriptional repression is unknown. In this work, we analyzed the DNA methylation pattern of the 5′-flanking of the myogenin gene, a positive regulator of muscle differentiation with no CpG island and low CpG density, in both C2C12 muscle satellite cells and embryonic muscle. Embryonic brain was studied as a non-expressing tissue. High levels of both CpG and non-CpG methylation were observed in non-expressing experimental conditions. Both CpG and non-CpG methylation rapidly dropped during muscle differentiation and myogenin transcriptional activation with active demethylation dynamics. Non-CpG demethylation occurred more rapidly than CpG demethylation. Demethylation spread from initially highly methylated short CpC-rich elements to a virtually unmethylated status. These short elements have a high CpC content and density, share some motifs and largely coincide with putative recognition sequences of some differentiation-related transcription factors. Our findings point to a dynamically controlled equilibrium between CpG and non-CpG active demethylation in the transcriptional control of tissue-specific genes. The short CpC-rich elements are new structural features of the methylation machinery, whose functions may include priming the complete demethylation of a transcriptionally crucial DNA region. PMID:20935518

  9. Small RNAs, big impact: small RNA pathways in transposon control and their effect on the host stress response.

    PubMed

    Wheeler, Bayly S

    2013-12-01

    Transposons are mobile genetic elements that are a major constituent of most genomes. Organisms regulate transposable element expression, transposition, and insertion site preference, mitigating the genome instability caused by uncontrolled transposition. A recent burst of research has demonstrated the critical role of small non-coding RNAs in regulating transposition in fungi, plants, and animals. While mechanistically distinct, these pathways work through a conserved paradigm. The presence of a transposon is communicated by the presence of its RNA or by its integration into specific genomic loci. These signals are then translated into small non-coding RNAs that guide epigenetic modifications and gene silencing back to the transposon. In addition to being regulated by the host, transposable elements are themselves capable of influencing host gene expression. Transposon expression is responsive to environmental signals, and many transposons are activated by various cellular stresses. TEs can confer local gene regulation by acting as enhancers and can also confer global gene regulation through their non-coding RNAs. Thus, transposable elements can act as stress-responsive regulators that control host gene expression in cis and trans.

  10. Detection of non-coding RNA in bacteria and archaea using the DETR'PROK Galaxy pipeline.

    PubMed

    Toffano-Nioche, Claire; Luo, Yufei; Kuchly, Claire; Wallon, Claire; Steinbach, Delphine; Zytnicki, Matthias; Jacq, Annick; Gautheret, Daniel

    2013-09-01

    RNA-seq experiments are now routinely used for the large scale sequencing of transcripts. In bacteria or archaea, such deep sequencing experiments typically produce 10-50 million fragments that cover most of the genome, including intergenic regions. In this context, the precise delineation of the non-coding elements is challenging. Non-coding elements include untranslated regions (UTRs) of mRNAs, independent small RNA genes (sRNAs) and transcripts produced from the antisense strand of genes (asRNA). Here we present a computational pipeline (DETR'PROK: detection of ncRNAs in prokaryotes) based on the Galaxy framework that takes as input a mapping of deep sequencing reads and performs successive steps of clustering, comparison with existing annotation and identification of transcribed non-coding fragments classified into putative 5' UTRs, sRNAs and asRNAs. We provide a step-by-step description of the protocol using real-life example data sets from Vibrio splendidus and Escherichia coli. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  11. In and out of the rRNA genes: characterization of Pokey elements in the sequenced Daphnia genome

    PubMed Central

    2013-01-01

    Background Only a few transposable elements are known to exhibit site-specific insertion patterns, including the well-studied R-element retrotransposons that insert into specific sites within the multigene rDNA. The only known rDNA-specific DNA transposon, Pokey (superfamily: piggyBac) is found in the freshwater microcrustacean, Daphnia pulex. Here, we present a genome-wide analysis of Pokey based on the recently completed whole genome sequencing project for D. pulex. Results Phylogenetic analysis of Pokey elements recovered from the genome sequence revealed the presence of four lineages corresponding to two divergent autonomous families and two related lineages of non-autonomous miniature inverted repeat transposable elements (MITEs). The MITEs are also found at the same 28S rRNA gene insertion site as the Pokey elements, and appear to have arisen as deletion derivatives of autonomous elements. Several copies of the full-length Pokey elements may be capable of producing an active transposase. Surprisingly, both families of Pokey possess a series of 200 bp repeats upstream of the transposase that is derived from the rDNA intergenic spacer (IGS). The IGS sequences within the Pokey elements appear to be evolving in concert with the rDNA units. Finally, analysis of the insertion sites of Pokey elements outside of rDNA showed a target preference for sites similar to the specific sequence that is targeted within rDNA. Conclusions Based on the target site preference of Pokey elements and the concerted evolution of a segment of the element with the rDNA unit, we propose an evolutionary path by which the ancestors of Pokey elements have invaded the rDNA niche. We discuss how specificity for the rDNA unit may have evolved and how this specificity has played a role in the long-term survival of these elements in the subgenus Daphnia. PMID:24059783

  12. DNA transposon activity is associated with increased mutation rates in genes of rice and other grasses

    PubMed Central

    Wicker, Thomas; Yu, Yeisoo; Haberer, Georg; Mayer, Klaus F. X.; Marri, Pradeep Reddy; Rounsley, Steve; Chen, Mingsheng; Zuccolo, Andrea; Panaud, Olivier; Wing, Rod A.; Roffler, Stefan

    2016-01-01

    DNA (class 2) transposons are mobile genetic elements which move within their ‘host' genome through excising and re-inserting elsewhere. Although the rice genome contains tens of thousands of such elements, their actual role in evolution is still unclear. Analysing over 650 transposon polymorphisms in the rice species Oryza sativa and Oryza glaberrima, we find that DNA repair following transposon excisions is associated with an increased number of mutations in the sequences neighbouring the transposon. Indeed, the 3,000 bp flanking the excised transposons can contain over 10 times more mutations than the genome-wide average. Since DNA transposons preferably insert near genes, this is correlated with increases in mutation rates in coding sequences and regulatory regions. Most importantly, we find this phenomenon also in maize, wheat and barley. Thus, these findings suggest that DNA transposon activity is a major evolutionary force in grasses which provide the basis of most food consumed by humankind. PMID:27599761

  13. Probing the Potential Role of Non-B DNA Structures at Yeast Meiosis-Specific DNA Double-Strand Breaks.

    PubMed

    Kshirsagar, Rucha; Khan, Krishnendu; Joshi, Mamata V; Hosur, Ramakrishna V; Muniyappa, K

    2017-05-23

    A plethora of evidence suggests that different types of DNA quadruplexes are widely present in the genome of all organisms. The existence of a growing number of proteins that selectively bind and/or process these structures underscores their biological relevance. Moreover, G-quadruplex DNA has been implicated in the alignment of four sister chromatids by forming parallel guanine quadruplexes during meiosis; however, the underlying mechanism is not well defined. Here we show that a G/C-rich motif associated with a meiosis-specific DNA double-strand break (DSB) in Saccharomyces cerevisiae folds into G-quadruplex, and the C-rich sequence complementary to the G-rich sequence forms an i-motif. The presence of G-quadruplex or i-motif structures upstream of the green fluorescent protein-coding sequence markedly reduces the levels of gfp mRNA expression in S. cerevisiae cells, with a concomitant decrease in green fluorescent protein abundance, and blocks primer extension by DNA polymerase, thereby demonstrating the functional significance of these structures. Surprisingly, although S. cerevisiae Hop1, a component of synaptonemal complex axial/lateral elements, exhibits strong affinity to G-quadruplex DNA, it displays a much weaker affinity for the i-motif structure. However, the Hop1 C-terminal but not the N-terminal domain possesses strong i-motif binding activity, implying that the C-terminal domain has a distinct substrate specificity. Additionally, we found that Hop1 promotes intermolecular pairing between G/C-rich DNA segments associated with a meiosis-specific DSB site. Our results support the idea that the G/C-rich motifs associated with meiosis-specific DSBs fold into intramolecular G-quadruplex and i-motif structures, both in vitro and in vivo, thus revealing an important link between non-B form DNA structures and Hop1 in meiotic chromosome synapsis and recombination. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  14. Evolutionary Dynamics of Microsatellite Distribution in Plants: Insight from the Comparison of Sequenced Brassica, Arabidopsis and Other Angiosperm Species

    PubMed Central

    Shi, Jiaqin; Huang, Shunmou; Fu, Donghui; Yu, Jinyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2013-01-01

    Despite their ubiquity and functional importance, microsatellites have been largely ignored in comparative genomics, mostly due to the lack of genomic information. In the current study, microsatellite distribution was characterized and compared in the whole genomes and both the coding and non-coding DNA sequences of the sequenced Brassica, Arabidopsis and other angiosperm species to investigate their evolutionary dynamics in plants. The variation in the microsatellite frequencies of these angiosperm species was much smaller than those for their microsatellite numbers and genome sizes, suggesting that microsatellite frequency may be relatively stable in plants. The microsatellite frequencies of these angiosperm species were significantly negatively correlated with both their genome sizes and transposable elements contents. The pattern of microsatellite distribution may differ according to the different genomic regions (such as coding and non-coding sequences). The observed differences in many important microsatellite characteristics (especially the distribution with respect to motif length, type and repeat number) of these angiosperm species were generally accordant with their phylogenetic distance, which suggested that the evolutionary dynamics of microsatellite distribution may be generally consistent with plant divergence/evolution. Importantly, by comparing these microsatellite characteristics (especially the distribution with respect to motif type) the angiosperm species (aside from a few species) all clustered into two obviously different groups that were largely represented by monocots and dicots, suggesting a complex and generally dichotomous evolutionary pattern of microsatellite distribution in angiosperms. Polyploidy may lead to a slight increase in microsatellite frequency in the coding sequences and a significant decrease in microsatellite frequency in the whole genome/non-coding sequences, but have little effect on the microsatellite distribution with respect to motif length, type and repeat number. Interestingly, several microsatellite characteristics seemed to be constant in plant evolution, which can be well explained by the general biological rules. PMID:23555856

  15. The Mitochondrial Cytochrome Oxidase Subunit I Gene Occurs on a Minichromosome with Extensive Heteroplasmy in Two Species of Chewing Lice, Geomydoecus aurei and Thomomydoecus minor

    PubMed Central

    Pietan, Lucas L.; Spradling, Theresa A.

    2016-01-01

    In animals, mitochondrial DNA (mtDNA) typically occurs as a single circular chromosome with 13 protein-coding genes and 22 tRNA genes. The various species of lice examined previously, however, have shown mitochondrial genome rearrangements with a range of chromosome sizes and numbers. Our research demonstrates that the mitochondrial genomes of two species of chewing lice found on pocket gophers, Geomydoecus aurei and Thomomydoecus minor, are fragmented with the 1,536 base-pair (bp) cytochrome-oxidase subunit I (cox1) gene occurring as the only protein-coding gene on a 1,916–1,964 bp minicircular chromosome in the two species, respectively. The cox1 gene of T. minor begins with an atypical start codon, while that of G. aurei does not. Components of the non-protein coding sequence of G. aurei and T. minor include a tRNA (isoleucine) gene, inverted repeat sequences consistent with origins of replication, and an additional non-coding region that is smaller than the non-coding sequence of other lice with such fragmented mitochondrial genomes. Sequences of cox1 minichromosome clones for each species reveal extensive length and sequence heteroplasmy in both coding and noncoding regions. The highly variable non-gene regions of G. aurei and T. minor have little sequence similarity with one another except for a 19-bp region of phylogenetically conserved sequence with unknown function. PMID:27589589

  16. Lactase non-persistence is directed by DNA variation-dependent epigenetic aging

    PubMed Central

    Labrie, Viviane; Buske, Orion J; Oh, Edward; Jeremian, Richie; Ptak, Carolyn; Gasiūnas, Giedrius; Maleckas, Almantas; Petereit, Rūta; Žvirbliene, Aida; Adamonis, Kęstutis; Kriukienė, Edita; Koncevičius, Karolis; Gordevičius, Juozas; Nair, Akhil; Zhang, Aiping; Ebrahimi, Sasha; Oh, Gabriel; Šikšnys, Virginijus; Kupčinskas, Limas; Brudno, Michael; Petronis, Arturas

    2016-01-01

    Inability to digest lactose due to lactase non-persistence is a common trait in adult mammals, with the exception of certain human populations that exhibit lactase persistence. It is not clear how the lactase gene can be dramatically downregulated with age in most individuals, but remains active in some. We performed a comprehensive epigenetic study of the human and mouse intestine using chromosome-wide DNA modification profiling and targeted bisulfite sequencing. Epigenetically-controlled regulatory elements were found to account for the differences in lactase mRNA levels between individuals, intestinal cell types and species. The importance of these regulatory elements in modulating lactase mRNA levels was confirmed by CRISPR-Cas9-induced deletions. Genetic factors contribute to epigenetic changes occurring with age at the regulatory elements, as lactase persistence- and non-persistence-DNA haplotypes demonstrated markedly different epigenetic aging. Thus, genetic factors facilitate a gradual accumulation of epigenetic changes with age to affect phenotypic outcome. PMID:27159559

  17. An Integrated Encyclopedia of DNA Elements in the Human Genome

    PubMed Central

    2012-01-01

    Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616

  18. Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer.

    PubMed

    Palazzo, Antonio; Lovero, Domenica; D'Addabbo, Pietro; Caizzi, Ruggiero; Marsano, René Massimiliano

    2016-01-01

    Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon's co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon's evolutionary dynamics and increases our understanding on the Tc1-mariner elements' biology.

  19. Regulatory activities of transposable elements: from conflicts to benefits

    PubMed Central

    Chuong, Edward B.; Elde, Nels C.; Feschotte, Cédric

    2017-01-01

    Transposable elements (TEs) are a prolific source of tightly regulated, biochemically active non-coding elements, such as transcription factor binding sites and non-coding RNAs. A wealth of recent studies reinvigorates the idea that these elements are pervasively co-opted for the regulation of host genes. We argue that the inherent genetic properties of TEs and conflicting relationships with their hosts facilitate their recruitment for regulatory functions in diverse genomes. We review recent findings supporting the long-standing hypothesis that the waves of TE invasions endured by organisms for eons have catalyzed the evolution of gene regulatory networks. We also discuss the challenges of dissecting and interpreting the phenotypic impact of regulatory activities encoded by TEs in health and disease. PMID:27867194

  20. Improved design of special boundary elements for T-shaped reinforced concrete walls

    NASA Astrophysics Data System (ADS)

    Ji, Xiaodong; Liu, Dan; Qian, Jiaru

    2017-01-01

    This study examines the design provisions of the Chinese GB 50011-2010 code for seismic design of buildings for the special boundary elements of T-shaped reinforced concrete walls and proposes an improved design method. Comparison of the design provisions of the GB 50011-2010 code and those of the American code ACI 318-14 indicates a possible deficiency in the T-shaped wall design provisions in GB 50011-2010. A case study of a typical T-shaped wall designed in accordance with GB 50011-2010 also indicates the insufficient extent of the boundary element at the non-flange end and overly conservative design of the flange end boundary element. Improved designs for special boundary elements of T-shaped walls are developed using a displacement-based method. The proposed design formulas produce a longer boundary element at the non-flange end and a shorter boundary element at the flange end, relative to those of the GB 50011-2010 provisions. Extensive numerical analysis indicates that T-shaped walls designed using the proposed formulas develop inelastic drift of 0.01 for both cases of the flange in compression and in tension.

  1. Lnc2Meth: a manually curated database of regulatory relationships between long non-coding RNAs and DNA methylation associated with human disease.

    PubMed

    Zhi, Hui; Li, Xin; Wang, Peng; Gao, Yue; Gao, Baoqing; Zhou, Dianshuang; Zhang, Yan; Guo, Maoni; Yue, Ming; Shen, Weitao; Ning, Shangwei; Jin, Lianhong; Li, Xia

    2018-01-04

    Lnc2Meth (http://www.bio-bigdata.com/Lnc2Meth/), an interactive resource to identify regulatory relationships between human long non-coding RNAs (lncRNAs) and DNA methylation, is not only a manually curated collection and annotation of experimentally supported lncRNAs-DNA methylation associations but also a platform that effectively integrates tools for calculating and identifying the differentially methylated lncRNAs and protein-coding genes (PCGs) in diverse human diseases. The resource provides: (i) advanced search possibilities, e.g. retrieval of the database by searching the lncRNA symbol of interest, DNA methylation patterns, regulatory mechanisms and disease types; (ii) abundant computationally calculated DNA methylation array profiles for the lncRNAs and PCGs; (iii) the prognostic values for each hit transcript calculated from the patients clinical data; (iv) a genome browser to display the DNA methylation landscape of the lncRNA transcripts for a specific type of disease; (v) tools to re-annotate probes to lncRNA loci and identify the differential methylation patterns for lncRNAs and PCGs with user-supplied external datasets; (vi) an R package (LncDM) to complete the differentially methylated lncRNAs identification and visualization with local computers. Lnc2Meth provides a timely and valuable resource that can be applied to significantly expand our understanding of the regulatory relationships between lncRNAs and DNA methylation in various human diseases. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

    PubMed Central

    Hall, L; Laird, J E; Craig, R K

    1984-01-01

    Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375

  3. Using the NCBI Genome Databases to Compare the Genes for Human & Chimpanzee Beta Hemoglobin

    ERIC Educational Resources Information Center

    Offner, Susan

    2010-01-01

    The beta hemoglobin protein is identical in humans and chimpanzees. In this tutorial, students see that even though the proteins are identical, the genes that code for them are not. There are many more differences in the introns than in the exons, which indicates that coding regions of DNA are more highly conserved than non-coding regions.

  4. Epigenetics of Peripheral B-Cell Differentiation and the Antibody Response

    PubMed Central

    Zan, Hong; Casali, Paolo

    2015-01-01

    Epigenetic modifications, such as histone post-translational modifications, DNA methylation, and alteration of gene expression by non-coding RNAs, including microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), are heritable changes that are independent from the genomic DNA sequence. These regulate gene activities and, therefore, cellular functions. Epigenetic modifications act in concert with transcription factors and play critical roles in B cell development and differentiation, thereby modulating antibody responses to foreign- and self-antigens. Upon antigen encounter by mature B cells in the periphery, alterations of these lymphocytes epigenetic landscape are induced by the same stimuli that drive the antibody response. Such alterations instruct B cells to undergo immunoglobulin (Ig) class switch DNA recombination (CSR) and somatic hypermutation (SHM), as well as differentiation to memory B cells or long-lived plasma cells for the immune memory. Inducible histone modifications, together with DNA methylation and miRNAs modulate the transcriptome, particularly the expression of activation-induced cytidine deaminase, which is essential for CSR and SHM, and factors central to plasma cell differentiation, such as B lymphocyte-induced maturation protein-1. These inducible B cell-intrinsic epigenetic marks guide the maturation of antibody responses. Combinatorial histone modifications also function as histone codes to target CSR and, possibly, SHM machinery to the Ig loci by recruiting specific adaptors that can stabilize CSR/SHM factors. In addition, lncRNAs, such as recently reported lncRNA-CSR and an lncRNA generated through transcription of the S region that form G-quadruplex structures, are also important for CSR targeting. Epigenetic dysregulation in B cells, including the aberrant expression of non-coding RNAs and alterations of histone modifications and DNA methylation, can result in aberrant antibody responses to foreign antigens, such as those on microbial pathogens, and generation of pathogenic autoantibodies, IgE in allergic reactions, as well as B cell neoplasia. Epigenetic marks would be attractive targets for new therapeutics for autoimmune and allergic diseases, and B cell malignancies. PMID:26697022

  5. Differential DNA methylation profiles of coding and non-coding genes define hippocampal sclerosis in human temporal lobe epilepsy

    PubMed Central

    Miller-Delaney, Suzanne F.C.; Bryan, Kenneth; Das, Sudipto; McKiernan, Ross C.; Bray, Isabella M.; Reynolds, James P.; Gwinn, Ryder; Stallings, Raymond L.

    2015-01-01

    Temporal lobe epilepsy is associated with large-scale, wide-ranging changes in gene expression in the hippocampus. Epigenetic changes to DNA are attractive mechanisms to explain the sustained hyperexcitability of chronic epilepsy. Here, through methylation analysis of all annotated C-phosphate-G islands and promoter regions in the human genome, we report a pilot study of the methylation profiles of temporal lobe epilepsy with or without hippocampal sclerosis. Furthermore, by comparative analysis of expression and promoter methylation, we identify methylation sensitive non-coding RNA in human temporal lobe epilepsy. A total of 146 protein-coding genes exhibited altered DNA methylation in temporal lobe epilepsy hippocampus (n = 9) when compared to control (n = 5), with 81.5% of the promoters of these genes displaying hypermethylation. Unique methylation profiles were evident in temporal lobe epilepsy with or without hippocampal sclerosis, in addition to a common methylation profile regardless of pathology grade. Gene ontology terms associated with development, neuron remodelling and neuron maturation were over-represented in the methylation profile of Watson Grade 1 samples (mild hippocampal sclerosis). In addition to genes associated with neuronal, neurotransmitter/synaptic transmission and cell death functions, differential hypermethylation of genes associated with transcriptional regulation was evident in temporal lobe epilepsy, but overall few genes previously associated with epilepsy were among the differentially methylated. Finally, a panel of 13, methylation-sensitive microRNA were identified in temporal lobe epilepsy including MIR27A, miR-193a-5p (MIR193A) and miR-876-3p (MIR876), and the differential methylation of long non-coding RNA documented for the first time. The present study therefore reports select, genome-wide DNA methylation changes in human temporal lobe epilepsy that may contribute to the molecular architecture of the epileptic brain. PMID:25552301

  6. Trichodesmium genome maintains abundant, widespread noncoding DNA in situ, despite oligotrophic lifestyle

    DOE PAGES

    Walworth, Nathan G.; Pfreundt, Ulrike; Nelson, William C.; ...

    2015-04-07

    Understanding the evolution of the free-living, cyanobacterial, diazotroph Trichodesmium is of great importance due to its critical role in oceanic biogeochemistry and primary production. Unlike the other >150 available genomes of free-living cyanobacteria, only 63.8% of the Trichodesmium erythraeum (strain IMS101) genome is predicted to encode protein, which is 20-25% less than the average for other cyanobacteria and non-pathogenic, free-living bacteria. We use distinctive isolates and metagenomic data to show that low coding density observed in IMS101 is a common feature of the Trichodesmium genus both in culture and in situ. Transcriptome analysis indicates that 86% of the non-coding spacemore » is expressed, although the function of these transcripts is unclear. The density of noncoding, possible regulatory elements predicted in Trichodesmium, when normalized per intergenic kilobase, was comparable and two fold higher than that found in the gene dense genomes of the sympatric cyanobacterial genera Synechococcus and Prochlorococcus, respectively. Conserved Trichodesmium ncRNA secondary structures were predicted between most culture and metagenomic sequences lending support to the structural conservation. Conservation of these intergenic regions in spatiotemporally separated Trichodesmium populations suggests possible genus-wide selection for their maintenance. These large intergenic spacers may have developed during intervals of strong genetic drift caused by periodic blooms of a subset of genotypes, which may have reduced effective population size. Our data suggest that transposition of selfish DNA, low effective population size, and high fidelity replication allowed the unusual ‘inflation’ of noncoding sequence observed in Trichodesmium despite its oligotrophic lifestyle.« less

  7. MitoAge: a database for comparative analysis of mitochondrial DNA, with a special focus on animal longevity.

    PubMed

    Toren, Dmitri; Barzilay, Thomer; Tacutu, Robi; Lehmann, Gilad; Muradian, Khachik K; Fraifeld, Vadim E

    2016-01-04

    Mitochondria are the only organelles in the animal cells that have their own genome. Due to a key role in energy production, generation of damaging factors (ROS, heat), and apoptosis, mitochondria and mtDNA in particular have long been considered one of the major players in the mechanisms of aging, longevity and age-related diseases. The rapidly increasing number of species with fully sequenced mtDNA, together with accumulated data on longevity records, provides a new fascinating basis for comparative analysis of the links between mtDNA features and animal longevity. To facilitate such analyses and to support the scientific community in carrying these out, we developed the MitoAge database containing calculated mtDNA compositional features of the entire mitochondrial genome, mtDNA coding (tRNA, rRNA, protein-coding genes) and non-coding (D-loop) regions, and codon usage/amino acids frequency for each protein-coding gene. MitoAge includes 922 species with fully sequenced mtDNA and maximum lifespan records. The database is available through the MitoAge website (www.mitoage.org or www.mitoage.info), which provides the necessary tools for searching, browsing, comparing and downloading the data sets of interest for selected taxonomic groups across the Kingdom Animalia. The MitoAge website assists in statistical analysis of different features of the mtDNA and their correlative links to longevity. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Divergent evolutionary rates in vertebrate and mammalian specific conserved non-coding elements (CNEs) in echolocating mammals.

    PubMed

    Davies, Kalina T J; Tsagkogeorga, Georgia; Rossiter, Stephen J

    2014-12-19

    The majority of DNA contained within vertebrate genomes is non-coding, with a certain proportion of this thought to play regulatory roles during development. Conserved Non-coding Elements (CNEs) are an abundant group of putative regulatory sequences that are highly conserved across divergent groups and thus assumed to be under strong selective constraint. Many CNEs may contain regulatory factor binding sites, and their frequent spatial association with key developmental genes - such as those regulating sensory system development - suggests crucial roles in regulating gene expression and cellular patterning. Yet surprisingly little is known about the molecular evolution of CNEs across diverse mammalian taxa or their role in specific phenotypic adaptations. We examined 3,110 vertebrate-specific and ~82,000 mammalian-specific CNEs across 19 and 9 mammalian orders respectively, and tested for changes in the rate of evolution of CNEs located in the proximity of genes underlying the development or functioning of auditory systems. As we focused on CNEs putatively associated with genes underlying the development/functioning of auditory systems, we incorporated echolocating taxa in our dataset because of their highly specialised and derived auditory systems. Phylogenetic reconstructions of concatenated CNEs broadly recovered accepted mammal relationships despite high levels of sequence conservation. We found that CNE substitution rates were highest in rodents and lowest in primates, consistent with previous findings. Comparisons of CNE substitution rates from several genomic regions containing genes linked to auditory system development and hearing revealed differences between echolocating and non-echolocating taxa. Wider taxonomic sampling of four CNEs associated with the homeobox genes Hmx2 and Hmx3 - which are required for inner ear development - revealed family-wise variation across diverse bat species. Specifically within one family of echolocating bats that utilise frequency-modulated echolocation calls varying widely in frequency and intensity high levels of sequence divergence were found. Levels of selective constraint acting on CNEs differed both across genomic locations and taxa, with observed variation in substitution rates of CNEs among bat species. More work is needed to determine whether this variation can be linked to echolocation, and wider taxonomic sampling is necessary to fully document levels of conservation in CNEs across diverse taxa.

  9. The site-specific ribosomal DNA insertion element R1Bm belongs to a class of non-long-terminal-repeat retrotransposons.

    PubMed Central

    Xiong, Y; Eickbush, T H

    1988-01-01

    Two types of insertion elements, R1 and R2 (previously called type I and type II), are known to interrupt the 28S ribosomal genes of several insect species. In the silkmoth, Bombyx mori, each element occupies approximately 10% of the estimated 240 ribosomal DNA units, while at most only a few copies are located outside the ribosomal DNA units. We present here the complete nucleotide sequence of an R1 insertion from B. mori (R1Bm). This 5.1-kilobase element contains two overlapping open reading frames (ORFs) which together occupy 88% of its length. ORF1 is 461 amino acids in length and exhibits characteristics of retroviral gag genes. ORF2 is 1,051 amino acids in length and contains homology to reverse transcriptase-like enzymes. The analysis of 3' and 5' ends of independent isolates from the ribosomal locus supports the suggestion that R1 is still functioning as a transposable element. The precise location of the element within the genome implies that its transposition must occur with remarkable insertion sequence specificity. Comparison of the deduced amino acid sequences from six retrotransposons, R1 and R2 of B. mori, I factor and F element of Drosophila melanogaster, L1 of Mus domesticus, and Ingi of Trypanosoma brucei, reveals a relatively high level of sequence homology in the reverse transcriptase region. Like R1, these elements lack long terminal repeats. We have therefore named this class of related elements the non-long-terminal-repeat (non-LTR) retrotransposons. Images PMID:2447482

  10. Drosophila muller f elements maintain a distinct set of genomic properties over 40 million years of evolution.

    PubMed

    Leung, Wilson; Shaffer, Christopher D; Reed, Laura K; Smith, Sheryl T; Barshop, William; Dirkes, William; Dothager, Matthew; Lee, Paul; Wong, Jeannette; Xiong, David; Yuan, Han; Bedard, James E J; Machone, Joshua F; Patterson, Seantay D; Price, Amber L; Turner, Bryce A; Robic, Srebrenka; Luippold, Erin K; McCartha, Shannon R; Walji, Tezin A; Walker, Chelsea A; Saville, Kenneth; Abrams, Marita K; Armstrong, Andrew R; Armstrong, William; Bailey, Robert J; Barberi, Chelsea R; Beck, Lauren R; Blaker, Amanda L; Blunden, Christopher E; Brand, Jordan P; Brock, Ethan J; Brooks, Dana W; Brown, Marie; Butzler, Sarah C; Clark, Eric M; Clark, Nicole B; Collins, Ashley A; Cotteleer, Rebecca J; Cullimore, Peterson R; Dawson, Seth G; Docking, Carter T; Dorsett, Sasha L; Dougherty, Grace A; Downey, Kaitlyn A; Drake, Andrew P; Earl, Erica K; Floyd, Trevor G; Forsyth, Joshua D; Foust, Jonathan D; Franchi, Spencer L; Geary, James F; Hanson, Cynthia K; Harding, Taylor S; Harris, Cameron B; Heckman, Jonathan M; Holderness, Heather L; Howey, Nicole A; Jacobs, Dontae A; Jewell, Elizabeth S; Kaisler, Maria; Karaska, Elizabeth A; Kehoe, James L; Koaches, Hannah C; Koehler, Jessica; Koenig, Dana; Kujawski, Alexander J; Kus, Jordan E; Lammers, Jennifer A; Leads, Rachel R; Leatherman, Emily C; Lippert, Rachel N; Messenger, Gregory S; Morrow, Adam T; Newcomb, Victoria; Plasman, Haley J; Potocny, Stephanie J; Powers, Michelle K; Reem, Rachel M; Rennhack, Jonathan P; Reynolds, Katherine R; Reynolds, Lyndsey A; Rhee, Dong K; Rivard, Allyson B; Ronk, Adam J; Rooney, Meghan B; Rubin, Lainey S; Salbert, Luke R; Saluja, Rasleen K; Schauder, Taylor; Schneiter, Allison R; Schulz, Robert W; Smith, Karl E; Spencer, Sarah; Swanson, Bryant R; Tache, Melissa A; Tewilliager, Ashley A; Tilot, Amanda K; VanEck, Eve; Villerot, Matthew M; Vylonis, Megan B; Watson, David T; Wurzler, Juliana A; Wysocki, Lauren M; Yalamanchili, Monica; Zaborowicz, Matthew A; Emerson, Julia A; Ortiz, Carlos; Deuschle, Frederic J; DiLorenzo, Lauren A; Goeller, Katie L; Macchi, Christopher R; Muller, Sarah E; Pasierb, Brittany D; Sable, Joseph E; Tucci, Jessica M; Tynon, Marykathryn; Dunbar, David A; Beken, Levent H; Conturso, Alaina C; Danner, Benjamin L; DeMichele, Gabriella A; Gonzales, Justin A; Hammond, Maureen S; Kelley, Colleen V; Kelly, Elisabeth A; Kulich, Danielle; Mageeney, Catherine M; McCabe, Nikie L; Newman, Alyssa M; Spaeder, Lindsay A; Tumminello, Richard A; Revie, Dennis; Benson, Jonathon M; Cristostomo, Michael C; DaSilva, Paolo A; Harker, Katherine S; Jarrell, Jenifer N; Jimenez, Luis A; Katz, Brandon M; Kennedy, William R; Kolibas, Kimberly S; LeBlanc, Mark T; Nguyen, Trung T; Nicolas, Daniel S; Patao, Melissa D; Patao, Shane M; Rupley, Bryan J; Sessions, Bridget J; Weaver, Jennifer A; Goodman, Anya L; Alvendia, Erica L; Baldassari, Shana M; Brown, Ashley S; Chase, Ian O; Chen, Maida; Chiang, Scott; Cromwell, Avery B; Custer, Ashley F; DiTommaso, Tia M; El-Adaimi, Jad; Goscinski, Nora C; Grove, Ryan A; Gutierrez, Nestor; Harnoto, Raechel S; Hedeen, Heather; Hong, Emily L; Hopkins, Barbara L; Huerta, Vilma F; Khoshabian, Colin; LaForge, Kristin M; Lee, Cassidy T; Lewis, Benjamin M; Lydon, Anniken M; Maniaci, Brian J; Mitchell, Ryan D; Morlock, Elaine V; Morris, William M; Naik, Priyanka; Olson, Nicole C; Osterloh, Jeannette M; Perez, Marcos A; Presley, Jonathan D; Randazzo, Matt J; Regan, Melanie K; Rossi, Franca G; Smith, Melanie A; Soliterman, Eugenia A; Sparks, Ciani J; Tran, Danny L; Wan, Tiffany; Welker, Anne A; Wong, Jeremy N; Sreenivasan, Aparna; Youngblom, Jim; Adams, Andrew; Alldredge, Justin; Bryant, Ashley; Carranza, David; Cifelli, Alyssa; Coulson, Kevin; Debow, Calise; Delacruz, Noelle; Emerson, Charlene; Farrar, Cassandra; Foret, Don; Garibay, Edgar; Gooch, John; Heslop, Michelle; Kaur, Sukhjit; Khan, Ambreen; Kim, Van; Lamb, Travis; Lindbeck, Peter; Lucas, Gabi; Macias, Elizabeth; Martiniuc, Daniela; Mayorga, Lissett; Medina, Joseph; Membreno, Nelson; Messiah, Shady; Neufeld, Lacey; Nguyen, San Francisco; Nichols, Zachary; Odisho, George; Peterson, Daymon; Rodela, Laura; Rodriguez, Priscilla; Rodriguez, Vanessa; Ruiz, Jorge; Sherrill, Will; Silva, Valeria; Sparks, Jeri; Statton, Geeta; Townsend, Ashley; Valdez, Isabel; Waters, Mary; Westphal, Kyle; Winkler, Stacey; Zumkehr, Joannee; DeJong, Randall J; Hoogewerf, Arlene J; Ackerman, Cheri M; Armistead, Isaac O; Baatenburg, Lara; Borr, Matthew J; Brouwer, Lindsay K; Burkhart, Brandon J; Bushhouse, Kelsey T; Cesko, Lejla; Choi, Tiffany Y Y; Cohen, Heather; Damsteegt, Amanda M; Darusz, Jess M; Dauphin, Cory M; Davis, Yelena P; Diekema, Emily J; Drewry, Melissa; Eisen, Michelle E M; Faber, Hayley M; Faber, Katherine J; Feenstra, Elizabeth; Felzer-Kim, Isabella T; Hammond, Brandy L; Hendriksma, Jesse; Herrold, Milton R; Hilbrands, Julia A; Howell, Emily J; Jelgerhuis, Sarah A; Jelsema, Timothy R; Johnson, Benjamin K; Jones, Kelly K; Kim, Anna; Kooienga, Ross D; Menyes, Erika E; Nollet, Eric A; Plescher, Brittany E; Rios, Lindsay; Rose, Jenny L; Schepers, Allison J; Scott, Geoff; Smith, Joshua R; Sterling, Allison M; Tenney, Jenna C; Uitvlugt, Chris; VanDyken, Rachel E; VanderVennen, Marielle; Vue, Samantha; Kokan, Nighat P; Agbley, Kwabea; Boham, Sampson K; Broomfield, Daniel; Chapman, Kayla; Dobbe, Ali; Dobbe, Ian; Harrington, William; Ibrahem, Marwan; Kennedy, Andre; Koplinsky, Chad A; Kubricky, Cassandra; Ladzekpo, Danielle; Pattison, Claire; Ramirez, Roman E; Wande, Lucia; Woehlke, Sarah; Wawersik, Matthew; Kiernan, Elizabeth; Thompson, Jeffrey S; Banker, Roxanne; Bartling, Justina R; Bhatiya, Chinmoy I; Boudoures, Anna L; Christiansen, Lena; Fosselman, Daniel S; French, Kristin M; Gill, Ishwar S; Havill, Jessen T; Johnson, Jaelyn L; Keny, Lauren J; Kerber, John M; Klett, Bethany M; Kufel, Christina N; May, Francis J; Mecoli, Jonathan P; Merry, Callie R; Meyer, Lauren R; Miller, Emily G; Mullen, Gregory J; Palozola, Katherine C; Pfeil, Jacob J; Thomas, Jessica G; Verbofsky, Evan M; Spana, Eric P; Agarwalla, Anant; Chapman, Julia; Chlebina, Ben; Chong, Insun; Falk, I N; Fitzgibbons, John D; Friedman, Harrison; Ighile, Osagie; Kim, Andrew J; Knouse, Kristin A; Kung, Faith; Mammo, Danny; Ng, Chun Leung; Nikam, Vinayak S; Norton, Diana; Pham, Philip; Polk, Jessica W; Prasad, Shreya; Rankin, Helen; Ratliff, Camille D; Scala, Victoria; Schwartz, Nicholas U; Shuen, Jessica A; Xu, Amy; Xu, Thomas Q; Zhang, Yi; Rosenwald, Anne G; Burg, Martin G; Adams, Stephanie J; Baker, Morgan; Botsford, Bobbi; Brinkley, Briana; Brown, Carter; Emiah, Shadie; Enoch, Erica; Gier, Chad; Greenwell, Alyson; Hoogenboom, Lindsay; Matthews, Jordan E; McDonald, Mitchell; Mercer, Amanda; Monsma, Nicholaus; Ostby, Kristine; Ramic, Alen; Shallman, Devon; Simon, Matthew; Spencer, Eric; Tomkins, Trisha; Wendland, Pete; Wylie, Anna; Wolyniak, Michael J; Robertson, Gregory M; Smith, Samuel I; DiAngelo, Justin R; Sassu, Eric D; Bhalla, Satish C; Sharif, Karim A; Choeying, Tenzin; Macias, Jason S; Sanusi, Fareed; Torchon, Karvyn; Bednarski, April E; Alvarez, Consuelo J; Davis, Kristen C; Dunham, Carrie A; Grantham, Alaina J; Hare, Amber N; Schottler, Jennifer; Scott, Zackary W; Kuleck, Gary A; Yu, Nicole S; Kaehler, Marian M; Jipp, Jacob; Overvoorde, Paul J; Shoop, Elizabeth; Cyrankowski, Olivia; Hoover, Betsy; Kusner, Matt; Lin, Devry; Martinov, Tijana; Misch, Jonathan; Salzman, Garrett; Schiedermayer, Holly; Snavely, Michael; Zarrasola, Stephanie; Parrish, Susan; Baker, Atlee; Beckett, Alissa; Belella, Carissa; Bryant, Julie; Conrad, Turner; Fearnow, Adam; Gomez, Carolina; Herbstsomer, Robert A; Hirsch, Sarah; Johnson, Christen; Jones, Melissa; Kabaso, Rita; Lemmon, Eric; Vieira, Carolina Marques Dos Santos; McFarland, Darryl; McLaughlin, Christopher; Morgan, Abbie; Musokotwane, Sepo; Neutzling, William; Nietmann, Jana; Paluskievicz, Christina; Penn, Jessica; Peoples, Emily; Pozmanter, Caitlin; Reed, Emily; Rigby, Nichole; Schmidt, Lasse; Shelton, Micah; Shuford, Rebecca; Tirasawasdichai, Tiara; Undem, Blair; Urick, Damian; Vondy, Kayla; Yarrington, Bryan; Eckdahl, Todd T; Poet, Jeffrey L; Allen, Alica B; Anderson, John E; Barnett, Jason M; Baumgardner, Jordan S; Brown, Adam D; Carney, Jordan E; Chavez, Ramiro A; Christgen, Shelbi L; Christie, Jordan S; Clary, Andrea N; Conn, Michel A; Cooper, Kristen M; Crowley, Matt J; Crowley, Samuel T; Doty, Jennifer S; Dow, Brian A; Edwards, Curtis R; Elder, Darcie D; Fanning, John P; Janssen, Bridget M; Lambright, Anthony K; Lane, Curtiss E; Limle, Austin B; Mazur, Tammy; McCracken, Marly R; McDonough, Alexa M; Melton, Amy D; Minnick, Phillip J; Musick, Adam E; Newhart, William H; Noynaert, Joseph W; Ogden, Bradley J; Sandusky, Michael W; Schmuecker, Samantha M; Shipman, Anna L; Smith, Anna L; Thomsen, Kristen M; Unzicker, Matthew R; Vernon, William B; Winn, Wesley W; Woyski, Dustin S; Zhu, Xiao; Du, Chunguang; Ament, Caitlin; Aso, Soham; Bisogno, Laura Simone; Caronna, Jason; Fefelova, Nadezhda; Lopez, Lenin; Malkowitz, Lorraine; Marra, Jonathan; Menillo, Daniella; Obiorah, Ifeanyi; Onsarigo, Eric Nyabeta; Primus, Shekerah; Soos, Mahdi; Tare, Archana; Zidan, Ameer; Jones, Christopher J; Aronhalt, Todd; Bellush, James M; Burke, Christa; DeFazio, Steve; Does, Benjamin R; Johnson, Todd D; Keysock, Nicholas; Knudsen, Nelson H; Messler, James; Myirski, Kevin; Rekai, Jade Lea; Rempe, Ryan Michael; Salgado, Michael S; Stagaard, Erica; Starcher, Justin R; Waggoner, Andrew W; Yemelyanova, Anastasia K; Hark, Amy T; Bertolet, Anne; Kuschner, Cyrus E; Parry, Kesley; Quach, Michael; Shantzer, Lindsey; Shaw, Mary E; Smith, Mary A; Glenn, Omolara; Mason, Portia; Williams, Charlotte; Key, S Catherine Silver; Henry, Tyneshia C P; Johnson, Ashlee G; White, Jackie X; Haberman, Adam; Asinof, Sam; Drumm, Kelly; Freeburg, Trip; Safa, Nadia; Schultz, Darrin; Shevin, Yakov; Svoronos, Petros; Vuong, Tam; Wellinghoff, Jules; Hoopes, Laura L M; Chau, Kim M; Ward, Alyssa; Regisford, E Gloria C; Augustine, LaJerald; Davis-Reyes, Brionna; Echendu, Vivienne; Hales, Jasmine; Ibarra, Sharon; Johnson, Lauriaun; Ovu, Steven; Braverman, John M; Bahr, Thomas J; Caesar, Nicole M; Campana, Christopher; Cassidy, Daniel W; Cognetti, Peter A; English, Johnathan D; Fadus, Matthew C; Fick, Cameron N; Freda, Philip J; Hennessy, Bryan M; Hockenberger, Kelsey; Jones, Jennifer K; King, Jessica E; Knob, Christopher R; Kraftmann, Karen J; Li, Linghui; Lupey, Lena N; Minniti, Carl J; Minton, Thomas F; Moran, Joseph V; Mudumbi, Krishna; Nordman, Elizabeth C; Puetz, William J; Robinson, Lauren M; Rose, Thomas J; Sweeney, Edward P; Timko, Ashley S; Paetkau, Don W; Eisler, Heather L; Aldrup, Megan E; Bodenberg, Jessica M; Cole, Mara G; Deranek, Kelly M; DeShetler, Megan; Dowd, Rose M; Eckardt, Alexandra K; Ehret, Sharon C; Fese, Jessica; Garrett, Amanda D; Kammrath, Anna; Kappes, Michelle L; Light, Morgan R; Meier, Anne C; O'Rouke, Allison; Perella, Mallory; Ramsey, Kimberley; Ramthun, Jennifer R; Reilly, Mary T; Robinett, Deirdre; Rossi, Nadine L; Schueler, Mary Grace; Shoemaker, Emma; Starkey, Kristin M; Vetor, Ashley; Vrable, Abby; Chandrasekaran, Vidya; Beck, Christopher; Hatfield, Kristen R; Herrick, Douglas A; Khoury, Christopher B; Lea, Charlotte; Louie, Christopher A; Lowell, Shannon M; Reynolds, Thomas J; Schibler, Jeanine; Scoma, Alexandra H; Smith-Gee, Maxwell T; Tuberty, Sarah; Smith, Christopher D; Lopilato, Jane E; Hauke, Jeanette; Roecklein-Canfield, Jennifer A; Corrielus, Maureen; Gilman, Hannah; Intriago, Stephanie; Maffa, Amanda; Rauf, Sabya A; Thistle, Katrina; Trieu, Melissa; Winters, Jenifer; Yang, Bib; Hauser, Charles R; Abusheikh, Tariq; Ashrawi, Yara; Benitez, Pedro; Boudreaux, Lauren R; Bourland, Megan; Chavez, Miranda; Cruz, Samantha; Elliott, GiNell; Farek, Jesse R; Flohr, Sarah; Flores, Amanda H; Friedrichs, Chelsey; Fusco, Zach; Goodwin, Zane; Helmreich, Eric; Kiley, John; Knepper, John Mark; Langner, Christine; Martinez, Megan; Mendoza, Carlos; Naik, Monal; Ochoa, Andrea; Ragland, Nicolas; Raimey, England; Rathore, Sunil; Reza, Evangelina; Sadovsky, Griffin; Seydoux, Marie-Isabelle B; Smith, Jonathan E; Unruh, Anna K; Velasquez, Vicente; Wolski, Matthew W; Gosser, Yuying; Govind, Shubha; Clarke-Medley, Nicole; Guadron, Leslie; Lau, Dawn; Lu, Alvin; Mazzeo, Cheryl; Meghdari, Mariam; Ng, Simon; Pamnani, Brad; Plante, Olivia; Shum, Yuki Kwan Wa; Song, Roy; Johnson, Diana E; Abdelnabi, Mai; Archambault, Alexi; Chamma, Norma; Gaur, Shailly; Hammett, Deborah; Kandahari, Adrese; Khayrullina, Guzal; Kumar, Sonali; Lawrence, Samantha; Madden, Nigel; Mandelbaum, Max; Milnthorp, Heather; Mohini, Shiv; Patel, Roshni; Peacock, Sarah J; Perling, Emily; Quintana, Amber; Rahimi, Michael; Ramirez, Kristen; Singhal, Rishi; Weeks, Corinne; Wong, Tiffany; Gillis, Aubree T; Moore, Zachary D; Savell, Christopher D; Watson, Reece; Mel, Stephanie F; Anilkumar, Arjun A; Bilinski, Paul; Castillo, Rostislav; Closser, Michael; Cruz, Nathalia M; Dai, Tiffany; Garbagnati, Giancarlo F; Horton, Lanor S; Kim, Dongyeon; Lau, Joyce H; Liu, James Z; Mach, Sandy D; Phan, Thu A; Ren, Yi; Stapleton, Kenneth E; Strelitz, Jean M; Sunjed, Ray; Stamm, Joyce; Anderson, Morgan C; Bonifield, Bethany Grace; Coomes, Daniel; Dillman, Adam; Durchholz, Elaine J; Fafara-Thompson, Antoinette E; Gross, Meleah J; Gygi, Amber M; Jackson, Lesley E; Johnson, Amy; Kocsisova, Zuzana; Manghelli, Joshua L; McNeil, Kylie; Murillo, Michael; Naylor, Kierstin L; Neely, Jessica; Ogawa, Emmy E; Rich, Ashley; Rogers, Anna; Spencer, J Devin; Stemler, Kristina M; Throm, Allison A; Van Camp, Matt; Weihbrecht, Katie; Wiles, T Aaron; Williams, Mallory A; Williams, Matthew; Zoll, Kyle; Bailey, Cheryl; Zhou, Leming; Balthaser, Darla M; Bashiri, Azita; Bower, Mindy E; Florian, Kayla A; Ghavam, Nazanin; Greiner-Sosanko, Elizabeth S; Karim, Helmet; Mullen, Victor W; Pelchen, Carly E; Yenerall, Paul M; Zhang, Jiayu; Rubin, Michael R; Arias-Mejias, Suzette M; Bermudez-Capo, Armando G; Bernal-Vega, Gabriela V; Colon-Vazquez, Mariela; Flores-Vazquez, Arelys; Gines-Rosario, Mariela; Llavona-Cartagena, Ivan G; Martinez-Rodriguez, Javier O; Ortiz-Fuentes, Lionel; Perez-Colomba, Eliezer O; Perez-Otero, Joseph; Rivera, Elisandra; Rodriguez-Giron, Luke J; Santiago-Sanabria, Arnaldo J; Senquiz-Gonzalez, Andrea M; delValle, Frank R Soto; Vargas-Franco, Dorianmarie; Velázquez-Soto, Karla I; Zambrana-Burgos, Joan D; Martinez-Cruzado, Juan Carlos; Asencio-Zayas, Lillyann; Babilonia-Figueroa, Kevin; Beauchamp-Pérez, Francis D; Belén-Rodríguez, Juliana; Bracero-Quiñones, Luciann; Burgos-Bula, Andrea P; Collado-Méndez, Xavier A; Colón-Cruz, Luis R; Correa-Muller, Ana I; Crooke-Rosado, Jonathan L; Cruz-García, José M; Defendini-Ávila, Marianna; Delgado-Peraza, Francheska M; Feliciano-Cancela, Alex J; Gónzalez-Pérez, Valerie M; Guiblet, Wilfried; Heredia-Negrón, Aldo; Hernández-Muñiz, Jennifer; Irizarry-González, Lourdes N; Laboy-Corales, Ángel L; Llaurador-Caraballo, Gabriela A; Marín-Maldonado, Frances; Marrero-Llerena, Ulises; Martell-Martínez, Héctor A; Martínez-Traverso, Idaliz M; Medina-Ortega, Kiara N; Méndez-Castellanos, Sonya G; Menéndez-Serrano, Krizia C; Morales-Caraballo, Carol I; Ortiz-DeChoudens, Saryleine; Ortiz-Ortiz, Patricia; Pagán-Torres, Hendrick; Pérez-Afanador, Diana; Quintana-Torres, Enid M; Ramírez-Aponte, Edwin G; Riascos-Cuero, Carolina; Rivera-Llovet, Michelle S; Rivera-Pagán, Ingrid T; Rivera-Vicéns, Ramón E; Robles-Juarbe, Fabiola; Rodríguez-Bonilla, Lorraine; Rodríguez-Echevarría, Brian O; Rodríguez-García, Priscila M; Rodríguez-Laboy, Abneris E; Rodríguez-Santiago, Susana; Rojas-Vargas, Michael L; Rubio-Marrero, Eva N; Santiago-Colón, Albeliz; Santiago-Ortiz, Jorge L; Santos-Ramos, Carlos E; Serrano-González, Joseline; Tamayo-Figueroa, Alina M; Tascón-Peñaranda, Edna P; Torres-Castillo, José L; Valentín-Feliciano, Nelson A; Valentín-Feliciano, Yashira M; Vargas-Barreto, Nadyan M; Vélez-Vázquez, Miguel; Vilanova-Vélez, Luis R; Zambrana-Echevarría, Cristina; MacKinnon, Christy; Chung, Hui-Min; Kay, Chris; Pinto, Anthony; Kopp, Olga R; Burkhardt, Joshua; Harward, Chris; Allen, Robert; Bhat, Pavan; Chang, Jimmy Hsiang-Chun; Chen, York; Chesley, Christopher; Cohn, Dara; DuPuis, David; Fasano, Michael; Fazzio, Nicholas; Gavinski, Katherine; Gebreyesus, Heran; Giarla, Thomas; Gostelow, Marcus; Greenstein, Rachel; Gunasinghe, Hashini; Hanson, Casey; Hay, Amanda; He, Tao Jian; Homa, Katie; Howe, Ruth; Howenstein, Jeff; Huang, Henry; Khatri, Aaditya; Kim, Young Lu; Knowles, Olivia; Kong, Sarah; Krock, Rebecca; Kroll, Matt; Kuhn, Julia; Kwong, Matthew; Lee, Brandon; Lee, Ryan; Levine, Kevin; Li, Yedda; Liu, Bo; Liu, Lucy; Liu, Max; Lousararian, Adam; Ma, Jimmy; Mallya, Allyson; Manchee, Charlie; Marcus, Joseph; McDaniel, Stephen; Miller, Michelle L; Molleston, Jerome M; Diez, Cristina Montero; Ng, Patrick; Ngai, Natalie; Nguyen, Hien; Nylander, Andrew; Pollack, Jason; Rastogi, Suchita; Reddy, Himabindu; Regenold, Nathaniel; Sarezky, Jon; Schultz, Michael; Shim, Jien; Skorupa, Tara; Smith, Kenneth; Spencer, Sarah J; Srikanth, Priya; Stancu, Gabriel; Stein, Andrew P; Strother, Marshall; Sudmeier, Lisa; Sun, Mengyang; Sundaram, Varun; Tazudeen, Noor; Tseng, Alan; Tzeng, Albert; Venkat, Rohit; Venkataram, Sandeep; Waldman, Leah; Wang, Tracy; Yang, Hao; Yu, Jack Y; Zheng, Yin; Preuss, Mary L; Garcia, Angelica; Juergens, Matt; Morris, Robert W; Nagengast, Alexis A; Azarewicz, Julie; Carr, Thomas J; Chichearo, Nicole; Colgan, Mike; Donegan, Megan; Gardner, Bob; Kolba, Nik; Krumm, Janice L; Lytle, Stacey; MacMillian, Laurell; Miller, Mary; Montgomery, Andrew; Moretti, Alysha; Offenbacker, Brittney; Polen, Mike; Toth, John; Woytanowski, John; Kadlec, Lisa; Crawford, Justin; Spratt, Mary L; Adams, Ashley L; Barnard, Brianna K; Cheramie, Martin N; Eime, Anne M; Golden, Kathryn L; Hawkins, Allyson P; Hill, Jessica E; Kampmeier, Jessica A; Kern, Cody D; Magnuson, Emily E; Miller, Ashley R; Morrow, Cody M; Peairs, Julia C; Pickett, Gentry L; Popelka, Sarah A; Scott, Alexis J; Teepe, Emily J; TerMeer, Katie A; Watchinski, Carmen A; Watson, Lucas A; Weber, Rachel E; Woodard, Kate A; Barnard, Daron C; Appiah, Isaac; Giddens, Michelle M; McNeil, Gerard P; Adebayo, Adeola; Bagaeva, Kate; Chinwong, Justina; Dol, Chrystel; George, Eunice; Haltaufderhyde, Kirk; Haye, Joanna; Kaur, Manpreet; Semon, Max; Serjanov, Dmitri; Toorie, Anika; Wilson, Christopher; Riddle, Nicole C; Buhler, Jeremy; Mardis, Elaine R; Elgin, Sarah C R

    2015-03-04

    The Muller F element (4.2 Mb, ~80 protein-coding genes) is an unusual autosome of Drosophila melanogaster; it is mostly heterochromatic with a low recombination rate. To investigate how these properties impact the evolution of repeats and genes, we manually improved the sequence and annotated the genes on the D. erecta, D. mojavensis, and D. grimshawi F elements and euchromatic domains from the Muller D element. We find that F elements have greater transposon density (25-50%) than euchromatic reference regions (3-11%). Among the F elements, D. grimshawi has the lowest transposon density (particularly DINE-1: 2% vs. 11-27%). F element genes have larger coding spans, more coding exons, larger introns, and lower codon bias. Comparison of the Effective Number of Codons with the Codon Adaptation Index shows that, in contrast to the other species, codon bias in D. grimshawi F element genes can be attributed primarily to selection instead of mutational biases, suggesting that density and types of transposons affect the degree of local heterochromatin formation. F element genes have lower estimated DNA melting temperatures than D element genes, potentially facilitating transcription through heterochromatin. Most F element genes (~90%) have remained on that element, but the F element has smaller syntenic blocks than genome averages (3.4-3.6 vs. 8.4-8.8 genes per block), indicating greater rates of inversion despite lower rates of recombination. Overall, the F element has maintained characteristics that are distinct from other autosomes in the Drosophila lineage, illuminating the constraints imposed by a heterochromatic milieu. Copyright © 2015 Leung et al.

  11. Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer

    PubMed Central

    D’Addabbo, Pietro; Caizzi, Ruggiero

    2016-01-01

    Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon’s co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon’s evolutionary dynamics and increases our understanding on the Tc1-mariner elements’ biology. PMID:27213270

  12. Finite Element Analysis of Tube Hydroforming in Non-Symmetrical Dies

    NASA Astrophysics Data System (ADS)

    Nulkar, Abhishek V.; Gu, Randy; Murty, Pilaka

    2011-08-01

    Tube hydroforming has been studied intensively using commercial finite element programs. A great deal of the investigations dealt with models with symmetric cross-sections. It is known that additional constraints due to symmetry may be imposed on the model so that it is properly supported. For a non-symmetric model, these constraints become invalid and the model does not have sufficient support resulting in a singular finite element system. Majority of commercial codes have a limited capability in solving models with insufficient supports. Recently, new algorithms using penalty variable and air-like contact element (ALCE) have been developed to solve positive semi-definite finite element systems such as those in contact mechanics. In this study the ALCE algorithm is first validated by comparing its result against a commercial code using a symmetric model in which a circular tube is formed to polygonal dies with symmetric shapes. Then, the study investigates the accuracy and efficiency of using ALCE in analyzing hydroforming of tubes with various cross-sections in non-symmetrical dies in 2-D finite element settings.

  13. Differential DNA methylation at conserved non-genic elements and evidence for transgenerational inheritance following developmental exposure to mono(2-ethylhexyl) phthalate and 5-azacytidine in zebrafish.

    PubMed

    Kamstra, Jorke H; Sales, Liana Bastos; Aleström, Peter; Legler, Juliette

    2017-01-01

    Exposure to environmental stressors during development may lead to latent and transgenerational adverse health effects. To understand the role of DNA methylation in these effects, we used zebrafish as a vertebrate model to investigate heritable changes in DNA methylation following chemical-induced stress during early development. We exposed zebrafish embryos to non-embryotoxic concentrations of the biologically active phthalate metabolite mono(2-ethylhexyl) phthalate (MEHP, 30 µM) and the DNA methyltransferase 1 inhibitor 5-azacytidine (5AC, 10 µM). Direct, latent and transgenerational effects on DNA methylation were assessed using global, genome-wide and locus-specific DNA methylation analyses. Following direct exposure in zebrafish embryos from 0 to 6 days post-fertilization, genome-wide analysis revealed a multitude of differentially methylated regions, strongly enriched at conserved non-genic elements for both compounds. Pathways involved in adipogenesis were enriched with the putative obesogenic compound MEHP. Exposure to 5AC resulted in enrichment of pathways involved in embryonic development and transgenerational effects on larval body length. Locus-specific methylation analysis of 10 differentially methylated sites revealed six of these loci differentially methylated in sperm sampled from adult zebrafish exposed during development to 5AC, and in first and second generation larvae. With MEHP, consistent changes were found at 2 specific loci in first and second generation larvae. Our results suggest a functional role for DNA methylation on cis-regulatory conserved elements following developmental exposure to compounds. Effects on these regions are potentially transferred to subsequent generations.

  14. DDM1 represses noncoding RNA expression and RNA-directed DNA methylation in heterochromatin.

    PubMed

    Tan, Feng; Lu, Yue; Jiang, Wei; Zhao, Yu; Wu, Tian; Zhang, Ruoyu; Zhou, Dao-Xiu

    2018-05-24

    Cytosine methylation of DNA, which occurs at CG, CHG, and CHH (H=A, C, or T) sequences in plants, is a hallmark for epigenetic repression of repetitive sequences. The chromatin remodeling factor DECREASE IN DNA METHYLATION1 (DDM1) is essential for DNA methylation, especially at CG and CHG sequences. However, its potential role in RNA-directed DNA methylation (RdDM) and in chromatin function is not completely understood in rice (Oryza sativa). In this work, we used high-throughput approaches to study the function of rice DDM1 (OsDDM1) in RdDM and the expression of non-coding RNA (ncRNA). We show that loss of function of OsDDM1 results in ectopic CHH methylation of transposable elements and repeats. The ectopic CHH methylation was dependent on rice DOMAINS REARRANGED METHYLTRANSFERASE2 (OsDRM2), a DNA methyltransferase involved in RdDM. Mutations in OsDDM1 lead to decreases of histone H3K9me2 and increases in the levels of heterochromatic small RNA (sRNA) and long noncoding RNA (lncRNA). In particular, OsDDM1 was found to be essential to repress transcription of the two repetitive sequences, Centromeric Retrotransposons of Rice1 (CRR1) and the dominant centromeric CentO repeats. These results suggest that OsDDM1 antagonizes RdDM at heterochromatin and represses tissue-specific expression of ncRNA from repetitive sequences in the rice genome. {copyright, serif} 2018 American Society of Plant Biologists. All rights reserved.

  15. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  16. MD simulations of papillomavirus DNA-E2 protein complexes hints at a protein structural code for DNA deformation.

    PubMed

    Falconi, M; Oteri, F; Eliseo, T; Cicero, D O; Desideri, A

    2008-08-01

    The structural dynamics of the DNA binding domains of the human papillomavirus strain 16 and the bovine papillomavirus strain 1, complexed with their DNA targets, has been investigated by modeling, molecular dynamics simulations, and nuclear magnetic resonance analysis. The simulations underline different dynamical features of the protein scaffolds and a different mechanical interaction of the two proteins with DNA. The two protein structures, although very similar, show differences in the relative mobility of secondary structure elements. Protein structural analyses, principal component analysis, and geometrical and energetic DNA analyses indicate that the two transcription factors utilize a different strategy in DNA recognition and deformation. Results show that the protein indirect DNA readout is not only addressable to the DNA molecule flexibility but it is finely tuned by the mechanical and dynamical properties of the protein scaffold involved in the interaction.

  17. Regulation of the yeast RAD2 gene: DNA damage-dependent induction correlates with protein binding to regulatory sequences and their deletion influences survival.

    PubMed

    Siede, W; Friedberg, E C

    1992-03-01

    In the yeast Saccharomyces cerevisiae the RAD2 gene is absolutely required for damage-specific incision of DNA during nucleotide excision repair and is inducible by DNA-damaging agents. In the present study we correlated sensitivity to killing by DNA-damaging agents with the deletion of previously defined specific promoter elements. Deletion of the element DRE2 increased the UV sensitivity of cells in both the G1/early S and S/G2 phases of the cell cycle as well as in stationary phase. On the other hand, increased UV sensitivity associated with deletion of the sequence-related element DRE1 was restricted to cells irradiated in G1/S. Specific binding of protein(s) to the promoter elements DRE1 and DRE2 was observed under non-inducing conditions using gel retardation assays. Exposure of cells to DNA-damaging agents resulted in increased protein binding that was dependent on de novo protein synthesis.

  18. Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

    PubMed

    Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

    1985-09-01

    The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this.

  19. The development of non-coding RNA ontology.

    PubMed

    Huang, Jingshan; Eilbeck, Karen; Smith, Barry; Blake, Judith A; Dou, Dejing; Huang, Weili; Natale, Darren A; Ruttenberg, Alan; Huan, Jun; Zimmermann, Michael T; Jiang, Guoqian; Lin, Yu; Wu, Bin; Strachan, Harrison J; de Silva, Nisansa; Kasukurthi, Mohan Vamsi; Jha, Vikash Kumar; He, Yongqun; Zhang, Shaojie; Wang, Xiaowei; Liu, Zixing; Borchert, Glen M; Tan, Ming

    2016-01-01

    Identification of non-coding RNAs (ncRNAs) has been significantly improved over the past decade. On the other hand, semantic annotation of ncRNA data is facing critical challenges due to the lack of a comprehensive ontology to serve as common data elements and data exchange standards in the field. We developed the Non-Coding RNA Ontology (NCRO) to handle this situation. By providing a formally defined ncRNA controlled vocabulary, the NCRO aims to fill a specific and highly needed niche in semantic annotation of large amounts of ncRNA biological and clinical data.

  20. Characterization of a species-specific repetitive DNA from a highly endangered wild animal, Rhinoceros unicornis, and assessment of genetic polymorphism by microsatellite associated sequence amplification (MASA).

    PubMed

    Ali, S; Azfer, M A; Bashamboo, A; Mathur, P K; Malik, P K; Mathur, V B; Raha, A K; Ansari, S

    1999-03-04

    We have cloned and sequenced a 906bp EcoRI repeat DNA fraction from Rhinoceros unicornis genome. The contig pSS(R)2 is AT rich with 340 A (37.53%), 187 C (20.64%), 173 G (19.09%) and 206 T (22.74%). The sequence contains MALT box, NF-E1, Poly-A signal, lariat consensus sequences, TATA box, translational initiation sequences and several stop codons. Translation of the contig showed seven different types of protein motifs, among which, EGF-like domain cysteine pattern signatures and Bowman-Birk serine protease inhibitor family signatures were prominent. The presence of eukaryotic transcriptional elements, protein signatures and analysis of subset sequences in the 5' region from 1 to 165nt indicating coding potential (test code value=0.97) suggest possible regulatory and/or functional role(s) of these sequences in the rhino genome. Translation of the complementary strand from 906 to 706nt and 190 to 2nt showed proteins of more than 7kDa rich in non-polar residues. This suggests that pSS(R)2 is either a part of, or adjacent to, a functional gene. The contig contains mostly non-consecutive simple repeat units from 2 to 17nt with varying frequencies, of which four base motifs were found to be predominant. Zoo-blot hybridization revealed that pSS(R)2 sequences are unique to R. unicornis genome because they do not cross-hybridize, even with the genomic DNA of South African black rhino Diceros bicornis. Southern blot analysis of R. unicornis genomic DNA with pSS(R)2 and other synthetic oligo probes revealed a high level of genetic homogeneity, which was also substantiated by microsatellite associated sequence amplification (MASA). Owing to its uniqueness, the pSS(R)2 probe has a potential application in the area of conservation biology for unequivocal identification of horn or other body tissues of R. unicornis. The evolutionary aspect of this repeat fraction in the context of comparative genome analysis is discussed.

  1. Sequence-dependent modelling of local DNA bending phenomena: curvature prediction and vibrational analysis.

    PubMed

    Vlahovicek, K; Munteanu, M G; Pongor, S

    1999-01-01

    Bending is a local conformational micropolymorphism of DNA in which the original B-DNA structure is only distorted but not extensively modified. Bending can be predicted by simple static geometry models as well as by a recently developed elastic model that incorporate sequence dependent anisotropic bendability (SDAB). The SDAB model qualitatively explains phenomena including affinity of protein binding, kinking, as well as sequence-dependent vibrational properties of DNA. The vibrational properties of DNA segments can be studied by finite element analysis of a model subjected to an initial bending moment. The frequency spectrum is obtained by applying Fourier analysis to the displacement values in the time domain. This analysis shows that the spectrum of the bending vibrations quite sensitively depends on the sequence, for example the spectrum of a curved sequence is characteristically different from the spectrum of straight sequence motifs of identical basepair composition. Curvature distributions are genome-specific, and pronounced differences are found between protein-coding and regulatory regions, respectively, that is, sites of extreme curvature and/or bendability are less frequent in protein-coding regions. A WWW server is set up for the prediction of curvature and generation of 3D models from DNA sequences (http:@www.icgeb.trieste.it/dna).

  2. RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336”

    PubMed Central

    Kumar, Ranjit; Lawrence, Mark L.; Watt, James; Cooksey, Amanda M.; Burgess, Shane C.; Nanduri, Bindu

    2012-01-01

    Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method. The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations. PMID:22276113

  3. RNA-seq based transcriptional map of bovine respiratory disease pathogen "Histophilus somni 2336".

    PubMed

    Kumar, Ranjit; Lawrence, Mark L; Watt, James; Cooksey, Amanda M; Burgess, Shane C; Nanduri, Bindu

    2012-01-01

    Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify "novel" genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.

  4. Light element opacities of astrophysical interest from ATOMIC

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Colgan, J.; Kilcrease, D. P.; Magee, N. H. Jr.

    We present new calculations of local-thermodynamic-equilibrium (LTE) light element opacities from the Los Alamos ATOMIC code for systems of astrophysical interest. ATOMIC is a multi-purpose code that can generate LTE or non-LTE quantities of interest at various levels of approximation. Our calculations, which include fine-structure detail, represent a systematic improvement over previous Los Alamos opacity calculations using the LEDCOP legacy code. The ATOMIC code uses ab-initio atomic structure data computed from the CATS code, which is based on Cowan's atomic structure codes, and photoionization cross section data computed from the Los Alamos ionization code GIPPER. ATOMIC also incorporates a newmore » equation-of-state (EOS) model based on the chemical picture. ATOMIC incorporates some physics packages from LEDCOP and also includes additional physical processes, such as improved free-free cross sections and additional scattering mechanisms. Our new calculations are made for elements of astrophysical interest and for a wide range of temperatures and densities.« less

  5. Genome-wide DNA methylation patterns in LSH mutant reveals de-repression of repeat elements and redundant epigenetic silencing pathways

    PubMed Central

    Yu, Weishi; McIntosh, Carl; Lister, Ryan; Zhu, Iris; Han, Yixing; Ren, Jianke; Landsman, David; Lee, Eunice; Briones, Victorino; Terashima, Minoru; Leighty, Robert; Ecker, Joseph R.

    2014-01-01

    Cytosine methylation is critical in mammalian development and plays a role in diverse biologic processes such as genomic imprinting, X chromosome inactivation, and silencing of repeat elements. Several factors regulate DNA methylation in early embryogenesis, but their precise role in the establishment of DNA methylation at a given site remains unclear. We have generated a comprehensive methylation map in fibroblasts derived from the murine DNA methylation mutant Hells−/− (helicase, lymphoid specific, also known as LSH). It has been previously shown that HELLS can influence de novo methylation of retroviral sequences and endogenous genes. Here, we describe that HELLS controls cytosine methylation in a nuclear compartment that is in part defined by lamin B1 attachment regions. Despite widespread loss of cytosine methylation at regulatory sequences, including promoter regions of protein-coding genes and noncoding RNA genes, overall relative transcript abundance levels in the absence of HELLS are similar to those in wild-type cells. A subset of promoter regions shows increases of the histone modification H3K27me3, suggesting redundancy of epigenetic silencing mechanisms. Furthermore, HELLS modulates CG methylation at all classes of repeat elements and is critical for repression of a subset of repeat elements. Overall, we provide a detailed analysis of gene expression changes in relation to DNA methylation alterations, which contributes to our understanding of the biological role of cytosine methylation. PMID:25170028

  6. Finite element methods in a simulation code for offshore wind turbines

    NASA Astrophysics Data System (ADS)

    Kurz, Wolfgang

    1994-06-01

    Offshore installation of wind turbines will become important for electricity supply in future. Wind conditions above sea are more favorable than on land and appropriate locations on land are limited and restricted. The dynamic behavior of advanced wind turbines is investigated with digital simulations to reduce time and cost in development and design phase. A wind turbine can be described and simulated as a multi-body system containing rigid and flexible bodies. Simulation of the non-linear motion of such a mechanical system using a multi-body system code is much faster than using a finite element code. However, a modal representation of the deformation field has to be incorporated in the multi-body system approach. The equations of motion of flexible bodies due to deformation are generated by finite element calculations. At Delft University of Technology the simulation code DUWECS has been developed which simulates the non-linear behavior of wind turbines in time domain. The wind turbine is divided in subcomponents which are represented by modules (e.g. rotor, tower etc.).

  7. Characterization of Non-coding DNA Satellites Associated with Sweepoviruses (Genus Begomovirus, Geminiviridae) – Definition of a Distinct Class of Begomovirus-Associated Satellites

    PubMed Central

    Lozano, Gloria; Trenado, Helena P.; Fiallo-Olivé, Elvira; Chirinos, Dorys; Geraud-Pouey, Francis; Briddon, Rob W.; Navas-Castillo, Jesús

    2016-01-01

    Begomoviruses (family Geminiviridae) are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas). Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus-satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus), with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem–loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem–loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed. PMID:26925037

  8. Non-coding RNAs in lung cancer

    PubMed Central

    Ricciuti, Biagio; Mecca, Carmen; Crinò, Lucio; Baglivo, Sara; Cenci, Matteo; Metro, Giulio

    2014-01-01

    The discovery that protein-coding genes represent less than 2% of all human genome, and the evidence that more than 90% of it is actively transcribed, changed the classical point of view of the central dogma of molecular biology, which was always based on the assumption that RNA functions mainly as an intermediate bridge between DNA sequences and protein synthesis machinery. Accumulating data indicates that non-coding RNAs are involved in different physiological processes, providing for the maintenance of cellular homeostasis. They are important regulators of gene expression, cellular differentiation, proliferation, migration, apoptosis, and stem cell maintenance. Alterations and disruptions of their expression or activity have increasingly been associated with pathological changes of cancer cells, this evidence and the prospect of using these molecules as diagnostic markers and therapeutic targets, make currently non-coding RNAs among the most relevant molecules in cancer research. In this paper we will provide an overview of non-coding RNA function and disruption in lung cancer biology, also focusing on their potential as diagnostic, prognostic and predictive biomarkers. PMID:25593996

  9. VaDiR: an integrated approach to Variant Detection in RNA.

    PubMed

    Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy

    2018-02-01

    Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.

  10. Alteration of gene expression in human hepatocellular carcinoma with integrated hepatitis B virus DNA.

    PubMed

    Tamori, Akihiro; Yamanishi, Yoshihiro; Kawashima, Shuichi; Kanehisa, Minoru; Enomoto, Masaru; Tanaka, Hiromu; Kubo, Shoji; Shiomi, Susumu; Nishiguchi, Shuhei

    2005-08-15

    Integration of hepatitis B virus (HBV) DNA into the human genome is one of the most important steps in HBV-related carcinogenesis. This study attempted to find the link between HBV DNA, the adjoining cellular sequence, and altered gene expression in hepatocellular carcinoma (HCC) with integrated HBV DNA. We examined 15 cases of HCC infected with HBV by cassette ligation-mediated PCR. The human DNA adjacent to the integrated HBV DNA was sequenced. Protein coding sequences were searched for in the human sequence. In five cases with HBV DNA integration, from which good quality RNA was extracted, gene expression was examined by cDNA microarray analysis. The human DNA sequence successive to integrated HBV DNA was determined in the 15 HCCs. Eight protein-coding regions were involved: ras-responsive element binding protein 1, calmodulin 1, mixed lineage leukemia 2 (MLL2), FLJ333655, LOC220272, LOC255345, LOC220220, and LOC168991. The MLL2 gene was expressed in three cases with HBV DNA integrated into exon 3 of MLL2 and in one case with HBV DNA integrated into intron 3 of MLL2. Gene expression analysis suggested that two HCCs with HBV integrated into MLL2 had similar patterns of gene expression compared with three HCCs with HBV integrated into other loci of human chromosomes. HBV DNA was integrated at random sites of human DNA, and the MLL2 gene was one of the targets for integration. Our results suggest that HBV DNA might modulate human genes near integration sites, followed by integration site-specific expression of such genes during hepatocarcinogenesis.

  11. Code-assisted discovery of TAL effector targets in bacterial leaf streak of rice reveals contrast with bacterial blight and a novel susceptibility gene

    USDA-ARS?s Scientific Manuscript database

    Transcription activator-like (TAL) effectors found in Xanthomonas spp. promote bacterial growth and plant susceptibility by binding specific DNA sequences or, effector-binding elements (EBEs), and inducing host gene expression. In this study, we have found substantially different transcriptional pro...

  12. Different domains of the murine RNA polymerase I-specific termination factor mTTF-I serve distinct functions in transcription termination.

    PubMed

    Evers, R; Smid, A; Rudloff, U; Lottspeich, F; Grummt, I

    1995-03-15

    Termination of mouse ribosomal gene transcription by RNA polymerase I (Pol I) requires the specific interaction of a DNA binding protein, mTTF-I, with an 18 bp sequence element located downstream of the rRNA coding region. Here we describe the molecular cloning and functional characterization of the cDNA encoding this transcription termination factor. Recombinant mTTF-I binds specifically to the murine terminator elements and terminates Pol I transcription in a reconstituted in vitro system. Deletion analysis has defined a modular structure of mTTF-I comprising a dispensable N-terminal half, a large C-terminal DNA binding region and an internal domain which is required for transcription termination. Significantly, the C-terminal region of mTTF-I reveals striking homology to the DNA binding domains of the proto-oncogene c-Myb and the yeast transcription factor Reb1p. Site-directed mutagenesis of one of the tryptophan residues that is conserved in the homology region of c-Myb, Reb1p and mTTF-I abolishes specific DNA binding, a finding which underscores the functional relevance of these residues in DNA-protein interactions.

  13. Different domains of the murine RNA polymerase I-specific termination factor mTTF-I serve distinct functions in transcription termination.

    PubMed Central

    Evers, R; Smid, A; Rudloff, U; Lottspeich, F; Grummt, I

    1995-01-01

    Termination of mouse ribosomal gene transcription by RNA polymerase I (Pol I) requires the specific interaction of a DNA binding protein, mTTF-I, with an 18 bp sequence element located downstream of the rRNA coding region. Here we describe the molecular cloning and functional characterization of the cDNA encoding this transcription termination factor. Recombinant mTTF-I binds specifically to the murine terminator elements and terminates Pol I transcription in a reconstituted in vitro system. Deletion analysis has defined a modular structure of mTTF-I comprising a dispensable N-terminal half, a large C-terminal DNA binding region and an internal domain which is required for transcription termination. Significantly, the C-terminal region of mTTF-I reveals striking homology to the DNA binding domains of the proto-oncogene c-Myb and the yeast transcription factor Reb1p. Site-directed mutagenesis of one of the tryptophan residues that is conserved in the homology region of c-Myb, Reb1p and mTTF-I abolishes specific DNA binding, a finding which underscores the functional relevance of these residues in DNA-protein interactions. Images PMID:7720715

  14. Repetitive DNA loci and their modulation by the non-canonical nucleic acid structures R-loops and G-quadruplexes

    PubMed Central

    Hall, Amanda C.; Ostrowski, Lauren A.; Mekhail, Karim

    2017-01-01

    ABSTRACT Cells have evolved intricate mechanisms to maintain genome stability despite allowing mutational changes to drive evolutionary adaptation. Repetitive DNA sequences, which represent the bulk of most genomes, are a major threat to genome stability often driving chromosome rearrangements and disease. The major source of repetitive DNA sequences and thus the most vulnerable constituents of the genome are the rDNA (rDNA) repeats, telomeres, and transposable elements. Maintaining the stability of these loci is critical to overall cellular fitness and lifespan. Therefore, cells have evolved mechanisms to regulate rDNA copy number, telomere length and transposon activity, as well as DNA repair at these loci. In addition, non-canonical structure-forming DNA motifs can also modulate the function of these repetitive DNA loci by impacting their transcription, replication, and stability. Here, we discuss key mechanisms that maintain rDNA repeats, telomeres, and transposons in yeast and human before highlighting emerging roles for non-canonical DNA structures at these repetitive loci. PMID:28406751

  15. Assessment of the in vivo genotoxicity of cadmium chloride, chloroform, and D,L-menthol as coded test chemicals using the alkaline comet assay.

    PubMed

    Wada, Kunio; Fukuyama, Tomoki; Nakashima, Nobuaki; Matsumoto, Kyomu

    2015-07-01

    As part of the Japanese Center for the Validation of Alternative Methods (JaCVAM) international validation study of in vivo rat alkaline comet assays, we examined cadmium chloride, chloroform, and D,L-menthol under blind conditions as coded chemicals in the liver and stomach of Sprague-Dawley rats after 3 days of administration. Cadmium chloride showed equivocal responses in the liver and stomach, supporting previous reports of its poor mutagenic potential and non-carcinogenic effects in these organs. Treatment with chloroform, which is a non-genotoxic carcinogen, did not induce DNA damage in the liver or stomach. Some histopathological changes, such as necrosis and degeneration, were observed in the liver; however, they did not affect the comet assay results. D,L-Menthol, a non-genotoxic non-carcinogen, did not induce liver or stomach DNA damage. These results indicate that the comet assay can reflect genotoxic properties under blind conditions. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. Drosophila Muller F Elements Maintain a Distinct Set of Genomic Properties Over 40 Million Years of Evolution

    PubMed Central

    Leung, Wilson; Shaffer, Christopher D.; Reed, Laura K.; Smith, Sheryl T.; Barshop, William; Dirkes, William; Dothager, Matthew; Lee, Paul; Wong, Jeannette; Xiong, David; Yuan, Han; Bedard, James E. J.; Machone, Joshua F.; Patterson, Seantay D.; Price, Amber L.; Turner, Bryce A.; Robic, Srebrenka; Luippold, Erin K.; McCartha, Shannon R.; Walji, Tezin A.; Walker, Chelsea A.; Saville, Kenneth; Abrams, Marita K.; Armstrong, Andrew R.; Armstrong, William; Bailey, Robert J.; Barberi, Chelsea R.; Beck, Lauren R.; Blaker, Amanda L.; Blunden, Christopher E.; Brand, Jordan P.; Brock, Ethan J.; Brooks, Dana W.; Brown, Marie; Butzler, Sarah C.; Clark, Eric M.; Clark, Nicole B.; Collins, Ashley A.; Cotteleer, Rebecca J.; Cullimore, Peterson R.; Dawson, Seth G.; Docking, Carter T.; Dorsett, Sasha L.; Dougherty, Grace A.; Downey, Kaitlyn A.; Drake, Andrew P.; Earl, Erica K.; Floyd, Trevor G.; Forsyth, Joshua D.; Foust, Jonathan D.; Franchi, Spencer L.; Geary, James F.; Hanson, Cynthia K.; Harding, Taylor S.; Harris, Cameron B.; Heckman, Jonathan M.; Holderness, Heather L.; Howey, Nicole A.; Jacobs, Dontae A.; Jewell, Elizabeth S.; Kaisler, Maria; Karaska, Elizabeth A.; Kehoe, James L.; Koaches, Hannah C.; Koehler, Jessica; Koenig, Dana; Kujawski, Alexander J.; Kus, Jordan E.; Lammers, Jennifer A.; Leads, Rachel R.; Leatherman, Emily C.; Lippert, Rachel N.; Messenger, Gregory S.; Morrow, Adam T.; Newcomb, Victoria; Plasman, Haley J.; Potocny, Stephanie J.; Powers, Michelle K.; Reem, Rachel M.; Rennhack, Jonathan P.; Reynolds, Katherine R.; Reynolds, Lyndsey A.; Rhee, Dong K.; Rivard, Allyson B.; Ronk, Adam J.; Rooney, Meghan B.; Rubin, Lainey S.; Salbert, Luke R.; Saluja, Rasleen K.; Schauder, Taylor; Schneiter, Allison R.; Schulz, Robert W.; Smith, Karl E.; Spencer, Sarah; Swanson, Bryant R.; Tache, Melissa A.; Tewilliager, Ashley A.; Tilot, Amanda K.; VanEck, Eve; Villerot, Matthew M.; Vylonis, Megan B.; Watson, David T.; Wurzler, Juliana A.; Wysocki, Lauren M.; Yalamanchili, Monica; Zaborowicz, Matthew A.; Emerson, Julia A.; Ortiz, Carlos; Deuschle, Frederic J.; DiLorenzo, Lauren A.; Goeller, Katie L.; Macchi, Christopher R.; Muller, Sarah E.; Pasierb, Brittany D.; Sable, Joseph E.; Tucci, Jessica M.; Tynon, Marykathryn; Dunbar, David A.; Beken, Levent H.; Conturso, Alaina C.; Danner, Benjamin L.; DeMichele, Gabriella A.; Gonzales, Justin A.; Hammond, Maureen S.; Kelley, Colleen V.; Kelly, Elisabeth A.; Kulich, Danielle; Mageeney, Catherine M.; McCabe, Nikie L.; Newman, Alyssa M.; Spaeder, Lindsay A.; Tumminello, Richard A.; Revie, Dennis; Benson, Jonathon M.; Cristostomo, Michael C.; DaSilva, Paolo A.; Harker, Katherine S.; Jarrell, Jenifer N.; Jimenez, Luis A.; Katz, Brandon M.; Kennedy, William R.; Kolibas, Kimberly S.; LeBlanc, Mark T.; Nguyen, Trung T.; Nicolas, Daniel S.; Patao, Melissa D.; Patao, Shane M.; Rupley, Bryan J.; Sessions, Bridget J.; Weaver, Jennifer A.; Goodman, Anya L.; Alvendia, Erica L.; Baldassari, Shana M.; Brown, Ashley S.; Chase, Ian O.; Chen, Maida; Chiang, Scott; Cromwell, Avery B.; Custer, Ashley F.; DiTommaso, Tia M.; El-Adaimi, Jad; Goscinski, Nora C.; Grove, Ryan A.; Gutierrez, Nestor; Harnoto, Raechel S.; Hedeen, Heather; Hong, Emily L.; Hopkins, Barbara L.; Huerta, Vilma F.; Khoshabian, Colin; LaForge, Kristin M.; Lee, Cassidy T.; Lewis, Benjamin M.; Lydon, Anniken M.; Maniaci, Brian J.; Mitchell, Ryan D.; Morlock, Elaine V.; Morris, William M.; Naik, Priyanka; Olson, Nicole C.; Osterloh, Jeannette M.; Perez, Marcos A.; Presley, Jonathan D.; Randazzo, Matt J.; Regan, Melanie K.; Rossi, Franca G.; Smith, Melanie A.; Soliterman, Eugenia A.; Sparks, Ciani J.; Tran, Danny L.; Wan, Tiffany; Welker, Anne A.; Wong, Jeremy N.; Sreenivasan, Aparna; Youngblom, Jim; Adams, Andrew; Alldredge, Justin; Bryant, Ashley; Carranza, David; Cifelli, Alyssa; Coulson, Kevin; Debow, Calise; Delacruz, Noelle; Emerson, Charlene; Farrar, Cassandra; Foret, Don; Garibay, Edgar; Gooch, John; Heslop, Michelle; Kaur, Sukhjit; Khan, Ambreen; Kim, Van; Lamb, Travis; Lindbeck, Peter; Lucas, Gabi; Macias, Elizabeth; Martiniuc, Daniela; Mayorga, Lissett; Medina, Joseph; Membreno, Nelson; Messiah, Shady; Neufeld, Lacey; Nguyen, San Francisco; Nichols, Zachary; Odisho, George; Peterson, Daymon; Rodela, Laura; Rodriguez, Priscilla; Rodriguez, Vanessa; Ruiz, Jorge; Sherrill, Will; Silva, Valeria; Sparks, Jeri; Statton, Geeta; Townsend, Ashley; Valdez, Isabel; Waters, Mary; Westphal, Kyle; Winkler, Stacey; Zumkehr, Joannee; DeJong, Randall J.; Hoogewerf, Arlene J.; Ackerman, Cheri M.; Armistead, Isaac O.; Baatenburg, Lara; Borr, Matthew J.; Brouwer, Lindsay K.; Burkhart, Brandon J.; Bushhouse, Kelsey T.; Cesko, Lejla; Choi, Tiffany Y. Y.; Cohen, Heather; Damsteegt, Amanda M.; Darusz, Jess M.; Dauphin, Cory M.; Davis, Yelena P.; Diekema, Emily J.; Drewry, Melissa; Eisen, Michelle E. M.; Faber, Hayley M.; Faber, Katherine J.; Feenstra, Elizabeth; Felzer-Kim, Isabella T.; Hammond, Brandy L.; Hendriksma, Jesse; Herrold, Milton R.; Hilbrands, Julia A.; Howell, Emily J.; Jelgerhuis, Sarah A.; Jelsema, Timothy R.; Johnson, Benjamin K.; Jones, Kelly K.; Kim, Anna; Kooienga, Ross D.; Menyes, Erika E.; Nollet, Eric A.; Plescher, Brittany E.; Rios, Lindsay; Rose, Jenny L.; Schepers, Allison J.; Scott, Geoff; Smith, Joshua R.; Sterling, Allison M.; Tenney, Jenna C.; Uitvlugt, Chris; VanDyken, Rachel E.; VanderVennen, Marielle; Vue, Samantha; Kokan, Nighat P.; Agbley, Kwabea; Boham, Sampson K.; Broomfield, Daniel; Chapman, Kayla; Dobbe, Ali; Dobbe, Ian; Harrington, William; Ibrahem, Marwan; Kennedy, Andre; Koplinsky, Chad A.; Kubricky, Cassandra; Ladzekpo, Danielle; Pattison, Claire; Ramirez, Roman E.; Wande, Lucia; Woehlke, Sarah; Wawersik, Matthew; Kiernan, Elizabeth; Thompson, Jeffrey S.; Banker, Roxanne; Bartling, Justina R.; Bhatiya, Chinmoy I.; Boudoures, Anna L.; Christiansen, Lena; Fosselman, Daniel S.; French, Kristin M.; Gill, Ishwar S.; Havill, Jessen T.; Johnson, Jaelyn L.; Keny, Lauren J.; Kerber, John M.; Klett, Bethany M.; Kufel, Christina N.; May, Francis J.; Mecoli, Jonathan P.; Merry, Callie R.; Meyer, Lauren R.; Miller, Emily G.; Mullen, Gregory J.; Palozola, Katherine C.; Pfeil, Jacob J.; Thomas, Jessica G.; Verbofsky, Evan M.; Spana, Eric P.; Agarwalla, Anant; Chapman, Julia; Chlebina, Ben; Chong, Insun; Falk, I.N.; Fitzgibbons, John D.; Friedman, Harrison; Ighile, Osagie; Kim, Andrew J.; Knouse, Kristin A.; Kung, Faith; Mammo, Danny; Ng, Chun Leung; Nikam, Vinayak S.; Norton, Diana; Pham, Philip; Polk, Jessica W.; Prasad, Shreya; Rankin, Helen; Ratliff, Camille D.; Scala, Victoria; Schwartz, Nicholas U.; Shuen, Jessica A.; Xu, Amy; Xu, Thomas Q.; Zhang, Yi; Rosenwald, Anne G.; Burg, Martin G.; Adams, Stephanie J.; Baker, Morgan; Botsford, Bobbi; Brinkley, Briana; Brown, Carter; Emiah, Shadie; Enoch, Erica; Gier, Chad; Greenwell, Alyson; Hoogenboom, Lindsay; Matthews, Jordan E.; McDonald, Mitchell; Mercer, Amanda; Monsma, Nicholaus; Ostby, Kristine; Ramic, Alen; Shallman, Devon; Simon, Matthew; Spencer, Eric; Tomkins, Trisha; Wendland, Pete; Wylie, Anna; Wolyniak, Michael J.; Robertson, Gregory M.; Smith, Samuel I.; DiAngelo, Justin R.; Sassu, Eric D.; Bhalla, Satish C.; Sharif, Karim A.; Choeying, Tenzin; Macias, Jason S.; Sanusi, Fareed; Torchon, Karvyn; Bednarski, April E.; Alvarez, Consuelo J.; Davis, Kristen C.; Dunham, Carrie A.; Grantham, Alaina J.; Hare, Amber N.; Schottler, Jennifer; Scott, Zackary W.; Kuleck, Gary A.; Yu, Nicole S.; Kaehler, Marian M.; Jipp, Jacob; Overvoorde, Paul J.; Shoop, Elizabeth; Cyrankowski, Olivia; Hoover, Betsy; Kusner, Matt; Lin, Devry; Martinov, Tijana; Misch, Jonathan; Salzman, Garrett; Schiedermayer, Holly; Snavely, Michael; Zarrasola, Stephanie; Parrish, Susan; Baker, Atlee; Beckett, Alissa; Belella, Carissa; Bryant, Julie; Conrad, Turner; Fearnow, Adam; Gomez, Carolina; Herbstsomer, Robert A.; Hirsch, Sarah; Johnson, Christen; Jones, Melissa; Kabaso, Rita; Lemmon, Eric; Vieira, Carolina Marques dos Santos; McFarland, Darryl; McLaughlin, Christopher; Morgan, Abbie; Musokotwane, Sepo; Neutzling, William; Nietmann, Jana; Paluskievicz, Christina; Penn, Jessica; Peoples, Emily; Pozmanter, Caitlin; Reed, Emily; Rigby, Nichole; Schmidt, Lasse; Shelton, Micah; Shuford, Rebecca; Tirasawasdichai, Tiara; Undem, Blair; Urick, Damian; Vondy, Kayla; Yarrington, Bryan; Eckdahl, Todd T.; Poet, Jeffrey L.; Allen, Alica B.; Anderson, John E.; Barnett, Jason M.; Baumgardner, Jordan S.; Brown, Adam D.; Carney, Jordan E.; Chavez, Ramiro A.; Christgen, Shelbi L.; Christie, Jordan S.; Clary, Andrea N.; Conn, Michel A.; Cooper, Kristen M.; Crowley, Matt J.; Crowley, Samuel T.; Doty, Jennifer S.; Dow, Brian A.; Edwards, Curtis R.; Elder, Darcie D.; Fanning, John P.; Janssen, Bridget M.; Lambright, Anthony K.; Lane, Curtiss E.; Limle, Austin B.; Mazur, Tammy; McCracken, Marly R.; McDonough, Alexa M.; Melton, Amy D.; Minnick, Phillip J.; Musick, Adam E.; Newhart, William H.; Noynaert, Joseph W.; Ogden, Bradley J.; Sandusky, Michael W.; Schmuecker, Samantha M.; Shipman, Anna L.; Smith, Anna L.; Thomsen, Kristen M.; Unzicker, Matthew R.; Vernon, William B.; Winn, Wesley W.; Woyski, Dustin S.; Zhu, Xiao; Du, Chunguang; Ament, Caitlin; Aso, Soham; Bisogno, Laura Simone; Caronna, Jason; Fefelova, Nadezhda; Lopez, Lenin; Malkowitz, Lorraine; Marra, Jonathan; Menillo, Daniella; Obiorah, Ifeanyi; Onsarigo, Eric Nyabeta; Primus, Shekerah; Soos, Mahdi; Tare, Archana; Zidan, Ameer; Jones, Christopher J.; Aronhalt, Todd; Bellush, James M.; Burke, Christa; DeFazio, Steve; Does, Benjamin R.; Johnson, Todd D.; Keysock, Nicholas; Knudsen, Nelson H.; Messler, James; Myirski, Kevin; Rekai, Jade Lea; Rempe, Ryan Michael; Salgado, Michael S.; Stagaard, Erica; Starcher, Justin R.; Waggoner, Andrew W.; Yemelyanova, Anastasia K.; Hark, Amy T.; Bertolet, Anne; Kuschner, Cyrus E.; Parry, Kesley; Quach, Michael; Shantzer, Lindsey; Shaw, Mary E.; Smith, Mary A.; Glenn, Omolara; Mason, Portia; Williams, Charlotte; Key, S. Catherine Silver; Henry, Tyneshia C. P.; Johnson, Ashlee G.; White, Jackie X.; Haberman, Adam; Asinof, Sam; Drumm, Kelly; Freeburg, Trip; Safa, Nadia; Schultz, Darrin; Shevin, Yakov; Svoronos, Petros; Vuong, Tam; Wellinghoff, Jules; Hoopes, Laura L. M.; Chau, Kim M.; Ward, Alyssa; Regisford, E. Gloria C.; Augustine, LaJerald; Davis-Reyes, Brionna; Echendu, Vivienne; Hales, Jasmine; Ibarra, Sharon; Johnson, Lauriaun; Ovu, Steven; Braverman, John M.; Bahr, Thomas J.; Caesar, Nicole M.; Campana, Christopher; Cassidy, Daniel W.; Cognetti, Peter A.; English, Johnathan D.; Fadus, Matthew C.; Fick, Cameron N.; Freda, Philip J.; Hennessy, Bryan M.; Hockenberger, Kelsey; Jones, Jennifer K.; King, Jessica E.; Knob, Christopher R.; Kraftmann, Karen J.; Li, Linghui; Lupey, Lena N.; Minniti, Carl J.; Minton, Thomas F.; Moran, Joseph V.; Mudumbi, Krishna; Nordman, Elizabeth C.; Puetz, William J.; Robinson, Lauren M.; Rose, Thomas J.; Sweeney, Edward P.; Timko, Ashley S.; Paetkau, Don W.; Eisler, Heather L.; Aldrup, Megan E.; Bodenberg, Jessica M.; Cole, Mara G.; Deranek, Kelly M.; DeShetler, Megan; Dowd, Rose M.; Eckardt, Alexandra K.; Ehret, Sharon C.; Fese, Jessica; Garrett, Amanda D.; Kammrath, Anna; Kappes, Michelle L.; Light, Morgan R.; Meier, Anne C.; O’Rouke, Allison; Perella, Mallory; Ramsey, Kimberley; Ramthun, Jennifer R.; Reilly, Mary T.; Robinett, Deirdre; Rossi, Nadine L.; Schueler, Mary Grace; Shoemaker, Emma; Starkey, Kristin M.; Vetor, Ashley; Vrable, Abby; Chandrasekaran, Vidya; Beck, Christopher; Hatfield, Kristen R.; Herrick, Douglas A.; Khoury, Christopher B.; Lea, Charlotte; Louie, Christopher A.; Lowell, Shannon M.; Reynolds, Thomas J.; Schibler, Jeanine; Scoma, Alexandra H.; Smith-Gee, Maxwell T.; Tuberty, Sarah; Smith, Christopher D.; Lopilato, Jane E.; Hauke, Jeanette; Roecklein-Canfield, Jennifer A.; Corrielus, Maureen; Gilman, Hannah; Intriago, Stephanie; Maffa, Amanda; Rauf, Sabya A.; Thistle, Katrina; Trieu, Melissa; Winters, Jenifer; Yang, Bib; Hauser, Charles R.; Abusheikh, Tariq; Ashrawi, Yara; Benitez, Pedro; Boudreaux, Lauren R.; Bourland, Megan; Chavez, Miranda; Cruz, Samantha; Elliott, GiNell; Farek, Jesse R.; Flohr, Sarah; Flores, Amanda H.; Friedrichs, Chelsey; Fusco, Zach; Goodwin, Zane; Helmreich, Eric; Kiley, John; Knepper, John Mark; Langner, Christine; Martinez, Megan; Mendoza, Carlos; Naik, Monal; Ochoa, Andrea; Ragland, Nicolas; Raimey, England; Rathore, Sunil; Reza, Evangelina; Sadovsky, Griffin; Seydoux, Marie-Isabelle B.; Smith, Jonathan E.; Unruh, Anna K.; Velasquez, Vicente; Wolski, Matthew W.; Gosser, Yuying; Govind, Shubha; Clarke-Medley, Nicole; Guadron, Leslie; Lau, Dawn; Lu, Alvin; Mazzeo, Cheryl; Meghdari, Mariam; Ng, Simon; Pamnani, Brad; Plante, Olivia; Shum, Yuki Kwan Wa; Song, Roy; Johnson, Diana E.; Abdelnabi, Mai; Archambault, Alexi; Chamma, Norma; Gaur, Shailly; Hammett, Deborah; Kandahari, Adrese; Khayrullina, Guzal; Kumar, Sonali; Lawrence, Samantha; Madden, Nigel; Mandelbaum, Max; Milnthorp, Heather; Mohini, Shiv; Patel, Roshni; Peacock, Sarah J.; Perling, Emily; Quintana, Amber; Rahimi, Michael; Ramirez, Kristen; Singhal, Rishi; Weeks, Corinne; Wong, Tiffany; Gillis, Aubree T.; Moore, Zachary D.; Savell, Christopher D.; Watson, Reece; Mel, Stephanie F.; Anilkumar, Arjun A.; Bilinski, Paul; Castillo, Rostislav; Closser, Michael; Cruz, Nathalia M.; Dai, Tiffany; Garbagnati, Giancarlo F.; Horton, Lanor S.; Kim, Dongyeon; Lau, Joyce H.; Liu, James Z.; Mach, Sandy D.; Phan, Thu A.; Ren, Yi; Stapleton, Kenneth E.; Strelitz, Jean M.; Sunjed, Ray; Stamm, Joyce; Anderson, Morgan C.; Bonifield, Bethany Grace; Coomes, Daniel; Dillman, Adam; Durchholz, Elaine J.; Fafara-Thompson, Antoinette E.; Gross, Meleah J.; Gygi, Amber M.; Jackson, Lesley E.; Johnson, Amy; Kocsisova, Zuzana; Manghelli, Joshua L.; McNeil, Kylie; Murillo, Michael; Naylor, Kierstin L.; Neely, Jessica; Ogawa, Emmy E.; Rich, Ashley; Rogers, Anna; Spencer, J. Devin; Stemler, Kristina M.; Throm, Allison A.; Van Camp, Matt; Weihbrecht, Katie; Wiles, T. Aaron; Williams, Mallory A.; Williams, Matthew; Zoll, Kyle; Bailey, Cheryl; Zhou, Leming; Balthaser, Darla M.; Bashiri, Azita; Bower, Mindy E.; Florian, Kayla A.; Ghavam, Nazanin; Greiner-Sosanko, Elizabeth S.; Karim, Helmet; Mullen, Victor W.; Pelchen, Carly E.; Yenerall, Paul M.; Zhang, Jiayu; Rubin, Michael R.; Arias-Mejias, Suzette M.; Bermudez-Capo, Armando G.; Bernal-Vega, Gabriela V.; Colon-Vazquez, Mariela; Flores-Vazquez, Arelys; Gines-Rosario, Mariela; Llavona-Cartagena, Ivan G.; Martinez-Rodriguez, Javier O.; Ortiz-Fuentes, Lionel; Perez-Colomba, Eliezer O.; Perez-Otero, Joseph; Rivera, Elisandra; Rodriguez-Giron, Luke J.; Santiago-Sanabria, Arnaldo J.; Senquiz-Gonzalez, Andrea M.; delValle, Frank R. Soto; Vargas-Franco, Dorianmarie; Velázquez-Soto, Karla I.; Zambrana-Burgos, Joan D.; Martinez-Cruzado, Juan Carlos; Asencio-Zayas, Lillyann; Babilonia-Figueroa, Kevin; Beauchamp-Pérez, Francis D.; Belén-Rodríguez, Juliana; Bracero-Quiñones, Luciann; Burgos-Bula, Andrea P.; Collado-Méndez, Xavier A.; Colón-Cruz, Luis R.; Correa-Muller, Ana I.; Crooke-Rosado, Jonathan L.; Cruz-García, José M.; Defendini-Ávila, Marianna; Delgado-Peraza, Francheska M.; Feliciano-Cancela, Alex J.; Gónzalez-Pérez, Valerie M.; Guiblet, Wilfried; Heredia-Negrón, Aldo; Hernández-Muñiz, Jennifer; Irizarry-González, Lourdes N.; Laboy-Corales, Ángel L.; Llaurador-Caraballo, Gabriela A.; Marín-Maldonado, Frances; Marrero-Llerena, Ulises; Martell-Martínez, Héctor A.; Martínez-Traverso, Idaliz M.; Medina-Ortega, Kiara N.; Méndez-Castellanos, Sonya G.; Menéndez-Serrano, Krizia C.; Morales-Caraballo, Carol I.; Ortiz-DeChoudens, Saryleine; Ortiz-Ortiz, Patricia; Pagán-Torres, Hendrick; Pérez-Afanador, Diana; Quintana-Torres, Enid M.; Ramírez-Aponte, Edwin G.; Riascos-Cuero, Carolina; Rivera-Llovet, Michelle S.; Rivera-Pagán, Ingrid T.; Rivera-Vicéns, Ramón E.; Robles-Juarbe, Fabiola; Rodríguez-Bonilla, Lorraine; Rodríguez-Echevarría, Brian O.; Rodríguez-García, Priscila M.; Rodríguez-Laboy, Abneris E.; Rodríguez-Santiago, Susana; Rojas-Vargas, Michael L.; Rubio-Marrero, Eva N.; Santiago-Colón, Albeliz; Santiago-Ortiz, Jorge L.; Santos-Ramos, Carlos E.; Serrano-González, Joseline; Tamayo-Figueroa, Alina M.; Tascón-Peñaranda, Edna P.; Torres-Castillo, José L.; Valentín-Feliciano, Nelson A.; Valentín-Feliciano, Yashira M.; Vargas-Barreto, Nadyan M.; Vélez-Vázquez, Miguel; Vilanova-Vélez, Luis R.; Zambrana-Echevarría, Cristina; MacKinnon, Christy; Chung, Hui-Min; Kay, Chris; Pinto, Anthony; Kopp, Olga R.; Burkhardt, Joshua; Harward, Chris; Allen, Robert; Bhat, Pavan; Chang, Jimmy Hsiang-Chun; Chen, York; Chesley, Christopher; Cohn, Dara; DuPuis, David; Fasano, Michael; Fazzio, Nicholas; Gavinski, Katherine; Gebreyesus, Heran; Giarla, Thomas; Gostelow, Marcus; Greenstein, Rachel; Gunasinghe, Hashini; Hanson, Casey; Hay, Amanda; He, Tao Jian; Homa, Katie; Howe, Ruth; Howenstein, Jeff; Huang, Henry; Khatri, Aaditya; Kim, Young Lu; Knowles, Olivia; Kong, Sarah; Krock, Rebecca; Kroll, Matt; Kuhn, Julia; Kwong, Matthew; Lee, Brandon; Lee, Ryan; Levine, Kevin; Li, Yedda; Liu, Bo; Liu, Lucy; Liu, Max; Lousararian, Adam; Ma, Jimmy; Mallya, Allyson; Manchee, Charlie; Marcus, Joseph; McDaniel, Stephen; Miller, Michelle L.; Molleston, Jerome M.; Diez, Cristina Montero; Ng, Patrick; Ngai, Natalie; Nguyen, Hien; Nylander, Andrew; Pollack, Jason; Rastogi, Suchita; Reddy, Himabindu; Regenold, Nathaniel; Sarezky, Jon; Schultz, Michael; Shim, Jien; Skorupa, Tara; Smith, Kenneth; Spencer, Sarah J.; Srikanth, Priya; Stancu, Gabriel; Stein, Andrew P.; Strother, Marshall; Sudmeier, Lisa; Sun, Mengyang; Sundaram, Varun; Tazudeen, Noor; Tseng, Alan; Tzeng, Albert; Venkat, Rohit; Venkataram, Sandeep; Waldman, Leah; Wang, Tracy; Yang, Hao; Yu, Jack Y.; Zheng, Yin; Preuss, Mary L.; Garcia, Angelica; Juergens, Matt; Morris, Robert W.; Nagengast, Alexis A.; Azarewicz, Julie; Carr, Thomas J.; Chichearo, Nicole; Colgan, Mike; Donegan, Megan; Gardner, Bob; Kolba, Nik; Krumm, Janice L.; Lytle, Stacey; MacMillian, Laurell; Miller, Mary; Montgomery, Andrew; Moretti, Alysha; Offenbacker, Brittney; Polen, Mike; Toth, John; Woytanowski, John; Kadlec, Lisa; Crawford, Justin; Spratt, Mary L.; Adams, Ashley L.; Barnard, Brianna K.; Cheramie, Martin N.; Eime, Anne M.; Golden, Kathryn L.; Hawkins, Allyson P.; Hill, Jessica E.; Kampmeier, Jessica A.; Kern, Cody D.; Magnuson, Emily E.; Miller, Ashley R.; Morrow, Cody M.; Peairs, Julia C.; Pickett, Gentry L.; Popelka, Sarah A.; Scott, Alexis J.; Teepe, Emily J.; TerMeer, Katie A.; Watchinski, Carmen A.; Watson, Lucas A.; Weber, Rachel E.; Woodard, Kate A.; Barnard, Daron C.; Appiah, Isaac; Giddens, Michelle M.; McNeil, Gerard P.; Adebayo, Adeola; Bagaeva, Kate; Chinwong, Justina; Dol, Chrystel; George, Eunice; Haltaufderhyde, Kirk; Haye, Joanna; Kaur, Manpreet; Semon, Max; Serjanov, Dmitri; Toorie, Anika; Wilson, Christopher; Riddle, Nicole C.; Buhler, Jeremy; Mardis, Elaine R.

    2015-01-01

    The Muller F element (4.2 Mb, ~80 protein-coding genes) is an unusual autosome of Drosophila melanogaster; it is mostly heterochromatic with a low recombination rate. To investigate how these properties impact the evolution of repeats and genes, we manually improved the sequence and annotated the genes on the D. erecta, D. mojavensis, and D. grimshawi F elements and euchromatic domains from the Muller D element. We find that F elements have greater transposon density (25–50%) than euchromatic reference regions (3–11%). Among the F elements, D. grimshawi has the lowest transposon density (particularly DINE-1: 2% vs. 11–27%). F element genes have larger coding spans, more coding exons, larger introns, and lower codon bias. Comparison of the Effective Number of Codons with the Codon Adaptation Index shows that, in contrast to the other species, codon bias in D. grimshawi F element genes can be attributed primarily to selection instead of mutational biases, suggesting that density and types of transposons affect the degree of local heterochromatin formation. F element genes have lower estimated DNA melting temperatures than D element genes, potentially facilitating transcription through heterochromatin. Most F element genes (~90%) have remained on that element, but the F element has smaller syntenic blocks than genome averages (3.4–3.6 vs. 8.4–8.8 genes per block), indicating greater rates of inversion despite lower rates of recombination. Overall, the F element has maintained characteristics that are distinct from other autosomes in the Drosophila lineage, illuminating the constraints imposed by a heterochromatic milieu. PMID:25740935

  17. Colonization of heterochromatic genes by transposable elements in Drosophila.

    PubMed

    Dimitri, Patrizio; Junakovic, Nikolaj; Arcà, Bruno

    2003-04-01

    As a further step toward understanding transposable element-host genome interactions, we investigated the molecular anatomy of introns from five heterochromatic and 22 euchromatic protein-coding genes of Drosophila melanogaster. A total of 79 kb of intronic sequences from heterochromatic genes and 355 kb of intronic sequences from euchromatic genes have been used in Blast searches against Drosophila transposable elements (TEs). The results show that TE-homologous sequences belonging to 19 different families represent about 50% of intronic DNA from heterochromatic genes. In contrast, only 0.1% of the euchromatic intron DNA exhibits homology to known TEs. Intraspecific and interspecific size polymorphisms of introns were found, which are likely to be associated with changes in TE-related sequences. Together, the enrichment in TEs and the apparent dynamic state of heterochromatic introns suggest that TEs contribute significantly to the evolution of genes located in heterochromatin.

  18. Homing endonucleases from mobile group I introns: discovery to genome engineering

    PubMed Central

    2014-01-01

    Homing endonucleases are highly specific DNA cleaving enzymes that are encoded within genomes of all forms of microbial life including phage and eukaryotic organelles. These proteins drive the mobility and persistence of their own reading frames. The genes that encode homing endonucleases are often embedded within self-splicing elements such as group I introns, group II introns and inteins. This combination of molecular functions is mutually advantageous: the endonuclease activity allows surrounding introns and inteins to act as invasive DNA elements, while the splicing activity allows the endonuclease gene to invade a coding sequence without disrupting its product. Crystallographic analyses of representatives from all known homing endonuclease families have illustrated both their mechanisms of action and their evolutionary relationships to a wide range of host proteins. Several homing endonucleases have been completely redesigned and used for a variety of genome engineering applications. Recent efforts to augment homing endonucleases with auxiliary DNA recognition elements and/or nucleic acid processing factors has further accelerated their use for applications that demand exceptionally high specificity and activity. PMID:24589358

  19. Detecting long tandem duplications in genomic sequences.

    PubMed

    Audemard, Eric; Schiex, Thomas; Faraut, Thomas

    2012-05-08

    Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome, using a reference set of tandem duplicated genes built using TAIR,(a) we show that ReD Tandem is able to predict a large fraction of recently duplicated genes (dS  <  1) and that it is also able to predict tandem duplications involving non coding elements such as pseudo-genes or RNA genes. ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations.

  20. Complete Sequence of the mitochondrial genome of the tapeworm Hymenolepis diminuta: Gene arrangements indicate that platyhelminths are eutrochozoans

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    von Nickisch-Rosenegk, Markus; Brown, Wesley M.; Boore, Jeffrey L.

    2001-01-01

    Using ''long-PCR'' we have amplified in overlapping fragments the complete mitochondrial genome of the tapeworm Hymenolepis diminuta (Platyhelminthes: Cestoda) and determined its 13,900 nucleotide sequence. The gene content is the same as that typically found for animal mitochondrial DNA (mtDNA) except that atp8 appears to be lacking, a condition found previously for several other animals. Despite the small size of this mtDNA, there are two large non-coding regions, one of which contains 13 repeats of a 31 nucleotide sequence and a potential stem-loop structure of 25 base pairs with an 11-member loop. Large potential secondary structures are identified also formore » the non-coding regions of two other cestode mtDNAs. Comparison of the mitochondrial gene arrangement of H. diminuta with those previously published supports a phylogenetic position of flatworms as members of the Eutrochozoa, rather than being basal to either a clade of protostomes or a clade of coelomates.« less

  1. The complete mitochondrial genome of Pomacea canaliculata (Gastropoda: Ampullariidae).

    PubMed

    Zhou, Xuming; Chen, Yu; Zhu, Shanliang; Xu, Haigen; Liu, Yan; Chen, Lian

    2016-01-01

    The mitochondrial genome of Pomacea canaliculata (Gastropoda: Ampullariidae) is the first complete mtDNA sequence reported in the genus Pomacea. The total length of mtDNA is 15,707 bp, which containing 13 protein-coding genes, 2 ribosomal RNAs, 22 transfer RNAs, and a 359 bp non-coding region. The A + T content of the overall base composition of H-strand is 71.7% (T: 41%, C: 12.7%, A: 30.7%, G: 15.6%). ATP6, ATP8, CO1, CO2, ND1-3, ND5, ND6, ND4L and Cyt b genes begin with ATG as start codon, CO3 and ND4 begin with ATA. ATP8, CO2-3, ND4L, ND2-6 and Cyt b genes are terminated with TAA as stop codon, ATP6, ND1, and CO1 end with TAG. A long non-coding region is found and a 23 bp repeat unit repeat 11 times in this region.

  2. Epigenetics and obesity cardiomyopathy: From pathophysiology to prevention and management.

    PubMed

    Zhang, Yingmei; Ren, Jun

    2016-05-01

    Uncorrected obesity has been associated with cardiac hypertrophy and contractile dysfunction. Several mechanisms for this cardiomyopathy have been identified, including oxidative stress, autophagy, adrenergic and renin-angiotensin aldosterone overflow. Another process that may regulate effects of obesity is epigenetics, which refers to the heritable alterations in gene expression or cellular phenotype that are not encoded on the DNA sequence. Advances in epigenome profiling have greatly improved the understanding of the epigenome in obesity, where environmental exposures during early life result in an increased health risk later on in life. Several mechanisms, including histone modification, DNA methylation and non-coding RNAs, have been reported in obesity and can cause transcriptional suppression or activation, depending on the location within the gene, contributing to obesity-induced complications. Through epigenetic modifications, the fetus may be prone to detrimental insults, leading to cardiac sequelae later in life. Important links between epigenetics and obesity include nutrition, exercise, adiposity, inflammation, insulin sensitivity and hepatic steatosis. Genome-wide studies have identified altered DNA methylation patterns in pancreatic islets, skeletal muscle and adipose tissues from obese subjects compared with non-obese controls. In addition, aging and intrauterine environment are associated with differential DNA methylation. Given the intense research on the molecular mechanisms of the etiology of obesity and its complications, this review will provide insights into the current understanding of epigenetics and pharmacological and non-pharmacological (such as exercise) interventions targeting epigenetics as they relate to treatment of obesity and its complications. Particular focus will be on DNA methylation, histone modification and non-coding RNAs. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. The ENCODE Project at UC Santa Cruz.

    PubMed

    Thomas, Daryl J; Rosenbloom, Kate R; Clawson, Hiram; Hinrichs, Angie S; Trumbower, Heather; Raney, Brian J; Karolchik, Donna; Barber, Galt P; Harte, Rachel A; Hillman-Jackson, Jennifer; Kuhn, Robert M; Rhead, Brooke L; Smith, Kayla E; Thakkapallayil, Archana; Zweig, Ann S; Haussler, David; Kent, W James

    2007-01-01

    The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website (http://genome.ucsc.edu/ENCODE) is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.

  4. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    PubMed

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  5. Horizontal Transfer of Non-LTR Retrotransposons from Arthropods to Flowering Plants.

    PubMed

    Gao, Dongying; Chu, Ye; Xia, Han; Xu, Chunming; Heyduk, Karolina; Abernathy, Brian; Ozias-Akins, Peggy; Leebens-Mack, James H; Jackson, Scott A

    2018-02-01

    Even though lateral movements of transposons across families and even phyla within multicellular eukaryotic kingdoms have been found, little is known about transposon transfer between the kingdoms Animalia and Plantae. We discovered a novel non-LTR retrotransposon, AdLINE3, in a wild peanut species. Sequence comparisons and phylogenetic analyses indicated that AdLINE3 is a member of the RTE clade, originally identified in a nematode and rarely reported in plants. We identified RTE elements in 82 plants, spanning angiosperms to algae, including recently active elements in some flowering plants. RTE elements in flowering plants were likely derived from a single family we refer to as An-RTE. Interestingly, An-RTEs show significant DNA sequence identity with non-LTR retroelements from 42 animals belonging to four phyla. Moreover, the sequence identity of RTEs between two arthropods and two plants was higher than that of homologous genes. Phylogenetic and evolutionary analyses of RTEs from both animals and plants suggest that the An-RTE family was likely transferred horizontally into angiosperms from an ancient aphid(s) or ancestral arthropod(s). Notably, some An-RTEs were recruited as coding sequences of functional genes participating in metabolic or other biochemical processes in plants. This is the first potential example of horizontal transfer of transposons between animals and flowering plants. Our findings help to understand exchanges of genetic material between the kingdom Animalia and Plantae and suggest arthropods likely impacted on plant genome evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  6. Development of non-linear finite element computer code

    NASA Technical Reports Server (NTRS)

    Becker, E. B.; Miller, T.

    1985-01-01

    Recent work has shown that the use of separable symmetric functions of the principal stretches can adequately describe the response of certain propellant materials and, further, that a data reduction scheme gives a convenient way of obtaining the values of the functions from experimental data. Based on representation of the energy, a computational scheme was developed that allows finite element analysis of boundary value problems of arbitrary shape and loading. The computational procedure was implemental in a three-dimensional finite element code, TEXLESP-S, which is documented herein.

  7. Complete sequences of the highly rearranged molluscan mitochondrial genomes of the scaphopod graptacme eborea and the bivalve mytilus edulis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boore, Jeffrey L.; Medina, Monica; Rosenberg, Lewis A.

    2004-01-31

    We have determined the complete sequence of the mitochondrial genome of the scaphopod mollusk Graptacme eborea (Conrad, 1846) (14,492 nts) and completed the sequence of the mitochondrial genome of the bivalve mollusk Mytilus edulis Linnaeus, 1758 (16,740 nts). (The name Graptacme eborea is a revision of the species formerly known as Dentalium eboreum.) G. eborea mtDNA contains the 37 genes that are typically found and has the genes divided about evenly between the two strands, but M. edulis contains an extra trnM and is missing atp8, and has all genes on the same strand. Each has a highly rearranged genemore » order relative to each other and to all other studied mtDNAs. G. eborea mtDNA has almost no strand skew, but the coding strand of M. edulis mtDNA is very rich in G and T. This is reflected in differential codon usage patterns and even in amino acid compositions. G. eborea mtDNA has fewer non-coding nucleotides than any other mtDNA studied to date, with the largest non-coding region being only 24 nt long. Phylogenetic analysis using 2,420 aligned amino acid positions of concatenated proteins weakly supports an association of the scaphopod with gastropods to the exclusion of Bivalvia, Cephalopoda, and Polyplacophora, but is generally unable to convincingly resolve the relationships among major groups of the Lophotrochozoa, in contrast to the good resolution seen for several other major metazoan groups.« less

  8. The chloroplast tRNALys(UUU) gene from mustard (Sinapis alba) contains a class II intron potentially coding for a maturase-related polypeptide.

    PubMed

    Neuhaus, H; Link, G

    1987-01-01

    The trnK gene endocing the tRNALys(UUU) has been located on mustard (Sinapis alba) chloroplast DNA, 263 bp upstream of the psbA gene on the same strand. The nucleotide sequence of the trnK gene and its flanking regions as well as the putative transcription start and termination sites are shown. The 5' end of the transcript lies 121 bp upstream of the 5' tRNA coding region and is preceded by procaryotic-type "-10" and "-35" sequence elements, while the 3' end maps 2.77 kb downstream to a DNA region with possible stemloop secondary structure. The anticodon loop of the tRNALys is interrupted by a 2,574 bp intron containing a long open reading frame, which codes for 524 amino acids. Based on conserved stem and loop structures, this intron has characteristic features of a class II intron. A region near the carboxyl terminus of the derived polypeptide appears structurally related to maturases.

  9. QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks.

    PubMed

    Thibodeau, Asa; Márquez, Eladio J; Luo, Oscar; Ruan, Yijun; Menghi, Francesca; Shin, Dong-Guk; Stitzel, Michael L; Vera-Licona, Paola; Ucar, Duygu

    2016-06-01

    Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. QuIN's web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/.

  10. Deciphering the transcriptional cis-regulatory code.

    PubMed

    Yáñez-Cuna, J Omar; Kvon, Evgeny Z; Stark, Alexander

    2013-01-01

    Information about developmental gene expression resides in defined regulatory elements, called enhancers, in the non-coding part of the genome. Although cells reliably utilize enhancers to orchestrate gene expression, a cis-regulatory code that would allow their interpretation has remained one of the greatest challenges of modern biology. In this review, we summarize studies from the past three decades that describe progress towards revealing the properties of enhancers and discuss how recent approaches are providing unprecedented insights into regulatory elements in animal genomes. Over the next years, we believe that the functional characterization of regulatory sequences in entire genomes, combined with recent computational methods, will provide a comprehensive view of genomic regulatory elements and their building blocks and will enable researchers to begin to understand the sequence basis of the cis-regulatory code. Copyright © 2012 Elsevier Ltd. All rights reserved.

  11. A tumor-promoting mechanism mediated by retrotransposon-encoded reverse transcriptase is active in human transformed cell lines

    PubMed Central

    Sciamanna, Ilaria; Gualtieri, Alberto; Cossetti, Cristina; Osimo, Emanuele Felice; Ferracin, Manuela; Macchia, Gianfranco; Aricò, Eleonora; Prosseda, Gianni; Vitullo, Patrizia; Misteli, Tom; Spadafora, Corrado

    2013-01-01

    LINE-1 elements make up the most abundant retrotransposon family in the human genome. Full-length LINE-1 elements encode a reverse transcriptase (RT) activity required for their own retrotranpsosition as well as that of non-autonomous Alu elements. LINE-1 are poorly expressed in normal cells and abundantly in cancer cells. Decreasing RT activity in cancer cells, by either LINE-1-specific RNA interference, or by RT inhibitory drugs, was previously found to reduce proliferation and promote differentiation and to antagonize tumor growth in animal models. Here we have investigated how RT exerts these global regulatory functions. We report that the RT inhibitor efavirenz (EFV) selectively downregulates proliferation of transformed cell lines, while exerting only mild effects on non-transformed cells; this differential sensitivity matches a differential RT abundance, which is high in the former and undetectable in the latter. Using CsCl density gradients, we selectively identify Alu and LINE-1 containing DNA:RNA hybrid molecules in cancer but not in normal cells. Remarkably, hybrid molecules fail to form in tumor cells treated with EFV under the same conditions that repress proliferation and induce the reprogramming of expression profiles of coding genes, microRNAs (miRNAs) and ultraconserved regions (UCRs). The RT-sensitive miRNAs and UCRs are significantly associated with Alu sequences. The results suggest that LINE-1-encoded RT governs the balance between single-stranded and double-stranded RNA production. In cancer cells the abundant RT reverse-transcribes retroelement-derived mRNAs forming RNA:DNA hybrids. We propose that this impairs the formation of double-stranded RNAs and the ensuing production of small regulatory RNAs, with a direct impact on gene expression. RT inhibition restores the ‘normal’ small RNA profile and the regulatory networks that depend on them. Thus, the retrotransposon-encoded RT drives a previously unrecognized mechanism crucial to the transformed state in tumor cells. PMID:24345856

  12. A Dual Origin of the Xist Gene from a Protein-Coding Gene and a Set of Transposable Elements

    PubMed Central

    Elisaphenko, Eugeny A.; Kolesnikov, Nikolay N.; Shevchenko, Alexander I.; Rogozin, Igor B.; Nesterova, Tatyana B.; Brockdorff, Neil; Zakian, Suren M.

    2008-01-01

    X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA. PMID:18575625

  13. An expanding universe of the non-coding genome in cancer biology.

    PubMed

    Xue, Bin; He, Lin

    2014-06-01

    Neoplastic transformation is caused by accumulation of genetic and epigenetic alterations that ultimately convert normal cells into tumor cells with uncontrolled proliferation and survival, unlimited replicative potential and invasive growth [Hanahan,D. et al. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646-674]. Although the majority of the cancer studies have focused on the functions of protein-coding genes, emerging evidence has started to reveal the importance of the vast non-coding genome, which constitutes more than 98% of the human genome. A number of non-coding RNAs (ncRNAs) derived from the 'dark matter' of the human genome exhibit cancer-specific differential expression and/or genomic alterations, and it is increasingly clear that ncRNAs, including small ncRNAs and long ncRNAs (lncRNAs), play an important role in cancer development by regulating protein-coding gene expression through diverse mechanisms. In addition to ncRNAs, nearly half of the mammalian genomes consist of transposable elements, particularly retrotransposons. Once depicted as selfish genomic parasites that propagate at the expense of host fitness, retrotransposon elements could also confer regulatory complexity to the host genomes during development and disease. Reactivation of retrotransposons in cancer, while capable of causing insertional mutagenesis and genome rearrangements to promote oncogenesis, could also alter host gene expression networks to favor tumor development. Taken together, the functional significance of non-coding genome in tumorigenesis has been previously underestimated, and diverse transcripts derived from the non-coding genome could act as integral functional components of the oncogene and tumor suppressor network. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  14. Long non-coding RNAs as novel expression signatures modulate DNA damage and repair in cadmium toxicology

    NASA Astrophysics Data System (ADS)

    Zhou, Zhiheng; Liu, Haibai; Wang, Caixia; Lu, Qian; Huang, Qinhai; Zheng, Chanjiao; Lei, Yixiong

    2015-10-01

    Increasing evidence suggests that long non-coding RNAs (lncRNAs) are involved in a variety of physiological and pathophysiological processes. Our study was to investigate whether lncRNAs as novel expression signatures are able to modulate DNA damage and repair in cadmium(Cd) toxicity. There were aberrant expression profiles of lncRNAs in 35th Cd-induced cells as compared to untreated 16HBE cells. siRNA-mediated knockdown of ENST00000414355 inhibited the growth of DNA-damaged cells and decreased the expressions of DNA-damage related genes (ATM, ATR and ATRIP), while increased the expressions of DNA-repair related genes (DDB1, DDB2, OGG1, ERCC1, MSH2, RAD50, XRCC1 and BARD1). Cadmium increased ENST00000414355 expression in the lung of Cd-exposed rats in a dose-dependent manner. A significant positive correlation was observed between blood ENST00000414355 expression and urinary/blood Cd concentrations, and there were significant correlations of lncRNA-ENST00000414355 expression with the expressions of target genes in the lung of Cd-exposed rats and the blood of Cd exposed workers. These results indicate that some lncRNAs are aberrantly expressed in Cd-treated 16HBE cells. lncRNA-ENST00000414355 may serve as a signature for DNA damage and repair related to the epigenetic mechanisms underlying the cadmium toxicity and become a novel biomarker of cadmium toxicity.

  15. BcMF11, a novel non-coding RNA gene from Brassica campestris, is required for pollen development and male fertility.

    PubMed

    Song, Jiang-Hua; Cao, Jia-Shu; Wang, Cheng-Gang

    2013-01-01

    KEY MESSAGE : BcMF11 as a non-coding RNA gene has an essential role in pollen development, and might be useful for regulating the pollen fertility of crops by antisense RNA technology. We previously identified a 828-bp full-length cDNA of BcMF11, a novel pollen-specific non-coding mRNA-like gene from Chinese cabbage (Brassica campestris L. ssp. chinensis Makino). However, little information is known about the function of BcMF11 in pollen development. To investigate its exact biological roles in pollen development, the BcMF11 cDNA was antisense inhibited in transgenic Chinese cabbage under the control of a tapetum-specific promoter BcA9 and a constitutive promoter CaMV 35S. Antisense RNA transgenic plants displayed decreasing expression of BcMF11 and showed distinct morphological defects. Pollen germination test in vitro and in vivo of the transgenic plants suggested that inhibition of BcMF11 decreased pollen germination efficiency and delayed the pollen tubes' extension in the style. Under scanning electron microscopy, many shrunken and collapsed pollen grains were detected in the antisense BcMF11 transgenic Chinese cabbage. Further cytological observation revealed abnormal pollen development process in transgenic plants, including delayed degradation of tapetum, asynchronous separation of microspore, and aborted development of pollen grain. These results suggest that BcMF11, as a non-coding RNA, plays an essential role in pollen development and male fertility.

  16. Cancer-linked satellite 2 DNA hypomethylation does not regulate Sat2 non-coding RNA expression and is initiated by heat shock pathway activation.

    PubMed

    Tilman, Gaëlle; Arnoult, Nausica; Lenglez, Sandrine; Van Beneden, Amandine; Loriot, Axelle; De Smet, Charles; Decottignies, Anabelle

    2012-08-01

    Epigenetic dysfunctions, including DNA methylation alterations, play major roles in cancer initiation and progression. Although it is well established that gene promoter demethylation activates transcription, it remains unclear whether hypomethylation of repetitive heterochromatin similarly affects expression of non-coding RNA from these loci. Understanding how repetitive non-coding RNAs are transcriptionally regulated is important given that their established upregulation by the heat shock (HS) pathway suggests important functions in cellular response to stress, possibly by promoting heterochromatin reconstruction. We found that, although pericentromeric satellite 2 (Sat2) DNA hypomethylation is detected in a majority of cancer cell lines of various origins, DNA methylation loss does not constitutively hyperactivate Sat2 expression, and also does not facilitate Sat2 transcriptional induction upon heat shock. In melanoma tumor samples, our analysis revealed that the HS response, frequently upregulated in tumors, is probably the main determinant of Sat2 RNA expression in vivo. Next, we tested whether HS pathway hyperactivation may drive Sat2 demethylation. Strikingly, we found that both hyperthermia and hyperactivated RasV12 oncogene, another potent inducer of the HS pathway, reduced Sat2 methylation levels by up to 27% in human fibroblasts recovering from stress. Demethylation occurred locally on Sat2 repeats, resulting in a demethylation signature that was also detected in cancer cell lines with moderate genome-wide hypomethylation. We therefore propose that upregulation of Sat2 transcription in response to HS pathway hyperactivation during tumorigenesis may promote localized demethylation of the locus. This, in turn, may contribute to tumorigenesis, as demethylation of Sat2 was previously reported to favor chromosomal rearrangements.

  17. Trace elements are associated with urinary 8-hydroxy-2'-deoxyguanosine level: a case study of college students in Guangzhou, China.

    PubMed

    Lu, Shaoyou; Ren, Lu; Fang, Jianzhang; Ji, Jiajia; Liu, Guihua; Zhang, Jianqing; Zhang, Huimin; Luo, Ruorong; Lin, Kai; Fan, Ruifang

    2016-05-01

    Many trace heavy elements are carcinogenic and increase the incidence of cancer. However, a comprehensive study of the correlation between multiple trace elements and DNA oxidative damage is still lacking. The aim of this study is to investigate the relationships between the body burden of multiple trace elements and DNA oxidative stress in college students in Guangzhou, China. Seventeen trace elements in urine samples were determined by inductively coupled plasma-mass spectrometry (ICP-MS). Urinary 8-hydroxy-2'-deoxyguanosine (8-OHdG), a biomarker of DNA oxidative stress, was also measured using liquid chromatography tandem mass spectrometer (LC-MS/MS). The concentrations of six essential elements including manganese (Mn), copper (Cu), nickel (Ni), selenium (Se), strontium (Sr), and molybdenum (Mo), and five non-essential elements including arsenic (As), cadmium (Cd), aluminum (Al), stibium (Sb), and thallium (Tl), were found to be significantly correlated with urinary 8-OHdG levels. Moreover, urinary levels of Ni, Se, Mo, As, Sr, and Tl were strongly significantly correlated with 8-OHdG (P < 0.01) concentration. Environmental exposure and dietary intake of these trace elements may play important roles in DNA oxidative damage in the population of Guangzhou, China.

  18. G-quadruplex prediction in E. coli genome reveals a conserved putative G-quadruplex-Hairpin-Duplex switch.

    PubMed

    Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman

    2016-11-02

    Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Both V(D)J Coding Ends but Neither Signal End Can Recombine at the bcl-2 Major Breakpoint Region, and the Rejoining Is Ligase IV Dependent

    PubMed Central

    Raghavan, Sathees C.; Hsieh, Chih-Lin; Lieber, Michael R.

    2005-01-01

    The t(14;18) chromosomal translocation is the most common translocation in human cancer, and it occurs in all follicular lymphomas. The 150-bp bcl-2 major breakpoint region (Mbr) on chromosome 18 is a fragile site, because it adopts a non-B DNA conformation that can be cleaved by the RAG complex. The non-B DNA structure and the chromosomal translocation can be recapitulated on intracellular human minichromosomes where immunoglobulin 12- and 23-signals are positioned downstream of the bcl-2 Mbr. Here we show that either of the two coding ends in these V(D)J recombination reactions can recombine with either of the two broken ends of the bcl-2 Mbr but that neither signal end can recombine with the Mbr. Moreover, we show that the rejoining is fully dependent on DNA ligase IV, indicating that the rejoining phase relies on the nonhomologous DNA end-joining pathway. These results permit us to formulate a complete model for the order and types of cleavage and rejoining events in the t(14;18) translocation. PMID:16024785

  20. Tissue plasminogen activator (tPA) as a reporter gene in transient gene expression.

    PubMed

    Cheng, S M; Lee, S G; Kalyan, N K; McCloud, S; Levner, M; Hung, P P

    1987-01-01

    Using the gene coding for tissue plasminogen activator (tPA) as a reporter gene, a transient gene expression system has been established. Vectors containing the full-length cDNA of tPA with its signal sequences were introduced into mammalian recipient cells by a modified gene transfer procedure. Thirty hours after transfection, the secreted tPA was found in serum-free medium and measured by a fibrin-agarose plate assay (FAPA). In this assay, tPA converts plasminogen into plasmin which then degrades high-Mr fibrin to produce cleared zones. The sizes of these zones correspond to quantities of tPA. The combination of transient tPA expression system and the FAPA provides a quick, sensitive, quantitative and non-destructive method to examine the strength of eukaryotic regulatory elements in tissue-culture cells.

  1. Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

    PubMed Central

    Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

    1985-01-01

    The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this. Images PMID:3016521

  2. Neurotoxic Doses of Chronic Methamphetamine Trigger Retrotransposition of the Identifier Element in Rat Dorsal Dentate Gyrus

    PubMed Central

    Moszczynska, Anna; Burghardt, Kyle J.; Yu, Dongyue

    2017-01-01

    Short interspersed elements (SINEs) are typically silenced by DNA hypermethylation in somatic cells, but can retrotranspose in proliferating cells during adult neurogenesis. Hypomethylation caused by disease pathology or genotoxic stress leads to genomic instability of SINEs. The goal of the present investigation was to determine whether neurotoxic doses of binge or chronic methamphetamine (METH) trigger retrotransposition of the identifier (ID) element, a member of the rat SINE family, in the dentate gyrus genomic DNA. Adult male Sprague-Dawley rats were treated with saline or high doses of binge or chronic METH and sacrificed at three different time points thereafter. DNA methylation analysis, immunohistochemistry and next-generation sequencing (NGS) were performed on the dorsal dentate gyrus samples. Binge METH triggered hypomethylation, while chronic METH triggered hypermethylation of the CpG-2 site. Both METH regimens were associated with increased intensities in poly(A)-binding protein 1 (PABP1, a SINE regulatory protein)-like immunohistochemical staining in the dentate gyrus. The amplification of several ID element sequences was significantly higher in the chronic METH group than in the control group a week after METH, and they mapped to genes coding for proteins regulating cell growth and proliferation, transcription, protein function as well as for a variety of transporters. The results suggest that chronic METH induces ID element retrotransposition in the dorsal dentate gyrus and may affect hippocampal neurogenesis. PMID:28272323

  3. [Structural organization of 5S ribosomal DNA of Rosa rugosa].

    PubMed

    Tynkevych, Iu O; Volkov, R A

    2014-01-01

    In order to clarify molecular organization of the genomic region encoding 5S rRNA in diploid species Rosa rugosa several 5S rDNA repeated units were cloned and sequenced. Analysis of the obtained sequences revealed that only one length variant of 5S rDNA repeated units, which contains intact promoter elements in the intergenic spacer region (IGS) and appears to be transcriptionally active is present in the genome. Additionally, a limited number of 5S rDNA pseudogenes lacking a portion of coding sequence and the complete IGS was detected. A high level of sequence similarity (from 93.7 to 97.5%) between the IGS of major 5S rDNA variants of East Asian R. rugosa and North American R. nitida was found indicating comparatively recent divergence of these species.

  4. Evolution of Genome Size and Complexity in Pinus

    PubMed Central

    Morse, Alison M.; Peterson, Daniel G.; Islam-Faridi, M. Nurul; Smith, Katherine E.; Magbanua, Zenaida; Garcia, Saul A.; Kubisiak, Thomas L.; Amerson, Henry V.; Carlson, John E.; Nelson, C. Dana; Davis, John M.

    2009-01-01

    Background Genome evolution in the gymnosperm lineage of seed plants has given rise to many of the most complex and largest plant genomes, however the elements involved are poorly understood. Methodology/Principal Findings Gymny is a previously undescribed retrotransposon family in Pinus that is related to Athila elements in Arabidopsis. Gymny elements are dispersed throughout the modern Pinus genome and occupy a physical space at least the size of the Arabidopsis thaliana genome. In contrast to previously described retroelements in Pinus, the Gymny family was amplified or introduced after the divergence of pine and spruce (Picea). If retrotransposon expansions are responsible for genome size differences within the Pinaceae, as they are in angiosperms, then they have yet to be identified. In contrast, molecular divergence of Gymny retrotransposons together with other families of retrotransposons can account for the large genome complexity of pines along with protein-coding genic DNA, as revealed by massively parallel DNA sequence analysis of Cot fractionated genomic DNA. Conclusions/Significance Most of the enormous genome complexity of pines can be explained by divergence of retrotransposons, however the elements responsible for genome size variation are yet to be identified. Genomic resources for Pinus including those reported here should assist in further defining whether and how the roles of retrotransposons differ in the evolution of angiosperm and gymnosperm genomes. PMID:19194510

  5. A decade of human genome project conclusion: Scientific diffusion about our genome knowledge.

    PubMed

    Moraes, Fernanda; Góes, Andréa

    2016-05-06

    The Human Genome Project (HGP) was initiated in 1990 and completed in 2003. It aimed to sequence the whole human genome. Although it represented an advance in understanding the human genome and its complexity, many questions remained unanswered. Other projects were launched in order to unravel the mysteries of our genome, including the ENCyclopedia of DNA Elements (ENCODE). This review aims to analyze the evolution of scientific knowledge related to both the HGP and ENCODE projects. Data were retrieved from scientific articles published in 1990-2014, a period comprising the development and the 10 years following the HGP completion. The fact that only 20,000 genes are protein and RNA-coding is one of the most striking HGP results. A new concept about the organization of genome arose. The ENCODE project was initiated in 2003 and targeted to map the functional elements of the human genome. This project revealed that the human genome is pervasively transcribed. Therefore, it was determined that a large part of the non-protein coding regions are functional. Finally, a more sophisticated view of chromatin structure emerged. The mechanistic functioning of the genome has been redrafted, revealing a much more complex picture. Besides, a gene-centric conception of the organism has to be reviewed. A number of criticisms have emerged against the ENCODE project approaches, raising the question of whether non-conserved but biochemically active regions are truly functional. Thus, HGP and ENCODE projects accomplished a great map of the human genome, but the data generated still requires further in depth analysis. © 2016 by The International Union of Biochemistry and Molecular Biology, 44:215-223, 2016. © 2016 The International Union of Biochemistry and Molecular Biology.

  6. Loss of p53-inducible long non-coding RNA LINC01021 increases chemosensitivity

    PubMed Central

    Kaller, Markus; Götz, Ursula; Hermeking, Heiko

    2017-01-01

    We have previously identified the long non-coding RNA LINC01021 as a direct p53 target (Hünten et al. Mol Cell Proteomics. 2015; 14:2609-2629). Here, we show that LINC01021 is up-regulated in colorectal cancer (CRC) cell lines upon various p53-activating treatments. The LINC01021 promoter and the p53 binding site lie within a MER61C LTR, which originated from insertion of endogenous retrovirus 1 (ERV1) sequences. Deletion of this MER61C element by a CRISPR/Cas9 approach, as well as siRNA-mediated knockdown of LINC01021 RNA significantly enhanced the sensitivity of the CRC cell line HCT116 towards the chemotherapeutic drugs doxorubicin and 5-FU, suggesting that LINC01021 is an integral part of the p53-mediated response to DNA damage. Inactivation of LINC01021 and also its ectopic expression did not affect p53 protein expression and transcriptional activity, implying that LINC01021 does not feedback to p53. Furthermore, in CRC patient samples LINC01021 expression positively correlated with a wild-type p53-associated gene expression signature. LINC01021 expression was increased in primary colorectal tumors and displayed a bimodal distribution that was particularly pronounced in the mesenchymal CMS4 consensus molecular subtype of CRCs. CMS4 tumors with low LINC01021 expression were associated with poor patient survival. Our results suggest that the genomic redistribution of ERV1-derived p53 response elements and generation of novel p53-inducible lncRNA-encoding genes was selected for during primate evolution as integral part of the cellular response to various forms of genotoxic stress. PMID:29262524

  7. Position specific variation in the rate of evolution in transcription factor binding sites

    PubMed Central

    Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B

    2003-01-01

    Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282

  8. Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence.

    PubMed

    Benko, Sabina; Fantes, Judy A; Amiel, Jeanne; Kleinjan, Dirk-Jan; Thomas, Sophie; Ramsay, Jacqueline; Jamshidi, Negar; Essafi, Abdelkader; Heaney, Simon; Gordon, Christopher T; McBride, David; Golzio, Christelle; Fisher, Malcolm; Perry, Paul; Abadie, Véronique; Ayuso, Carmen; Holder-Espinasse, Muriel; Kilpatrick, Nicky; Lees, Melissa M; Picard, Arnaud; Temple, I Karen; Thomas, Paul; Vazquez, Marie-Paule; Vekemans, Michel; Roest Crollius, Hugues; Hastie, Nicholas D; Munnich, Arnold; Etchevers, Heather C; Pelet, Anna; Farlie, Peter G; Fitzpatrick, David R; Lyonnet, Stanislas

    2009-03-01

    Pierre Robin sequence (PRS) is an important subgroup of cleft palate. We report several lines of evidence for the existence of a 17q24 locus underlying PRS, including linkage analysis results, a clustering of translocation breakpoints 1.06-1.23 Mb upstream of SOX9, and microdeletions both approximately 1.5 Mb centromeric and approximately 1.5 Mb telomeric of SOX9. We have also identified a heterozygous point mutation in an evolutionarily conserved region of DNA with in vitro and in vivo features of a developmental enhancer. This enhancer is centromeric to the breakpoint cluster and maps within one of the microdeletion regions. The mutation abrogates the in vitro enhancer function and alters binding of the transcription factor MSX1 as compared to the wild-type sequence. In the developing mouse mandible, the 3-Mb region bounded by the microdeletions shows a regionally specific chromatin decompaction in cells expressing Sox9. Some cases of PRS may thus result from developmental misexpression of SOX9 due to disruption of very-long-range cis-regulatory elements.

  9. Evolutionary dynamics of selfish DNA explains the abundance distribution of genomic subsequences

    PubMed Central

    Sheinman, Michael; Ramisch, Anna; Massip, Florian; Arndt, Peter F.

    2016-01-01

    Since the sequencing of large genomes, many statistical features of their sequences have been found. One intriguing feature is that certain subsequences are much more abundant than others. In fact, abundances of subsequences of a given length are distributed with a scale-free power-law tail, resembling properties of human texts, such as Zipf’s law. Despite recent efforts, the understanding of this phenomenon is still lacking. Here we find that selfish DNA elements, such as those belonging to the Alu family of repeats, dominate the power-law tail. Interestingly, for the Alu elements the power-law exponent increases with the length of the considered subsequences. Motivated by these observations, we develop a model of selfish DNA expansion. The predictions of this model qualitatively and quantitatively agree with the empirical observations. This allows us to estimate parameters for the process of selfish DNA spreading in a genome during its evolution. The obtained results shed light on how evolution of selfish DNA elements shapes non-trivial statistical properties of genomes. PMID:27488939

  10. Ectopic recombination between Ty elements in Saccharomyces cerevisiae is not induced by DNA damage.

    PubMed

    Parket, A; Kupiec, M

    1992-10-01

    Mitotic recombination is increased when cells are treated with a variety of physical and chemical agents that cause damage to their DNA. We show here, using Saccharomyces cerevisiae strains that carry marked Ty elements, that recombination between members of this family of retrotransposons is not increased by UV irradiation or by treatment with the radiomimetic drug methyl methanesulfonate. Both ectopic recombination and mutation events were elevated by these agents for non-Ty sequences in the same strain. We discuss possible mechanisms that can prevent the induction of recombination between Ty elements.

  11. Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly

    PubMed Central

    Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka

    2010-01-01

    Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877

  12. Transcription initiation complex structures elucidate DNA opening.

    PubMed

    Plaschka, C; Hantsche, M; Dienemann, C; Burzinski, C; Plitzko, J; Cramer, P

    2016-05-19

    Transcription of eukaryotic protein-coding genes begins with assembly of the RNA polymerase (Pol) II initiation complex and promoter DNA opening. Here we report cryo-electron microscopy (cryo-EM) structures of yeast initiation complexes containing closed and open DNA at resolutions of 8.8 Å and 3.6 Å, respectively. DNA is positioned and retained over the Pol II cleft by a network of interactions between the TATA-box-binding protein TBP and transcription factors TFIIA, TFIIB, TFIIE, and TFIIF. DNA opening occurs around the tip of the Pol II clamp and the TFIIE 'extended winged helix' domain, and can occur in the absence of TFIIH. Loading of the DNA template strand into the active centre may be facilitated by movements of obstructing protein elements triggered by allosteric binding of the TFIIE 'E-ribbon' domain. The results suggest a unified model for transcription initiation with a key event, the trapping of open promoter DNA by extended protein-protein and protein-DNA contacts.

  13. San Diego Supercomputer Center

    Science.gov Websites

    Nile and Zika virusLearn More image Variants in Non-Coding DNA Contribute to Inherited Autism RiskGene mutations appearing for the first time contribute to approximately one-third of cases of autism spectrum

  14. Unraveling transcriptional control and cis-regulatory codes using the software suite GeneACT

    PubMed Central

    Cheung, Tom Hiu; Kwan, Yin Lam; Hamady, Micah; Liu, Xuedong

    2006-01-01

    Deciphering gene regulatory networks requires the systematic identification of functional cis-acting regulatory elements. We present a suite of web-based bioinformatics tools, called GeneACT , that can rapidly detect evolutionarily conserved transcription factor binding sites or microRNA target sites that are either unique or over-represented in differentially expressed genes from DNA microarray data. GeneACT provides graphic visualization and extraction of common regulatory sequence elements in the promoters and 3'-untranslated regions that are conserved across multiple mammalian species. PMID:17064417

  15. ADAPTION OF NONSTANDARD PIPING COMPONENTS INTO PRESENT DAY SEISMIC CODES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    D. T. Clark; M. J. Russell; R. E. Spears

    2009-07-01

    With spiraling energy demand and flat energy supply, there is a need to extend the life of older nuclear reactors. This sometimes requires that existing systems be evaluated to present day seismic codes. Older reactors built in the 1960s and early 1970s often used fabricated piping components that were code compliant during their initial construction time period, but are outside the standard parameters of present-day piping codes. There are several approaches available to the analyst in evaluating these non-standard components to modern codes. The simplest approach is to use the flexibility factors and stress indices for similar standard components withmore » the assumption that the non-standard component’s flexibility factors and stress indices will be very similar. This approach can require significant engineering judgment. A more rational approach available in Section III of the ASME Boiler and Pressure Vessel Code, which is the subject of this paper, involves calculation of flexibility factors using finite element analysis of the non-standard component. Such analysis allows modeling of geometric and material nonlinearities. Flexibility factors based on these analyses are sensitive to the load magnitudes used in their calculation, load magnitudes that need to be consistent with those produced by the linear system analyses where the flexibility factors are applied. This can lead to iteration, since the magnitude of the loads produced by the linear system analysis depend on the magnitude of the flexibility factors. After the loading applied to the nonstandard component finite element model has been matched to loads produced by the associated linear system model, the component finite element model can then be used to evaluate the performance of the component under the loads with the nonlinear analysis provisions of the Code, should the load levels lead to calculated stresses in excess of Allowable stresses. This paper details the application of component-level finite element modeling to account for geometric and material nonlinear component behavior in a linear elastic piping system model. Note that this technique can be applied to the analysis of B31 piping systems.« less

  16. Mechanisms generating long range correlation in nucleotide composition of the Borrelia Burgdorferi genome

    NASA Astrophysics Data System (ADS)

    Mackiewicz, P.; Gierlik, A.; Kowalczuk, M.; Szczepanik, D.; Dudek, M. R.; Cebrat, S.

    1999-12-01

    We have analysed protein coding and intergenic sequences in the Borrelia burgdorferi (the Lyme disease bacterium) genome using different kinds of DNA walks. Genes occupying the leading strand of DNA have significantly different nucleotide composition from genes occupying the lagging strand. Nucleotide compositional bias of the two DNA strands reflects the aminoacid composition of proteins. 96% of genes coding for ribosomal proteins lie on the leading DNA strand, which suggests that the positions of these as well as other genes are non-random. In the B. burgdorferi genome, the asymmetry in intergenic DNA sequences is lower than the asymmetry in the third positions in codons. All these characters of the B. burgdorferi genome suggest that both replication-associated mutational pressure and recombination mechanisms have established the specific structure of the genome and now any recombination leading to inversion of a gene in respect to the direction of replication is forbidden. This property of the genome allows us to assume that it is in a steady state, which enables us to fix some parameters for simulations of DNA evolution.

  17. Identification of common, unique and polymorphic microsatellites among 73 cyanobacterial genomes.

    PubMed

    Kabra, Ritika; Kapil, Aditi; Attarwala, Kherunnisa; Rai, Piyush Kant; Shanker, Asheesh

    2016-04-01

    Microsatellites also known as Simple Sequence Repeats are short tandem repeats of 1-6 nucleotides. These repeats are found in coding as well as non-coding regions of both prokaryotic and eukaryotic genomes and play a significant role in the study of gene regulation, genetic mapping, DNA fingerprinting and evolutionary studies. The availability of 73 complete genome sequences of cyanobacteria enabled us to mine and statistically analyze microsatellites in these genomes. The cyanobacterial microsatellites identified through bioinformatics analysis were stored in a user-friendly database named CyanoSat, which is an efficient data representation and query system designed using ASP.net. The information in CyanoSat comprises of perfect, imperfect and compound microsatellites found in coding, non-coding and coding-non-coding regions. Moreover, it contains PCR primers with 200 nucleotides long flanking region. The mined cyanobacterial microsatellites can be freely accessed at www.compubio.in/CyanoSat/home.aspx. In addition to this 82 polymorphic, 13,866 unique and 2390 common microsatellites were also detected. These microsatellites will be useful in strain identification and genetic diversity studies of cyanobacteria.

  18. Mavericks, a novel class of giant transposable elements widespread in eukaryotes and related to DNA viruses.

    PubMed

    Pritham, Ellen J; Putliwala, Tasneem; Feschotte, Cédric

    2007-04-01

    We previously identified a group of atypical mobile elements designated Mavericks from the nematodes Caenorhabditis elegans and C. briggsae and the zebrafish Danio rerio. Here we present the results of comprehensive database searches of the genome sequences available, which reveal that Mavericks are widespread in invertebrates and non-mammalian vertebrates but show a patchy distribution in non-animal species, being present in the fungi Glomus intraradices and Phakopsora pachyrhizi and in several single-celled eukaryotes such as the ciliate Tetrahymena thermophila, the stramenopile Phytophthora infestans and the trichomonad Trichomonas vaginalis, but not detectable in plants. This distribution, together with comparative and phylogenetic analyses of Maverick-encoded proteins, is suggestive of an ancient origin of these elements in eukaryotes followed by lineage-specific losses and/or recurrent episodes of horizontal transmission. In addition, we report that Maverick elements have amplified recently to high copy numbers in T. vaginalis where they now occupy as much as 30% of the genome. Sequence analysis confirms that most Mavericks encode a retroviral-like integrase, but lack other open reading frames typically found in retroelements. Nevertheless, the length and conservation of the target site duplication created upon Maverick insertion (5- or 6-bp) is consistent with a role of the integrase-like protein in the integration of a double-stranded DNA transposition intermediate. Mavericks also display long terminal-inverted repeats but do not contain ORFs similar to proteins encoded by DNA transposons. Instead, Mavericks encode a conserved set of 5 to 9 genes (in addition to the integrase) that are predicted to encode proteins with homology to replication and packaging proteins of some bacteriophages and diverse eukaryotic double-stranded DNA viruses, including a DNA polymerase B homolog and putative capsid proteins. Based on these and other structural similarities, we speculate that Mavericks represent an evolutionary missing link between seemingly disparate invasive DNA elements that include bacteriophages, adenoviruses and eukaryotic linear plasmids.

  19. Trichomonas vaginalis ribosomal RNA: identification and characterisation of the transcription promoter and terminator sequences.

    PubMed

    Franco, Bernardo; Hernández, Roberto; López-Villaseñor, Imelda

    2012-09-01

    Trichomonas vaginalis is a parasitic protozoan of both medical and biological relevance. Transcriptional studies in this organism have focused mainly on type II pol promoters, whereas the elements necessary for transcription by polI or polIII have not been investigated. Here, with the aid of a transient transcription system, we characterised the rDNA intergenic region, defining both the promoter and the terminator sequences required for transcription. We defined the promoter as a compact region of approximately 180 bp. We also identified a potential upstream control element (UCE) that was located 80 bp upstream of the transcription start point (TSP). A transcription termination element was identified within a 34 bp region that was located immediately downstream of the 28S coding sequence. The function of this element depends upon polarity and the presence of both a stretch of uridine residues (U's) and a hairpin structure in the transcript. Our observations provide a strong basis for the study of DNA recognition by the polI transcriptional machinery in this early divergent organism. Copyright © 2012 Elsevier B.V. All rights reserved.

  20. Regulated Formation of lncRNA-DNA Hybrids Enables Faster Transcriptional Induction and Environmental Adaptation.

    PubMed

    Cloutier, Sara C; Wang, Siwen; Ma, Wai Kit; Al Husini, Nadra; Dhoondia, Zuzer; Ansari, Athar; Pascuzzi, Pete E; Tran, Elizabeth J

    2016-02-04

    Long non-coding (lnc)RNAs, once thought to merely represent noise from imprecise transcription initiation, have now emerged as major regulatory entities in all eukaryotes. In contrast to the rapidly expanding identification of individual lncRNAs, mechanistic characterization has lagged behind. Here we provide evidence that the GAL lncRNAs in the budding yeast S. cerevisiae promote transcriptional induction in trans by formation of lncRNA-DNA hybrids or R-loops. The evolutionarily conserved RNA helicase Dbp2 regulates formation of these R-loops as genomic deletion or nuclear depletion results in accumulation of these structures across the GAL cluster gene promoters and coding regions. Enhanced transcriptional induction is manifested by lncRNA-dependent displacement of the Cyc8 co-repressor and subsequent gene looping, suggesting that these lncRNAs promote induction by altering chromatin architecture. Moreover, the GAL lncRNAs confer a competitive fitness advantage to yeast cells because expression of these non-coding molecules correlates with faster adaptation in response to an environmental switch. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. In Vivo Control of CpG and Non-CpG DNA Methylation by DNA Methyltransferases

    PubMed Central

    Arand, Julia; Spieler, David; Karius, Tommy; Branco, Miguel R.; Meilinger, Daniela; Meissner, Alexander; Jenuwein, Thomas; Xu, Guoliang; Leonhardt, Heinrich; Wolf, Verena; Walter, Jörn

    2012-01-01

    The enzymatic control of the setting and maintenance of symmetric and non-symmetric DNA methylation patterns in a particular genome context is not well understood. Here, we describe a comprehensive analysis of DNA methylation patterns generated by high resolution sequencing of hairpin-bisulfite amplicons of selected single copy genes and repetitive elements (LINE1, B1, IAP-LTR-retrotransposons, and major satellites). The analysis unambiguously identifies a substantial amount of regional incomplete methylation maintenance, i.e. hemimethylated CpG positions, with variant degrees among cell types. Moreover, non-CpG cytosine methylation is confined to ESCs and exclusively catalysed by Dnmt3a and Dnmt3b. This sequence position–, cell type–, and region-dependent non-CpG methylation is strongly linked to neighboring CpG methylation and requires the presence of Dnmt3L. The generation of a comprehensive data set of 146,000 CpG dyads was used to apply and develop parameter estimated hidden Markov models (HMM) to calculate the relative contribution of DNA methyltransferases (Dnmts) for de novo and maintenance DNA methylation. The comparative modelling included wild-type ESCs and mutant ESCs deficient for Dnmt1, Dnmt3a, Dnmt3b, or Dnmt3a/3b, respectively. The HMM analysis identifies a considerable de novo methylation activity for Dnmt1 at certain repetitive elements and single copy sequences. Dnmt3a and Dnmt3b contribute de novo function. However, both enzymes are also essential to maintain symmetrical CpG methylation at distinct repetitive and single copy sequences in ESCs. PMID:22761581

  2. Modular assembly of chimeric phi29 packaging RNAs that support DNA packaging.

    PubMed

    Fang, Yun; Shu, Dan; Xiao, Feng; Guo, Peixuan; Qin, Peter Z

    2008-08-08

    The bacteriophage phi29 DNA packaging motor is a protein/RNA complex that can produce strong force to condense the linear-double-stranded DNA genome into a pre-formed protein capsid. The RNA component, called the packaging RNA (pRNA), utilizes magnesium-dependent inter-molecular base-pairing interactions to form ring-shaped complexes. The pRNA is a class of non-coding RNA, interacting with phi29 motor proteins to enable DNA packaging. Here, we report a two-piece chimeric pRNA construct that is fully competent in interacting with partner pRNA to form ring-shaped complexes, in packaging DNA via the motor, and in assembling infectious phi29 virions in vitro. This is the first example of a fully functional pRNA assembled using two non-covalently interacting fragments. The results support the notion of modular pRNA architecture in the phi29 packaging motor.

  3. Modular assembly of chimeric phi29 packaging RNAs that support DNA packaging

    PubMed Central

    Fang, Yun; Shu, Dan; Xiao, Feng; Guo, Peixuan; Qin, Peter Z.

    2008-01-01

    The bacteriophage phi29 DNA packaging motor is a protein/RNA complex that can produce strong force to condense the linear-double stranded DNA genome into a pre-formed protein capsid. The RNA component, called the packaging RNA (pRNA), utilizes magnesium-dependent intermolecular base-pairing interactions to form ring-shaped complexes. The pRNA is a class of non-coding RNA, interacting with phi29 motor proteins to enable DNA packaging. Here, we report a 2-piece chimeric pRNA construct that is fully competent in interacting with partner pRNA to form ring-shaped complexes, in packaging DNA via the motor, and in assembling infectious phi29 virions in vitro. This is the first example of a fully functional pRNA assembled using two non-covalently interacting fragments. The results support the notion of modular pRNA architecture in the phi29 packaging motor. PMID:18514064

  4. DOUAR: A new three-dimensional creeping flow numerical model for the solution of geological problems

    NASA Astrophysics Data System (ADS)

    Braun, Jean; Thieulot, Cédric; Fullsack, Philippe; DeKool, Marthijn; Beaumont, Christopher; Huismans, Ritske

    2008-12-01

    We present a new finite element code for the solution of the Stokes and energy (or heat transport) equations that has been purposely designed to address crustal-scale to mantle-scale flow problems in three dimensions. Although it is based on an Eulerian description of deformation and flow, the code, which we named DOUAR ('Earth' in Breton language), has the ability to track interfaces and, in particular, the free surface, by using a dual representation based on a set of particles placed on the interface and the computation of a level set function on the nodes of the finite element grid, thus ensuring accuracy and efficiency. The code also makes use of a new method to compute the dynamic Delaunay triangulation connecting the particles based on non-Euclidian, curvilinear measure of distance, ensuring that the density of particles remains uniform and/or dynamically adapted to the curvature of the interface. The finite element discretization is based on a non-uniform, yet regular octree division of space within a unit cube that allows efficient adaptation of the finite element discretization, i.e. in regions of strong velocity gradient or high interface curvature. The finite elements are cubes (the leaves of the octree) in which a q1- p0 interpolation scheme is used. Nodal incompatibilities across faces separating elements of differing size are dealt with by introducing linear constraints among nodal degrees of freedom. Discontinuities in material properties across the interfaces are accommodated by the use of a novel method (which we called divFEM) to integrate the finite element equations in which the elemental volume is divided by a local octree to an appropriate depth (resolution). A variety of rheologies have been implemented including linear, non-linear and thermally activated creep and brittle (or plastic) frictional deformation. A simple smoothing operator has been defined to avoid checkerboard oscillations in pressure that tend to develop when using a highly irregular octree discretization and the tri-linear (or q1- p0) finite element. A three-dimensional cloud of particles is used to track material properties that depend on the integrated history of deformation (the integrated strain, for example); its density is variable and dynamically adapted to the computed flow. The large system of algebraic equations that results from the finite element discretization and linearization of the basic partial differential equations is solved using a multi-frontal massively parallel direct solver that can efficiently factorize poorly conditioned systems resulting from the highly non-linear rheology and the presence of the free surface. The code is almost entirely parallelized. We present example results including the onset of a Rayleigh-Taylor instability, the indentation of a rigid-plastic material and the formation of a fold beneath a free eroding surface, that demonstrate the accuracy, efficiency and appropriateness of the new code to solve complex geodynamical problems in three dimensions.

  5. An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

    PubMed Central

    Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S

    1999-01-01

    A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707

  6. MicroRNAs in large herpesvirus DNA genomes: recent advances.

    PubMed

    Sorel, Océane; Dewals, Benjamin G

    2016-08-01

    MicroRNAs (miRNAs) are small non-coding RNAs (ncRNAs) that regulate gene expression. They alter mRNA translation through base-pair complementarity, leading to regulation of genes during both physiological and pathological processes. Viruses have evolved mechanisms to take advantage of the host cells to multiply and/or persist over the lifetime of the host. Herpesviridae are a large family of double-stranded DNA viruses that are associated with a number of important diseases, including lymphoproliferative diseases. Herpesviruses establish lifelong latent infections through modulation of the interface between the virus and its host. A number of reports have identified miRNAs in a very large number of human and animal herpesviruses suggesting that these short non-coding transcripts could play essential roles in herpesvirus biology. This review will specifically focus on the recent advances on the functions of herpesvirus miRNAs in infection and pathogenesis.

  7. Selfish DNA in protein-coding genes of Rickettsia.

    PubMed

    Ogata, H; Audic, S; Barbe, V; Artiguenave, F; Fournier, P E; Raoult, D; Claverie, J M

    2000-10-13

    Rickettsia conorii, the aetiological agent of Mediterranean spotted fever, is an intracellular bacterium transmitted by ticks. Preliminary analyses of the nearly complete genome sequence of R. conorii have revealed 44 occurrences of a previously undescribed palindromic repeat (150 base pairs long) throughout the genome. Unexpectedly, this repeat was found inserted in-frame within 19 different R. conorii open reading frames likely to encode functional proteins. We found the same repeat in proteins of other Rickettsia species. The finding of a mobile element inserted in many unrelated genes suggests the potential role of selfish DNA in the creation of new protein sequences.

  8. Non-homeodomain regions of Hox proteins mediate activation versus repression of Six2 via a single enhancer site in vivo

    PubMed Central

    Yallowitz, Alisha R.; Gong, Ke-Qin; Swinehart, Ilea T.; Nelson, Lisa T.; Wellik, Deneen M.

    2009-01-01

    Summary Hox genes control many developmental events along the AP axis, but few target genes have been identified. Whether target genes are activated or repressed, what enhancer elements are required for regulation, and how different domains of the Hox proteins contribute to regulatory specificity is poorly understood. Six2 is genetically downstream of both the Hox11 paralogous genes in the developing mammalian kidney and Hoxa2 in branchial arch and facial mesenchyme. Loss-of-function of Hox11 leads to loss of Six2 expression and loss-of-function of Hoxa2 leads to expanded Six2 expression. Herein we demonstrate that a single enhancer site upstream of the Six2 coding sequence is responsible for both activation by Hox11 proteins in the kidney and repression by Hoxa2 in the branchial arch and facial mesenchyme in vivo. DNA binding activity is required for both activation and repression, but differential activity is not controlled by differences in the homeodomains. Rather, protein domains N- and C-terminal to the homeodomain confer activation versus repression activity. These data support a model in which the DNA binding specificity of Hox proteins in vivo may be similar, consistent with accumulated in vitro data, and that unique functions result mainly from differential interactions mediated by non-homeodomain regions of Hox proteins. PMID:19716816

  9. Rare pseudoautosomal copy-number variations involving SHOX and/or its flanking regions in individuals with and without short stature.

    PubMed

    Fukami, Maki; Naiki, Yasuhiro; Muroya, Koji; Hamajima, Takashi; Soneda, Shun; Horikawa, Reiko; Jinno, Tomoko; Katsumi, Momori; Nakamura, Akie; Asakura, Yumi; Adachi, Masanori; Ogata, Tsutomu; Kanzaki, Susumu

    2015-09-01

    Pseudoautosomal region 1 (PAR1) contains SHOX, in addition to seven highly conserved non-coding DNA elements (CNEs) with cis-regulatory activity. Microdeletions involving SHOX exons 1-6a and/or the CNEs result in idiopathic short stature (ISS) and Leri-Weill dyschondrosteosis (LWD). Here, we report six rare copy-number variations (CNVs) in PAR1 identified through copy-number analyzes of 245 ISS/LWD patients and 15 unaffected individuals. The six CNVs consisted of three microduplications encompassing SHOX and some of the CNEs, two microduplications in the SHOX 3'-region affecting one or four of the downstream CNEs, and a microdeletion involving SHOX exon 6b and its neighboring CNE. The amplified DNA fragments of two SHOX-containing duplications were detected at chromosomal regions adjacent to the original positions. The breakpoints of a SHOX-containing duplication resided within Alu repeats. A microduplication encompassing four downstream CNEs was identified in an unaffected father-daughter pair, whereas the other five CNVs were detected in ISS patients. These results suggest that microduplications involving SHOX cause ISS by disrupting the cis-regulatory machinery of this gene and that at least some of microduplications in PAR1 arise from Alu-mediated non-allelic homologous recombination. The pathogenicity of other rare PAR1-linked CNVs, such as CNE-containing microduplications and exon 6b-flanking microdeletions, merits further investigation.

  10. Detecting authorized and unauthorized genetically modified organisms containing vip3A by real-time PCR and next-generation sequencing.

    PubMed

    Liang, Chanjuan; van Dijk, Jeroen P; Scholtens, Ingrid M J; Staats, Martijn; Prins, Theo W; Voorhuijzen, Marleen M; da Silva, Andrea M; Arisi, Ana Carolina Maisonnave; den Dunnen, Johan T; Kok, Esther J

    2014-04-01

    The growing number of biotech crops with novel genetic elements increasingly complicates the detection of genetically modified organisms (GMOs) in food and feed samples using conventional screening methods. Unauthorized GMOs (UGMOs) in food and feed are currently identified through combining GMO element screening with sequencing the DNA flanking these elements. In this study, a specific and sensitive qPCR assay was developed for vip3A element detection based on the vip3Aa20 coding sequences of the recently marketed MIR162 maize and COT102 cotton. Furthermore, SiteFinding-PCR in combination with Sanger, Illumina or Pacific BioSciences (PacBio) sequencing was performed targeting the flanking DNA of the vip3Aa20 element in MIR162. De novo assembly and Basic Local Alignment Search Tool searches were used to mimic UGMO identification. PacBio data resulted in relatively long contigs in the upstream (1,326 nucleotides (nt); 95 % identity) and downstream (1,135 nt; 92 % identity) regions, whereas Illumina data resulted in two smaller contigs of 858 and 1,038 nt with higher sequence identity (>99 % identity). Both approaches outperformed Sanger sequencing, underlining the potential for next-generation sequencing in UGMO identification.

  11. Cell cycle, oncogenic and tumor suppressor pathways regulate numerous long and macro non-protein-coding RNAs

    PubMed Central

    2014-01-01

    Background The genome is pervasively transcribed but most transcripts do not code for proteins, constituting non-protein-coding RNAs. Despite increasing numbers of functional reports of individual long non-coding RNAs (lncRNAs), assessing the extent of functionality among the non-coding transcriptional output of mammalian cells remains intricate. In the protein-coding world, transcripts differentially expressed in the context of processes essential for the survival of multicellular organisms have been instrumental in the discovery of functionally relevant proteins and their deregulation is frequently associated with diseases. We therefore systematically identified lncRNAs expressed differentially in response to oncologically relevant processes and cell-cycle, p53 and STAT3 pathways, using tiling arrays. Results We found that up to 80% of the pathway-triggered transcriptional responses are non-coding. Among these we identified very large macroRNAs with pathway-specific expression patterns and demonstrated that these are likely continuous transcripts. MacroRNAs contain elements conserved in mammals and sauropsids, which in part exhibit conserved RNA secondary structure. Comparing evolutionary rates of a macroRNA to adjacent protein-coding genes suggests a local action of the transcript. Finally, in different grades of astrocytoma, a tumor disease unrelated to the initially used cell lines, macroRNAs are differentially expressed. Conclusions It has been shown previously that the majority of expressed non-ribosomal transcripts are non-coding. We now conclude that differential expression triggered by signaling pathways gives rise to a similar abundance of non-coding content. It is thus unlikely that the prevalence of non-coding transcripts in the cell is a trivial consequence of leaky or random transcription events. PMID:24594072

  12. Chromatin-Specific Regulation of Mammalian rDNA Transcription by Clustered TTF-I Binding Sites

    PubMed Central

    Diermeier, Sarah D.; Németh, Attila; Rehli, Michael; Grummt, Ingrid; Längst, Gernot

    2013-01-01

    Enhancers and promoters often contain multiple binding sites for the same transcription factor, suggesting that homotypic clustering of binding sites may serve a role in transcription regulation. Here we show that clustering of binding sites for the transcription termination factor TTF-I downstream of the pre-rRNA coding region specifies transcription termination, increases the efficiency of transcription initiation and affects the three-dimensional structure of rRNA genes. On chromatin templates, but not on free rDNA, clustered binding sites promote cooperative binding of TTF-I, loading TTF-I to the downstream terminators before it binds to the rDNA promoter. Interaction of TTF-I with target sites upstream and downstream of the rDNA transcription unit connects these distal DNA elements by forming a chromatin loop between the rDNA promoter and the terminators. The results imply that clustered binding sites increase the binding affinity of transcription factors in chromatin, thus influencing the timing and strength of DNA-dependent processes. PMID:24068958

  13. Computational DNA hole spectroscopy: A new tool to predict mutation hotspots, critical base pairs, and disease ‘driver’ mutations

    PubMed Central

    Suárez, Martha Y.; Villagrán; Miller, John H.

    2015-01-01

    We report on a new technique, computational DNA hole spectroscopy, which creates spectra of electron hole probabilities vs. nucleotide position. A hole is a site of positive charge created when an electron is removed. Peaks in the hole spectrum depict sites where holes tend to localize and potentially trigger a base pair mismatch during replication. Our studies of mitochondrial DNA reveal a correlation between L-strand hole spectrum peaks and spikes in the human mutation spectrum. Importantly, we also find that hole peak positions that do not coincide with large variant frequencies often coincide with disease-implicated mutations and/or (for coding DNA) encoded conserved amino acids. This enables combining hole spectra with variant data to identify critical base pairs and potential disease ‘driver’ mutations. Such integration of DNA hole and variance spectra could ultimately prove invaluable for pinpointing critical regions of the vast non-protein-coding genome. An observed asymmetry in correlations, between the spectrum of human mtDNA variations and the L- and H-strand hole spectra, is attributed to asymmetric DNA replication processes that occur for the leading and lagging strands. PMID:26310834

  14. Computational DNA hole spectroscopy: A new tool to predict mutation hotspots, critical base pairs, and disease 'driver' mutations.

    PubMed

    Villagrán, Martha Y Suárez; Miller, John H

    2015-08-27

    We report on a new technique, computational DNA hole spectroscopy, which creates spectra of electron hole probabilities vs. nucleotide position. A hole is a site of positive charge created when an electron is removed. Peaks in the hole spectrum depict sites where holes tend to localize and potentially trigger a base pair mismatch during replication. Our studies of mitochondrial DNA reveal a correlation between L-strand hole spectrum peaks and spikes in the human mutation spectrum. Importantly, we also find that hole peak positions that do not coincide with large variant frequencies often coincide with disease-implicated mutations and/or (for coding DNA) encoded conserved amino acids. This enables combining hole spectra with variant data to identify critical base pairs and potential disease 'driver' mutations. Such integration of DNA hole and variance spectra could ultimately prove invaluable for pinpointing critical regions of the vast non-protein-coding genome. An observed asymmetry in correlations, between the spectrum of human mtDNA variations and the L- and H-strand hole spectra, is attributed to asymmetric DNA replication processes that occur for the leading and lagging strands.

  15. DNA barcode goes two-dimensions: DNA QR code web server.

    PubMed

    Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

    2012-01-01

    The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.

  16. Twisting Right to Left: A…A Mismatch in a CAG Trinucleotide Repeat Overexpansion Provokes Left-Handed Z-DNA Conformation

    PubMed Central

    2015-01-01

    Conformational polymorphism of DNA is a major causative factor behind several incurable trinucleotide repeat expansion disorders that arise from overexpansion of trinucleotide repeats located in coding/non-coding regions of specific genes. Hairpin DNA structures that are formed due to overexpansion of CAG repeat lead to Huntington’s disorder and spinocerebellar ataxias. Nonetheless, DNA hairpin stem structure that generally embraces B-form with canonical base pairs is poorly understood in the context of periodic noncanonical A…A mismatch as found in CAG repeat overexpansion. Molecular dynamics simulations on DNA hairpin stems containing A…A mismatches in a CAG repeat overexpansion show that A…A dictates local Z-form irrespective of starting glycosyl conformation, in sharp contrast to canonical DNA duplex. Transition from B-to-Z is due to the mechanistic effect that originates from its pronounced nonisostericity with flanking canonical base pairs facilitated by base extrusion, backbone and/or base flipping. Based on these structural insights we envisage that such an unusual DNA structure of the CAG hairpin stem may have a role in disease pathogenesis. As this is the first study that delineates the influence of a single A…A mismatch in reversing DNA helicity, it would further have an impact on understanding DNA mismatch repair. PMID:25876062

  17. A Tandemly Arranged Pattern of Two 5S rDNA Arrays in Amolops mantzorum (Anura, Ranidae).

    PubMed

    Liu, Ting; Song, Menghuan; Xia, Yun; Zeng, Xiaomao

    2017-01-01

    In an attempt to extend the knowledge of the 5S rDNA organization in anurans, the 5S rDNA sequences of Amolops mantzorum were isolated, characterized, and mapped by FISH. Two forms of 5S rDNA, type I (209 bp) and type II (about 870 bp), were found in specimens investigated from various populations. Both of them contained a 118-bp coding sequence, readily differentiated by their non-transcribed spacer (NTS) sizes and compositions. Four probes (the 5S rDNA coding sequences, the type I NTS, the type II NTS, and the entire type II 5S rDNA sequences) were respectively labeled with TAMRA or digoxigenin to hybridize with mitotic chromosomes for samples of all localities. It turned out that all probes showed the same signals that appeared in every centromeric region and in the telomeric regions of chromosome 5, without differences within or between populations. Obviously, both type I and type II of the 5S rDNA arrays arranged in tandem, which was contrasting with other frogs or fishes recorded to date. More interestingly, all the probes detected centromeric regions in all karyotypes, suggesting the presence of a satellite DNA family derived from 5S rDNA. © 2017 S. Karger AG, Basel.

  18. Dead Element Replicating: Degenerate R2 Element Replication and rDNA Genomic Turnover in the Bacillus rossius Stick Insect (Insecta: Phasmida)

    PubMed Central

    Martoni, Francesco; Eickbush, Danna G.; Scavariello, Claudia; Luchetti, Andrea; Mantovani, Barbara

    2015-01-01

    R2 is an extensively investigated non-LTR retrotransposon that specifically inserts into the 28S rRNA gene sequences of a wide range of metazoans, disrupting its functionality. During R2 integration, first strand synthesis can be incomplete so that 5’ end deleted copies are occasionally inserted. While active R2 copies repopulate the locus by retrotransposing, the non-functional truncated elements should frequently be eliminated by molecular drive processes leading to the concerted evolution of the rDNA array(s). Although, multiple R2 lineages have been discovered in the genome of many animals, the rDNA of the stick insect Bacillus rossius exhibits a peculiar situation: it harbors both a canonical, functional R2 element (R2Brfun) as well as a full-length but degenerate element (R2Brdeg). An intensive sequencing survey in the present study reveals that all truncated variants in stick insects are present in multiple copies suggesting they were duplicated by unequal recombination. Sequencing results also demonstrate that all R2Brdeg copies are full-length, i. e. they have no associated 5' end deletions, and functional assays indicate they have lost the active ribozyme necessary for R2 RNA maturation. Although it cannot be completely ruled out, it seems unlikely that the degenerate elements replicate via reverse transcription, exploiting the R2Brfun element enzymatic machinery, but rather via genomic amplification of inserted 28S by unequal recombination. That inactive copies (both R2Brdeg or 5'-truncated elements) are not eliminated in a short term in stick insects contrasts with findings for the Drosophila R2, suggesting a widely different management of rDNA loci and a lower efficiency of the molecular drive while achieving the concerted evolution. PMID:25799008

  19. Extensive Evolutionary Changes in Regulatory Element Activity during Human Origins Are Associated with Altered Gene Expression and Positive Selection

    PubMed Central

    Fedrigo, Olivier; Babbitt, Courtney C.; Wortham, Matthew; Tewari, Alok K.; London, Darin; Song, Lingyun; Lee, Bum-Kyu; Iyer, Vishwanath R.; Parker, Stephen C. J.; Margulies, Elliott H.; Wray, Gregory A.; Furey, Terrence S.; Crawford, Gregory E.

    2012-01-01

    Understanding the molecular basis for phenotypic differences between humans and other primates remains an outstanding challenge. Mutations in non-coding regulatory DNA that alter gene expression have been hypothesized as a key driver of these phenotypic differences. This has been supported by differential gene expression analyses in general, but not by the identification of specific regulatory elements responsible for changes in transcription and phenotype. To identify the genetic source of regulatory differences, we mapped DNaseI hypersensitive (DHS) sites, which mark all types of active gene regulatory elements, genome-wide in the same cell type isolated from human, chimpanzee, and macaque. Most DHS sites were conserved among all three species, as expected based on their central role in regulating transcription. However, we found evidence that several hundred DHS sites were gained or lost on the lineages leading to modern human and chimpanzee. Species-specific DHS site gains are enriched near differentially expressed genes, are positively correlated with increased transcription, show evidence of branch-specific positive selection, and overlap with active chromatin marks. Species-specific sequence differences in transcription factor motifs found within these DHS sites are linked with species-specific changes in chromatin accessibility. Together, these indicate that the regulatory elements identified here are genetic contributors to transcriptional and phenotypic differences among primate species. PMID:22761590

  20. The full mitochondrial genome sequence of Raillietina tetragona from chicken (Cestoda: Davaineidae).

    PubMed

    Liang, Jian-Ying; Lin, Rui-Qing

    2016-11-01

    In the present study, the complete mitochondrial DNA (mtDNA) sequence of Raillietina tetragona was sequenced and its gene contents and genome organizations was compared with that of other tapeworm. The complete mt genome sequence of R. tetragona is 14,444 bp in length. It contains 12 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes, and two non-coding region. All genes are transcribed in the same direction and have a nucleotide composition high in A and T. The contents of A + T of the complete mt genome are 71.4% for R. tetragona. The R. tetragona mt genome sequence provides novel mtDNA marker for studying the molecular epidemiology and population genetics of Raillietina and has implications for the molecular diagnosis of chicken cestodosis caused by Raillietina.

  1. Epigenetic conservation at gene regulatory elements revealed by non-methylated DNA profiling in seven vertebrates

    PubMed Central

    Long, Hannah K; Sims, David; Heger, Andreas; Blackledge, Neil P; Kutter, Claudia; Wright, Megan L; Grützner, Frank; Odom, Duncan T; Patient, Roger; Ponting, Chris P; Klose, Robert J

    2013-01-01

    Two-thirds of gene promoters in mammals are associated with regions of non-methylated DNA, called CpG islands (CGIs), which counteract the repressive effects of DNA methylation on chromatin. In cold-blooded vertebrates, computational CGI predictions often reside away from gene promoters, suggesting a major divergence in gene promoter architecture across vertebrates. By experimentally identifying non-methylated DNA in the genomes of seven diverse vertebrates, we instead reveal that non-methylated islands (NMIs) of DNA are a central feature of vertebrate gene promoters. Furthermore, NMIs are present at orthologous genes across vast evolutionary distances, revealing a surprising level of conservation in this epigenetic feature. By profiling NMIs in different tissues and developmental stages we uncover a unifying set of features that are central to the function of NMIs in vertebrates. Together these findings demonstrate an ancient logic for NMI usage at gene promoters and reveal an unprecedented level of epigenetic conservation across vertebrate evolution. DOI: http://dx.doi.org/10.7554/eLife.00348.001 PMID:23467541

  2. Characterizing the strand-specific distribution of non-CpG methylation in human pluripotent cells.

    PubMed

    Guo, Weilong; Chung, Wen-Yu; Qian, Minping; Pellegrini, Matteo; Zhang, Michael Q

    2014-03-01

    DNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation. With the divide-and-compare strategy, we show that CHG and CHH methylation are not intrinsically different in human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We also find that non-CpG methylation is skewed between the two strands in introns, especially at intron boundaries and in highly expressed genes. Controlling for the proximal sequences of non-CpG sites, we show that the skew of non-CpG methylation in introns is mainly guided by sequence skew. By studying subgroups of transposable elements, we also found that non-CpG methylation is distributed in a strand-specific manner in both short interspersed nuclear elements (SINE) and long interspersed nuclear elements (LINE), but not in long terminal repeats (LTR). Finally, we show that on the antisense strand of Alus, a non-CpG site just downstream of the A-box is highly methylated. Together, the divide-and-compare strategy leads us to identify regions with strand-specific distributions of non-CpG methylation in humans.

  3. Transfer RNA gene-targeted integration: an adaptation of retrotransposable elements to survive in the compact Dictyostelium discoideum genome.

    PubMed

    Winckler, T; Szafranski, K; Glöckner, G

    2005-01-01

    Almost every organism carries along a multitude of molecular parasites known as transposable elements (TEs). TEs influence their host genomes in many ways by expanding genome size and complexity, rearranging genomic DNA, mutagenizing host genes, and altering transcription levels of nearby genes. The eukaryotic microorganism Dictyostelium discoideum is attractive for the study of fundamental biological phenomena such as intercellular communication, formation of multicellularity, cell differentiation, and morphogenesis. D. discoideum has a highly compacted, haploid genome with less than 1 kb of genomic DNA separating coding regions. Nevertheless, the D. discoideum genome is loaded with 10% of TEs that managed to settle and survive in this inhospitable environment. In depth analysis of D. discoideum genome project data has provided intriguing insights into the evolutionary challenges that mobile elements face when they invade compact genomes. Two different mechanisms are used by D. discoideum TEs to avoid disruption of host genes upon retrotransposition. Several TEs have invented the specific targeting of tRNA gene-flanking regions as a means to avoid integration into coding regions. These elements have been dispersed on all chromosomes, closely following the distribution of tRNA genes. By contrast, TEs that lack bona fide integration specificities show a strong bias to nested integration, thus forming large TE clusters at certain chromosomal loci that are hardly resolved by bioinformatics approaches. We summarize our current view of D. discoideum TEs and present new data from the analysis of the complete sequences of D. discoideum chromosomes 1 and 2, which comprise more than one third of the total genome.

  4. RNA from the 5' end of the R2 retrotransposon controls R2 protein binding to and cleavage of its DNA target site.

    PubMed

    Christensen, Shawn M; Ye, Junqiang; Eickbush, Thomas H

    2006-11-21

    Non-LTR retrotransposons insert into eukaryotic genomes by target-primed reverse transcription (TPRT), a process in which cleaved DNA targets are used to prime reverse transcription of the element's RNA transcript. Many of the steps in the integration pathway of these elements can be characterized in vitro for the R2 element because of the rigid sequence specificity of R2 for both its DNA target and its RNA template. R2 retrotransposition involves identical subunits of the R2 protein bound to different DNA sequences upstream and downstream of the insertion site. The key determinant regulating which DNA-binding conformation the protein adopts was found to be a 320-nt RNA sequence from near the 5' end of the R2 element. In the absence of this 5' RNA the R2 protein binds DNA sequences upstream of the insertion site, cleaves the first DNA strand, and conducts TPRT when RNA containing the 3' untranslated region of the R2 transcript is present. In the presence of the 320-nt 5' RNA, the R2 protein binds DNA sequences downstream of the insertion site. Cleavage of the second DNA strand by the downstream subunit does not appear to occur until after the 5' RNA is removed from this subunit. We postulate that the removal of the 5' RNA normally occurs during reverse transcription, and thus provides a critical temporal link to first- and second-strand DNA cleavage in the R2 retrotransposition reaction.

  5. Digital Coding and the Self-Proving Message

    ERIC Educational Resources Information Center

    Dettering, Richard

    1971-01-01

    Author suggests that digital Communication", which relies on arbitrary coding elements, like the phones of speech," overshadows the importance of the analogic symbolism people use more extensively than realized. Non-verbal messages can be more convincing than verbal and can be used to predict patterns of future behavior. (Author/PD)

  6. The DNA Methylome of Human Peripheral Blood Mononuclear Cells

    PubMed Central

    Ye, Mingzhi; Zheng, Hancheng; Yu, Jian; Wu, Honglong; Sun, Jihua; Zhang, Hongyu; Chen, Quan; Luo, Ruibang; Chen, Minfeng; He, Yinghua; Jin, Xin; Zhang, Qinghui; Yu, Chang; Zhou, Guangyu; Sun, Jinfeng; Huang, Yebo; Zheng, Huisong; Cao, Hongzhi; Zhou, Xiaoyu; Guo, Shicheng; Hu, Xueda; Li, Xin; Kristiansen, Karsten; Bolund, Lars; Xu, Jiujin; Wang, Wen; Yang, Huanming; Wang, Jian; Li, Ruiqiang; Beck, Stephan; Wang, Jun; Zhang, Xiuqing

    2010-01-01

    DNA methylation plays an important role in biological processes in human health and disease. Recent technological advances allow unbiased whole-genome DNA methylation (methylome) analysis to be carried out on human cells. Using whole-genome bisulfite sequencing at 24.7-fold coverage (12.3-fold per strand), we report a comprehensive (92.62%) methylome and analysis of the unique sequences in human peripheral blood mononuclear cells (PBMC) from the same Asian individual whose genome was deciphered in the YH project. PBMC constitute an important source for clinical blood tests world-wide. We found that 68.4% of CpG sites and <0.2% of non-CpG sites were methylated, demonstrating that non-CpG cytosine methylation is minor in human PBMC. Analysis of the PBMC methylome revealed a rich epigenomic landscape for 20 distinct genomic features, including regulatory, protein-coding, non-coding, RNA-coding, and repeat sequences. Integration of our methylome data with the YH genome sequence enabled a first comprehensive assessment of allele-specific methylation (ASM) between the two haploid methylomes of any individual and allowed the identification of 599 haploid differentially methylated regions (hDMRs) covering 287 genes. Of these, 76 genes had hDMRs within 2 kb of their transcriptional start sites of which >80% displayed allele-specific expression (ASE). These data demonstrate that ASM is a recurrent phenomenon and is highly correlated with ASE in human PBMCs. Together with recently reported similar studies, our study provides a comprehensive resource for future epigenomic research and confirms new sequencing technology as a paradigm for large-scale epigenomics studies. PMID:21085693

  7. Non-coding-regulatory regions of human brain genes delineated by bacterial artificial chromosome knock-in mice.

    PubMed

    Schmouth, Jean-François; Castellarin, Mauro; Laprise, Stéphanie; Banks, Kathleen G; Bonaguro, Russell J; McInerny, Simone C; Borretta, Lisa; Amirabbasi, Mahsa; Korecki, Andrea J; Portales-Casamar, Elodie; Wilson, Gary; Dreolini, Lisa; Jones, Steven J M; Wasserman, Wyeth W; Goldowitz, Daniel; Holt, Robert A; Simpson, Elizabeth M

    2013-10-14

    The next big challenge in human genetics is understanding the 98% of the genome that comprises non-coding DNA. Hidden in this DNA are sequences critical for gene regulation, and new experimental strategies are needed to understand the functional role of gene-regulation sequences in health and disease. In this study, we build upon our HuGX ('high-throughput human genes on the X chromosome') strategy to expand our understanding of human gene regulation in vivo. In all, ten human genes known to express in therapeutically important brain regions were chosen for study. For eight of these genes, human bacterial artificial chromosome clones were identified, retrofitted with a reporter, knocked single-copy into the Hprt locus in mouse embryonic stem cells, and mouse strains derived. Five of these human genes expressed in mouse, and all expressed in the adult brain region for which they were chosen. This defined the boundaries of the genomic DNA sufficient for brain expression, and refined our knowledge regarding the complexity of gene regulation. We also characterized for the first time the expression of human MAOA and NR2F2, two genes for which the mouse homologs have been extensively studied in the central nervous system (CNS), and AMOTL1 and NOV, for which roles in CNS have been unclear. We have demonstrated the use of the HuGX strategy to functionally delineate non-coding-regulatory regions of therapeutically important human brain genes. Our results also show that a careful investigation, using publicly available resources and bioinformatics, can lead to accurate predictions of gene expression.

  8. Insights into the strategies used by related group II introns to adapt successfully for the colonisation of a bacterial genome

    PubMed Central

    Martínez-Rodríguez, Laura; García-Rodríguez, Fernando M; Molina-Sánchez, María Dolores; Toro, Nicolás; Martínez-Abarca, Francisco

    2014-01-01

    Group II introns are self-splicing RNAs and site-specific mobile retroelements found in bacterial and organellar genomes. The group II intron RmInt1 is present at high copy number in Sinorhizobium meliloti species, and has a multifunctional intron-encoded protein (IEP) with reverse transcriptase/maturase activities, but lacking the DNA-binding and endonuclease domains. We characterized two RmInt1-related group II introns RmInt2 from S. meliloti strain GR4 and Sr.md.I1 from S. medicae strain WSM419 in terms of splicing and mobility activities. We used both wild-type and engineered intron-donor constructs based on ribozyme ΔORF-coding sequence derivatives, and we determined the DNA target requirements for RmInt2, the element most distantly related to RmInt1. The excision and mobility patterns of intron-donor constructs expressing different combinations of IEP and intron RNA provided experimental evidence for the co-operation of IEPs and intron RNAs from related elements in intron splicing and, in some cases, in intron homing. We were also able to identify the DNA target regions recognized by these IEPs lacking the DNA endonuclease domain. Our results provide new insight into the versatility of related group II introns and the possible co-operation between these elements to facilitate the colonization of bacterial genomes. PMID:25482895

  9. Insights into the strategies used by related group II introns to adapt successfully for the colonisation of a bacterial genome.

    PubMed

    Martínez-Rodríguez, Laura; García-Rodríguez, Fernando M; Molina-Sánchez, María Dolores; Toro, Nicolás; Martínez-Abarca, Francisco

    2014-01-01

    Group II introns are self-splicing RNAs and site-specific mobile retroelements found in bacterial and organellar genomes. The group II intron RmInt1 is present at high copy number in Sinorhizobium meliloti species, and has a multifunctional intron-encoded protein (IEP) with reverse transcriptase/maturase activities, but lacking the DNA-binding and endonuclease domains. We characterized two RmInt1-related group II introns RmInt2 from S. meliloti strain GR4 and Sr.md.I1 from S. medicae strain WSM419 in terms of splicing and mobility activities. We used both wild-type and engineered intron-donor constructs based on ribozyme ΔORF-coding sequence derivatives, and we determined the DNA target requirements for RmInt2, the element most distantly related to RmInt1. The excision and mobility patterns of intron-donor constructs expressing different combinations of IEP and intron RNA provided experimental evidence for the co-operation of IEPs and intron RNAs from related elements in intron splicing and, in some cases, in intron homing. We were also able to identify the DNA target regions recognized by these IEPs lacking the DNA endonuclease domain. Our results provide new insight into the versatility of related group II introns and the possible co-operation between these elements to facilitate the colonization of bacterial genomes.

  10. Elevated Rate of Fixation of Endogenous Retroviral Elements in Haplorhini TRIM5 and TRIM22 Genomic Sequences: Impact on Transcriptional Regulation

    PubMed Central

    Diehl, William E.; Johnson, Welkin E.; Hunter, Eric

    2013-01-01

    All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. Evolutionary studies involving the TRIM6/34/5/22 locus have predominantly focused on the coding sequence of the genes, finding that TRIM5 and TRIM22 have undergone high rates of both non-synonymous nucleotide replacements and in-frame insertions and deletions. We sought to understand if divergent evolutionary pressures on TRIM6/34/5/22 coding regions have selected for modifications in the non-coding regions of these genes and explore whether such non-coding changes may influence the biological function of these genes. The transcribed genomic regions, including the introns, of TRIM6, TRIM34, TRIM5, and TRIM22 from ten Haplorhini primates and one prosimian species were analyzed for transposable element content. In Haplorhini species, TRIM5 displayed an exaggerated interspecies variability, predominantly resulting from changes in the composition of transposable elements in the large first and fourth introns. Multiple lineage-specific endogenous retroviral long terminal repeats (LTRs) were identified in the first intron of TRIM5 and TRIM22. In the prosimian genome, we identified a duplication of TRIM5 with a concomitant loss of TRIM22. The transposable element content of the prosimian TRIM5 genes appears to largely represent the shared Haplorhini/prosimian ancestral state for this gene. Furthermore, we demonstrated that one such differentially fixed LTR provides for species-specific transcriptional regulation of TRIM22 in response to p53 activation. Our results identify a previously unrecognized source of species-specific variation in the antiviral TRIM genes, which can lead to alterations in their transcriptional regulation. These observations suggest that there has existed long-term pressure for exaptation of retroviral LTRs in the non-coding regions of these genes. This likely resulted from serial viral challenges and provided a mechanism for rapid alteration of transcriptional regulation. To our knowledge, this represents the first report of persistent evolutionary pressure for the capture of retroviral LTR insertions. PMID:23516500

  11. Transcription of a protein-coding gene on B chromosomes of the Siberian roe deer (Capreolus pygargus)

    PubMed Central

    2013-01-01

    Background Most eukaryotic species represent stable karyotypes with a particular diploid number. B chromosomes are additional to standard karyotypes and may vary in size, number and morphology even between cells of the same individual. For many years it was generally believed that B chromosomes found in some plant, animal and fungi species lacked active genes. Recently, molecular cytogenetic studies showed the presence of additional copies of protein-coding genes on B chromosomes. However, the transcriptional activity of these genes remained elusive. We studied karyotypes of the Siberian roe deer (Capreolus pygargus) that possess up to 14 B chromosomes to investigate the presence and expression of genes on supernumerary chromosomes. Results Here, we describe a 2 Mbp region homologous to cattle chromosome 3 and containing TNNI3K (partial), FPGT, LRRIQ3 and a large gene-sparse segment on B chromosomes of the Siberian roe deer. The presence of the copy of the autosomal region was demonstrated by B-specific cDNA analysis, PCR assisted mapping, cattle bacterial artificial chromosome (BAC) clone localization and quantitative polymerase chain reaction (qPCR). By comparative analysis of B-specific and non-B chromosomal sequences we discovered some B chromosome-specific mutations in protein-coding genes, which further enabled the detection of a FPGT-TNNI3K transcript expressed from duplicated genes located on B chromosomes in roe deer fibroblasts. Conclusions Discovery of a large autosomal segment in all B chromosomes of the Siberian roe deer further corroborates the view of an autosomal origin for these elements. Detection of a B-derived transcript in fibroblasts implies that the protein coding sequences located on Bs are not fully inactivated. The origin, evolution and effect on host of B chromosomal genes seem to be similar to autosomal segmental duplications, which reinforces the view that supernumerary chromosomal elements might play an important role in genome evolution. PMID:23915065

  12. A finite element code for modelling tracer transport in a non-isothermal two-phase flow system for CO2 geological storage characterization

    NASA Astrophysics Data System (ADS)

    Tong, F.; Niemi, A. P.; Yang, Z.; Fagerlund, F.; Licha, T.; Sauter, M.

    2011-12-01

    This paper presents a new finite element method (FEM) code for modeling tracer transport in a non-isothermal two-phase flow system. The main intended application is simulation of the movement of so-called novel tracers for the purpose of characterization of geologically stored CO2 and its phase partitioning and migration in deep saline formations. The governing equations are based on the conservation of mass and energy. Among the phenomena accounted for are liquid-phase flow, gas flow, heat transport and the movement of the novel tracers. The movement of tracers includes diffusion and the advection associated with the gas and liquid flow. The temperature, gas pressure, suction, concentration of tracer in liquid phase and concentration of tracer in gas phase are chosen as the five primary variables. Parameters such as the density, viscosity, thermal expansion coefficient are expressed in terms of the primary variables. The governing equations are discretized in space using the Galerkin finite element formulation, and are discretized in time by one-dimensional finite difference scheme. This leads to an ill-conditioned FEM equation that has many small entries along the diagonal of the non-symmetric coefficient matrix. In order to deal with the problem of non-symmetric ill-conditioned matrix equation, special techniques are introduced . Firstly, only nonzero elements of the matrix need to be stored. Secondly, it is avoided to directly solve the whole large matrix. Thirdly, a strategy has been used to keep the diversity of solution methods in the calculation process. Additionally, an efficient adaptive mesh technique is included in the code in order to track the wetting front. The code has been validated against several classical analytical solutions, and will be applied for simulating the CO2 injection experiment to be carried out at the Heletz site, Israel, as part of the EU FP7 project MUSTANG.

  13. DNA as a Binary Code: How the Physical Structure of Nucleotide Bases Carries Information

    ERIC Educational Resources Information Center

    McCallister, Gary

    2005-01-01

    The DNA triplet code also functions as a binary code. Because double-ring compounds cannot bind to double-ring compounds in the DNA code, the sequence of bases classified simply as purines or pyrimidines can encode for smaller groups of possible amino acids. This is an intuitive approach to teaching the DNA code. (Contains 6 figures.)

  14. A fully decompressed synthetic bacteriophage øX174 genome assembled and archived in yeast.

    PubMed

    Jaschke, Paul R; Lieberman, Erica K; Rodriguez, Jon; Sierra, Adrian; Endy, Drew

    2012-12-20

    The 5386 nucleotide bacteriophage øX174 genome has a complicated architecture that encodes 11 gene products via overlapping protein coding sequences spanning multiple reading frames. We designed a 6302 nucleotide synthetic surrogate, øX174.1, that fully separates all primary phage protein coding sequences along with cognate translation control elements. To specify øX174.1f, a decompressed genome the same length as wild type, we truncated the gene F coding sequence. We synthesized DNA encoding fragments of øX174.1f and used a combination of in vitro- and yeast-based assembly to produce yeast vectors encoding natural or designer bacteriophage genomes. We isolated clonal preparations of yeast plasmid DNA and transfected E. coli C strains. We recovered viable øX174 particles containing the øX174.1f genome from E. coli C strains that independently express full-length gene F. We expect that yeast can serve as a genomic 'drydock' within which to maintain and manipulate clonal lineages of other obligate lytic phage. Copyright © 2012 Elsevier Inc. All rights reserved.

  15. The HUSH complex cooperates with TRIM28 to repress young retrotransposons and new genes.

    PubMed

    Robbez-Masson, Luisa; Tie, Christopher H C; Conde, Lucia; Tunbak, Hale; Husovsky, Connor; Tchasovnikarova, Iva A; Timms, Richard T; Herrero, Javier; Lehner, Paul J; Rowe, Helen M

    2018-05-04

    Retrotransposons encompass half of the human genome and contribute to the formation of heterochromatin, which provides nuclear structure and regulates gene expression. Here, we asked if the human silencing hub (HUSH) complex is necessary to silence retrotransposons and whether it collaborates with TRIM28 and the chromatin remodeler ATRX at specific genomic loci. We show that the HUSH complex contributes to de novo repression and DNA methylation of a SVA retrotransposon reporter. By using naïve vs. primed mouse pluripotent stem cells, we reveal a critical role for the HUSH complex in naïve cells, implicating it in programming epigenetic marks in development. While the HUSH component FAM208A binds to endogenous retroviruses (ERVs) and long interspersed element-1s (LINE-1s or L1s), it is mainly required to repress evolutionarily young L1s (mouse-specific lineages less than 5 million years old). TRIM28, in contrast, is necessary to repress both ERVs and young L1s. Genes co-repressed by TRIM28 and FAM208A are evolutionarily young, or exhibit tissue-specific expression, are enriched in young L1s and display evidence for regulation through LTR promoters. Finally, we demonstrate that the HUSH complex is also required to repress L1 elements in human cells. Overall, these data indicate that the HUSH complex and TRIM28 co-repress young retrotransposons and new genes rewired by retrotransposon non-coding DNA. Published by Cold Spring Harbor Laboratory Press.

  16. QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks

    PubMed Central

    Thibodeau, Asa; Márquez, Eladio J.; Luo, Oscar; Ruan, Yijun; Shin, Dong-Guk; Stitzel, Michael L.; Ucar, Duygu

    2016-01-01

    Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. AVAILABILITY: QuIN’s web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/. PMID:27336171

  17. Advances in RNA Structure Determination | Center for Cancer Research

    Cancer.gov

    The recent years have witnessed a revolution in the field of RNA structure and function. Until recently the main contribution of RNA in cellular and disease functions was considered to be a role defined by the central dogma, namely DNA codes for mRNAs, which in turn encode for proteins, a notion facilitated by non-coding ribosomal RNA and tRNA. It was also assumed at the time

  18. Detection and characterization of miniature inverted-repeat transposable elements in “Candidatus Liberibacter asiaticus”

    USDA-ARS?s Scientific Manuscript database

    Miniature inverted-repeat transposable elements (MITEs) are non-autonomous transposons (devoid a transposase gene, tps) involving insertion/deletion of genomic DNA in bacterial genomes influencing gene functions. No transposon has yet been reported in “Candidatus Liberibacter asiaticus”, an alpha-pr...

  19. GRID-seq reveals the global RNA-chromatin interactome

    PubMed Central

    Li, Xiao; Zhou, Bing; Chen, Liang; Gou, Lan-Tao; Li, Hairi; Fu, Xiang-Dong

    2017-01-01

    Higher eukaryotic genomes are bound by a large number of coding and non-coding RNAs, but approaches to comprehensively map the identity and binding sites of these RNAs are lacking. Here we report a method to in situ capture global RNA interactions with DNA by deep sequencing (GRID-seq), which enables the comprehensive identification of the entire repertoire of chromatin-interacting RNAs and their respective binding sites. In human, mouse and Drosophila cells, we detected a large set of tissue-specific coding and non-coding RNAs that are bound to active promoters and enhancers, especially super-enhancers. Assuming that most mRNA-chromatin interactions indicate the physical proximity of a promoter and an enhancer, we constructed a three-dimensional global connectivity map of promoters and enhancers, revealing transcription activity-linked genomic interactions in the nucleus. PMID:28922346

  20. [Long non-coding RNAs in the pathophysiology of atherosclerosis].

    PubMed

    Novak, Jan; Vašků, Julie Bienertová; Souček, Miroslav

    2018-01-01

    The human genome contains about 22 000 protein-coding genes that are transcribed to an even larger amount of messenger RNAs (mRNA). Interestingly, the results of the project ENCODE from 2012 show, that despite up to 90 % of our genome being actively transcribed, protein-coding mRNAs make up only 2-3 % of the total amount of the transcribed RNA. The rest of RNA transcripts is not translated to proteins and that is why they are referred to as "non-coding RNAs". Earlier the non-coding RNA was considered "the dark matter of genome", or "the junk", whose genes has accumulated in our DNA during the course of evolution. Today we already know that non-coding RNAs fulfil a variety of regulatory functions in our body - they intervene into epigenetic processes from chromatin remodelling to histone methylation, or into the transcription process itself, or even post-transcription processes. Long non-coding RNAs (lncRNA) are one of the classes of non-coding RNAs that have more than 200 nucleotides in length (non-coding RNAs with less than 200 nucleotides in length are called small non-coding RNAs). lncRNAs represent a widely varied and large group of molecules with diverse regulatory functions. We can identify them in all thinkable cell types or tissues, or even in an extracellular space, which includes blood, specifically plasma. Their levels change during the course of organogenesis, they are specific to different tissues and their changes also occur along with the development of different illnesses, including atherosclerosis. This review article aims to present lncRNAs problematics in general and then focuses on some of their specific representatives in relation to the process of atherosclerosis (i.e. we describe lncRNA involvement in the biology of endothelial cells, vascular smooth muscle cells or immune cells), and we further describe possible clinical potential of lncRNA, whether in diagnostics or therapy of atherosclerosis and its clinical manifestations.Key words: atherosclerosis - lincRNA - lncRNA - MALAT - MIAT.

  1. DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server

    PubMed Central

    Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

    2012-01-01

    The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications. PMID:22574113

  2. Parallel evolution of chordate cis-regulatory code for development.

    PubMed

    Doglio, Laura; Goode, Debbie K; Pelleri, Maria C; Pauls, Stefan; Frabetti, Flavia; Shimeld, Sebastian M; Vavouri, Tanya; Elgar, Greg

    2013-11-01

    Urochordates are the closest relatives of vertebrates and at the larval stage, possess a characteristic bilateral chordate body plan. In vertebrates, the genes that orchestrate embryonic patterning are in part regulated by highly conserved non-coding elements (CNEs), yet these elements have not been identified in urochordate genomes. Consequently the evolution of the cis-regulatory code for urochordate development remains largely uncharacterised. Here, we use genome-wide comparisons between C. intestinalis and C. savignyi to identify putative urochordate cis-regulatory sequences. Ciona conserved non-coding elements (ciCNEs) are associated with largely the same key regulatory genes as vertebrate CNEs. Furthermore, some of the tested ciCNEs are able to activate reporter gene expression in both zebrafish and Ciona embryos, in a pattern that at least partially overlaps that of the gene they associate with, despite the absence of sequence identity. We also show that the ability of a ciCNE to up-regulate gene expression in vertebrate embryos can in some cases be localised to short sub-sequences, suggesting that functional cross-talk may be defined by small regions of ancestral regulatory logic, although functional sub-sequences may also be dispersed across the whole element. We conclude that the structure and organisation of cis-regulatory modules is very different between vertebrates and urochordates, reflecting their separate evolutionary histories. However, functional cross-talk still exists because the same repertoire of transcription factors has likely guided their parallel evolution, exploiting similar sets of binding sites but in different combinations.

  3. [Long non-coding RNAs in plants].

    PubMed

    Xiaoqing, Huang; Dandan, Li; Juan, Wu

    2015-04-01

    Long non-coding RNAs (lncRNAs), which are longer than 200 nucleotides in length, widely exist in organisms and function in a variety of biological processes. Currently, most of lncRNAs found in plants are transcribed by RNA polymerase Ⅱ and mediate gene expression through multiple mechanisms, such as target mimicry, transcription interference, histone methylation and DNA methylation, and play important roles in flowering, male sterility, nutrition metabolism, biotic and abiotic stress and other biological processes as regulators in plants. In this review, we summarize the databases, prediction methods, and possible functions of plant lncRNAs discovered in recent years.

  4. Prediction of plant lncRNA by ensemble machine learning classifiers.

    PubMed

    Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian

    2018-05-02

    In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.

  5. It’s Time for An Epigenomics Roadmap of Heart Failure

    PubMed Central

    Papait, Roberto; Corrado, Nadia; Rusconi, Francesca; Serio, Simone; V.G. Latronico, Michael

    2015-01-01

    The post-genomic era has completed its first decade. During this time, we have seen an attempt to understand life not just through the study of individual isolated processes, but through the appreciation of the amalgam of complex networks, within which each process can influence others. Greatly benefiting this view has been the study of the epigenome, the set of DNA and histone protein modifications that regulate gene expression and the function of regulatory non-coding RNAs without altering the DNA sequence itself. Indeed, the availability of reference genome assemblies of many species has led to the development of methodologies such as ChIP-Seq and RNA-Seq that have allowed us to define with high resolution the genomic distribution of several epigenetic elements and to better comprehend how they are interconnected for the regulation of gene expression. In the last few years, the use of these methodologies in the cardiovascular field has contributed to our understanding of the importance of epigenetics in heart diseases, giving new input to this area of research. Here, we review recently acquired knowledge on the role of the epigenome in heart failure, and discuss the need of an epigenomics roadmap for cardiovascular disease. PMID:27006627

  6. Changes in the Coding and Non-coding Transcriptome and DNA Methylome that Define the Schwann Cell Repair Phenotype after Nerve Injury.

    PubMed

    Arthur-Farraj, Peter J; Morgan, Claire C; Adamowicz, Martyna; Gomez-Sanchez, Jose A; Fazal, Shaline V; Beucher, Anthony; Razzaghi, Bonnie; Mirsky, Rhona; Jessen, Kristjan R; Aitman, Timothy J

    2017-09-12

    Repair Schwann cells play a critical role in orchestrating nerve repair after injury, but the cellular and molecular processes that generate them are poorly understood. Here, we perform a combined whole-genome, coding and non-coding RNA and CpG methylation study following nerve injury. We show that genes involved in the epithelial-mesenchymal transition are enriched in repair cells, and we identify several long non-coding RNAs in Schwann cells. We demonstrate that the AP-1 transcription factor C-JUN regulates the expression of certain micro RNAs in repair Schwann cells, in particular miR-21 and miR-34. Surprisingly, unlike during development, changes in CpG methylation are limited in injury, restricted to specific locations, such as enhancer regions of Schwann cell-specific genes (e.g., Nedd4l), and close to local enrichment of AP-1 motifs. These genetic and epigenomic changes broaden our mechanistic understanding of the formation of repair Schwann cell during peripheral nervous system tissue repair. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  7. Mobility and generation of mosaic non-autonomous transposons by Tn3-derived inverted-repeat miniature elements (TIMEs).

    PubMed

    Szuplewska, Magdalena; Ludwiczak, Marta; Lyzwa, Katarzyna; Czarnecki, Jakub; Bartosik, Dariusz

    2014-01-01

    Functional transposable elements (TEs) of several Pseudomonas spp. strains isolated from black shale ore of Lubin mine and from post-flotation tailings of Zelazny Most in Poland, were identified using a positive selection trap plasmid strategy. This approach led to the capture and characterization of (i) 13 insertion sequences from 5 IS families (IS3, IS5, ISL3, IS30 and IS1380), (ii) isoforms of two Tn3-family transposons--Tn5563a and Tn4662a (the latter contains a toxin-antitoxin system), as well as (iii) non-autonomous TEs of diverse structure, ranging in size from 262 to 3892 bp. The non-autonomous elements transposed into AT-rich DNA regions and generated 5- or 6-bp sequence duplications at the target site of transposition. Although these TEs lack a transposase gene, they contain homologous 38-bp-long terminal inverted repeat sequences (IRs), highly conserved in Tn5563a and many other Tn3-family transposons. The simplest elements of this type, designated TIMEs (Tn3 family-derived Inverted-repeat Miniature Elements) (262 bp), were identified within two natural plasmids (pZM1P1 and pLM8P2) of Pseudomonas spp. It was demonstrated that TIMEs are able to mobilize segments of plasmid DNA for transposition, which results in the generation of more complex non-autonomous elements, resembling IS-driven composite transposons in structure. Such transposon-like elements may contain different functional genetic modules in their core regions, including plasmid replication systems. Another non-autonomous element "captured" with a trap plasmid was a TIME derivative containing a predicted resolvase gene and a res site typical for many Tn3-family transposons. The identification of a portable site-specific recombination system is another intriguing example confirming the important role of non-autonomous TEs of the TIME family in shuffling genetic information in bacterial genomes. Transposition of such mosaic elements may have a significant impact on diversity and evolution, not only of transposons and plasmids, but also of other types of mobile genetic elements.

  8. The Paramecium germline genome provides a niche for intragenic parasitic DNA: evolutionary dynamics of internal eliminated sequences.

    PubMed

    Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda

    2012-01-01

    Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of -45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a -10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated.

  9. The Paramecium Germline Genome Provides a Niche for Intragenic Parasitic DNA: Evolutionary Dynamics of Internal Eliminated Sequences

    PubMed Central

    Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E.; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda

    2012-01-01

    Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of ∼45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a ∼10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated. PMID:23071448

  10. Billions of basepairs of recently expanded, repetitive sequences are eliminated from the somatic genome during copepod development.

    PubMed

    Sun, Cheng; Wyngaard, Grace; Walton, D Brian; Wichman, Holly A; Mueller, Rachel Lockridge

    2014-03-11

    Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution--some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 - 75 Gb, 12-74 Gb of which are lost from pre-somatic cell lineages at germline--soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms.

  11. Billions of basepairs of recently expanded, repetitive sequences are eliminated from the somatic genome during copepod development

    PubMed Central

    2014-01-01

    Background Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution — some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 – 75 Gb, 12–74 Gb of which are lost from pre-somatic cell lineages at germline – soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Results Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Conclusions Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms. PMID:24618421

  12. Mitochondrial DNA deletion percentage in sun exposed and non sun exposed skin.

    PubMed

    Powers, Julia M; Murphy, Gillian; Ralph, Nikki; O'Gorman, Susan M; Murphy, James E J

    2016-12-01

    The percentages of mitochondrial genomes carrying the mtDNA 3895 and the mtDNA 4977 (common) deletion were quantified in sun exposed and non sun exposed skin biopsies, for five cohorts of patients varying either in sun exposure profile, age or skin cancer status. Non-melanoma skin cancer diagnoses are rising in Ireland and worldwide [12] but most risk prediction is based on subjective visual estimations of sun exposure history. A quantitative objective test for pre-neoplastic markers may result in better adherence to sun protective behaviours. Mitochondrial DNA (mtDNA) is known to be subject to the loss of a significant proportion of specific sections of genetic code due to exposure to ultraviolet light in sunlight. Although one such deletion has been deemed more sensitive, another, called the mtDNA 4977 or common deletion, has proved to be a more useful indicator of possible risk in this study. Quantitative molecular analysis was carried out to determine the percentage of genomes carrying the deletion using non sun exposed and sun exposed skin biopsies in cohorts of patients with high or low sun exposure profiles and two high exposure groups undergoing treatment for NMSC. Results indicate that mtDNA deletions correlate to sun exposure; in groups with high sun exposure habits a significant increase in deletion number in exposed over non sun exposed skin occurred. An increase in deletion percentage was also seen in older cohorts compared to the younger group. The mtDNA 3895 deletion was detected in small amounts in exposed skin of many patients, the mtDNA 4977 common deletion, although present to some extent in non sun exposed skin, is suggested to be the more reliable and easily detected marker. In all cohorts except the younger group with relatively lower sun exposure, the mtDNA 4977 deletion was more frequent in sun exposed skin samples compared to non-sun exposed skin. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. Proper reprogramming of imprinted and non-imprinted genes in cloned cattle gametogenesis.

    PubMed

    Kaneda, Masahiro; Watanabe, Shinya; Akagi, Satoshi; Inaba, Yasushi; Geshi, Masaya; Nagai, Takashi

    2017-11-01

    Epigenetic abnormalities in cloned animals are caused by incomplete reprogramming of the donor nucleus during the nuclear transfer step (first reprogramming). However, during the second reprogramming step that occurs only in the germline cells, epigenetic errors not corrected during the first step are repaired. Consequently, epigenetic abnormalities in the somatic cells of cloned animals should be erased in their spermatozoa or oocytes. This is supported by the fact that offspring from cloned animals do not exhibit defects at birth or during postnatal development. To test this hypothesis in cloned cattle, we compared the DNA methylation level of two imprinted genes (H19 and PEG3) and three non-imprinted genes (XIST, OCT4 and NANOG) and two repetitive elements (Satellite I and Satellite II) in blood and sperm DNAs from cloned and non-cloned bulls. We found no differences between cloned and non-cloned bulls. We also analyzed the DNA methylation levels of four repetitive elements (Satellite I, Satellite II, Alpha-satellite and Art2) in oocytes recovered from cloned and non-cloned cows. Again, no significant differences were observed between clones and non-clones. These results suggested that imprinted and non-imprinted genes and repetitive elements were properly reprogramed during gametogenesis in cloned cattle; therefore, they contributed to the soundness of cloned cattle offspring. © 2017 Japanese Society of Animal Science.

  14. Characterization of the Fb-Nof Transposable Element of Drosophila Melanogaster

    PubMed Central

    Harden, N.; Ashburner, M.

    1990-01-01

    FB-NOF is a composite transposable element of Drosophila melanogaster. It is composed of foldback sequences, of variable length, which flank a 4-kb NOF sequence with 308-bp inverted repeat termini. The NOF sequence could potentially code for a 120-kD polypeptide. The FB-NOF element is responsible for unstable mutations of the white gene (w(c) and w(DZL)) and is associated with the large TEs of G. Ising. Although most strains of D. melanogaster have 20-30 sites of FB insertion, FB-NOF elements are usually rare, many strains lack this composite element or have only one copy of it. A few strains, including w(DZL) and Basc have many (8-21) copies of FB-NOF, and these show a tendency to insert at ``hot-spots.'' These strains also have an increased number of FB elements. The DNA sequence of the NOF region associated with TE146(Z) has been determined. PMID:2174013

  15. APOBEC3B cytidine deaminase targets the non-transcribed strand of tRNA genes in yeast.

    PubMed

    Saini, Natalie; Roberts, Steven A; Sterling, Joan F; Malc, Ewa P; Mieczkowski, Piotr A; Gordenin, Dmitry A

    2017-05-01

    Variations in mutation rates across the genome have been demonstrated both in model organisms and in cancers. This phenomenon is largely driven by the damage specificity of diverse mutagens and the differences in DNA repair efficiency in given genomic contexts. Here, we demonstrate that the single-strand DNA-specific cytidine deaminase APOBEC3B (A3B) damages tRNA genes at a 1000-fold higher efficiency than other non-tRNA genomic regions in budding yeast. We found that A3B-induced lesions in tRNA genes were predominantly located on the non-transcribed strand, while no transcriptional strand bias was observed in protein coding genes. Furthermore, tRNA gene mutations were exacerbated in cells where RNaseH expression was completely abolished (Δrnh1Δrnh35). These data suggest a transcription-dependent mechanism for A3B-induced tRNA gene hypermutation. Interestingly, in strains proficient in DNA repair, only 1% of the abasic sites formed upon excision of A3B-deaminated cytosines were not repaired leading to mutations in tRNA genes, while 18% of these lesions failed to be repaired in the remainder of the genome. A3B-induced mutagenesis in tRNA genes was found to be efficiently suppressed by the redundant activities of both base excision repair (BER) and the error-free DNA damage bypass pathway. On the other hand, deficiencies in BER did not have a profound effect on A3B-induced mutations in CAN1, the reporter for protein coding genes. We hypothesize that differences in the mechanisms underlying ssDNA formation at tRNA genes and other genomic loci are the key determinants of the choice of the repair pathways and consequently the efficiency of DNA damage repair in these regions. Overall, our results indicate that tRNA genes are highly susceptible to ssDNA-specific DNA damaging agents. However, increased DNA repair efficacy in tRNA genes can prevent their hypermutation and maintain both genome and proteome homeostasis. Published by Elsevier B.V.

  16. Reading of the non-template DNA by transcription elongation factors.

    PubMed

    Svetlov, Vladimir; Nudler, Evgeny

    2018-05-14

    Unlike transcription initiation and termination, which have easily discernable signals such as promoters and terminators, elongation is regulated through a dynamic network involving RNA/DNA pause signals and states- rather than sequence-specific protein interactions. A report by Nedialkov et al. (in press) provides experimental evidence for sequence-specific recruitment of elongation factor RfaH to transcribing RNA polymerase (RNAP) and outlines the mechanism of gene expression regulation by restraint ("locking") of the DNA non-template strand. According to this model, the elongation complex pauses at the so called "operon polarity sequence" (found in some long bacterial operons coding for virulence genes), when the usually flexible non-template DNA strand adopts a distinct hairpin-loop conformation on the surface of transcribing RNAP. Sequence-specific binding of RfaH to this DNA segment facilitates conversion of RfaH from its inactive closed to its active open conformation. The interaction network formed between RfaH, non-template DNA, and RNAP locks DNA in a conformation that renders the elongation complex resistant to pausing and termination. The effects of such locking on transcript elongation can be mimicked by restraint of the non-template strand due to its shortening. This work advances our understanding of regulation of transcript elongation and has important implications for the action of general transcription factors, such as NusG, which lack apparent sequence-specificity, as well as for the mechanisms of other processes linked to transcription such as transcription-coupled DNA repair. This article is protected by copyright. All rights reserved. © 2018 John Wiley & Sons Ltd.

  17. DNA methylation dynamics during early plant life.

    PubMed

    Bouyer, Daniel; Kramdi, Amira; Kassam, Mohamed; Heese, Maren; Schnittger, Arp; Roudier, François; Colot, Vincent

    2017-09-25

    Cytosine methylation is crucial for gene regulation and silencing of transposable elements in mammals and plants. While this epigenetic mark is extensively reprogrammed in the germline and early embryos of mammals, the extent to which DNA methylation is reset between generations in plants remains largely unknown. Using Arabidopsis as a model, we uncovered distinct DNA methylation dynamics over transposable element sequences during the early stages of plant development. Specifically, transposable elements and their relics show invariably high methylation at CG sites but increasing methylation at CHG and CHH sites. This non-CG methylation culminates in mature embryos, where it reaches saturation for a large fraction of methylated CHH sites, compared to the typical 10-20% methylation level observed in seedlings or adult plants. Moreover, the increase in CHH methylation during embryogenesis matches the hypomethylated state in the early endosperm. Finally, we show that interfering with the embryo-to-seedling transition results in the persistence of high CHH methylation levels after germination, specifically over sequences that are targeted by the RNA-directed DNA methylation (RdDM) machinery. Our findings indicate the absence of extensive resetting of DNA methylation patterns during early plant life and point instead to an important role of RdDM in reinforcing DNA methylation of transposable element sequences in every cell of the mature embryo. Furthermore, we provide evidence that this elevated RdDM activity is a specific property of embryogenesis.

  18. Do plant cell walls have a code?

    PubMed

    Tavares, Eveline Q P; Buckeridge, Marcos S

    2015-12-01

    A code is a set of rules that establish correspondence between two worlds, signs (consisting of encrypted information) and meaning (of the decrypted message). A third element, the adaptor, connects both worlds, assigning meaning to a code. We propose that a Glycomic Code exists in plant cell walls where signs are represented by monosaccharides and phenylpropanoids and meaning is cell wall architecture with its highly complex association of polymers. Cell wall biosynthetic mechanisms, structure, architecture and properties are addressed according to Code Biology perspective, focusing on how they oppose to cell wall deconstruction. Cell wall hydrolysis is mainly focused as a mechanism of decryption of the Glycomic Code. Evidence for encoded information in cell wall polymers fine structure is highlighted and the implications of the existence of the Glycomic Code are discussed. Aspects related to fine structure are responsible for polysaccharide packing and polymer-polymer interactions, affecting the final cell wall architecture. The question whether polymers assembly within a wall display similar properties as other biological macromolecules (i.e. proteins, DNA, histones) is addressed, i.e. do they display a code? Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  19. Evolutionary Dynamics of 5S rDNA and Recurrent Association of Transposable Elements in Electric Fish of the Family Gymnotidae (Gymnotiformes): The Case of Gymnotus mamiraua.

    PubMed

    da Silva, Maelin; Barbosa, Patricia; Artoni, Roberto F; Feldberg, Eliana

    2016-01-01

    Gymnotidae is a family of electric fish endemic to the Neotropics consisting of 2 genera: Electrophorus and Gymnotus. The genus Gymnotus is widely distributed and is found in all of the major Brazilian river systems. Physical and molecular mapping data for the ribosomal DNA (rDNA) in this genus are still scarce, with its chromosomal location known in only 11 species. As other species of Gymnotus with 2n = 54 chromosomes from the Paraná-Paraguay basin, G. mamiraua was found to have a large number of 5S rDNA sites. Isolation and cloning of the 5S rDNA sequences from G. mamiraua identified a fragment of a transposable element similar to the Tc1/mariner transposon associated with a non-transcribed spacer. Double fluorescence in situ hybridization analysis of this element and the 5S rDNA showed that they were colocalized on several chromosomes, in addition to acting as nonsyntenic markers on others. Our data show the association between these sequences and suggest that the Tc1 retrotransposon may be the agent that drives the spread of these 5S rDNA-like sequences in the G. mamiraua genome. © 2016 S. Karger AG, Basel.

  20. The ATRX cDNA is prone to bacterial IS10 element insertions that alter its structure.

    PubMed

    Valle-García, David; Griffiths, Lyra M; Dyer, Michael A; Bernstein, Emily; Recillas-Targa, Félix

    2014-01-01

    The SWI/SNF-like chromatin-remodeling protein ATRX has emerged as a key factor in the regulation of α-globin gene expression, incorporation of histone variants into the chromatin template and, more recently, as a frequently mutated gene across a wide spectrum of cancers. Therefore, the availability of a functional ATRX cDNA for expression studies is a valuable tool for the scientific community. We have identified two independent transposon insertions of a bacterial IS10 element into exon 8 of ATRX isoform 2 coding sequence in two different plasmids derived from a single source. We demonstrate that these insertion events are common and there is an insertion hotspot within the ATRX cDNA. Such IS10 insertions produce a truncated form of ATRX, which significantly compromises its nuclear localization. In turn, we describe ways to prevent IS10 insertion during propagation and cloning of ATRX-containing vectors, including optimal growth conditions, bacterial strains, and suggested sequencing strategies. Finally, we have generated an insertion-free plasmid that is available to the community for expression studies of ATRX.

  1. A purified transcription factor (TIF-IB) binds to essential sequences of the mouse rDNA promoter.

    PubMed Central

    Clos, J; Buttgereit, D; Grummt, I

    1986-01-01

    A transcription factor that is specific for mouse rDNA has been partially purified from Ehrlich ascites cells. This factor [designated transcription initiation factor (TIF)-IB] is required for accurate in vitro synthesis of mouse rRNA in addition to RNA polymerase I and another regulatory factor, TIF-IA. TIF-IB activity is present in extracts both from growing and nongrowing cells in comparable amounts. Prebinding competition experiments with wild-type and mutant templates suggest that TIF-IB interacts with the core control element of the rDNA promoter, which is located immediately upstream of the initiation site. The specific binding of TIF-IB to the RNA polymerase I promoter is demonstrated by exonuclease III protection experiments. The 3' border of the sequences protected by TIF-IB is shown to be on the coding strand at position -21 and on the noncoding strand at position -7. The results suggest that direct binding of TIF-IB to sequences in the core promoter element is the mechanism by which this factor imparts promoter selectivity to RNA polymerase I. Images PMID:3456157

  2. A global transcriptional analysis of Plasmodium falciparum malaria reveals a novel family of telomere-associated lncRNAs

    PubMed Central

    2011-01-01

    Background Mounting evidence suggests a major role for epigenetic feedback in Plasmodium falciparum transcriptional regulation. Long non-coding RNAs (lncRNAs) have recently emerged as a new paradigm in epigenetic remodeling. We therefore set out to investigate putative roles for lncRNAs in P. falciparum transcriptional regulation. Results We used a high-resolution DNA tiling microarray to survey transcriptional activity across 22.6% of the P. falciparum strain 3D7 genome. We identified 872 protein-coding genes and 60 putative P. falciparum lncRNAs under developmental regulation during the parasite's pathogenic human blood stage. Further characterization of lncRNA candidates led to the discovery of an intriguing family of lncRNA telomere-associated repetitive element transcripts, termed lncRNA-TARE. We have quantified lncRNA-TARE expression at 15 distinct chromosome ends and mapped putative transcriptional start and termination sites of lncRNA-TARE loci. Remarkably, we observed coordinated and stage-specific expression of lncRNA-TARE on all chromosome ends tested, and two dominant transcripts of approximately 1.5 kb and 3.1 kb transcribed towards the telomere. Conclusions We have characterized a family of 22 telomere-associated lncRNAs in P. falciparum. Homologous lncRNA-TARE loci are coordinately expressed after parasite DNA replication, and are poised to play an important role in P. falciparum telomere maintenance, virulence gene regulation, and potentially other processes of parasite chromosome end biology. Further study of lncRNA-TARE and other promising lncRNA candidates may provide mechanistic insight into P. falciparum transcriptional regulation. PMID:21689454

  3. Generating and repairing genetically programmed DNA breaks during immunoglobulin class switch recombination

    PubMed Central

    Nicolas, Laura; Cols, Montserrat; Choi, Jee Eun; Chaudhuri, Jayanta; Vuong, Bao

    2018-01-01

    Adaptive immune responses require the generation of a diverse repertoire of immunoglobulins (Igs) that can recognize and neutralize a seemingly infinite number of antigens. V(D)J recombination creates the primary Ig repertoire, which subsequently is modified by somatic hypermutation (SHM) and class switch recombination (CSR). SHM promotes Ig affinity maturation whereas CSR alters the effector function of the Ig. Both SHM and CSR require activation-induced cytidine deaminase (AID) to produce dU:dG mismatches in the Ig locus that are transformed into untemplated mutations in variable coding segments during SHM or DNA double-strand breaks (DSBs) in switch regions during CSR. Within the Ig locus, DNA repair pathways are diverted from their canonical role in maintaining genomic integrity to permit AID-directed mutation and deletion of gene coding segments. Recently identified proteins, genes, and regulatory networks have provided new insights into the temporally and spatially coordinated molecular interactions that control the formation and repair of DSBs within the Ig locus. Unravelling the genetic program that allows B cells to selectively alter the Ig coding regions while protecting non-Ig genes from DNA damage advances our understanding of the molecular processes that maintain genomic integrity as well as humoral immunity. PMID:29744038

  4. New insights into mitogenomic phylogeny and copy number in eight indigenous sheep populations based on the ATP synthase and cytochrome c oxidase genes.

    PubMed

    Xiao, P; Niu, L L; Zhao, Q J; Chen, X Y; Wang, L J; Li, L; Zhang, H P; Guo, J Z; Xu, H Y; Zhong, T

    2017-11-16

    The origins and phylogeny of different sheep breeds has been widely studied using polymorphisms within the mitochondrial hypervariable region. However, little is known about the mitochondrial DNA (mtDNA) content and phylogeny based on mtDNA protein-coding genes. In this study, we assessed the phylogeny and copy number of the mtDNA in eight indigenous (population size, n=184) and three introduced (n=66) sheep breeds in China based on five mitochondrial coding genes (COX1, COX2, ATP8, ATP6 and COX3). The mean haplotype and nucleotide diversities were 0.944 and 0.00322, respectively. We identified a correlation between the lineages distribution and the genetic distance, whereby Valley-type Tibetan sheep had a closer genetic relationship with introduced breeds (Dorper, Poll Dorset and Suffolk) than with other indigenous breeds. Similarly, the Median-joining profile of haplotypes revealed the distribution of clusters according to genetic differences. Moreover, copy number analysis based on the five mitochondrial coding genes was affected by the genetic distance combining with genetic phylogeny; we also identified obvious non-synonymous mutations in ATP6 between the different levels of copy number expressions. These results imply that differences in mitogenomic compositions resulting from geographical separation lead to differences in mitochondrial function.

  5. BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.

    PubMed

    Yang, Bite; Liu, Feng; Ren, Chao; Ouyang, Zhangyi; Xie, Ziwei; Bo, Xiaochen; Shu, Wenjie

    2017-07-01

    Enhancer elements are noncoding stretches of DNA that play key roles in controlling gene expression programmes. Despite major efforts to develop accurate enhancer prediction methods, identifying enhancer sequences continues to be a challenge in the annotation of mammalian genomes. One of the major issues is the lack of large, sufficiently comprehensive and experimentally validated enhancers for humans or other species. Thus, the development of computational methods based on limited experimentally validated enhancers and deciphering the transcriptional regulatory code encoded in the enhancer sequences is urgent. We present a deep-learning-based hybrid architecture, BiRen, which predicts enhancers using the DNA sequence alone. Our results demonstrate that BiRen can learn common enhancer patterns directly from the DNA sequence and exhibits superior accuracy, robustness and generalizability in enhancer prediction relative to other state-of-the-art enhancer predictors based on sequence characteristics. Our BiRen will enable researchers to acquire a deeper understanding of the regulatory code of enhancer sequences. Our BiRen method can be freely accessed at https://github.com/wenjiegroup/BiRen . shuwj@bmi.ac.cn or boxc@bmi.ac.cn. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  6. Effective gene prediction by high resolution frequency estimator based on least-norm solution technique

    PubMed Central

    2014-01-01

    Linear algebraic concept of subspace plays a significant role in the recent techniques of spectrum estimation. In this article, the authors have utilized the noise subspace concept for finding hidden periodicities in DNA sequence. With the vast growth of genomic sequences, the demand to identify accurately the protein-coding regions in DNA is increasingly rising. Several techniques of DNA feature extraction which involves various cross fields have come up in the recent past, among which application of digital signal processing tools is of prime importance. It is known that coding segments have a 3-base periodicity, while non-coding regions do not have this unique feature. One of the most important spectrum analysis techniques based on the concept of subspace is the least-norm method. The least-norm estimator developed in this paper shows sharp period-3 peaks in coding regions completely eliminating background noise. Comparison of proposed method with existing sliding discrete Fourier transform (SDFT) method popularly known as modified periodogram method has been drawn on several genes from various organisms and the results show that the proposed method has better as well as an effective approach towards gene prediction. Resolution, quality factor, sensitivity, specificity, miss rate, and wrong rate are used to establish superiority of least-norm gene prediction method over existing method. PMID:24386895

  7. Identification of the SRC pyrimidine-binding protein (SPy) as hnRNP K: implications in the regulation of SRC1A transcription

    PubMed Central

    Ritchie, Shawn A.; Pasha, Mohammed K.; Batten, Danielle J. P.; Sharma, Rajendra K.; Olson, Douglas J. H.; Ross, Andrew R. S.; Bonham, Keith

    2003-01-01

    The human SRC gene encodes pp60c–src, a non-receptor tyrosine kinase involved in numerous signaling pathways. Activation or overexpression of c-Src has also been linked to a number of important human cancers. Transcription of the SRC gene is complex and regulated by two closely linked but highly dissimilar promoters, each associated with its own distinct non-coding exon. In many tissues SRC expression is regulated by the housekeeping-like SRC1A promoter. In addition to other regulatory elements, three substantial polypurine:polypyrimidine (TC) tracts within this promoter are required for full transcriptional activity. Previously, we described an unusual factor called SRC pyrimidine-binding protein (SPy) that could bind to two of these TC tracts in their double-stranded form, but was also capable of interacting with higher affinity to all three pyrimidine tracts in their single-stranded form. Mutations in the TC tracts, which abolished the ability of SPy to interact with its double-stranded DNA target, significantly reduced SRC1A promoter activity, especially in concert with mutations in critical Sp1 binding sites. Here we expand upon our characterization of this interesting factor and describe the purification of SPy from human SW620 colon cancer cells using a DNA affinity-based approach. Subsequent in-gel tryptic digestion of purified SPy followed by MALDI-TOF mass spectrometric analysis identified SPy as heterogeneous nuclear ribonucleoprotein K (hnRNP K), a known nucleic-acid binding protein implicated in various aspects of gene expression including transcription. These data provide new insights into the double- and single-stranded DNA-binding specificity, as well as functional properties of hnRNP K, and suggest that hnRNP K is a critical component of SRC1A transcriptional processes. PMID:12595559

  8. Cytotoxic chromosomal targeting by CRISPR/Cas systems can reshape bacterial genomes and expel or remodel pathogenicity islands.

    PubMed

    Vercoe, Reuben B; Chang, James T; Dy, Ron L; Taylor, Corinda; Gristwood, Tamzin; Clulow, James S; Richter, Corinna; Przybilski, Rita; Pitman, Andrew R; Fineran, Peter C

    2013-04-01

    In prokaryotes, clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated (Cas) proteins constitute a defence system against bacteriophages and plasmids. CRISPR/Cas systems acquire short spacer sequences from foreign genetic elements and incorporate these into their CRISPR arrays, generating a memory of past invaders. Defence is provided by short non-coding RNAs that guide Cas proteins to cleave complementary nucleic acids. While most spacers are acquired from phages and plasmids, there are examples of spacers that match genes elsewhere in the host bacterial chromosome. In Pectobacterium atrosepticum the type I-F CRISPR/Cas system has acquired a self-complementary spacer that perfectly matches a protospacer target in a horizontally acquired island (HAI2) involved in plant pathogenicity. Given the paucity of experimental data about CRISPR/Cas-mediated chromosomal targeting, we examined this process by developing a tightly controlled system. Chromosomal targeting was highly toxic via targeting of DNA and resulted in growth inhibition and cellular filamentation. The toxic phenotype was avoided by mutations in the cas operon, the CRISPR repeats, the protospacer target, and protospacer-adjacent motif (PAM) beside the target. Indeed, the natural self-targeting spacer was non-toxic due to a single nucleotide mutation adjacent to the target in the PAM sequence. Furthermore, we show that chromosomal targeting can result in large-scale genomic alterations, including the remodelling or deletion of entire pre-existing pathogenicity islands. These features can be engineered for the targeted deletion of large regions of bacterial chromosomes. In conclusion, in DNA-targeting CRISPR/Cas systems, chromosomal interference is deleterious by causing DNA damage and providing a strong selective pressure for genome alterations, which may have consequences for bacterial evolution and pathogenicity.

  9. Epigenetic deregulation in chronic lymphocytic leukemia: Clinical and biological impact.

    PubMed

    Mansouri, Larry; Wierzbinska, Justyna Anna; Plass, Christoph; Rosenquist, Richard

    2018-02-07

    Deregulated transcriptional control caused by aberrant DNA methylation and/or histone modifications is a hallmark of cancer cells. In chronic lymphocytic leukemia (CLL), the most common adult leukemia, the epigenetic 'landscape' has added a new layer of complexity to our understanding of this clinically and biologically heterogeneous disease. Early studies identified aberrant DNA methylation, often based on single gene promoter analysis with both biological and clinical impact. Subsequent genome-wide profiling studies revealed differential DNA methylation between CLLs and controls and in prognostics subgroups of the disease. From these studies, it became apparent that DNA methylation in regions outside of promoters, such as enhancers, is important for the regulation of coding genes as well as for the regulation of non-coding RNAs. Although DNA methylation profiles are reportedly stable over time and in relation to therapy, a higher epigenetic heterogeneity or 'burden' is seen in more aggressive CLL subgroups, albeit as non-recurrent 'passenger' events. More recently, DNA methylation profiles in CLL analyzed in relation to differentiating normal B-cell populations revealed that the majority of the CLL epigenome reflects the epigenomes present in the cell of origin and that only a small fraction of the epigenetic alterations represents truly CLL-specific changes. Furthermore, CLL patients can be grouped into at least three clinically relevant epigenetic subgroups, potentially originating from different cells at various stages of differentiation and associated with distinct outcomes. In this review, we summarize the current understanding of the DNA methylome in CLL, the role of histone modifying enzymes, highlight insights derived from animal models and attempts made to target epigenetic regulators in CLL along with the future directions of this rapidly advancing field. Copyright © 2018 Elsevier Ltd. All rights reserved.

  10. Cross-species amplification of mitochondrial DNA sequence-tagged-site markers in conifers: the nature of polymorphism and variation within and among species in Picea.

    PubMed

    Jaramillo-Correa, J P; Bousquet, J; Beaulieu, J; Isabel, N; Perron, M; Bouillé, M

    2003-05-01

    Primers previously developed to amplify specific non-coding regions of the mitochondrial genome in Angiosperms, and new primers for additional non-coding mtDNA regions, were tested for their ability to direct DNA amplification in 12 conifer taxa and to detect sequence-tagged-site (STS) polymorphisms within and among eight species in Picea. Out of 12 primer pairs, nine were successful at amplifying mtDNA in most of the taxa surveyed. In conifers, indels and substitutions were observed for several loci, allowing them to distinguish between families, genera and, in some cases, between species within genera. In Picea, interspecific polymorphism was detected for four loci, while intraspecific variation was observed for three of the mtDNA regions studied. One of these (SSU rRNA V1 region) exhibited indel polymorphisms, and the two others ( nad1 intron b/c and nad5 intron1) revealed restriction differences after digestion with Sau3AI (PCR-RFLP). A fourth locus, the nad4L- orf25 intergenic region, showed a multibanding pattern for most of the spruce species, suggesting a possible gene duplication. Maternal inheritance, expected for mtDNA in conifers, was observed for all polymorphic markers except the intergenic region nad4L- orf25. Pooling of the variation observed with the remaining three markers resulted in two to six different mtDNA haplotypes within the different species of Picea. Evidence for intra-genomic recombination was observed in at least two taxa. Thus, these mitotypes are likely to be more informative than single-locus haplotypes. They should be particularly useful for the study of biogeography and the dynamics of hybrid zones.

  11. Ovine mitochondrial DNA sequence variation and its association with production and reproduction traits within an Afec-Assaf flock.

    PubMed

    Reicher, S; Seroussi, E; Weller, J I; Rosov, A; Gootwine, E

    2012-07-01

    Polymorphisms in mitochondrial DNA (mtDNA) protein- and tRNA-coding genes were shown to be associated with various diseases in humans as well as with production and reproduction traits in livestock. Alignment of full length mitochondria sequences from the 5 known ovine haplogroups: HA (n = 3), HB (n = 5), HC (n = 3), HD (n = 2), and HE (n = 2; GenBank accession nos. HE577847-50 and 11 published complete ovine mitochondria sequences) revealed sequence variation in 10 out of the 13 protein coding mtDNA sequences. Twenty-six of the 245 variable sites found in the protein coding sequences represent non-synonymous mutations. Sequence variation was observed also in 8 out of the 22 tRNA mtDNA sequences. On the basis of the mtDNA control region and cytochrome b partial sequences along with information on maternal lineages within an Afec-Assaf flock, 1,126 Afec-Assaf ewes were assigned to mitochondrial haplogroups HA, HB, and HC, with frequencies of 0.43, 0.43, and 0.14, respectively. Analysis of birth weight and growth rate records of lamb (n = 1286) and productivity from 4,993 lambing records revealed no association between mitochondrial haplogroup affiliation and female longevity, lambs perinatal survival rate, birth weight, and daily growth rate of lambs up to 150 d that averaged 1,664 d, 88.3%, 4.5 kg, and 320 g/d, respectively. However, significant (P < 0.0001) differences among the haplogroups were found for prolificacy of ewes, with prolificacies (mean ± SE) of 2.14 ± 0.04, 2.25 ± 0.04, and 2.30 ± 0.06 lamb born/ewe lambing for the HA, HB, and the HC haplogroups, respectively. Our results highlight the ovine mitogenome genetic variation in protein- and tRNA coding genes and suggest that sequence variation in ovine mtDNA is associated with variation in ewe prolificacy.

  12. Genetic evidence for conserved non-coding element function across species–the ears have it

    PubMed Central

    Turner, Eric E.; Cox, Timothy C.

    2014-01-01

    Comparison of genomic sequences from diverse vertebrate species has revealed numerous highly conserved regions that do not appear to encode proteins or functional RNAs. Often these “conserved non-coding elements,” or CNEs, can direct gene expression to specific tissues in transgenic models, demonstrating they have regulatory function. CNEs are frequently found near “developmental” genes, particularly transcription factors, implying that these elements have essential regulatory roles in development. However, actual examples demonstrating CNE regulatory functions across species have been few, and recent loss-of-function studies of several CNEs in mice have shown relatively minor effects. In this Perspectives article, we discuss new findings in “fancy” rats and Highland cattle demonstrating that function of a CNE near the Hmx1 gene is crucial for normal external ear development and when disrupted can mimic loss-of function Hmx1 coding mutations in mice and humans. These findings provide important support for conserved developmental roles of CNEs in divergent species, and reinforce the concept that CNEs should be examined systematically in the ongoing search for genetic causes of human developmental disorders in the era of genome-scale sequencing. PMID:24478720

  13. Characterization of mitochondrial genome of sea cucumber Stichopus horrens: a novel gene arrangement in Holothuroidea.

    PubMed

    Fan, SiGang; Hu, ChaoQun; Wen, Jing; Zhang, LvPing

    2011-05-01

    The complete mitochondrial DNA sequence contains useful information for phylogenetic analyses of metazoa. In this study, the complete mitochondrial DNA sequence of sea cucumber Stichopus horrens (Holothuroidea: Stichopodidae: Stichopus) is presented. The complete sequence was determined using normal and long PCRs. The mitochondrial genome of Stichopus horrens is a circular molecule 16257 bps long, composed of 13 protein-coding genes, two ribosomal RNA genes and 22 transfer RNA genes. Most of these genes are coded on the heavy strand except for one protein-coding gene (nad6) and five tRNA genes (tRNA ( Ser(UCN) ), tRNA ( Gln ), tRNA ( Ala ), tRNA ( Val ), tRNA ( Asp )) which are coded on the light strand. The composition of the heavy strand is 30.8% A, 23.7% C, 16.2% G, and 29.3% T bases (AT skew=0.025; GC skew=-0.188). A non-coding region of 675 bp was identified as a putative control region because of its location and AT richness. The intergenic spacers range from 1 to 50 bp in size, totaling 227 bp. A total of 25 overlapping nucleotides, ranging from 1 to 10 bp in size, exist among 11 genes. All 13 protein-coding genes are initiated with an ATG. The TAA codon is used as the stop codon in all the protein coding genes except nad3 and nad4 that use TAG as their termination codon. The most frequently used amino acids are Leu (16.29%), Ser (10.34%) and Phe (8.37%). All of the tRNA genes have the potential to fold into typical cloverleaf secondary structures. We also compared the order of the genes in the mitochondrial DNA from the five holothurians that are now available and found a novel gene arrangement in the mitochondrial DNA of Stichopus horrens.

  14. Primary analysis of repeat elements of the Asian seabass (Lates calcarifer) transcriptome and genome

    PubMed Central

    Kuznetsova, Inna S.; Thevasagayam, Natascha M.; Sridatta, Prakki S. R.; Komissarov, Aleksey S.; Saju, Jolly M.; Ngoh, Si Y.; Jiang, Junhui; Shen, Xueyan; Orbán, László

    2014-01-01

    As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n = 24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8–14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionarily conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates. PMID:25120555

  15. Repeated administration of CGP 46381, a gamma-aminobutyric acidB antagonist, and ethosuximide suppresses seizure-associated cyclic adenosine 3'5' monophosphate response element- and activator protein-1 DNA-binding activities in lethargic (lh/lh) mice.

    PubMed

    Ishige, K; Endo, H; Saito, H; Ito, Y

    2001-01-19

    To characterize seizure-associated increases in cerebral cortical and thalamic cyclic AMP responsive element (CRE)- and activator protein 1 (AP-1) DNA-binding activities in lethargic (lh/lh) mice, a genetic model of absence seizures, we examined the effects of ethosuximide and CGP 46381 on these DNA-binding activities. Repeated administration (twice a day for 5 days) of ethosuximide (200 mg/kg) or CGP 46381 (60 mg/kg) attenuated both seizure behavior and the increased DNA-binding activities, and was more effective than a single administration of these drugs. These treatments did not affect either normal behavior or basal DNA-binding activities in non-epileptic control (+/+) mice. Gel supershift assays revealed that the increased CRE-binding activity was attributable to activation of the binding activity of CREB, and that the c-Fos-c-Jun complex was a component of the increased AP-1 DNA-binding activity.

  16. SHARAKU: an algorithm for aligning and clustering read mapping profiles of deep sequencing in non-coding RNA processing.

    PubMed

    Tsuchiya, Mariko; Amano, Kojiro; Abe, Masaya; Seki, Misato; Hase, Sumitaka; Sato, Kengo; Sakakibara, Yasubumi

    2016-06-15

    Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5'-end processing and 3'-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain. The source code of our program SHARAKU is available at http://www.dna.bio.keio.ac.jp/sharaku/, and the simulated dataset used in this work is available at the same link. Accession code: The sequence data from the whole RNA transcripts in the hippocampus of the left brain used in this work is available from the DNA DataBank of Japan (DDBJ) Sequence Read Archive (DRA) under the accession number DRA004502. yasu@bio.keio.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  17. Analysis of 16S-23S rRNA intergenic spacer regions of Vibrio cholerae and Vibrio mimicus.

    PubMed

    Chun, J; Huq, A; Colwell, R R

    1999-05-01

    Vibrio cholerae identification based on molecular sequence data has been hampered by a lack of sequence variation from the closely related Vibrio mimicus. The two species share many genes coding for proteins, such as ctxAB, and show almost identical 16S DNA coding for rRNA (rDNA) sequences. Primers targeting conserved sequences flanking the 3' end of the 16S and the 5' end of the 23S rDNAs were used to amplify the 16S-23S rRNA intergenic spacer regions of V. cholerae and V. mimicus. Two major (ca. 580 and 500 bp) and one minor (ca. 750 bp) amplicons were consistently generated for both species, and their sequences were determined. The largest fragment contains three tRNA genes (tDNAs) coding for tRNAGlu, tRNALys, and tRNAVal, which has not previously been found in bacteria examined to date. The 580-bp amplicon contained tDNAIle and tDNAAla, whereas the 500-bp fragment had single tDNA coding either tRNAGlu or tRNAAla. Little variation, i.e., 0 to 0.4%, was found among V. cholerae O1 classical, O1 El Tor, and O139 epidemic strains. Slightly more variation was found against the non-O1/non-O139 serotypes (ca. 1% difference) and V. mimicus (2 to 3% difference). A pair of oligonucleotide primers were designed, based on the region differentiating all of V. cholerae strains from V. mimicus. The PCR system developed was subsequently evaluated by using representatives of V. cholerae from environmental and clinical sources, and of other taxa, including V. mimicus. This study provides the first molecular tool for identifying the species V. cholerae.

  18. Decoding the genome beyond sequencing: the new phase of genomic research.

    PubMed

    Heng, Henry H Q; Liu, Guo; Stevens, Joshua B; Bremer, Steven W; Ye, Karen J; Abdallah, Batoul Y; Horne, Steven D; Ye, Christine J

    2011-10-01

    While our understanding of gene-based biology has greatly improved, it is clear that the function of the genome and most diseases cannot be fully explained by genes and other regulatory elements. Genes and the genome represent distinct levels of genetic organization with their own coding systems; Genes code parts like protein and RNA, but the genome codes the structure of genetic networks, which are defined by the whole set of genes, chromosomes and their topological interactions within a cell. Accordingly, the genetic code of DNA offers limited understanding of genome functions. In this perspective, we introduce the genome theory which calls for the departure of gene-centric genomic research. To make this transition for the next phase of genomic research, it is essential to acknowledge the importance of new genome-based biological concepts and to establish new technology platforms to decode the genome beyond sequencing. Copyright © 2011 Elsevier Inc. All rights reserved.

  19. Engineering bacteria to solve the Burnt Pancake Problem

    PubMed Central

    Haynes, Karmella A; Broderick, Marian L; Brown, Adam D; Butner, Trevor L; Dickson, James O; Harden, W Lance; Heard, Lane H; Jessen, Eric L; Malloy, Kelly J; Ogden, Brad J; Rosemond, Sabriya; Simpson, Samantha; Zwack, Erin; Campbell, A Malcolm; Eckdahl, Todd T; Heyer, Laurie J; Poet, Jeffrey L

    2008-01-01

    Background We investigated the possibility of executing DNA-based computation in living cells by engineering Escherichia coli to address a classic mathematical puzzle called the Burnt Pancake Problem (BPP). The BPP is solved by sorting a stack of distinct objects (pancakes) into proper order and orientation using the minimum number of manipulations. Each manipulation reverses the order and orientation of one or more adjacent objects in the stack. We have designed a system that uses site-specific DNA recombination to mediate inversions of genetic elements that represent pancakes within plasmid DNA. Results Inversions (or "flips") of the DNA fragment pancakes are driven by the Salmonella typhimurium Hin/hix DNA recombinase system that we reconstituted as a collection of modular genetic elements for use in E. coli. Our system sorts DNA segments by inversions to produce different permutations of a promoter and a tetracycline resistance coding region; E. coli cells become antibiotic resistant when the segments are properly sorted. Hin recombinase can mediate all possible inversion operations on adjacent flippable DNA fragments. Mathematical modeling predicts that the system reaches equilibrium after very few flips, where equal numbers of permutations are randomly sorted and unsorted. Semiquantitative PCR analysis of in vivo flipping suggests that inversion products accumulate on a time scale of hours or days rather than minutes. Conclusion The Hin/hix system is a proof-of-concept demonstration of in vivo computation with the potential to be scaled up to accommodate larger and more challenging problems. Hin/hix may provide a flexible new tool for manipulating transgenic DNA in vivo. PMID:18492232

  20. VLF Trimpi modelling on the path NWC-Dunedin using both finite element and 3D Born modelling

    NASA Astrophysics Data System (ADS)

    Nunn, D.; Hayakawa, K. B. M.

    1998-10-01

    This paper investigates the numerical modelling of VLF Trimpis, produced by a D region inhomogeneity on the great circle path. Two different codes are used to model Trimpis on the path NWC-Dunedin. The first is a 2D Finite Element Method Code (FEM), whose solutions are rigorous and valid in the strong scattering or non-Born limit. The second code is a 3D model that invokes the Born approximation. The predicted Trimpis from these codes compare very closely, thus confirming the validity of both models. The modal scattering matrices for both codes are analysed in some detail and are found to have a comparable structure. They indicate strong scattering between the dominant TM modes. Analysis of the scattering matrix from the FEM code shows that departure from linear Born behaviour occurs when the inhomogeneity has a horizontal scale size of about 100 km and a maximum electron density enhancement at 75 km altitude of about 6 electrons.

  1. Demonstration of retrotransposition of the Tf1 element in fission yeast.

    PubMed

    Levin, H L; Boeke, J D

    1992-03-01

    Tf1, a retrotransposon from fission yeast, has LTRs and coding sequences resembling the protease, reverse transcriptase and integrase domains of retroviral pol genes. A unique aspect of Tf1 is that it contains a single open reading frame whereas other retroviruses and retrotransposons usually possess two or more open reading frames. To determine whether Tf1 can transpose, we overproduced Tf1 transcripts encoded by a plasmid copy of the element marked with a neo gene. Approximately 0.1-4.0% of the cell population acquired chromosomally inherited resistance to G418. DNA blot analysis demonstrated that such strains had acquired both Tf1 and neo specific sequences within a restriction fragment of the same size; the size of this restriction fragment varied between different isolates. Structural analysis of the cloned DNA flanking the Tf1-neo element of two transposition candidates with the same regions in the parent strain showed that the ability to grow on G418 was due to transposition of Tf1-neo and not other types of recombination events.

  2. Endonuclease-independent LINE-1 retrotransposition at mammalian telomeres.

    PubMed

    Morrish, Tammy A; Garcia-Perez, José Luis; Stamato, Thomas D; Taccioli, Guillermo E; Sekiguchi, JoAnn; Moran, John V

    2007-03-08

    Long interspersed element-1 (LINE-1 or L1) elements are abundant, non-long-terminal-repeat (non-LTR) retrotransposons that comprise approximately 17% of human DNA. The average human genome contains approximately 80-100 retrotransposition-competent L1s (ref. 2), and they mobilize by a process that uses both the L1 endonuclease and reverse transcriptase, termed target-site primed reverse transcription. We have previously reported an efficient, endonuclease-independent L1 retrotransposition pathway (EN(i)) in certain Chinese hamster ovary (CHO) cell lines that are defective in the non-homologous end-joining (NHEJ) pathway of DNA double-strand-break repair. Here we have characterized EN(i) retrotransposition events generated in V3 CHO cells, which are deficient in DNA-dependent protein kinase catalytic subunit (DNA-PKcs) activity and have both dysfunctional telomeres and an NHEJ defect. Notably, approximately 30% of EN(i) retrotransposition events insert in an orientation-specific manner adjacent to a perfect telomere repeat (5'-TTAGGG-3'). Similar insertions were not detected among EN(i) retrotransposition events generated in controls or in XR-1 CHO cells deficient for XRCC4, an NHEJ factor that is required for DNA ligation but has no known function in telomere maintenance. Furthermore, transient expression of a dominant-negative allele of human TRF2 (also called TERF2) in XRCC4-deficient XR-1 cells, which disrupts telomere capping, enables telomere-associated EN(i) retrotransposition events. These data indicate that L1s containing a disabled endonuclease can use dysfunctional telomeres as an integration substrate. The findings highlight similarities between the mechanism of EN(i) retrotransposition and the action of telomerase, because both processes can use a 3' OH for priming reverse transcription at either internal DNA lesions or chromosome ends. Thus, we propose that EN(i) retrotransposition is an ancestral mechanism of RNA-mediated DNA repair associated with non-LTR retrotransposons that may have been used before the acquisition of an endonuclease domain.

  3. Oncogenomic disruptions in arsenic-induced carcinogenesis

    PubMed Central

    Ng, Kevin W.; Stewart, Greg L.; Dummer, Trevor J.B.; Lam, Wan L.; Martinez, Victor D

    2017-01-01

    Chronic exposure to arsenic affects more than 200 million people worldwide, and has been associated with many adverse health effects, including cancer in several organs. There is accumulating evidence that arsenic biotransformation, a step in the elimination of arsenic from the human body, can induce changes at a genetic and epigenetic level, leading to carcinogenesis. At the genetic level, arsenic interferes with key cellular processes such as DNA damage-repair and chromosomal structure, leading to genomic instability. At the epigenetic level, arsenic places a high demand on the cellular methyl pool, leading to global hypomethylation and hypermethylation of specific gene promoters. These arsenic-associated DNA alterations result in the deregulation of both oncogenic and tumour-suppressive genes. Furthermore, recent reports have implicated aberrant expression of non-coding RNAs and the consequential disruption of signaling pathways in the context of arsenic-induced carcinogenesis. This article provides an overview of the oncogenomic anomalies associated with arsenic exposure and conveys the importance of non-coding RNAs in the arsenic-induced carcinogenic process. PMID:28179585

  4. Ancient DNA sequence revealed by error-correcting codes.

    PubMed

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-07-10

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.

  5. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  6. DNA sequence requirements for the accurate transcription of a protein-coding plastid gene in a plastid in vitro system from mustard (Sinapis alba L.)

    PubMed Central

    Link, Gerhard

    1984-01-01

    A nuclease-treated plastid extract from mustard (Sinapis alba L.) allows efficient transcription of cloned plastid DNA templates. In this in vitro system, the major runoff transcript of the truncated gene for the 32 000 mol. wt. photosystem II protein was accurately initiated from a site close to or identical with the in vivo start site. By using plasmids with deletions in the 5'-flanking region of this gene as templates, a DNA region required for efficient and selective initiation was detected ˜28-35 nucleotides upstream of the transcription start site. This region contains the sequence element TTGACA, which matches the consensus sequence for prokaryotic `−35' promoter elements. In the absence of this region, a region ˜13-27 nucleotides upstream of the start site still enables a basic level of specific transcription. This second region contains the sequence element TATATAA, which matches the consensus sequence for the `TATA' box of genes transcribed by RNA polymerase II (or B). The region between the `TATA'-like element and the transcription start site is not sufficient but may be required for specific transcription of the plastid gene. This latter region contains the sequence element TATACT, which resembles the prokaryotic `−10' (Pribnow) box. Based on the structural and transcriptional features of the 5' upstream region, a `promoter switch' mechanism is proposed, which may account for the developmentally regulated expression of this plastid gene. ImagesFig. 1.Fig. 2.Fig. 3.Fig. 4.Figure 5. PMID:16453540

  7. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  8. The expanding regulatory universe of p53 in gastrointestinal cancer.

    PubMed

    Fesler, Andrew; Zhang, Ning; Ju, Jingfang

    2016-01-01

    Tumor suppresser gene TP53 is one of the most frequently deleted or mutated genes in gastrointestinal cancers. As a transcription factor, p53 regulates a number of important protein coding genes to control cell cycle, cell death, DNA damage/repair, stemness, differentiation and other key cellular functions. In addition, p53 is also able to activate the expression of a number of small non-coding microRNAs (miRNAs) through direct binding to the promoter region of these miRNAs.  Many miRNAs have been identified to be potential tumor suppressors by regulating key effecter target mRNAs. Our understanding of the regulatory network of p53 has recently expanded to include long non-coding RNAs (lncRNAs). Like miRNA, lncRNAs have been found to play important roles in cancer biology.  With our increased understanding of the important functions of these non-coding RNAs and their relationship with p53, we are gaining exciting new insights into the biology and function of cells in response to various growth environment changes. In this review we summarize the current understanding of the ever expanding involvement of non-coding RNAs in the p53 regulatory network and its implications for our understanding of gastrointestinal cancer.

  9. Definition of RNA Polymerase II CoTC Terminator Elements in the Human Genome

    PubMed Central

    Nojima, Takayuki; Dienstbier, Martin; Murphy, Shona; Proudfoot, Nicholas J.; Dye, Michael J.

    2013-01-01

    Summary Mammalian RNA polymerase II (Pol II) transcription termination is an essential step in protein-coding gene expression that is mediated by pre-mRNA processing activities and DNA-encoded terminator elements. Although much is known about the role of pre-mRNA processing in termination, our understanding of the characteristics and generality of terminator elements is limited. Whereas promoter databases list up to 40,000 known and potential Pol II promoter sequences, fewer than ten Pol II terminator sequences have been described. Using our knowledge of the human β-globin terminator mechanism, we have developed a selection strategy for mapping mammalian Pol II terminator elements. We report the identification of 78 cotranscriptional cleavage (CoTC)-type terminator elements at endogenous gene loci. The results of this analysis pave the way for the full understanding of Pol II termination pathways and their roles in gene expression. PMID:23562152

  10. The Foldback-like element Galileo belongs to the P superfamily of DNA transposons and is widespread within the Drosophila genus.

    PubMed

    Marzo, Mar; Puig, Marta; Ruiz, Alfredo

    2008-02-26

    Galileo is the only transposable element (TE) known to have generated natural chromosomal inversions in the genus Drosophila. It was discovered in Drosophila buzzatii and classified as a Foldback-like element because of its long, internally repetitive, terminal inverted repeats (TIRs) and lack of coding capacity. Here, we characterized a seemingly complete copy of Galileo from the D. buzzatii genome. It is 5,406 bp long, possesses 1,229-bp TIRs, and encodes a 912-aa transposase similar to those of the Drosophila melanogaster 1360 (Hoppel) and P elements. We also searched the recently available genome sequences of 12 Drosophila species for elements similar to Dbuz\\Galileo by using bioinformatic tools. Galileo was found in six species (ananassae, willistoni, peudoobscura, persimilis, virilis, and mojavensis) from the two main lineages within the Drosophila genus. Our observations place Galileo within the P superfamily of cut-and-paste transposons and extend considerably its phylogenetic distribution. The interspecific distribution of Galileo indicates an ancient presence in the genus, but the phylogenetic tree built with the transposase amino acid sequences contrasts significantly with that of the species, indicating lineage sorting and/or horizontal transfer events. Our results also suggest that Foldback-like elements such as Galileo may evolve from DNA-based transposon ancestors by loss of the transposase gene and disproportionate elongation of TIRs.

  11. The Foldback-like element Galileo belongs to the P superfamily of DNA transposons and is widespread within the Drosophila genus

    PubMed Central

    Marzo, Mar; Puig, Marta; Ruiz, Alfredo

    2008-01-01

    Galileo is the only transposable element (TE) known to have generated natural chromosomal inversions in the genus Drosophila. It was discovered in Drosophila buzzatii and classified as a Foldback-like element because of its long, internally repetitive, terminal inverted repeats (TIRs) and lack of coding capacity. Here, we characterized a seemingly complete copy of Galileo from the D. buzzatii genome. It is 5,406 bp long, possesses 1,229-bp TIRs, and encodes a 912-aa transposase similar to those of the Drosophila melanogaster 1360 (Hoppel) and P elements. We also searched the recently available genome sequences of 12 Drosophila species for elements similar to Dbuz\\Galileo by using bioinformatic tools. Galileo was found in six species (ananassae, willistoni, peudoobscura, persimilis, virilis, and mojavensis) from the two main lineages within the Drosophila genus. Our observations place Galileo within the P superfamily of cut-and-paste transposons and extend considerably its phylogenetic distribution. The interspecific distribution of Galileo indicates an ancient presence in the genus, but the phylogenetic tree built with the transposase amino acid sequences contrasts significantly with that of the species, indicating lineage sorting and/or horizontal transfer events. Our results also suggest that Foldback-like elements such as Galileo may evolve from DNA-based transposon ancestors by loss of the transposase gene and disproportionate elongation of TIRs. PMID:18287066

  12. The bornavirus-derived human protein EBLN1 promotes efficient cell cycle transit, microtubule organisation and genome stability.

    PubMed

    Myers, Katie N; Barone, Giancarlo; Ganesh, Anil; Staples, Christopher J; Howard, Anna E; Beveridge, Ryan D; Maslen, Sarah; Skehel, J Mark; Collis, Spencer J

    2016-10-14

    It was recently discovered that vertebrate genomes contain multiple endogenised nucleotide sequences derived from the non-retroviral RNA bornavirus. Strikingly, some of these elements have been evolutionary maintained as open reading frames in host genomes for over 40 million years, suggesting that some endogenised bornavirus-derived elements (EBL) might encode functional proteins. EBLN1 is one such element established through endogenisation of the bornavirus N gene (BDV N). Here, we functionally characterise human EBLN1 as a novel regulator of genome stability. Cells depleted of human EBLN1 accumulate DNA damage both under non-stressed conditions and following exogenously induced DNA damage. EBLN1-depleted cells also exhibit cell cycle abnormalities and defects in microtubule organisation as well as premature centrosome splitting, which we attribute in part, to improper localisation of the nuclear envelope protein TPR. Our data therefore reveal that human EBLN1 possesses important cellular functions within human cells, and suggest that other EBLs present within vertebrate genomes may also possess important cellular functions.

  13. Cryo-EM Structures Reveal Mechanism and Inhibition of DNA Targeting by a CRISPR-Cas Surveillance Complex.

    PubMed

    Guo, Tai Wei; Bartesaghi, Alberto; Yang, Hui; Falconieri, Veronica; Rao, Prashant; Merk, Alan; Eng, Edward T; Raczkowski, Ashleigh M; Fox, Tara; Earl, Lesley A; Patel, Dinshaw J; Subramaniam, Sriram

    2017-10-05

    Prokaryotic cells possess CRISPR-mediated adaptive immune systems that protect them from foreign genetic elements, such as invading viruses. A central element of this immune system is an RNA-guided surveillance complex capable of targeting non-self DNA or RNA for degradation in a sequence- and site-specific manner analogous to RNA interference. Although the complexes display considerable diversity in their composition and architecture, many basic mechanisms underlying target recognition and cleavage are highly conserved. Using cryoelectron microscopy (cryo-EM), we show that the binding of target double-stranded DNA (dsDNA) to a type I-F CRISPR system yersinia (Csy) surveillance complex leads to large quaternary and tertiary structural changes in the complex that are likely necessary in the pathway leading to target dsDNA degradation by a trans-acting helicase-nuclease. Comparison of the structure of the surveillance complex before and after dsDNA binding, or in complex with three virally encoded anti-CRISPR suppressors that inhibit dsDNA binding, reveals mechanistic details underlying target recognition and inhibition. Published by Elsevier Inc.

  14. Structures of Escherichia coli DNA adenine methyltransferase (Dam) in complex with a non-GATC sequence: Potential implications for methylation-independent transcriptional repression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Horton, John R.; Zhang, Xing; Blumenthal, Robert M.

    DNA adenine methyltransferase (Dam) is widespread and conserved among the γ-proteobacteria. Methylation of the Ade in GATC sequences regulates diverse bacterial cell functions, including gene expression, mismatch repair and chromosome replication. Dam also controls virulence in many pathogenic Gram-negative bacteria. An unexplained and perplexing observation about Escherichia coli Dam (EcoDam) is that there is no obvious relationship between the genes that are transcriptionally responsive to Dam and the promoter-proximal presence of GATC sequences. Here, we demonstrate that EcoDam interacts with a 5-base pair non-cognate sequence distinct from GATC. The crystal structure of a non-cognate complex allowed us to identify amore » DNA binding element, GTYTA/TARAC (where Y = C/T and R = A/G). This element immediately flanks GATC sites in some Dam-regulated promoters, including the Pap operon which specifies pyelonephritis-associated pili. In addition, Dam interacts with near-cognate GATC sequences (i.e. 3/4-site ATC and GAT). All together, these results imply that Dam, in addition to being responsible for GATC methylation, could also function as a methylation-independent transcriptional repressor.« less

  15. Structures of Escherichia coli DNA adenine methyltransferase (Dam) in complex with a non-GATC sequence: Potential implications for methylation-independent transcriptional repression

    DOE PAGES

    Horton, John R.; Zhang, Xing; Blumenthal, Robert M.; ...

    2015-04-06

    DNA adenine methyltransferase (Dam) is widespread and conserved among the γ-proteobacteria. Methylation of the Ade in GATC sequences regulates diverse bacterial cell functions, including gene expression, mismatch repair and chromosome replication. Dam also controls virulence in many pathogenic Gram-negative bacteria. An unexplained and perplexing observation about Escherichia coli Dam (EcoDam) is that there is no obvious relationship between the genes that are transcriptionally responsive to Dam and the promoter-proximal presence of GATC sequences. Here, we demonstrate that EcoDam interacts with a 5-base pair non-cognate sequence distinct from GATC. The crystal structure of a non-cognate complex allowed us to identify amore » DNA binding element, GTYTA/TARAC (where Y = C/T and R = A/G). This element immediately flanks GATC sites in some Dam-regulated promoters, including the Pap operon which specifies pyelonephritis-associated pili. In addition, Dam interacts with near-cognate GATC sequences (i.e. 3/4-site ATC and GAT). All together, these results imply that Dam, in addition to being responsible for GATC methylation, could also function as a methylation-independent transcriptional repressor.« less

  16. Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.

    PubMed

    Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M

    2011-01-01

    Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.

  17. Polymerization of non-complementary RNA: systematic symmetric nucleotide exchanges mainly involving uracil produce mitochondrial RNA transcripts coding for cryptic overlapping genes.

    PubMed

    Seligmann, Hervé

    2013-03-01

    Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging uracil. Independent measures of rates of rare replicational enzymatic DNA nucleotide misinsertions predict frequencies of RNA transcripts systematically exchanging the corresponding misinserted nucleotides. Exchange transcripts self-hybridize less than other gene regions, self-hybridization increases with length, suggesting endoribonuclease-limited elongation. Blast detects stop codon depleted putative protein coding overlapping genes within exchange-transcribed mitochondrial genes. These align with existing GenBank proteins (mainly metazoan origins, prokaryotic and viral origins underrepresented). These GenBank proteins frequently interact with RNA/DNA, are membrane transporters, or are typical of mitochondrial metabolism. Nucleotide exchange transcript frequencies increase with overlapping gene densities and stop densities, indicating finely tuned counterbalancing regulation of expression of systematic symmetric nucleotide exchange-encrypted proteins. Such expression necessitates combined activities of suppressor tRNAs matching stops, and nucleotide exchange transcription. Two independent properties confirm predicted exchanged overlap coding genes: discrepancy of third codon nucleotide contents from replicational deamination gradients, and codon usage according to circular code predictions. Predictions from both properties converge, especially for frequent nucleotide exchange types. Nucleotide exchanging transcription apparently increases coding densities of protein coding genes without lengthening genomes, revealing unsuspected functional DNA coding potential. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  18. Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.

    PubMed

    Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy

    2006-10-25

    Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).

  19. Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain

    PubMed Central

    de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

    2014-01-01

    The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. PMID:24792163

  20. Transient Vibration Prediction for Rotors on Ball Bearings Using Load-dependent Non-linear Bearing Stiffness

    NASA Technical Reports Server (NTRS)

    Fleming, David P.; Poplawski, J. V.

    2002-01-01

    Rolling-element bearing forces vary nonlinearly with bearing deflection. Thus an accurate rotordynamic transient analysis requires bearing forces to be determined at each step of the transient solution. Analyses have been carried out to show the effect of accurate bearing transient forces (accounting for non-linear speed and load dependent bearing stiffness) as compared to conventional use of average rolling-element bearing stiffness. Bearing forces were calculated by COBRA-AHS (Computer Optimized Ball and Roller Bearing Analysis - Advanced High Speed) and supplied to the rotordynamics code ARDS (Analysis of Rotor Dynamic Systems) for accurate simulation of rotor transient behavior. COBRA-AHS is a fast-running 5 degree-of-freedom computer code able to calculate high speed rolling-element bearing load-displacement data for radial and angular contact ball bearings and also for cylindrical and tapered roller beatings. Results show that use of nonlinear bearing characteristics is essential for accurate prediction of rotordynamic behavior.

  1. Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure

    PubMed Central

    Coyne, Robert S; Thiagarajan, Mathangi; Jones, Kristie M; Wortman, Jennifer R; Tallon, Luke J; Haas, Brian J; Cassidy-Hanley, Donna M; Wiley, Emily A; Smith, Joshua J; Collins, Kathleen; Lee, Suzanne R; Couvillion, Mary T; Liu, Yifan; Garg, Jyoti; Pearlman, Ronald E; Hamilton, Eileen P; Orias, Eduardo; Eisen, Jonathan A; Methé, Barbara A

    2008-01-01

    Background Tetrahymena thermophila, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC) and somatic macronucleus (MAC). The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigenetic removal of invasive DNA elements found only in the MIC genome. Such low repetitiveness makes complete closure of the MAC genome a feasible goal, which to achieve would require standard closure methods as well as removal of minor MIC contamination of the MAC genome assembly. Highly accurate preliminary annotation of Tetrahymena's coding potential was hindered by the lack of both comparative genomic sequence information from close relatives and significant amounts of cDNA evidence, thus limiting the value of the genomic information and also leaving unanswered certain questions, such as the frequency of alternative splicing. Results We addressed the problem of MIC contamination using comparative genomic hybridization with purified MIC and MAC DNA probes against a whole genome oligonucleotide microarray, allowing the identification of 763 genome scaffolds likely to contain MIC-limited DNA sequences. We also employed standard genome closure methods to essentially finish over 60% of the MAC genome. For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular growth and development conditions. Using this EST evidence, a combination of automated and manual reannotation efforts led to updates that affect 16% of the current protein-coding gene models. By comparing EST abundance, many genes showing apparent differential expression between these conditions were identified. Rare instances of alternative splicing and uses of the non-standard amino acid selenocysteine were also identified. Conclusion We report here significant progress in genome closure and reannotation of Tetrahymena thermophila. Our experience to date suggests that complete closure of the MAC genome is attainable. Using the new EST evidence, automated and manual curation has resulted in substantial improvements to the over 24,000 gene models, which will be valuable to researchers studying this model organism as well as for comparative genomics purposes. PMID:19036158

  2. Impacts of Bt crops on non-target organisms and insecticide use patterns

    USDA-ARS?s Scientific Manuscript database

    Bacillus thuringiensis (Bt), a bacterium capable of producing insecticidal proteins is ubiquitous in the environment, and the genes coding for these proteins are now becoming ubiquitous in major crop plants via recombinant DNA technology where they provide host plant resistance to major lepidopteran...

  3. Finite element modelling of non-linear magnetic circuits using Cosmic NASTRAN

    NASA Technical Reports Server (NTRS)

    Sheerer, T. J.

    1986-01-01

    The general purpose Finite Element Program COSMIC NASTRAN currently has the ability to model magnetic circuits with constant permeablilities. An approach was developed which, through small modifications to the program, allows modelling of non-linear magnetic devices including soft magnetic materials, permanent magnets and coils. Use of the NASTRAN code resulted in output which can be used for subsequent mechanical analysis using a variation of the same computer model. Test problems were found to produce theoretically verifiable results.

  4. Problems with mitochondrial DNA as a marker in population, phylogeographic and phylogenetic studies: the effects of inherited symbionts

    PubMed Central

    Hurst, Gregory D.D; Jiggins, Francis M

    2005-01-01

    Mitochondrial DNA (mtDNA) has been a marker of choice for reconstructing historical patterns of population demography, admixture, biogeography and speciation. However, it has recently been suggested that the pervasive nature of direct and indirect selection on this molecule renders any conclusion derived from it ambiguous. We review here the evidence for indirect selection on mtDNA in arthropods arising from linkage disequilibrium with maternally inherited symbionts. We note first that these symbionts are very common in arthropods and then review studies that reveal the extent to which they shape mtDNA evolution. mtDNA diversity patterns are compatible with neutral expectations for an uninfected population in only 2 of 19 cases. The remaining 17 studies revealed cases of symbiont-driven reduction in mtDNA diversity, symbiont-driven increases in diversity, symbiont-driven changes in mtDNA variation over space and symbiont-associated paraphyly of mtDNA. We therefore conclude that these elements often confound the inference of an organism's evolutionary history from mtDNA data and that mtDNA on its own is an unsuitable marker for the study of recent historical events in arthropods. We also discuss the impact of these studies on the current programme of taxonomy based on DNA bar-coding. PMID:16048766

  5. Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.

    PubMed

    Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi

    2016-03-01

    Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.

  6. Connexin31.1 deficiency in the mouse impairs object memory and modulates open-field exploration, acetylcholine esterase levels in the striatum, and cAMP response element-binding protein levels in the striatum and piriform cortex.

    PubMed

    Dere, E; Zheng-Fischhöfer, Q; Viggiano, D; Gironi Carnevale, U A; Ruocco, L A; Zlomuzica, A; Schnichels, M; Willecke, K; Huston, J P; Sadile, A G

    2008-05-02

    Neuronal gap junctions in the brain, providing intercellular electrotonic signal transfer, have been implicated in physiological and behavioral correlates of learning and memory. In connexin31.1 (Cx31.1) knockout (KO) mice the coding region of the Cx31.1 gene was replaced by a LacZ reporter gene. We investigated the impact of Cx31.1 deficiency on open-field exploration, the behavioral response to an odor, non-selective attention, learning and memory performance, and the levels of memory-related proteins in the hippocampus, striatum and the piriform cortex. In terms of behavior, the deletion of the Cx31.1 coding DNA in the mouse led to increased exploratory behaviors in a novel environment, and impaired one-trial object recognition at all delays tested. Despite strong Cx31.1 expression in the peripheral and central olfactory system, Cx31.1 KO mice exhibited normal behavioral responses to an odor. We found increased levels of acetylcholine esterase (AChE) and cAMP response element-binding protein (CREB) in the striatum of Cx31.1 KO mice. In the piriform cortex the Cx31.1 KO mice had an increased heterogeneity of CREB expression among neurons. In conclusion, gap-junctions featuring the Cx31.1 protein might be involved in open-field exploration as well as object memory and modulate levels of AChE and CREB in the striatum and piriform cortex.

  7. ANN modeling of DNA sequences: new strategies using DNA shape code.

    PubMed

    Parbhane, R V; Tambe, S S; Kulkarni, B D

    2000-09-01

    Two new encoding strategies, namely, wedge and twist codes, which are based on the DNA helical parameters, are introduced to represent DNA sequences in artificial neural network (ANN)-based modeling of biological systems. The performance of the new coding strategies has been evaluated by conducting three case studies involving mapping (modeling) and classification applications of ANNs. The proposed coding schemes have been compared rigorously and shown to outperform the existing coding strategies especially in situations wherein limited data are available for building the ANN models.

  8. Genome-wide DNA methylation map of human neutrophils reveals widespread inter-individual epigenetic variation

    PubMed Central

    Chatterjee, Aniruddha; Stockwell, Peter A.; Rodger, Euan J.; Duncan, Elizabeth J.; Parry, Matthew F.; Weeks, Robert J.; Morison, Ian M.

    2015-01-01

    The extent of variation in DNA methylation patterns in healthy individuals is not yet well documented. Identification of inter-individual epigenetic variation is important for understanding phenotypic variation and disease susceptibility. Using neutrophils from a cohort of healthy individuals, we generated base-resolution DNA methylation maps to document inter-individual epigenetic variation. We identified 12851 autosomal inter-individual variably methylated fragments (iVMFs). Gene promoters were the least variable, whereas gene body and upstream regions showed higher variation in DNA methylation. The iVMFs were relatively enriched in repetitive elements compared to non-iVMFs, and were associated with genome regulation and chromatin function elements. Further, variably methylated genes were disproportionately associated with regulation of transcription, responsive function and signal transduction pathways. Transcriptome analysis indicates that iVMF methylation at differentially expressed exons has a positive correlation and local effect on the inclusion of that exon in the mRNA transcript. PMID:26612583

  9. Use of wavelet-packet transforms to develop an engineering model for multifractal characterization of mutation dynamics in pathological and nonpathological gene sequences

    NASA Astrophysics Data System (ADS)

    Walker, David Lee

    1999-12-01

    This study uses dynamical analysis to examine in a quantitative fashion the information coding mechanism in DNA sequences. This exceeds the simple dichotomy of either modeling the mechanism by comparing DNA sequence walks as Fractal Brownian Motion (fbm) processes. The 2-D mappings of the DNA sequences for this research are from Iterated Function System (IFS) (Also known as the ``Chaos Game Representation'' (CGR)) mappings of the DNA sequences. This technique converts a 1-D sequence into a 2-D representation that preserves subsequence structure and provides a visual representation. The second step of this analysis involves the application of Wavelet Packet Transforms, a recently developed technique from the field of signal processing. A multi-fractal model is built by using wavelet transforms to estimate the Hurst exponent, H. The Hurst exponent is a non-parametric measurement of the dynamism of a system. This procedure is used to evaluate gene- coding events in the DNA sequence of cystic fibrosis mutations. The H exponent is calculated for various mutation sites in this gene. The results of this study indicate the presence of anti-persistent, random walks and persistent ``sub-periods'' in the sequence. This indicates the hypothesis of a multi-fractal model of DNA information encoding warrants further consideration. This work examines the model's behavior in both pathological (mutations) and non-pathological (healthy) base pair sequences of the cystic fibrosis gene. These mutations both natural and synthetic were introduced by computer manipulation of the original base pair text files. The results show that disease severity and system ``information dynamics'' correlate. These results have implications for genetic engineering as well as in mathematical biology. They suggest that there is scope for more multi-fractal models to be developed.

  10. T cells are influenced by a long non-coding RNA in the autoimmune associated PTPN2 locus.

    PubMed

    Houtman, Miranda; Shchetynsky, Klementy; Chemin, Karine; Hensvold, Aase Haj; Ramsköld, Daniel; Tandre, Karolina; Eloranta, Maija-Leena; Rönnblom, Lars; Uebe, Steffen; Catrina, Anca Irinel; Malmström, Vivianne; Padyukov, Leonid

    2018-06-01

    Non-coding SNPs in the protein tyrosine phosphatase non-receptor type 2 (PTPN2) locus have been linked with several autoimmune diseases, including rheumatoid arthritis, type I diabetes, and inflammatory bowel disease. However, the functional consequences of these SNPs are poorly characterized. Herein, we show in blood cells that SNPs in the PTPN2 locus are highly correlated with DNA methylation levels at four CpG sites downstream of PTPN2 and expression levels of the long non-coding RNA (lncRNA) LINC01882 downstream of these CpG sites. We observed that LINC01882 is mainly expressed in T cells and that anti-CD3/CD28 activated naïve CD4 + T cells downregulate the expression of LINC01882. RNA sequencing analysis of LINC01882 knockdown in Jurkat T cells, using a combination of antisense oligonucleotides and RNA interference, revealed the upregulation of the transcription factor ZEB1 and kinase MAP2K4, both involved in IL-2 regulation. Overall, our data suggests the involvement of LINC01882 in T cell activation and hints towards an auxiliary role of these non-coding SNPs in autoimmunity associated with the PTPN2 locus. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  11. A Transcriptome Map of Actinobacillus pleuropneumoniae at Single-Nucleotide Resolution Using Deep RNA-Seq

    PubMed Central

    Su, Zhipeng; Zhu, Jiawen; Xu, Zhuofei; Xiao, Ran; Zhou, Rui; Li, Lu; Chen, Huanchun

    2016-01-01

    Actinobacillus pleuropneumoniae is the pathogen of porcine contagious pleuropneumoniae, a highly contagious respiratory disease of swine. Although the genome of A. pleuropneumoniae was sequenced several years ago, limited information is available on the genome-wide transcriptional analysis to accurately annotate the gene structures and regulatory elements. High-throughput RNA sequencing (RNA-seq) has been applied to study the transcriptional landscape of bacteria, which can efficiently and accurately identify gene expression regions and unknown transcriptional units, especially small non-coding RNAs (sRNAs), UTRs and regulatory regions. The aim of this study is to comprehensively analyze the transcriptome of A. pleuropneumoniae by RNA-seq in order to improve the existing genome annotation and promote our understanding of A. pleuropneumoniae gene structures and RNA-based regulation. In this study, we utilized RNA-seq to construct a single nucleotide resolution transcriptome map of A. pleuropneumoniae. More than 3.8 million high-quality reads (average length ~90 bp) from a cDNA library were generated and aligned to the reference genome. We identified 32 open reading frames encoding novel proteins that were mis-annotated in the previous genome annotations. The start sites for 35 genes based on the current genome annotation were corrected. Furthermore, 51 sRNAs in the A. pleuropneumoniae genome were discovered, of which 40 sRNAs were never reported in previous studies. The transcriptome map also enabled visualization of 5'- and 3'-UTR regions, in which contained 11 sRNAs. In addition, 351 operons covering 1230 genes throughout the whole genome were identified. The RNA-Seq based transcriptome map validated annotated genes and corrected annotations of open reading frames in the genome, and led to the identification of many functional elements (e.g. regions encoding novel proteins, non-coding sRNAs and operon structures). The transcriptional units described in this study provide a foundation for future studies concerning the gene functions and the transcriptional regulatory architectures of this pathogen. PMID:27018591

  12. UV-induced DNA damage is an intermediate step in UV-induced expression of human immunodeficiency virus type 1, collagenase, c-fos, and metallothionein.

    PubMed Central

    Stein, B; Rahmsdorf, H J; Steffen, A; Litfin, M; Herrlich, P

    1989-01-01

    UV irradiation of human and murine cells enhances the transcription of several genes. Here we report on the primary target of relevant UV absorption, on pathways leading to gene activation, and on the elements receiving the UV-induced signal in the human immunodeficiency virus type 1 (HIV-1) long terminal repeat, in the gene coding for collagenase, and in the cellular oncogene fos. In order to induce the expression of genes. UV radiation needs to be absorbed by DNA and to cause DNA damage of the kind that cannot be repaired by cells from patients with xeroderma pigmentosum group A. UV-induced activation of the three genes is mediated by the major enhancer elements (located between nucleotide positions -105 and -79 of HIV-1, between positions -72 and -65 of the collagenase gene, and between positions -320 and -299 of fos). These elements share no apparent sequence motif and bind different trans-acting proteins; a member of the NF kappa B family binds to the HIV-1 enhancer, the heterodimer of Jun and Fos (AP-1) binds to the collagenase enhancer, and the serum response factors p67 and p62 bind to fos. DNA-binding activities of the factors recognizing the HIV-1 and collagenase enhancers are augmented in extracts from UV-treated cells. The increase in activity is due to posttranslational modification. While AP-1 resides in the nucleus and must be modulated there, NF kappa B is activated in the cytoplasm, indicating the existence of a cytoplasmic signal transduction pathway triggered by UV-induced DNA damage. In addition to activation, new synthesis of AP-1 is induced by UV radiation. Images PMID:2557547

  13. Discovery of functional non-coding conserved regions in the α-synuclein gene locus

    PubMed Central

    Sterling, Lori; Walter, Michael; Ting, Dennis; Schüle, Birgitt

    2014-01-01

    Several single nucleotide polymorphisms (SNPs) and the Rep-1 microsatellite marker of the α-synuclein ( SNCA) gene have consistently been shown to be associated with Parkinson’s disease, but the functional relevance is unclear. Based on these findings we hypothesized that conserved cis-regulatory elements in the SNCA genomic region regulate expression of SNCA, and that SNPs in these regions could be functionally modulating the expression of SNCA, thus contributing to neuronal demise and predisposing to Parkinson’s disease. In a pair-wise comparison of a 206kb genomic region encompassing the SNCA gene, we revealed 34 evolutionary conserved DNA sequences between human and mouse. All elements were cloned into reporter vectors and assessed for expression modulation in dual luciferase reporter assays.  We found that 12 out of 34 elements exhibited either an enhancement or reduction of the expression of the reporter gene. Three elements upstream of the SNCA gene displayed an approximately 1.5 fold (p<0.009) increase in expression. Of the intronic regions, three showed a 1.5 fold increase and two others indicated a 2 and 2.5 fold increase in expression (p<0.002). Three elements downstream of the SNCA gene showed 1.5 fold and 2.5 fold increase (p<0.0009). One element downstream of SNCA had a reduced expression of the reporter gene of 0.35 fold (p<0.0009) of normal activity. Our results demonstrate that the SNCA gene contains cis-regulatory regions that might regulate the transcription and expression of SNCA. Further studies in disease-relevant tissue types will be important to understand the functional impact of regulatory regions and specific Parkinson’s disease-associated SNPs and its function in the disease process. PMID:25566351

  14. Modulating the DNA polymerase β reaction equilibrium to dissect the reverse reaction

    PubMed Central

    Shock, David D.; Freudenthal, Bret D.; Beard, William A.; Wilson, Samuel H.

    2017-01-01

    DNA polymerases catalyze efficient and high fidelity DNA synthesis. While this reaction favors nucleotide incorporation, polymerases also catalyze a reverse reaction, pyrophosphorolysis, removing the DNA primer terminus and generating deoxynucleoside triphosphates. Since pyrophosphorolysis can influence polymerase fidelity and sensitivity to chain-terminating nucleosides, we analyzed pyrophosphorolysis with human DNA polymerase β and found the reaction to be inefficient. The lack of a thio-elemental effect indicated that it was limited by a non-chemical step. Utilizing a pyrophosphate analog, where the bridging oxygen is replaced with an imido-group (PNP), increased the rate of the reverse reaction and displayed a large thio-elemental effect indicating that chemistry was now rate determining. Time-lapse crystallography with PNP captured structures consistent with a chemical equilibrium that favored the reverse reaction. These results highlight the importance of the bridging atom between the β- and γ-phosphates of the incoming nucleotide in reaction chemistry, enzyme conformational changes, and overall reaction equilibrium. PMID:28759020

  15. Signatures of DNA Methylation across Insects Suggest Reduced DNA Methylation Levels in Holometabola

    PubMed Central

    Provataris, Panagiotis; Meusemann, Karen; Niehuis, Oliver; Grath, Sonja; Misof, Bernhard

    2018-01-01

    Abstract It has been experimentally shown that DNA methylation is involved in the regulation of gene expression and the silencing of transposable element activity in eukaryotes. The variable levels of DNA methylation among different insect species indicate an evolutionarily flexible role of DNA methylation in insects, which due to a lack of comparative data is not yet well-substantiated. Here, we use computational methods to trace signatures of DNA methylation across insects by analyzing transcriptomic and genomic sequence data from all currently recognized insect orders. We conclude that: 1) a functional methylation system relying exclusively on DNA methyltransferase 1 is widespread across insects. 2) DNA methylation has potentially been lost or extremely reduced in species belonging to springtails (Collembola), flies and relatives (Diptera), and twisted-winged parasites (Strepsiptera). 3) Holometabolous insects display signs of reduced DNA methylation levels in protein-coding sequences compared with hemimetabolous insects. 4) Evolutionarily conserved insect genes associated with housekeeping functions tend to display signs of heavier DNA methylation in comparison to the genomic/transcriptomic background. With this comparative study, we provide the much needed basis for experimental and detailed comparative analyses required to gain a deeper understanding on the evolution and function of DNA methylation in insects. PMID:29697817

  16. Theria-Specific Homeodomain and cis-Regulatory Element Evolution of the Dlx3–4 Bigene Cluster in 12 Different Mammalian Species

    PubMed Central

    SUMIYAMA, KENTA; MIYAKE, TSUTOMU; GRIMWOOD, JANE; STUART, ANDREW; DICKSON, MARK; SCHMUTZ, JEREMY; RUDDLE, FRANK H.; MYERS, RICHARD M.; AMEMIYA, CHRIS T.

    2013-01-01

    The mammalian Dlx3 and Dlx4 genes are configured as a bigene cluster, and their respective expression patterns are controlled temporally and spatially by cis-elements that largely reside within the intergenic region of the cluster. Previous work revealed that there are conspicuously conserved elements within the intergenic region of the Dlx3–4 bigene clusters of mouse and human. In this paper we have extended these analyses to include 12 additional mammalian taxa (including a marsupial and a monotreme) in order to better define the nature and molecular evolutionary trends of the coding and non-coding functional elements among morphologically divergent mammals. Dlx3–4 regions were fully sequenced from 12 divergent taxa of interest. We identified three theria-specific amino acid replacements in homeodomain of Dlx4 gene that functions in placenta. Sequence analyses of constrained nucleotide sites in the intergenic non-coding region showed that many of the intergenic conserved elements are highly conserved and have evolved slowly within the mammals. In contrast, a branchial arch/craniofacial enhancer I37-2 exhibited accelerated evolution at the branch between the monotreme and therian common ancestor despite being highly conserved among therian species. Functional analysis of I37-2 in transgenic mice has shown that the equivalent region of the platypus fails to drive transcriptional activity in branchial arches. These observations, taken together with our molecular evolutionary data, suggest that theria-specific episodic changes in the I37-2 element may have contributed to craniofacial innovation at the base of the mammalian lineage. PMID:22951979

  17. Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules

    PubMed Central

    Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex

    2012-01-01

    Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789

  18. Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.

    PubMed

    Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex

    2012-01-01

    Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.

  19. Charged particle tracking through electrostatic wire meshes using the finite element method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devlin, L. J.; Karamyshev, O.; Welsch, C. P., E-mail: carsten.welsch@cockcroft.ac.uk

    Wire meshes are used across many disciplines to accelerate and focus charged particles, however, analytical solutions are non-exact and few codes exist which simulate the exact fields around a mesh with physical sizes. A tracking code based in Matlab-Simulink using field maps generated using finite element software has been developed which tracks electrons or ions through electrostatic wire meshes. The fields around such a geometry are presented as an analytical expression using several basic assumptions, however, it is apparent that computational calculations are required to obtain realistic values of electric potential and fields, particularly when multiple wire meshes are deployed.more » The tracking code is flexible in that any quantitatively describable particle distribution can be used for both electrons and ions as well as other benefits such as ease of export to other programs for analysis. The code is made freely available and physical examples are highlighted where this code could be beneficial for different applications.« less

  20. Epigenetic Effects of Cadmium in Cancer: Focus on Melanoma

    PubMed Central

    Venza, Mario; Visalli, Maria; Biondo, Carmelo; Oteri, Rosaria; Agliano, Federica; Morabito, Silvia; Caruso, Gerardo; Caffo, Maria; Teti, Diana; Venza, Isabella

    2014-01-01

    Cadmium is a highly toxic heavy metal, which has a destroying impact on organs. Exposure to cadmium causes severe health problems to human beings due to its ubiquitous environmental presence and features of the pathologies associated with pro-longed exposure. Cadmium is a well-established carcinogen, although the underlying mechanisms have not been fully under-stood yet. Recently, there has been considerable interest in the impact of this environmental pollutant on the epigenome. Be-cause of the role of epigenetic alterations in regulating gene expression, there is a potential for the integration of cadmium-induced epigenetic alterations as critical elements in the cancer risk assessment process. Here, after a brief review of the ma-jor diseases related to cadmium exposure, we focus our interest on the carcinogenic potential of this heavy metal. Among the several proposed pathogenetic mechanisms, particular attention is given to epigenetic alterations, including changes in DNA methylation, histone modifications and non-coding RNA expression. We review evidence for a link between cadmium-induced epigenetic changes and cell transformation, with special emphasis on melanoma. DNA methylation, with reduced expression of key genes that regulate cell proliferation and apoptosis, has emerged as a possible cadmium-induced epigenetic mechanism in melanoma. A wider comprehension of mechanisms related to this common environmental contaminant would allow a better cancer risk evaluation. PMID:25646071

  1. High-throughput analysis of the satellitome illuminates satellite DNA evolution

    NASA Astrophysics Data System (ADS)

    Ruiz-Ruano, Francisco J.; López-León, María Dolores; Cabrero, Josefa; Camacho, Juan Pedro M.

    2016-07-01

    Satellite DNA (satDNA) is a major component yet the great unknown of eukaryote genomes and clearly underrepresented in genome sequencing projects. Here we show the high-throughput analysis of satellite DNA content in the migratory locust by means of the bioinformatic analysis of Illumina reads with the RepeatExplorer and RepeatMasker programs. This unveiled 62 satDNA families and we propose the term “satellitome” for the whole collection of different satDNA families in a genome. The finding that satDNAs were present in many contigs of the migratory locust draft genome indicates that they show many genomic locations invisible by fluorescent in situ hybridization (FISH). The cytological pattern of five satellites showing common descent (belonging to the SF3 superfamily) suggests that non-clustered satDNAs can become into clustered through local amplification at any of the many genomic loci resulting from previous dissemination of short satDNA arrays. The fact that all kinds of satDNA (micro- mini- and satellites) can show the non-clustered and clustered states suggests that all these elements are mostly similar, except for repeat length. Finally, the presence of VNTRs in bacteria, showing similar properties to non-clustered satDNAs in eukaryotes, suggests that this kind of tandem repeats show common properties in all living beings.

  2. Surveying DNA Elements within Functional Genes of Heterocyst-Forming Cyanobacteria

    PubMed Central

    Hilton, Jason A.; Meeks, John C.; Zehr, Jonathan P.

    2016-01-01

    Some cyanobacteria are capable of differentiating a variety of cell types in response to environmental factors. For instance, in low nitrogen conditions, some cyanobacteria form heterocysts, which are specialized for N2 fixation. Many heterocyst-forming cyanobacteria have DNA elements interrupting key N2 fixation genes, elements that are excised during heterocyst differentiation. While the mechanism for the excision of the element has been well-studied, many questions remain regarding the introduction of the elements into the cyanobacterial lineage and whether they have been retained ever since or have been lost and reintroduced. To examine the evolutionary relationships and possible function of DNA sequences that interrupt genes of heterocyst-forming cyanobacteria, we identified and compared 101 interruption element sequences within genes from 38 heterocyst-forming cyanobacterial genomes. The interruption element lengths ranged from about 1 kb (the minimum able to encode the recombinase responsible for element excision), up to nearly 1 Mb. The recombinase gene sequences served as genetic markers that were common across the interruption elements and were used to track element evolution. Elements were found that interrupted 22 different orthologs, only five of which had been previously observed to be interrupted by an element. Most of the newly identified interrupted orthologs encode proteins that have been shown to have heterocyst-specific activity. However, the presence of interruption elements within genes with no known role in N2 fixation, as well as in three non-heterocyst-forming cyanobacteria, indicates that the processes that trigger the excision of elements may not be limited to heterocyst development or that the elements move randomly within genomes. This comprehensive analysis provides the framework to study the history and behavior of these unique sequences, and offers new insight regarding the frequency and persistence of interruption elements in heterocyst-forming cyanobacteria. PMID:27206019

  3. Surveying DNA Elements within Functional Genes of Heterocyst-Forming Cyanobacteria.

    PubMed

    Hilton, Jason A; Meeks, John C; Zehr, Jonathan P

    2016-01-01

    Some cyanobacteria are capable of differentiating a variety of cell types in response to environmental factors. For instance, in low nitrogen conditions, some cyanobacteria form heterocysts, which are specialized for N2 fixation. Many heterocyst-forming cyanobacteria have DNA elements interrupting key N2 fixation genes, elements that are excised during heterocyst differentiation. While the mechanism for the excision of the element has been well-studied, many questions remain regarding the introduction of the elements into the cyanobacterial lineage and whether they have been retained ever since or have been lost and reintroduced. To examine the evolutionary relationships and possible function of DNA sequences that interrupt genes of heterocyst-forming cyanobacteria, we identified and compared 101 interruption element sequences within genes from 38 heterocyst-forming cyanobacterial genomes. The interruption element lengths ranged from about 1 kb (the minimum able to encode the recombinase responsible for element excision), up to nearly 1 Mb. The recombinase gene sequences served as genetic markers that were common across the interruption elements and were used to track element evolution. Elements were found that interrupted 22 different orthologs, only five of which had been previously observed to be interrupted by an element. Most of the newly identified interrupted orthologs encode proteins that have been shown to have heterocyst-specific activity. However, the presence of interruption elements within genes with no known role in N2 fixation, as well as in three non-heterocyst-forming cyanobacteria, indicates that the processes that trigger the excision of elements may not be limited to heterocyst development or that the elements move randomly within genomes. This comprehensive analysis provides the framework to study the history and behavior of these unique sequences, and offers new insight regarding the frequency and persistence of interruption elements in heterocyst-forming cyanobacteria.

  4. DNA AND THE FINE STRUCTURE OF SYNAPTIC CHROMOSOMES IN THE DOMESTIC ROOSTER (GALLUS DOMESTICUS)

    PubMed Central

    Coleman, James R.; Moses, Montrose J.

    1964-01-01

    The indium trichloride method of Watson and Aldridge (38) for staining nucleic acids for electron microscopy was employed to study the relationship of DNA to the structure of the synaptinemal complex in meiotic prophase chromosomes of the domestic rooster. The selectivity of the method was demonstrated in untreated and DNase-digested testis material by comparing the distribution of indium staining in the electron microscope to Feulgen staining and ultraviolet absorption in thicker sections seen with the light microscope. Following staining by indium, DNA was found mainly in the microfibril component of the synaptinemal complex. When DNA was known to have been removed from aldehyde-fixed material by digestion with DNase, indium stainability was also lost. However, staining of the digested material with non-selective heavy metal techniques demonstrated the presence of material other than DNA in the microfibrils and showed that little alteration in appearance of the chromosome resulted from DNA removal. The two dense lateral axial elements of the synaptinemal complex, but not the central one to any extent, also contained DNA, together with non-DNA material. PMID:14228519

  5. Turnover of R1 (Type I) and R2 (Type Ii) Retrotransposable Elements in the Ribosomal DNA of Drosophila Melanogaster

    PubMed Central

    Jakubczak, J. L.; Zenni, M. K.; Woodruff, R. C.; Eickbush, T. H.

    1992-01-01

    R1 and R2 are distantly related non-long terminal repeat retrotransposable elements each of which inserts into a specific site in the 28S rRNA genes of most insects. We have analyzed aspects of R1 and R2 abundance and sequence variation in 27 geographical isolates of Drosophila melanogaster. The fraction of 28S rRNA genes containing these elements varied greatly between strains, 17-67% for R1 elements and 2-28% for R2 elements. The total percentage of the rDNA repeats inserted ranged from 32 to 77%. The fraction of the rDNA repeats that contained both of these elements suggested that R1 and R2 exhibit neither an inhibition of nor preference for insertion into a 28S gene already containing the other type of element. Based on the conservation of restriction sites in the elements of all strains, and sequence analysis of individual elements from three strains, nucleotide divergence is very low for R1 and R2 elements within or between strains (<0.6%). This sequence uniformity is the expected result of the forces of concerted evolution (unequal crossovers and gene conversion) which act on the rRNA genes themselves. Evidence for the role of retrotransposition in the turnover of R1 and R2 was obtained by using naturally occurring 5' length polymorphisms of the elements as markers for independent transposition events. The pattern of these different length 5' truncations of R1 and R2 was found to be diverse and unique to most strains analyzed. Because recombination can only, with time, amplify or eliminate those length variants already present, the diversity found in each strain suggests that retrotransposition has played a critical role in maintaining these elements in the rDNA repeats of D. melanogaster. PMID:1317313

  6. The Autographa californica Multiple Nucleopolyhedrovirus ac83 Gene Contains a cis-Acting Element That Is Essential for Nucleocapsid Assembly.

    PubMed

    Huang, Zhihong; Pan, Mengjia; Zhu, Silei; Zhang, Hao; Wu, Wenbi; Yuan, Meijin; Yang, Kai

    2017-03-01

    Baculoviridae is a family of insect-specific viruses that have a circular double-stranded DNA genome packaged within a rod-shaped capsid. The mechanism of baculovirus nucleocapsid assembly remains unclear. Previous studies have shown that deletion of the ac83 gene of Autographa californica multiple nucleopolyhedrovirus (AcMNPV) blocks viral nucleocapsid assembly. Interestingly, the ac83 -encoded protein Ac83 is not a component of the nucleocapsid, implying a particular role for ac83 in nucleocapsid assembly that may be independent of its protein product. To examine this possibility, Ac83 synthesis was disrupted by insertion of a chloramphenicol resistance gene into its coding sequence or by deleting its promoter and translation start codon. Both mutants produced progeny viruses normally, indicating that the Ac83 protein is not required for nucleocapsid assembly. Subsequently, complementation assays showed that the production of progeny viruses required the presence of ac83 in the AcMNPV genome instead of its presence in trans Therefore, we reasoned that ac83 is involved in nucleocapsid assembly via an internal cis -acting element, which we named the nucleocapsid assembly-essential element (NAE). The NAE was identified to lie within nucleotides 1651 to 1850 of ac83 and had 8 conserved A/T-rich regions. Sequences homologous to the NAE were found only in alphabaculoviruses and have a conserved positional relationship with another essential cis -acting element that was recently identified. The identification of the NAE may help to connect the data of viral cis -acting elements and related proteins in the baculovirus nucleocapsid assembly, which is important for elucidating DNA-protein interaction events during this process. IMPORTANCE Virus nucleocapsid assembly usually requires specific cis -acting elements in the viral genome for various processes, such as the selection of the viral genome from the cellular nucleic acids, the cleavage of concatemeric viral genome replication intermediates, and the encapsidation of the viral genome into procapsids. In linear DNA viruses, such elements generally locate at the ends of the viral genome; however, most of these elements remain unidentified in circular DNA viruses (including baculovirus) due to their circular genomic conformation. Here, we identified a nucleocapsid assembly-essential element in the AcMNPV (the archetype of baculovirus) genome. This finding provides an important reference for studies of nucleocapsid assembly-related elements in baculoviruses and other circular DNA viruses. Moreover, as most of the previous studies of baculovirus nucleocapsid assembly have been focused on viral proteins, our study provides a novel entry point to investigate this mechanism via cis -acting elements in the viral genome. Copyright © 2017 American Society for Microbiology.

  7. Anisotropic constitutive model for nickel base single crystal alloys: Development and finite element implementation

    NASA Technical Reports Server (NTRS)

    Dame, L. T.; Stouffer, D. C.

    1986-01-01

    A tool for the mechanical analysis of nickel base single crystal superalloys, specifically Rene N4, used in gas turbine engine components is developed. This is achieved by a rate dependent anisotropic constitutive model implemented in a nonlinear three dimensional finite element code. The constitutive model is developed from metallurigical concepts utilizing a crystallographic approach. A non Schmid's law formulation is used to model the tension/compression asymmetry and orientation dependence in octahedral slip. Schmid's law is a good approximation to the inelastic response of the material in cube slip. The constitutive equations model the tensile behavior, creep response, and strain rate sensitivity of these alloys. Methods for deriving the material constants from standard tests are presented. The finite element implementation utilizes an initial strain method and twenty noded isoparametric solid elements. The ability to model piecewise linear load histories is included in the finite element code. The constitutive equations are accurately and economically integrated using a second order Adams-Moulton predictor-corrector method with a dynamic time incrementing procedure. Computed results from the finite element code are compared with experimental data for tensile, creep and cyclic tests at 760 deg C. The strain rate sensitivity and stress relaxation capabilities of the model are evaluated.

  8. Stability of non-Watson-Crick G-A/A-G base pair in synthetic DNA and RNA oligonucleotides.

    PubMed

    Ito, Yuko; Sone, Yumiko; Mizutani, Takaharu

    2004-03-01

    A non-Watson-Crick G-A/A-G base pair is found in SECIS (selenocysteine-insertion sequence) element in the 3'-untranslated region of Se-protein mRNAs and in the functional site of the hammerhead ribozyme. We studied the stability of G-A/A-G base pair (bold) in 17mer GT(U)GACGGAAACCGGAAC synthetic DNA and RNA oligonucleotides by thermal melting experiments and gel electrophoresis. The measured Tm value of DNA oligonucleotide having G-A/A-G pair showed an intermediate value (58 degrees C) between that of Watson-Crick G-C/C-G base pair (75 degrees C) and that of G-G/A-A of non-base-pair (40 degrees C). Similar thermal melting patterns were obtained with RNA oligonucleotides. This result indicates that the secondary structure of oligonucleotide having G-A/A-G base pair is looser than that of the G-C type Watson-Crick base pair. In the comparison between RNA and DNA having G-A/A-G base pair, the Tm value of the RNA oligonucleotide was 11 degrees C lower than that of DNA, indicating that DNA has a more rigid structure than RNA. The stained pattern of oligonucleotide on polyacrylamide gel clarified that the mobility of the DNA oligonucleotide G-A/A-G base pair changed according to the urea concentration from the rigid state (near the mobility of G-C/C-G oligonucleotide) in the absence of urea to the random state (near the mobility of G-G/A-A oligonucleotide) in 7 M urea. However, the RNA oligonucleotide with G-A/A-G pair moved at an intermediate mobility between that of oligonucleotide with G-C/C-G and of the oligonucleotide with G-G/A-A, and the mobility pattern did not depend on urea concentration. Thus, DNA and RNA oligonucleotides with the G-A/A-G base pair showed a pattern indicating an intermediate structure between the rigid Watson-Crick base pair and the random structure of non-base pair. RNA with G-A/A-G base pair has the intermediate structure not influenced by urea concentration. Finally, this study indicated that the intermediate rigidity imparted by Non-Watson-Crick base pair in SECIS element plays an important role in the selenocysteine expression by UGA codon.

  9. A Helitron-like Transposon Superfamily from Lepidoptera Disrupts (GAAA)n Microsatellites and is Responsible for Flanking Sequence Similarity within a Microsatellite Family

    USDA-ARS?s Scientific Manuscript database

    Transposable elements (TEs) are mobile DNA regions that alter host genome structure and gene expression. A novel 588 bp non-autonomous high copy number TE in the Ostrinia nubilalis genome has features in common with miniature inverted-repeat transposable elements (MITEs): high A+T content (62.3%),...

  10. Transcription of Gypsy Elements in a Y-Chromosome Male Fertility Gene of Drosophila Hydei

    PubMed Central

    Hochstenbach, R.; Harhangi, H.; Schouren, K.; Bindels, P.; Suijkerbuijk, R.; Hennig, W.

    1996-01-01

    We have found that defective gypsy retrotransposons are a major constituent of the lampbrush loop pair Nooses in the short arm of the Y chromosome of Drosophila hydei. The loop pair is formed by male fertility gene Q during the primary spermatocyte stage of spermatogenesis, each loop being a single transcription unit with an estimated length of 260 kb. Using fluorescent in situ hybridization, we show that throughout the loop transcripts gypsy elements are interspersed with blocks of a tandemly repetitive Y-specific DNA sequence, ay1. Nooses transcripts containing both sequence types show a wide size range on Northern blots, do not migrate to the cytoplasm, and are degraded just before the first meiotic division. Only one strand of ay1 and only the coding strand of gypsy can be detected in the loop transcripts. However, as cloned genomic DNA fragments also display opposite orientations of ay1 and gypsy, such DNA sections cannot be part of the Nooses. Hence, they are most likely derived from the flanking heterochromatin. The direction of transcription of ay1 and gypsy thus appears to be of a functional significance. PMID:8852843

  11. The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101

    NASA Astrophysics Data System (ADS)

    Pfreundt, Ulrike; Kopf, Matthias; Belkin, Natalia; Berman-Frank, Ilana; Hess, Wolfgang R.

    2014-08-01

    Blooms of the dinitrogen-fixing marine cyanobacterium Trichodesmium considerably contribute to new nitrogen inputs into tropical oceans. Intriguingly, only 60% of the Trichodesmium erythraeum IMS101 genome sequence codes for protein, compared with ~85% in other sequenced cyanobacterial genomes. The extensive non-coding genome fraction suggests space for an unusually high number of unidentified, potentially regulatory non-protein-coding RNAs (ncRNAs). To identify the transcribed fraction of the genome, here we present a genome-wide map of transcriptional start sites (TSS) at single nucleotide resolution, revealing the activity of 6,080 promoters. We demonstrate that T. erythraeum has the highest number of actively splicing group II introns and the highest percentage of TSS yielding ncRNAs of any bacterium examined to date. We identified a highly transcribed retroelement that serves as template repeat for the targeted mutation of at least 12 different genes by mutagenic homing. Our findings explain the non-coding portion of the T. erythraeum genome by the transcription of an unusually high number of non-coding transcripts in addition to the known high incidence of transposable elements. We conclude that riboregulation and RNA maturation-dependent processes constitute a major part of the Trichodesmium regulatory apparatus.

  12. Many human accelerated regions are developmental enhancers

    PubMed Central

    Capra, John A.; Erwin, Genevieve D.; McKinsey, Gabriel; Rubenstein, John L. R.; Pollard, Katherine S.

    2013-01-01

    The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology. PMID:24218637

  13. Aberrant methylation and associated transcriptional mobilization of Alu elements contributes to genomic instability in hypoxia.

    PubMed

    Pal, Arnab; Srivastava, Tapasya; Sharma, Manish K; Mehndiratta, Mohit; Das, Prerna; Sinha, Subrata; Chattopadhyay, Parthaprasad

    2010-11-01

    Hypoxia is an integral part of tumorigenesis and contributes extensively to the neoplastic phenotype including drug resistance and genomic instability. It has also been reported that hypoxia results in global demethylation. Because a majority of the cytosine-phosphate-guanine (CpG) islands are found within the repeat elements of DNA, and are usually methylated under normoxic conditions, we suggested that retrotransposable Alu or short interspersed nuclear elements (SINEs) which show altered methylation and associated changes of gene expression during hypoxia, could be associated with genomic instability. U87MG glioblastoma cells were cultured in 0.1% O₂ for 6 weeks and compared with cells cultured in 21% O₂ for the same duration. Real-time PCR analysis showed a significant increase in SINE and reverse transcriptase coding long interspersed nuclear element (LINE) transcripts during hypoxia. Sequencing of bisulphite treated DNA as well as the Combined Bisulfite Restriction Analysis (COBRA) assay showed that the SINE loci studied underwent significant hypomethylation though there was patchy hypermethylation at a few sites. The inter-alu PCR profile of DNA from cells cultured under 6-week hypoxia, its 4-week revert back to normoxia and 6-week normoxia showed several changes in the band pattern indicating increased alu mediated genomic alteration. Our results show that aberrant methylation leading to increased transcription of SINE and reverse transcriptase associated LINE elements could lead to increased genomic instability in hypoxia. This might be a cause of genetic heterogeneity in tumours especially in variegated hypoxic environment and lead to a development of foci of more aggressive tumour cells. © 2009 The Authors Journal compilation © 2010 Foundation for Cellular and Molecular Medicine/Blackwell Publishing Ltd.

  14. Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

    PubMed

    Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

    2017-10-03

    Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.

  15. Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes

    PubMed Central

    Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

    2017-01-01

    Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes. PMID:29108274

  16. Transcription Factor Binding Profiles Reveal Cyclic Expression of Human Protein-coding Genes and Non-coding RNAs

    PubMed Central

    Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.

    2013-01-01

    Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175

  17. Variations in the non-coding transcriptome as a driver of inter-strain divergence and physiological adaptation in bacteria.

    PubMed

    Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R; Voß, Björn

    2015-04-22

    In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5'UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5'UTR. Such an sRNA/mRNA structure, which we name 'actuaton', represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation.

  18. Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers

    PubMed Central

    Latgé, Guillaume; Poulet, Christophe; Bours, Vincent; Jerusalem, Guy

    2018-01-01

    Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers. PMID:29301303

  19. Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers.

    PubMed

    Latgé, Guillaume; Poulet, Christophe; Bours, Vincent; Josse, Claire; Jerusalem, Guy

    2018-01-02

    Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers.

  20. Genes in sport and doping.

    PubMed

    Pokrywka, A; Kaliszewski, P; Majorczyk, E; Zembroń-Łacny, A

    2013-09-01

    Genes control biological processes such as muscle production of energy, mitochondria biogenesis, bone formation, erythropoiesis, angiogenesis, vasodilation, neurogenesis, etc. DNA profiling for athletes reveals genetic variations that may be associated with endurance ability, muscle performance and power exercise, tendon susceptibility to injuries and psychological aptitude. Already, over 200 genes relating to physical performance have been identified by several research groups. Athletes' genotyping is developing as a tool for the formulation of personalized training and nutritional programmes to optimize sport training as well as for the prediction of exercise-related injuries. On the other hand, development of molecular technology and gene therapy creates a risk of non-therapeutic use of cells, genes and genetic elements to improve athletic performance. Therefore, the World Anti-Doping Agency decided to include prohibition of gene doping within their World Anti-Doping Code in 2003. In this review article, we will provide a current overview of genes for use in athletes' genotyping and gene doping possibilities, including their development and detection techniques.

  1. GENES IN SPORT AND DOPING

    PubMed Central

    Kaliszewski, P.; Majorczyk, E.; Zembroń-Łacny, A.

    2013-01-01

    Genes control biological processes such as muscle production of energy, mitochondria biogenesis, bone formation, erythropoiesis, angiogenesis, vasodilation, neurogenesis, etc. DNA profiling for athletes reveals genetic variations that may be associated with endurance ability, muscle performance and power exercise, tendon susceptibility to injuries and psychological aptitude. Already, over 200 genes relating to physical performance have been identified by several research groups. Athletes’ genotyping is developing as a tool for the formulation of personalized training and nutritional programmes to optimize sport training as well as for the prediction of exercise-related injuries. On the other hand, development of molecular technology and gene therapy creates a risk of non-therapeutic use of cells, genes and genetic elements to improve athletic performance. Therefore, the World Anti-Doping Agency decided to include prohibition of gene doping within their World Anti-Doping Code in 2003. In this review article, we will provide a current overview of genes for use in athletes’ genotyping and gene doping possibilities, including their development and detection techniques. PMID:24744482

  2. Dcode.org anthology of comparative genomic tools.

    PubMed

    Loots, Gabriela G; Ovcharenko, Ivan

    2005-07-01

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the non-coding encryption of gene regulation across genomes. To facilitate the practical application of comparative sequence analysis to genetics and genomics, we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools, zPicture and Mulan; a phylogenetic shadowing tool, eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools, rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, Creme 2.0; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here, we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ website.

  3. A genomic island integrated into recA of Vibrio cholerae contains a divergent recA and provides multi-pathway protection from DNA damage.

    PubMed

    Rapa, Rita A; Islam, Atiqul; Monahan, Leigh G; Mutreja, Ankur; Thomson, Nicholas; Charles, Ian G; Stokes, Harold W; Labbate, Maurizio

    2015-04-01

    Lateral gene transfer (LGT) has been crucial in the evolution of the cholera pathogen, Vibrio cholerae. The two major virulence factors are present on two different mobile genetic elements, a bacteriophage containing the cholera toxin genes and a genomic island (GI) containing the intestinal adhesin genes. Non-toxigenic V. cholerae in the aquatic environment are a major source of novel DNA that allows the pathogen to morph via LGT. In this study, we report a novel GI from a non-toxigenic V. cholerae strain containing multiple genes involved in DNA repair including the recombination repair gene recA that is 23% divergent from the indigenous recA and genes involved in the translesion synthesis pathway. This is the first report of a GI containing the critical gene recA and the first report of a GI that targets insertion into a specific site within recA. We show that possession of the island in Escherichia coli is protective against DNA damage induced by UV-irradiation and DNA targeting antibiotics. This study highlights the importance of genetic elements such as GIs in the evolution of V. cholerae and emphasizes the importance of environmental strains as a source of novel DNA that can influence the pathogenicity of toxigenic strains. © 2014 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

  4. Increasing Nucleosome Occupancy Is Correlated with an Increasing Mutation Rate so Long as DNA Repair Machinery Is Intact

    PubMed Central

    Taylor, Jared F.; Khattab, Omar S.; Chen, Yu-Han; Chen, Yumay; Jacobsen, Steven E.; Wang, Ping H.

    2015-01-01

    Deciphering the multitude of epigenomic and genomic factors that influence the mutation rate is an area of great interest in modern biology. Recently, chromatin has been shown to play a part in this process. To elucidate this relationship further, we integrated our own ultra-deep sequenced human nucleosomal DNA data set with a host of published human genomic and cancer genomic data sets. Our results revealed, that differences in nucleosome occupancy are associated with changes in base-specific mutation rates. Increasing nucleosome occupancy is associated with an increasing transition to transversion ratio and an increased germline mutation rate within the human genome. Additionally, cancer single nucleotide variants and microindels are enriched within nucleosomes and both the coding and non-coding cancer mutation rate increases with increasing nucleosome occupancy. There is an enrichment of cancer indels at the theoretical start (74 bp) and end (115 bp) of linker DNA between two nucleosomes. We then hypothesized that increasing nucleosome occupancy decreases access to DNA by DNA repair machinery and could account for the increasing mutation rate. Such a relationship should not exist in DNA repair knockouts, and we thus repeated our analysis in DNA repair machinery knockouts to test our hypothesis. Indeed, our results revealed no correlation between increasing nucleosome occupancy and increasing mutation rate in DNA repair knockouts. Our findings emphasize the linkage of the genome and epigenome through the nucleosome whose properties can affect genome evolution and genetic aberrations such as cancer. PMID:26308346

  5. Radiation-Induced Epigenetic Alterations after Low and High LET Irradiations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aypar, Umut; Morgan, William F.; Baulch, Janet E.

    Epigenetics, including DNA methylation and microRNA (miRNA) expression, could be the missing link in understanding the delayed, non-targeted effects of radiation including radiationinduced genomic instability (RIGI). This study tests the hypothesis that irradiation induces epigenetic aberrations, which could eventually lead to RIGI, and that the epigenetic aberrations induced by low linear energy transfer (LET) irradiation are different than those induced by high LET irradiations. GM10115 cells were irradiated with low LET x-rays and high LET iron (Fe) ions and evaluated for DNA damage, cell survival and chromosomal instability. The cells were also evaluated for specific locus methylation of nuclear factor-kappamore » B (NFκB), tumor suppressor in lung cancer 1 (TSLC1) and cadherin 1 (CDH1) gene promoter regions, long interspersed nuclear element 1 (LINE-1) and Alu repeat element methylation, CpG and non-CpG global methylation and miRNA expression levels. Irradiated cells showed increased micronucleus induction and cell killing immediately following exposure, but were chromosomally stable at delayed times post-irradiation. At this same delayed time, alterations in repeat element and global DNA methylation and miRNA expression were observed. Analyses of DNA methylation predominantly showed hypomethylation, however hypermethylation was also observed. MiRNA shown to be altered in expression level after x-ray irradiation are involved in chromatin remodeling and DNA methylation. Different and higher incidence of epigenetic changes were observed after exposure to low LET x-rays than high LET Fe ions even though Fe ions elicited more chromosomal damage and cell killing. This study also shows that the irradiated cells acquire epigenetic changes even though they are chromosomally stable suggesting that epigenetic aberrations may arise in the cell without initiating RIGI.« less

  6. The PARTRAC code: Status and recent developments

    NASA Astrophysics Data System (ADS)

    Friedland, Werner; Kundrat, Pavel

    Biophysical modeling is of particular value for predictions of radiation effects due to manned space missions. PARTRAC is an established tool for Monte Carlo-based simulations of radiation track structures, damage induction in cellular DNA and its repair [1]. Dedicated modules describe interactions of ionizing particles with the traversed medium, the production and reactions of reactive species, and score DNA damage determined by overlapping track structures with multi-scale chromatin models. The DNA repair module describes the repair of DNA double-strand breaks (DSB) via the non-homologous end-joining pathway; the code explicitly simulates the spatial mobility of individual DNA ends in parallel with their processing by major repair enzymes [2]. To simulate the yields and kinetics of radiation-induced chromosome aberrations, the repair module has been extended by tracking the information on the chromosome origin of ligated fragments as well as the presence of centromeres [3]. PARTRAC calculations have been benchmarked against experimental data on various biological endpoints induced by photon and ion irradiation. The calculated DNA fragment distributions after photon and ion irradiation reproduce corresponding experimental data and their dose- and LET-dependence. However, in particular for high-LET radiation many short DNA fragments are predicted below the detection limits of the measurements, so that the experiments significantly underestimate DSB yields by high-LET radiation [4]. The DNA repair module correctly describes the LET-dependent repair kinetics after (60) Co gamma-rays and different N-ion radiation qualities [2]. First calculations on the induction of chromosome aberrations have overestimated the absolute yields of dicentrics, but correctly reproduced their relative dose-dependence and the difference between gamma- and alpha particle irradiation [3]. Recent developments of the PARTRAC code include a model of hetero- vs euchromatin structures to enable accounting for variations in DNA damage yields, complexity and repair between these regions. Second, the applicability of the code to low-energy ions has been extended to full stopping by using a modified Barkas scaling of proton cross sections for ions heavier than helium. Third, ongoing studies aim at hitherto unprecedented benchmarking of the code against experiments with sub-muµm focused bunches of low-LET ions mimicking single high-LET ion tracks [5] which separate effects of damage clustering on a sub-mum scale from DNA damage complexity on a nanometer scale. Fourth, motivated by implications for the involvement of mitochondria in intercellular signaling and radiation-induced bystander effects, ongoing work extends the range of PARTRAC DNA models to radiation effects on mitochondrial DNA. The contribution will discuss the PARTRAC modules, benchmarks to experimental data, recent and ongoing developments of the code, with special attention to its implications and potential applications in radiation protection and space research. Acknowledgement. This work was partially funded by the EU (Contract FP7-249689 ‘DoReMi’). References 1. Friedland et al., Mutat. Res. 711, 28 (2011) 2. Friedland et al., Int. J. Radiat. Biol. 88, 129 (2012) 3. Friedland et al., Mutat. Res. 756, 213 (2013) 4. Alloni et al., Radiat. Res. 179, 690 (2013) 5. Schmid et al., Phys. Med. Biol. 57, 5889 (2012)

  7. The mitochondrial genome of Moniliophthora roreri, the frosty pod rot pathogen of cacao.

    PubMed

    Costa, Gustavo G L; Cabrera, Odalys G; Tiburcio, Ricardo A; Medrano, Francisco J; Carazzolle, Marcelo F; Thomazella, Daniela P T; Schuster, Stephen C; Carlson, John E; Guiltinan, Mark J; Bailey, Bryan A; Mieczkowski, Piotr; Pereira, Gonçalo A G; Meinhardt, Lyndel W

    2012-05-01

    In this study, we report the sequence of the mitochondrial (mt) genome of the Basidiomycete fungus Moniliophthora roreri, which is the etiologic agent of frosty pod rot of cacao (Theobroma cacao L.). We also compare it to the mtDNA from the closely-related species Moniliophthora perniciosa, which causes witches' broom disease of cacao. The 94 Kb mtDNA genome of M. roreri has a circular topology and codes for the typical 14 mt genes involved in oxidative phosphorylation. It also codes for both rRNA genes, a ribosomal protein subunit, 13 intronic open reading frames (ORFs), and a full complement of 27 tRNA genes. The conserved genes of M. roreri mtDNA are completely syntenic with homologous genes of the 109 Kb mtDNA of M. perniciosa. As in M. perniciosa, M. roreri mtDNA contains a high number of hypothetical ORFs (28), a remarkable feature that make Moniliophthoras the largest reservoir of hypothetical ORFs among sequenced fungal mtDNA. Additionally, the mt genome of M. roreri has three free invertron-like linear mt plasmids, one of which is very similar to that previously described as integrated into the main M. perniciosa mtDNA molecule. Moniliophthora roreri mtDNA also has a region of suspected plasmid origin containing 15 hypothetical ORFs distributed in both strands. One of these ORFs is similar to an ORF in the mtDNA gene encoding DNA polymerase in Pleurotus ostreatus. The comparison to M. perniciosa showed that the 15 Kb difference in mtDNA sizes is mainly attributed to a lower abundance of repetitive regions in M. roreri (5.8 Kb vs 20.7 Kb). The most notable differences between M. roreri and M. perniciosa mtDNA are attributed to repeats and regions of plasmid origin. These elements might have contributed to the rapid evolution of mtDNA. Since M. roreri is the second species of the genus Moniliophthora whose mtDNA genome has been sequenced, the data presented here contribute valuable information for understanding the evolution of fungal mt genomes among closely-related species. Crown Copyright © 2012. Published by Elsevier Ltd. All rights reserved.

  8. Characterization of human glucocorticoid receptor complexes formed with DNA fragments containing or lacking glucocorticoid response elements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tully, D.B.; Cidlowski, J.A.

    1989-03-07

    Sucrose density gradient shift assays were used to study the interactions of human glucocorticoid receptors (GR) with small DNA fragments either containing or lacking glucocorticoid response element (GRE) DNA consensus sequences. When crude cytoplasmic extracts containing ({sup 3}H)triamcinolone acetonide (({sup 3}H)TA) labeled GR were incubated with unlabeled DNA under conditions of DNA excess, a GRE-containing DNA fragment obtained from the 5' long terminal repeat of mouse mammary tumor virus (MMTV LTR) formed a stable 12-16S complex with activated, but not nonactivated, ({sup 3}H)TA receptor. By contrast, if the cytosols were treated with calf thymus DNA-cellulose to deplete non-GR-DNA-binding proteins priormore » to heat activation, a smaller 7-10S complex was formed with the MMTV LTR DNA fragment. Activated ({sup 3}H)TA receptor from DNA-cellulose pretreated cytosols also interacted with two similarly sized fragments from pBR322 DNA. Stability of the complexes formed between GR and these three DNA fragments was strongly affected by even moderate alterations in either the salt concentration or the pH of the gradient buffer. Under all conditions tested, the complex formed with the MMTV LTR DNA fragment was more stable than the complexes formed with either of the pBR322 DNA fragments. Together these observations indicate that the formation of stable complexes between activated GR and isolated DNA fragments requires the presence of GRE consensus sequences in the DNA.« less

  9. Gene expression analysis upon lncRNA DDSR1 knockdown in human fibroblasts

    PubMed Central

    Jia, Li; Sun, Zhonghe; Wu, Xiaolin; Misteli, Tom; Sharma, Vivek

    2015-01-01

    Long non-coding RNAs (lncRNAs) play important roles in regulating diverse biological processes including DNA damage and repair. We have recently reported that the DNA damage inducible lncRNA DNA damage-sensitive RNA1 (DDSR1) regulates DNA repair by homologous recombination (HR). Since lncRNAs also modulate gene expression, we identified gene expression changes upon DDSR1 knockdown in human fibroblast cells. Gene expression analysis after RNAi treatment targeted against DDSR1 revealed 119 genes that show differential expression. Here we provide a detailed description of the microarray data (NCBI GEO accession number GSE67048) and the data analysis procedure associated with the publication by Sharma et al., 2015 in EMBO Reports [1]. PMID:26697398

  10. A deep learning method for lincRNA detection using auto-encoder algorithm.

    PubMed

    Yu, Ning; Yu, Zeng; Pan, Yi

    2017-12-06

    RNA sequencing technique (RNA-seq) enables scientists to develop novel data-driven methods for discovering more unidentified lincRNAs. Meantime, knowledge-based technologies are experiencing a potential revolution ignited by the new deep learning methods. By scanning the newly found data set from RNA-seq, scientists have found that: (1) the expression of lincRNAs appears to be regulated, that is, the relevance exists along the DNA sequences; (2) lincRNAs contain some conversed patterns/motifs tethered together by non-conserved regions. The two evidences give the reasoning for adopting knowledge-based deep learning methods in lincRNA detection. Similar to coding region transcription, non-coding regions are split at transcriptional sites. However, regulatory RNAs rather than message RNAs are generated. That is, the transcribed RNAs participate the biological process as regulatory units instead of generating proteins. Identifying these transcriptional regions from non-coding regions is the first step towards lincRNA recognition. The auto-encoder method achieves 100% and 92.4% prediction accuracy on transcription sites over the putative data sets. The experimental results also show the excellent performance of predictive deep neural network on the lincRNA data sets compared with support vector machine and traditional neural network. In addition, it is validated through the newly discovered lincRNA data set and one unreported transcription site is found by feeding the whole annotated sequences through the deep learning machine, which indicates that deep learning method has the extensive ability for lincRNA prediction. The transcriptional sequences of lincRNAs are collected from the annotated human DNA genome data. Subsequently, a two-layer deep neural network is developed for the lincRNA detection, which adopts the auto-encoder algorithm and utilizes different encoding schemes to obtain the best performance over intergenic DNA sequence data. Driven by those newly annotated lincRNA data, deep learning methods based on auto-encoder algorithm can exert their capability in knowledge learning in order to capture the useful features and the information correlation along DNA genome sequences for lincRNA detection. As our knowledge, this is the first application to adopt the deep learning techniques for identifying lincRNA transcription sequences.

  11. Extracellular self-DNA as a damage-associated molecular pattern (DAMP) that triggers self-specific immunity induction in plants.

    PubMed

    Duran-Flores, Dalia; Heil, Martin

    2017-10-16

    Mammals sense self or non-self extracellular or extranuclear DNA fragments (hereinafter collectively termed eDNA) as indicators of injury or infection and respond with immunity. We hypothesised that eDNA acts as a damage-associated molecular pattern (DAMP) also in plants and that it contributes to self versus non-self discrimination. Treating plants and suspension-cultured cells of common bean (Phaseolus vulgaris) with fragmented self eDNA (obtained from other plants of the same species) induced early, immunity-related signalling responses such as H 2 O 2 generation and MAPK activation, decreased the infection by a bacterial pathogen (Pseudomonas syringae) and increased an indirect defence to herbivores (extrafloral nectar secretion). By contrast, non-self DNA (obtained from lima bean, Phaseolus lunatus, and Acacia farnesiana) had significantly lower or no detectable effects. Only fragments below a size of 700 bp were active, and treating the eDNA preparation DNAse abolished its inducing effects, whereas treatment with RNAse or proteinase had no detectable effect. These findings indicate that DNA fragments, rather than small RNAs, single nucleotides or proteins, accounted for the observed effects. We suggest that eDNA functions a DAMP in plants and that plants discriminate self from non-self at a species-specific level. The immune systems of plants and mammals share multiple central elements, but further work will be required to understand the mechanisms and the selective benefits of an immunity response that is triggered by eDNA in a species-specific manner. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Origin and evolution of the long non-coding genes in the X-inactivation center.

    PubMed

    Romito, Antonio; Rougeulle, Claire

    2011-11-01

    Random X chromosome inactivation (XCI), the eutherian mechanism of X-linked gene dosage compensation, is controlled by a cis-acting locus termed the X-inactivation center (Xic). One of the striking features that characterize the Xic landscape is the abundance of loci transcribing non-coding RNAs (ncRNAs), including Xist, the master regulator of the inactivation process. Recent comparative genomic analyses have depicted the evolutionary scenario behind the origin of the X-inactivation center, revealing that this locus evolved from a region harboring protein-coding genes. During mammalian radiation, this ancestral protein-coding region was disrupted in the marsupial group, whilst it provided in eutherian lineage the starting material for the non-translated RNAs of the X-inactivation center. The emergence of non-coding genes occurred by a dual mechanism involving loss of protein-coding function of the pre-existing genes and integration of different classes of mobile elements, some of which modeled the structure and sequence of the non-coding genes in a species-specific manner. The rising genes started to produce transcripts that acquired function in regulating the epigenetic status of the X chromosome, as shown for Xist, its antisense Tsix, Jpx, and recently suggested for Ftx. Thus, the appearance of the Xic, which occurred after the divergence between eutherians and marsupials, was the basis for the evolution of random X inactivation as a strategy to achieve dosage compensation. Copyright © 2011. Published by Elsevier Masson SAS.

  13. The changing epitome of species identification – DNA barcoding

    PubMed Central

    Ajmal Ali, M.; Gyulai, Gábor; Hidvégi, Norbert; Kerti, Balázs; Al Hemaid, Fahad M.A.; Pandey, Arun K.; Lee, Joongku

    2014-01-01

    The discipline taxonomy (the science of naming and classifying organisms, the original bioinformatics and a basis for all biology) is fundamentally important in ensuring the quality of life of future human generation on the earth; yet over the past few decades, the teaching and research funding in taxonomy have declined because of its classical way of practice which lead the discipline many a times to a subject of opinion, and this ultimately gave birth to several problems and challenges, and therefore the taxonomist became an endangered race in the era of genomics. Now taxonomy suddenly became fashionable again due to revolutionary approaches in taxonomy called DNA barcoding (a novel technology to provide rapid, accurate, and automated species identifications using short orthologous DNA sequences). In DNA barcoding, complete data set can be obtained from a single specimen irrespective to morphological or life stage characters. The core idea of DNA barcoding is based on the fact that the highly conserved stretches of DNA, either coding or non coding regions, vary at very minor degree during the evolution within the species. Sequences suggested to be useful in DNA barcoding include cytoplasmic mitochondrial DNA (e.g. cox1) and chloroplast DNA (e.g. rbcL, trnL-F, matK, ndhF, and atpB rbcL), and nuclear DNA (ITS, and house keeping genes e.g. gapdh). The plant DNA barcoding is now transitioning the epitome of species identification; and thus, ultimately helping in the molecularization of taxonomy, a need of the hour. The ‘DNA barcodes’ show promise in providing a practical, standardized, species-level identification tool that can be used for biodiversity assessment, life history and ecological studies, forensic analysis, and many more. PMID:24955007

  14. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.

    PubMed

    2004-12-09

    We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

  15. Introduction to the Natural Anticipator and the Artificial Anticipator

    NASA Astrophysics Data System (ADS)

    Dubois, Daniel M.

    2010-11-01

    This short communication deals with the introduction of the concept of anticipator, which is one who anticipates, in the framework of computing anticipatory systems. The definition of anticipation deals with the concept of program. Indeed, the word program, comes from "pro-gram" meaning "to write before" by anticipation, and means a plan for the programming of a mechanism, or a sequence of coded instructions that can be inserted into a mechanism, or a sequence of coded instructions, as genes or behavioural responses, that is part of an organism. Any natural or artificial programs are thus related to anticipatory rewriting systems, as shown in this paper. All the cells in the body, and the neurons in the brain, are programmed by the anticipatory genetic code, DNA, in a low-level language with four signs. The programs in computers are also computing anticipatory systems. It will be shown, at one hand, that the genetic code DNA is a natural anticipator. As demonstrated by Nobel laureate McClintock [8], genomes are programmed. The fundamental program deals with the DNA genetic code. The properties of the DNA consist in self-replication and self-modification. The self-replicating process leads to reproduction of the species, while the self-modifying process leads to new species or evolution and adaptation in existing ones. The genetic code DNA keeps its instructions in memory in the DNA coding molecule. The genetic code DNA is a rewriting system, from DNA coding to DNA template molecule. The DNA template molecule is a rewriting system to the Messenger RNA molecule. The information is not destroyed during the execution of the rewriting program. On the other hand, it will be demonstrated that Turing machine is an artificial anticipator. The Turing machine is a rewriting system. The head reads and writes, modifying the content of the tape. The information is destroyed during the execution of the program. This is an irreversible process. The input data are lost.

  16. HyDEn: A Hybrid Steganocryptographic Approach for Data Encryption Using Randomized Error-Correcting DNA Codes

    PubMed Central

    Regoui, Chaouki; Durand, Guillaume; Belliveau, Luc; Léger, Serge

    2013-01-01

    This paper presents a novel hybrid DNA encryption (HyDEn) approach that uses randomized assignments of unique error-correcting DNA Hamming code words for single characters in the extended ASCII set. HyDEn relies on custom-built quaternary codes and a private key used in the randomized assignment of code words and the cyclic permutations applied on the encoded message. Along with its ability to detect and correct errors, HyDEn equals or outperforms existing cryptographic methods and represents a promising in silico DNA steganographic approach. PMID:23984392

  17. Sequence of the non-phosphorylating glyceraldehyde-3-phosphate dehydrogenase from Nicotiana plumbaginifolia and phylogenetic origin of the gene family.

    PubMed

    Habenicht, A; Quesada, A; Cerff, R

    1997-10-01

    A cDNA-library has been constructed from Nicotiana plumbaginifolia seedlings, and the non-phosphorylating glyceraldehyde-3-phosphate dehydrogenase (GapN, EC 1.2.1.9) was isolated by plaque hybridization using the cDNA from pea as a heterologous probe. The cDNA comprises the entire GapN coding region. A putative polyadenylation signal is identified. Phylogenetic analysis based on the deduced amino acid sequences revealed that the GapN gene family represents a separate ancient branch within the aldehyde dehydrogenase superfamily. It can be shown that the GapN gene family and other distinct branches of the superfamily have its phylogenetic origin before the separation of primary life-forms. This further demonstrates that already very early in evolution, a broad diversification of the aldehyde dehydrogenases led to the formation of the superfamily.

  18. Concerted copy number variation balances ribosomal DNA dosage in human and mouse genomes

    PubMed Central

    Gibbons, John G.; Branco, Alan T.; Godinho, Susana A.; Yu, Shoukai; Lemos, Bernardo

    2015-01-01

    Tandemly repeated ribosomal DNA (rDNA) arrays are among the most evolutionary dynamic loci of eukaryotic genomes. The loci code for essential cellular components, yet exhibit extensive copy number (CN) variation within and between species. CN might be partly determined by the requirement of dosage balance between the 5S and 45S rDNA arrays. The arrays are nonhomologous, physically unlinked in mammals, and encode functionally interdependent RNA components of the ribosome. Here we show that the 5S and 45S rDNA arrays exhibit concerted CN variation (cCNV). Despite 5S and 45S rDNA elements residing on different chromosomes and lacking sequence similarity, cCNV between these loci is strong, evolutionarily conserved in humans and mice, and manifested across individual genotypes in natural populations and pedigrees. Finally, we observe that bisphenol A induces rapid and parallel modulation of 5S and 45S rDNA CN. Our observations reveal a novel mode of genome variation, indicate that natural selection contributed to the evolution and conservation of cCNV, and support the hypothesis that 5S CN is partly determined by the requirement of dosage balance with the 45S rDNA array. We suggest that human disease variation might be traced to disrupted rDNA dosage balance in the genome. PMID:25583482

  19. Evolutional dynamics of 45S and 5S ribosomal DNA in ancient allohexaploid Atropa belladonna.

    PubMed

    Volkov, Roman A; Panchuk, Irina I; Borisjuk, Nikolai V; Hosiawa-Baranska, Marta; Maluszynska, Jolanta; Hemleben, Vera

    2017-01-23

    Polyploid hybrids represent a rich natural resource to study molecular evolution of plant genes and genomes. Here, we applied a combination of karyological and molecular methods to investigate chromosomal structure, molecular organization and evolution of ribosomal DNA (rDNA) in nightshade, Atropa belladonna (fam. Solanaceae), one of the oldest known allohexaploids among flowering plants. Because of their abundance and specific molecular organization (evolutionarily conserved coding regions linked to variable intergenic spacers, IGS), 45S and 5S rDNA are widely used in plant taxonomic and evolutionary studies. Molecular cloning and nucleotide sequencing of A. belladonna 45S rDNA repeats revealed a general structure characteristic of other Solanaceae species, and a very high sequence similarity of two length variants, with the only difference in number of short IGS subrepeats. These results combined with the detection of three pairs of 45S rDNA loci on separate chromosomes, presumably inherited from both tetraploid and diploid ancestor species, example intensive sequence homogenization that led to substitution/elimination of rDNA repeats of one parent. Chromosome silver-staining revealed that only four out of six 45S rDNA sites are frequently transcriptionally active, demonstrating nucleolar dominance. For 5S rDNA, three size variants of repeats were detected, with the major class represented by repeats containing all functional IGS elements required for transcription, the intermediate size repeats containing partially deleted IGS sequences, and the short 5S repeats containing severe defects both in the IGS and coding sequences. While shorter variants demonstrate increased rate of based substitution, probably in their transition into pseudogenes, the functional 5S rDNA variants are nearly identical at the sequence level, pointing to their origin from a single parental species. Localization of the 5S rDNA genes on two chromosome pairs further supports uniparental inheritance from the tetraploid progenitor. The obtained molecular, cytogenetic and phylogenetic data demonstrate complex evolutionary dynamics of rDNA loci in allohexaploid species of Atropa belladonna. The high level of sequence unification revealed in 45S and 5S rDNA loci of this ancient hybrid species have been seemingly achieved by different molecular mechanisms.

  20. Origin, evolution, and biogeography of Juglans: a phylogenetic perspective

    USDA-ARS?s Scientific Manuscript database

    The eastern Asian and eastern North American disjunction in Juglans offers an opportunity to estimate the time since divergence of the Eurasian and American lineages and to compare it with paleobotanical evidences. Five chloroplast DNA non-coding spacer (NCS) sequences: trnT-trnF, psbA-trnH, atpB-r...

  1. Detection of human microRNAs across miRNA Array and Next Generation DNA Sequencing Platforms

    EPA Science Inventory

    microRNA (miRNAs) are non-coding RNA molecules between 19 and 30 nucleotides in length that are believed to regulate approximately 30 per cent of all human genes. They act as negative regulators of their gene targets in many biological processes. Recent developments in microar...

  2. Decoding the non-coding RNAs in Alzheimer's disease.

    PubMed

    Schonrock, Nicole; Götz, Jürgen

    2012-11-01

    Non-coding RNAs (ncRNAs) are integral components of biological networks with fundamental roles in regulating gene expression. They can integrate sequence information from the DNA code, epigenetic regulation and functions of multimeric protein complexes to potentially determine the epigenetic status and transcriptional network in any given cell. Humans potentially contain more ncRNAs than any other species, especially in the brain, where they may well play a significant role in human development and cognitive ability. This review discusses their emerging role in Alzheimer's disease (AD), a human pathological condition characterized by the progressive impairment of cognitive functions. We discuss the complexity of the ncRNA world and how this is reflected in the regulation of the amyloid precursor protein and Tau, two proteins with central functions in AD. By understanding this intricate regulatory network, there is hope for a better understanding of disease mechanisms and ultimately developing diagnostic and therapeutic tools.

  3. Comparative Genomics of Oral Isolates of Streptococcus mutans by in silico Genome Subtraction Does Not Reveal Accessory DNA Associated with Severe Early Childhood Caries

    PubMed Central

    Argimón, Silvia; Konganti, Kranti; Chen, Hao; Alekseyenko, Alexander V.; Brown, Stuart; Caufield, Page W.

    2014-01-01

    Comparative genomics is a popular method for the identification of microbial virulence determinants, especially since the sequencing of a large number of whole bacterial genomes from pathogenic and non-pathogenic strains has become relatively inexpensive. The bioinformatics pipelines for comparative genomics usually include gene prediction and annotation and can require significant computer power. To circumvent this, we developed a rapid method for genome-scale in silico subtractive hybridization, based on blastn and independent of feature identification and annotation. Whole genome comparisons by in silico genome subtraction were performed to identify genetic loci specific to Streptococcus mutans strains associated with severe early childhood caries (S-ECC), compared to strains isolated from caries-free (CF) children. The genome similarity of the 20 S. mutans strains included in this study, calculated by Simrank k-mer sharing, ranged from 79.5 to 90.9%, confirming this is a genetically heterogeneous group of strains. We identified strain-specific genetic elements in 19 strains, with sizes ranging from 200 bp to 39 kb. These elements contained protein-coding regions with functions mostly associated with mobile DNA. We did not, however, identify any genetic loci consistently associated with dental caries, i.e., shared by all the S-ECC strains and absent in the CF strains. Conversely, we did not identify any genetic loci specific with the healthy group. Comparison of previously published genomes from pathogenic and carriage strains of Neisseria meningitidis with our in silico genome subtraction yielded the same set of genes specific to the pathogenic strains, thus validating our method. Our results suggest that S. mutans strains derived from caries active or caries free dentitions cannot be differentiated based on the presence or absence of specific genetic elements. Our in silico genome subtraction method is available as the Microbial Genome Comparison (MGC) tool, with a user-friendly JAVA graphical interface. PMID:24291226

  4. Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain.

    PubMed

    de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

    2014-06-01

    The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Biallelic insertion of a transcriptional terminator via the CRISPR/Cas9 system efficiently silences expression of protein-coding and non-coding RNA genes.

    PubMed

    Liu, Yangyang; Han, Xiao; Yuan, Junting; Geng, Tuoyu; Chen, Shihao; Hu, Xuming; Cui, Isabelle H; Cui, Hengmi

    2017-04-07

    The type II bacterial CRISPR/Cas9 system is a simple, convenient, and powerful tool for targeted gene editing. Here, we describe a CRISPR/Cas9-based approach for inserting a poly(A) transcriptional terminator into both alleles of a targeted gene to silence protein-coding and non-protein-coding genes, which often play key roles in gene regulation but are difficult to silence via insertion or deletion of short DNA fragments. The integration of 225 bp of bovine growth hormone poly(A) signals into either the first intron or the first exon or behind the promoter of target genes caused efficient termination of expression of PPP1R12C , NSUN2 (protein-coding genes), and MALAT1 (non-protein-coding gene). Both NeoR and PuroR were used as markers in the selection of clonal cell lines with biallelic integration of a poly(A) signal. Genotyping analysis indicated that the cell lines displayed the desired biallelic silencing after a brief selection period. These combined results indicate that this CRISPR/Cas9-based approach offers an easy, convenient, and efficient novel technique for gene silencing in cell lines, especially for those in which gene integration is difficult because of a low efficiency of homology-directed repair. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  6. Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term.

    PubMed

    Romero, Roberto; Tarca, Adi L; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S; Kalita, Cynthia A; Cai, Juan; Yeo, Lami; Lipovich, Leonard

    2014-09-01

    To identify differentially expressed long non-coding RNA (lncRNA) genes in human myometrium in women with spontaneous labor at term. Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n = 19) and women in spontaneous labor at term (n = 20). RNA was extracted and profiled using an Illumina® microarray platform. We have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. We identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an experimental method completely independent of the microarray analysis. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site, that lacked evolutionary conservation beyond primates. We provide, for the first time, evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term.

  7. The bornavirus-derived human protein EBLN1 promotes efficient cell cycle transit, microtubule organisation and genome stability

    PubMed Central

    Myers, Katie N.; Barone, Giancarlo; Ganesh, Anil; Staples, Christopher J.; Howard, Anna E.; Beveridge, Ryan D.; Maslen, Sarah; Skehel, J. Mark; Collis, Spencer J.

    2016-01-01

    It was recently discovered that vertebrate genomes contain multiple endogenised nucleotide sequences derived from the non-retroviral RNA bornavirus. Strikingly, some of these elements have been evolutionary maintained as open reading frames in host genomes for over 40 million years, suggesting that some endogenised bornavirus-derived elements (EBL) might encode functional proteins. EBLN1 is one such element established through endogenisation of the bornavirus N gene (BDV N). Here, we functionally characterise human EBLN1 as a novel regulator of genome stability. Cells depleted of human EBLN1 accumulate DNA damage both under non-stressed conditions and following exogenously induced DNA damage. EBLN1-depleted cells also exhibit cell cycle abnormalities and defects in microtubule organisation as well as premature centrosome splitting, which we attribute in part, to improper localisation of the nuclear envelope protein TPR. Our data therefore reveal that human EBLN1 possesses important cellular functions within human cells, and suggest that other EBLs present within vertebrate genomes may also possess important cellular functions. PMID:27739501

  8. Decoding DNA labels by melting curve analysis using real-time PCR.

    PubMed

    Balog, József A; Fehér, Liliána Z; Puskás, László G

    2017-12-01

    Synthetic DNA has been used as an authentication code for a diverse number of applications. However, existing decoding approaches are based on either DNA sequencing or the determination of DNA length variations. Here, we present a simple alternative protocol for labeling different objects using a small number of short DNA sequences that differ in their melting points. Code amplification and decoding can be done in two steps using quantitative PCR (qPCR). To obtain a DNA barcode with high complexity, we defined 8 template groups, each having 4 different DNA templates, yielding 158 (>2.5 billion) combinations of different individual melting temperature (Tm) values and corresponding ID codes. The reproducibility and specificity of the decoding was confirmed by using the most complex template mixture, which had 32 different products in 8 groups with different Tm values. The industrial applicability of our protocol was also demonstrated by labeling a drone with an oil-based paint containing a predefined DNA code, which was then successfully decoded. The method presented here consists of a simple code system based on a small number of synthetic DNA sequences and a cost-effective, rapid decoding protocol using a few qPCR reactions, enabling a wide range of authentication applications.

  9. Genetics Home Reference: isolated Pierre Robin sequence

    MedlinePlus

    ... PG, Fitzpatrick DR, Lyonnet S. Highly conserved non-coding elements on either side of SOX9 associated with Pierre ... Citation on PubMed or Free article on PubMed Central Jakobsen LP, Ullmann R, Christensen SB, Jensen KE, ...

  10. Non-DSB clustered DNA lesions. Does theory colocalize with the experiment?

    NASA Astrophysics Data System (ADS)

    Nikitaki, Zacharenia; Nikolov, Vladimir; Mavragani, Ifigeneia V.; Plante, Ianik; Emfietzoglou, Dimitris; Iliakis, George; Georgakilas, Alexandros G.

    2016-11-01

    Ionizing radiation results in various kinds of DNA lesions such as double strand breaks (DSBs) and other non-DSB base lesions. These lesions may be formed in close proximity (i.e., within a few nanometers) resulting in clustered types of DNA lesions. These damage clusters are considered the fingerprint of ionizing radiation, notably charged particles of high linear energy transfer (LET). Accumulating theoretical and experimental evidence suggests that the induction of these clustered lesions appears under various irradiation conditions but also as a result of high levels of oxidative stress. The biological significance of these clustered DNA lesions pertains to the inability of cells to process them efficiently compared to isolated DNA lesions. The results in the case of unsuccessful or erroneous repair can vary from mutations up to chromosomal instability. In this mini review, we discuss of several Monte Carlo simulations codes and experimental evidence regarding the induction and repair of radiation-induced non-DSB complex DNA lesions. We also critically present the most widely used methodologies (i.e., gel electrophoresis and fluorescence microscopy [in situ colocalization assays]). Based on the comparison of different approaches, we provide examples and suggestions for the improved detection of these lesions in situ. Based on the current status of knowledge, we conclude that there is a great need for improvement of the detection techniques at the cellular or tissue level, which will provide valuable information for understanding the mechanisms used by the cell to process clustered DNA lesions.

  11. Converting Panax ginseng DNA and chemical fingerprints into two-dimensional barcode.

    PubMed

    Cai, Yong; Li, Peng; Li, Xi-Wen; Zhao, Jing; Chen, Hai; Yang, Qing; Hu, Hao

    2017-07-01

    In this study, we investigated how to convert the Panax ginseng DNA sequence code and chemical fingerprints into a two-dimensional code. In order to improve the compression efficiency, GATC2Bytes and digital merger compression algorithms are proposed. HPLC chemical fingerprint data of 10 groups of P. ginseng from Northeast China and the internal transcribed spacer 2 (ITS2) sequence code as the DNA sequence code were ready for conversion. In order to convert such data into a two-dimensional code, the following six steps were performed: First, the chemical fingerprint characteristic data sets were obtained through the inflection filtering algorithm. Second, precompression processing of such data sets is undertaken. Third, precompression processing was undertaken with the P. ginseng DNA (ITS2) sequence codes. Fourth, the precompressed chemical fingerprint data and the DNA (ITS2) sequence code were combined in accordance with the set data format. Such combined data can be compressed by Zlib, an open source data compression algorithm. Finally, the compressed data generated a two-dimensional code called a quick response code (QR code). Through the abovementioned converting process, it can be found that the number of bytes needed for storing P. ginseng chemical fingerprints and its DNA (ITS2) sequence code can be greatly reduced. After GTCA2Bytes algorithm processing, the ITS2 compression rate reaches 75% and the chemical fingerprint compression rate exceeds 99.65% via filtration and digital merger compression algorithm processing. Therefore, the overall compression ratio even exceeds 99.36%. The capacity of the formed QR code is around 0.5k, which can easily and successfully be read and identified by any smartphone. P. ginseng chemical fingerprints and its DNA (ITS2) sequence code can form a QR code after data processing, and therefore the QR code can be a perfect carrier of the authenticity and quality of P. ginseng information. This study provides a theoretical basis for the development of a quality traceability system of traditional Chinese medicine based on a two-dimensional code.

  12. Identification of a Retroelement from the Resurrection Plant Boea hygrometrica That Confers Osmotic and Alkaline Tolerance in Arabidopsis thaliana

    PubMed Central

    Shen, Chun-Ying; Xu, Guang-Hui; Chen, Shi-Xuan; Song, Li-Zhen; Li, Mei-Jing; Wang, Li-Li; Zhu, Yan; Lv, Wei-Tao; Gong, Zhi-Zhong; Liu, Chun-Ming; Deng, Xin

    2014-01-01

    Functional genomic elements, including transposable elements, small RNAs and non-coding RNAs, are involved in regulation of gene expression in response to plant stress. To identify genomic elements that regulate dehydration and alkaline tolerance in Boea hygrometrica, a resurrection plant that inhabits drought and alkaline Karst areas, a genomic DNA library from B. hygrometrica was constructed and subsequently transformed into Arabidopsis using binary bacterial artificial chromosome (BIBAC) vectors. Transgenic lines were screened under osmotic and alkaline conditions, leading to the identification of Clone L1-4 that conferred osmotic and alkaline tolerance. Sequence analyses revealed that L1-4 contained a 49-kb retroelement fragment from B. hygrometrica, of which only a truncated sequence was present in L1-4 transgenic Arabidopsis plants. Additional subcloning revealed that activity resided in a 2-kb sequence, designated Osmotic and Alkaline Resistance 1 (OAR1). In addition, transgenic Arabidopsis lines carrying an OAR1-homologue also showed similar stress tolerance phenotypes. Physiological and molecular analyses demonstrated that OAR1-transgenic plants exhibited improved photochemical efficiency and membrane integrity and biomarker gene expression under both osmotic and alkaline stresses. Short transcripts that originated from OAR1 were increased under stress conditions in both B. hygrometrica and Arabidopsis carrying OAR1. The relative copy number of OAR1 was stable in transgenic Arabidopsis under stress but increased in B. hygrometrica. Taken together, our results indicated a potential role of OAR1 element in plant tolerance to osmotic and alkaline stresses, and verified the feasibility of the BIBAC transformation technique to identify functional genomic elements from physiological model species. PMID:24851859

  13. Comparison of Ultra-Conserved Elements in Drosophilids and Vertebrates

    PubMed Central

    Makunin, Igor V.; Shloma, Viktor V.; Stephen, Stuart J.; Pheasant, Michael; Belyakin, Stepan N.

    2013-01-01

    Metazoan genomes contain many ultra-conserved elements (UCEs), long sequences identical between distant species. In this study we identified UCEs in drosophilid and vertebrate species with a similar level of phylogenetic divergence measured at protein-coding regions, and demonstrated that both the length and number of UCEs are larger in vertebrates. The proportion of non-exonic UCEs declines in distant drosophilids whilst an opposite trend was observed in vertebrates. We generated a set of 2,126 Sophophora UCEs by merging elements identified in several drosophila species and compared these to the eutherian UCEs identified in placental mammals. In contrast to vertebrates, the Sophophora UCEs are depleted around transcription start sites. Analysis of 52,954 P-element, piggyBac and Minos insertions in the D. melanogaster genome revealed depletion of the P-element and piggyBac insertions in and around the Sophophora UCEs. We examined eleven fly strains with transposon insertions into the intergenic UCEs and identified associated phenotypes in five strains. Four insertions behave as recessive lethals, and in one case we observed a suppression of the marker gene within the transgene, presumably by silenced chromatin around the integration site. To confirm the lethality is caused by integration of transposons we performed a phenotype rescue experiment for two stocks and demonstrated that the excision of the transposons from the intergenic UCEs restores viability. Sequencing of DNA after the transposon excision in one fly strain with the restored viability revealed a 47 bp insertion at the original transposon integration site suggesting that the nature of the mutation is important for the appearance of the phenotype. Our results suggest that the UCEs in flies and vertebrates have both common and distinct features, and demonstrate that a significant proportion of intergenic drosophila UCEs are sensitive to disruption. PMID:24349264

  14. The contribution of alu elements to mutagenic DNA double-strand break repair.

    PubMed

    Morales, Maria E; White, Travis B; Streva, Vincent A; DeFreece, Cecily B; Hedges, Dale J; Deininger, Prescott L

    2015-03-01

    Alu elements make up the largest family of human mobile elements, numbering 1.1 million copies and comprising 11% of the human genome. As a consequence of evolution and genetic drift, Alu elements of various sequence divergence exist throughout the human genome. Alu/Alu recombination has been shown to cause approximately 0.5% of new human genetic diseases and contribute to extensive genomic structural variation. To begin understanding the molecular mechanisms leading to these rearrangements in mammalian cells, we constructed Alu/Alu recombination reporter cell lines containing Alu elements ranging in sequence divergence from 0%-30% that allow detection of both Alu/Alu recombination and large non-homologous end joining (NHEJ) deletions that range from 1.0 to 1.9 kb in size. Introduction of as little as 0.7% sequence divergence between Alu elements resulted in a significant reduction in recombination, which indicates even small degrees of sequence divergence reduce the efficiency of homology-directed DNA double-strand break (DSB) repair. Further reduction in recombination was observed in a sequence divergence-dependent manner for diverged Alu/Alu recombination constructs with up to 10% sequence divergence. With greater levels of sequence divergence (15%-30%), we observed a significant increase in DSB repair due to a shift from Alu/Alu recombination to variable-length NHEJ which removes sequence between the two Alu elements. This increase in NHEJ deletions depends on the presence of Alu sequence homeology (similar but not identical sequences). Analysis of recombination products revealed that Alu/Alu recombination junctions occur more frequently in the first 100 bp of the Alu element within our reporter assay, just as they do in genomic Alu/Alu recombination events. This is the first extensive study characterizing the influence of Alu element sequence divergence on DNA repair, which will inform predictions regarding the effect of Alu element sequence divergence on both the rate and nature of DNA repair events.

  15. Maximizing mutagenesis with solubilized CRISPR-Cas9 ribonucleoprotein complexes.

    PubMed

    Burger, Alexa; Lindsay, Helen; Felker, Anastasia; Hess, Christopher; Anders, Carolin; Chiavacci, Elena; Zaugg, Jonas; Weber, Lukas M; Catena, Raul; Jinek, Martin; Robinson, Mark D; Mosimann, Christian

    2016-06-01

    CRISPR-Cas9 enables efficient sequence-specific mutagenesis for creating somatic or germline mutants of model organisms. Key constraints in vivo remain the expression and delivery of active Cas9-sgRNA ribonucleoprotein complexes (RNPs) with minimal toxicity, variable mutagenesis efficiencies depending on targeting sequence, and high mutation mosaicism. Here, we apply in vitro assembled, fluorescent Cas9-sgRNA RNPs in solubilizing salt solution to achieve maximal mutagenesis efficiency in zebrafish embryos. MiSeq-based sequence analysis of targeted loci in individual embryos using CrispRVariants, a customized software tool for mutagenesis quantification and visualization, reveals efficient bi-allelic mutagenesis that reaches saturation at several tested gene loci. Such virtually complete mutagenesis exposes loss-of-function phenotypes for candidate genes in somatic mutant embryos for subsequent generation of stable germline mutants. We further show that targeting of non-coding elements in gene regulatory regions using saturating mutagenesis uncovers functional control elements in transgenic reporters and endogenous genes in injected embryos. Our results establish that optimally solubilized, in vitro assembled fluorescent Cas9-sgRNA RNPs provide a reproducible reagent for direct and scalable loss-of-function studies and applications beyond zebrafish experiments that require maximal DNA cutting efficiency in vivo. © 2016. Published by The Company of Biologists Ltd.

  16. The primary structure of the Saccharomyces cerevisiae gene for 3-phosphoglycerate kinase.

    PubMed Central

    Hitzeman, R A; Hagie, F E; Hayflick, J S; Chen, C Y; Seeburg, P H; Derynck, R

    1982-01-01

    The DNA sequence of the gene for the yeast glycolytic enzyme, 3-phosphoglycerate kinase (PGK), has been obtained by sequencing part of a 3.1 kbp HindIII fragment obtained from the yeast genome. The structural gene sequence corresponds to a reading frame of 1251 bp coding for 416 amino acids with no intervening DNA sequences. The amino acid sequence is approximately 65 percent homologous with human and horse PGK protein sequences and is in general agreement with the published protein sequence for yeast PGK. As for other highly expressed structural genes in yeast, the coding sequence is highly codon biased with 95 percent of the amino acids coded for by a select 25 codons (out of 61 possible). Besides structural DNA sequence, 291 bp of 5'-flanking sequence and 286 bp of 3'-flanking sequence were determined. Transcription starts 36 nucleotides upstream from the translational start and stops 86-93 nucleotides downstream from the translational stop. These results suggest a non-polyadenylated mRNA length of 1373 to 1380 nucleotides, which is consistent with the observed length of 1500 nucleotides for polyadenylated PGK mRNA. A sequence TATATATAAA is found at 145 nucleotides upstream from the translational start. This sequence resembles the TATAAA box that is possibly associated with RNA polymerase II binding. Images PMID:6296791

  17. Nothing in Evolution Makes Sense Except in the Light of Genomics: Read-Write Genome Evolution as an Active Biological Process.

    PubMed

    Shapiro, James A

    2016-06-08

    The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess "Read-Write Genomes" they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification.

  18. Nothing in Evolution Makes Sense Except in the Light of Genomics: Read–Write Genome Evolution as an Active Biological Process

    PubMed Central

    Shapiro, James A.

    2016-01-01

    The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess “Read–Write Genomes” they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification. PMID:27338490

  19. Specific DNA binding of the two chicken Deformed family homeodomain proteins, Chox-1.4 and Chox-a.

    PubMed Central

    Sasaki, H; Yokoyama, E; Kuroiwa, A

    1990-01-01

    The cDNA clones encoding two chicken Deformed (Dfd) family homeobox containing genes Chox-1.4 and Chox-a were isolated. Comparison of their amino acid sequences with another chicken Dfd family homeodomain protein and with those of mouse homologues revealed that strong homologies are located in the amino terminal regions and around the homeodomains. Although homologies in other regions were relatively low, some short conserved sequences were also identified. E. coli-made full length proteins were purified and used for the production of specific antibodies and for DNA binding studies. The binding profiles of these proteins to the 5'-leader and 5'-upstream sequences of Chox-1.4 and Chox-a coding regions were analyzed by immunoprecipitation and DNase I footprint assays. These two Chox proteins bound to the same sites in the 5'-flanking sequences of their coding regions with various affinities and their binding affinities to each site were nearly the same. The consensus sequences of the high and low affinity binding sites were TAATGA(C/G) and CTAATTTT, respectively. A clustered binding site was identified in the 5'-upstream of the Chox-a gene, suggesting that this clustered binding site works as a cis-regulatory element for auto- and/or cross-regulation of Chox-a gene expression. Images PMID:1970866

  20. Hypochondria as withdrawal and comedy as cure in Dr. Willibald's Der Hypochondrist (1824).

    PubMed

    Potter, Edward T

    2012-01-01

    Balthasar von Ammann's comedy Der Hypochondrist, published in 1824 under the pseudonym Dr. Willibald, foregrounds the social, sexual, and political implications of hypochondria. The play engages with early nineteenth-century medical and popular conceptions of hypochondria to co-opt potentially subversive elements and to promote a specific social, sexual, and political agenda. The text promotes literature — specifically comedic drama — as a cure for hypochondria. Hypochondria functions as a code for withdrawal. The hypochondriac withdraws medically from healthy society, gaining exceptional status. He withdraws sexually from society by remaining a bachelor, possibly engaged in non-normative sexual behaviour. Furthermore, the politically disenfranchised protagonist voices his political frustrations via a coded medical metaphor. The hypochondriac poses a threefold challenge to the social, sexual, and political order, and the play engages with contemporary conceptions of the disease to provide the solution: comedy. The text, presented as a cure for hypochondria, replaces the coded questioning of the social order via hypochondria with the less threatening code of heraldry. A comedy-within-the-comedy uses the hypochondriac's love of heraldry to cure him, resulting in the elimination of his medical problems and exceptional status, in the purification of his bachelorhood from non-normative elements, and in the pre-emption of political frustrations.

  1. Enrichment of colorectal cancer associations in functional regions: Insight for using epigenomics data in the analysis of whole genome sequence-imputed GWAS data.

    PubMed

    Bien, Stephanie A; Auer, Paul L; Harrison, Tabitha A; Qu, Conghui; Connolly, Charles M; Greenside, Peyton G; Chen, Sai; Berndt, Sonja I; Bézieau, Stéphane; Kang, Hyun M; Huyghe, Jeroen; Brenner, Hermann; Casey, Graham; Chan, Andrew T; Hopper, John L; Banbury, Barbara L; Chang-Claude, Jenny; Chanock, Stephen J; Haile, Robert W; Hoffmeister, Michael; Fuchsberger, Christian; Jenkins, Mark A; Leal, Suzanne M; Lemire, Mathieu; Newcomb, Polly A; Gallinger, Steven; Potter, John D; Schoen, Robert E; Slattery, Martha L; Smith, Joshua D; Le Marchand, Loic; White, Emily; Zanke, Brent W; Abeçasis, Goncalo R; Carlson, Christopher S; Peters, Ulrike; Nickerson, Deborah A; Kundaje, Anshul; Hsu, Li

    2017-01-01

    The evaluation of less frequent genetic variants and their effect on complex disease pose new challenges for genomic research. To investigate whether epigenetic data can be used to inform aggregate rare-variant association methods (RVAM), we assessed whether variants more significantly associated with colorectal cancer (CRC) were preferentially located in non-coding regulatory regions, and whether enrichment was specific to colorectal tissues. Active regulatory elements (ARE) were mapped using data from 127 tissues and cell-types from NIH Roadmap Epigenomics and Encyclopedia of DNA Elements (ENCODE) projects. We investigated whether CRC association p-values were more significant for common variants inside versus outside AREs, or 2) inside colorectal (CR) AREs versus AREs of other tissues and cell-types. We employed an integrative epigenomic RVAM for variants with allele frequency <1%. Gene sets were defined as ARE variants within 200 kilobases of a transcription start site (TSS) using either CR ARE or ARE from non-digestive tissues. CRC-set association p-values were used to evaluate enrichment of less frequent variant associations in CR ARE versus non-digestive ARE. ARE from 126/127 tissues and cell-types were significantly enriched for stronger CRC-variant associations. Strongest enrichment was observed for digestive tissues and immune cell types. CR-specific ARE were also enriched for stronger CRC-variant associations compared to ARE combined across non-digestive tissues (p-value = 9.6 × 10-4). Additionally, we found enrichment of stronger CRC association p-values for rare variant sets of CR ARE compared to non-digestive ARE (p-value = 0.029). Integrative epigenomic RVAM may enable discovery of less frequent variants associated with CRC, and ARE of digestive and immune tissues are most informative. Although distance-based aggregation of less frequent variants in CR ARE surrounding TSS showed modest enrichment, future association studies would likely benefit from joint analysis of transcriptomes and epigenomes to better link regulatory variation with target genes.

  2. The phylogenetic position of the roughskin skate Dipturus trachyderma (Krefft & Stehmann, 1975) (Rajiformes, Rajidae) inferred from the mitochondrial genome.

    PubMed

    Vargas-Caro, Carolina; Bustamante, Carlos; Lamilla, Julio; Bennett, Michael B; Ovenden, Jennifer R

    2016-07-01

    The complete mitochondrial genome of the roughskin skate Dipturus trachyderma is described from 1 455 724 sequences obtained using Illumina NGS technology. Total length of the mitogenome was 16 909 base pairs, comprising 2 rRNAs, 13 protein-coding genes, 22 tRNAs and 2 non-coding regions. Phylogenetic analysis based on mtDNA revealed low genetic divergence among longnose skates, in particular, those dwelling the continental shelf and slope off the coasts of Chile and Argentina.

  3. Single-tube, non-isotopic, multiplex PCR/OLA assay and sequence-coded separation for simultaneous screening of 31 cystic fibrosis mutations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brinson, E.C.; Adriano, T.; Bloch, W.

    1994-09-01

    We have developed a rapid, single-tube, non-isotopic assay that screens a patient sample for the presence of 31 cystic fibrosis (CF) mutations. This assay can identify these mutations in a single reaction tube and a single electrophoresis run. Sample preparation is a simple, boil-and-go procedure, completed in less than an hour. The assay is composed of a 15-plex PCR, followed by a 61-plex oligonucleotide ligation assay (OLA), and incorporates a novel detection scheme, Sequence Coded Separation. Initially, the multiplex PCR amplifies 15 relevant segments of the CFTR gene, simultaneously. These PCR amplicons serve as templates for the multiplex OLA, whichmore » detects the normal or mutant allele at all loci, simultaneously. Each polymorphic site is interrogated by three oligonucleotide probes, a common probe and two allele-specific probes. Each common probe is tagged with a fluorescent dye, and the competing normal and mutant allelic probes incorporate different, non-nucleotide, mobility modifiers. These modifiers are composed of hexaethylene oxide (HEO) units, incorporated as HEO phosphoramidite monomers during automated DNA synthesis. The OLA is based on both probe hybridization and the ability of DNA ligase to discriminate single base mismatches at the junction between paired probes. Each single tube assay is electrophoresed in a single gel lane of a 4-color fluorescent DNA sequencer (Applied Biosystems, Model 373A). Each of the ligation products is identified by its unique combination of electrophoretic mobility and one of three colors. The fourth color is reserved for the in-lane size standard, used by GENESCAN{sup TM} software (Applied Biosystems) to size the OLA electrophoresis products. The Genotyper{sub TM} software (Applied Biosystems) decodes these Sequence-Coded-Separation data to create a patient summary report for all loci tested.« less

  4. Comet assay in reconstructed 3D human epidermal skin models--investigation of intra- and inter-laboratory reproducibility with coded chemicals.

    PubMed

    Reus, Astrid A; Reisinger, Kerstin; Downs, Thomas R; Carr, Gregory J; Zeller, Andreas; Corvi, Raffaella; Krul, Cyrille A M; Pfuhler, Stefan

    2013-11-01

    Reconstructed 3D human epidermal skin models are being used increasingly for safety testing of chemicals. Based on EpiDerm™ tissues, an assay was developed in which the tissues were topically exposed to test chemicals for 3h followed by cell isolation and assessment of DNA damage using the comet assay. Inter-laboratory reproducibility of the 3D skin comet assay was initially demonstrated using two model genotoxic carcinogens, methyl methane sulfonate (MMS) and 4-nitroquinoline-n-oxide, and the results showed good concordance among three different laboratories and with in vivo data. In Phase 2 of the project, intra- and inter-laboratory reproducibility was investigated with five coded compounds with different genotoxicity liability tested at three different laboratories. For the genotoxic carcinogens MMS and N-ethyl-N-nitrosourea, all laboratories reported a dose-related and statistically significant increase (P < 0.05) in DNA damage in every experiment. For the genotoxic carcinogen, 2,4-diaminotoluene, the overall result from all laboratories showed a smaller, but significant genotoxic response (P < 0.05). For cyclohexanone (CHN) (non-genotoxic in vitro and in vivo, and non-carcinogenic), an increase compared to the solvent control acetone was observed only in one laboratory. However, the response was not dose related and CHN was judged negative overall, as was p-nitrophenol (p-NP) (genotoxic in vitro but not in vivo and non-carcinogenic), which was the only compound showing clear cytotoxic effects. For p-NP, significant DNA damage generally occurred only at doses that were substantially cytotoxic (>30% cell loss), and the overall response was comparable in all laboratories despite some differences in doses tested. The results of the collaborative study for the coded compounds were generally reproducible among the laboratories involved and intra-laboratory reproducibility was also good. These data indicate that the comet assay in EpiDerm™ skin models is a promising model for the safety assessment of compounds with a dermal route of exposure.

  5. Comet assay in reconstructed 3D human epidermal skin models—investigation of intra- and inter-laboratory reproducibility with coded chemicals

    PubMed Central

    Pfuhler, Stefan

    2013-01-01

    Reconstructed 3D human epidermal skin models are being used increasingly for safety testing of chemicals. Based on EpiDerm™ tissues, an assay was developed in which the tissues were topically exposed to test chemicals for 3h followed by cell isolation and assessment of DNA damage using the comet assay. Inter-laboratory reproducibility of the 3D skin comet assay was initially demonstrated using two model genotoxic carcinogens, methyl methane sulfonate (MMS) and 4-nitroquinoline-n-oxide, and the results showed good concordance among three different laboratories and with in vivo data. In Phase 2 of the project, intra- and inter-laboratory reproducibility was investigated with five coded compounds with different genotoxicity liability tested at three different laboratories. For the genotoxic carcinogens MMS and N-ethyl-N-nitrosourea, all laboratories reported a dose-related and statistically significant increase (P < 0.05) in DNA damage in every experiment. For the genotoxic carcinogen, 2,4-diaminotoluene, the overall result from all laboratories showed a smaller, but significant genotoxic response (P < 0.05). For cyclohexanone (CHN) (non-genotoxic in vitro and in vivo, and non-carcinogenic), an increase compared to the solvent control acetone was observed only in one laboratory. However, the response was not dose related and CHN was judged negative overall, as was p-nitrophenol (p-NP) (genotoxic in vitro but not in vivo and non-carcinogenic), which was the only compound showing clear cytotoxic effects. For p-NP, significant DNA damage generally occurred only at doses that were substantially cytotoxic (>30% cell loss), and the overall response was comparable in all laboratories despite some differences in doses tested. The results of the collaborative study for the coded compounds were generally reproducible among the laboratories involved and intra-laboratory reproducibility was also good. These data indicate that the comet assay in EpiDerm™ skin models is a promising model for the safety assessment of compounds with a dermal route of exposure. PMID:24150594

  6. A Locus Encoding Variable Defense Systems against Invading DNA Identified in Streptococcus suis

    PubMed Central

    Okura, Masatoshi; Nozawa, Takashi; Watanabe, Takayasu; Murase, Kazunori; Nakagawa, Ichiro; Takamatsu, Daisuke; Osaki, Makoto; Sekizaki, Tsutomu; Gottschalk, Marcelo; Hamada, Shigeyuki

    2017-01-01

    Streptococcus suis, an important zoonotic pathogen, is known to have an open pan-genome and to develop a competent state. In S. suis, limited genetic lineages are suggested to be associated with zoonosis. However, little is known about the evolution of diversified lineages and their respective phenotypic or ecological characteristics. In this study, we performed comparative genome analyses of S. suis, with a focus on the competence genes, mobile genetic elements, and genetic elements related to various defense systems against exogenous DNAs (defense elements) that are associated with gene gain/loss/exchange mediated by horizontal DNA movements and their restrictions. Our genome analyses revealed a conserved competence-inducing peptide type (pherotype) of the competence system and large-scale genome rearrangements in certain clusters based on the genome phylogeny of 58 S. suis strains. Moreover, the profiles of the defense elements were similar or identical to each other among the strains belonging to the same genomic clusters. Our findings suggest that these genetic characteristics of each cluster might exert specific effects on the phenotypic or ecological differences between the clusters. We also found certain loci that shift several types of defense elements in S. suis. Of note, one of these loci is a previously unrecognized variable region in bacteria, at which strains of distinct clusters code for different and various defense elements. This locus might represent a novel defense mechanism that has evolved through an arms race between bacteria and invading DNAs, mediated by mobile genetic elements and genetic competence. PMID:28379509

  7. Definition of RNA polymerase II CoTC terminator elements in the human genome.

    PubMed

    Nojima, Takayuki; Dienstbier, Martin; Murphy, Shona; Proudfoot, Nicholas J; Dye, Michael J

    2013-04-25

    Mammalian RNA polymerase II (Pol II) transcription termination is an essential step in protein-coding gene expression that is mediated by pre-mRNA processing activities and DNA-encoded terminator elements. Although much is known about the role of pre-mRNA processing in termination, our understanding of the characteristics and generality of terminator elements is limited. Whereas promoter databases list up to 40,000 known and potential Pol II promoter sequences, fewer than ten Pol II terminator sequences have been described. Using our knowledge of the human β-globin terminator mechanism, we have developed a selection strategy for mapping mammalian Pol II terminator elements. We report the identification of 78 cotranscriptional cleavage (CoTC)-type terminator elements at endogenous gene loci. The results of this analysis pave the way for the full understanding of Pol II termination pathways and their roles in gene expression. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  8. A single nucleotide polymorphism associated with isolated cleft lip and palate, thyroid cancer and hypothyroidism alters the activity of an oral epithelium and thyroid enhancer near FOXE1

    PubMed Central

    Lidral, Andrew C.; Liu, Huan; Bullard, Steven A.; Bonde, Greg; Machida, Junichiro; Visel, Axel; Uribe, Lina M. Moreno; Li, Xiao; Amendt, Brad; Cornell, Robert A.

    2015-01-01

    Three common diseases, isolated cleft lip and cleft palate (CLP), hypothyroidism and thyroid cancer all map to the FOXE1 locus, but causative variants have yet to be identified. In patients with CLP, the frequency of coding mutations in FOXE1 fails to account for the risk attributable to this locus, suggesting that the common risk alleles reside in nearby regulatory elements. Using a combination of zebrafish and mouse transgenesis, we screened 15 conserved non-coding sequences for enhancer activity, identifying three that regulate expression in a tissue specific pattern consistent with endogenous foxe1 expression. These three, located −82.4, −67.7 and +22.6 kb from the FOXE1 start codon, are all active in the oral epithelium or branchial arches. The −67.7 and +22.6 kb elements are also active in the developing heart, and the −67.7 kb element uniquely directs expression in the developing thyroid. Within the −67.7 kb element is the SNP rs7850258 that is associated with all three diseases. Quantitative reporter assays in oral epithelial and thyroid cell lines show that the rs7850258 allele (G) associated with CLP and hypothyroidism has significantly greater enhancer activity than the allele associated with thyroid cancer (A). Moreover, consistent with predicted transcription factor binding differences, the −67.7 kb element containing rs7850258 allele G is significantly more responsive to both MYC and ARNT than allele A. By demonstrating that this common non-coding variant alters FOXE1 expression, we have identified at least in part the functional basis for the genetic risk of these seemingly disparate disorders. PMID:25652407

  9. Molecular Regulatory Pathways Link Sepsis With Metabolic Syndrome: Non-coding RNA Elements Underlying the Sepsis/Metabolic Cross-Talk.

    PubMed

    Meydan, Chanan; Bekenstein, Uriya; Soreq, Hermona

    2018-01-01

    Sepsis and metabolic syndrome (MetS) are both inflammation-related entities with high impact for human health and the consequences of concussions. Both represent imbalanced parasympathetic/cholinergic response to insulting triggers and variably uncontrolled inflammation that indicates shared upstream regulators, including short microRNAs (miRs) and long non-coding RNAs (lncRNAs). These may cross talk across multiple systems, leading to complex molecular and clinical outcomes. Notably, biomedical and RNA-sequencing based analyses both highlight new links between the acquired and inherited pathogenic, cardiac and inflammatory traits of sepsis/MetS. Those include the HOTAIR and MIAT lncRNAs and their targets, such as miR-122, -150, -155, -182, -197, -375, -608 and HLA-DRA. Implicating non-coding RNA regulators in sepsis and MetS may delineate novel high-value biomarkers and targets for intervention.

  10. DNA labelling of varieties covered by patent protection: a new solution for managing intellectual property rights in the seed industry.

    PubMed

    Fister, Karin; Fister, Iztok; Murovec, Jana; Bohanec, Borut

    2017-02-01

    Plant breeders' rights are undergoing dramatic changes due to changes in patent rights in terms of plant variety rights protection. Although differences in the interpretation of »breeder's exemption«, termed research exemption in the 1991 UPOV, did exist in the past in some countries, allowing breeders to use protected varieties as parents in the creation of new varieties of plants, current developments brought about by patenting conventionally bred varieties with the European Patent Office (such as EP2140023B1) have opened new challenges. Legal restrictions on germplasm availability are therefore imposed on breeders while, at the same time, no practical information on how to distinguish protected from non-protected varieties is given. We propose here a novel approach that would solve this problem by the insertion of short DNA stretches (labels) into protected plant varieties by genetic transformation. This information will then be available to breeders by a simple and standardized procedure. We propose that such a procedure should consist of using a pair of universal primers that will generate a sequence in a PCR reaction, which can be read and translated into ordinary text by a computer application. To demonstrate the feasibility of such approach, we conducted a case study. Using the Agrobacterium tumefaciens transformation protocol, we inserted a stretch of DNA code into Nicotiana benthamiana. We also developed an on-line application that enables coding of any text message into DNA nucleotide code and, on sequencing, decoding it back into text. In the presented case study, a short command line coding the phrase »Hello world« was transformed into a DNA sequence that was inserted in the plant genome. The encoded message was reconstructed from the resulting T1 seedlings with 100 % accuracy. The feasibility and possible other applications of this approach are discussed.

  11. Analysis of correlation structures in the Synechocystis PCC6803 genome.

    PubMed

    Wu, Zuo-Bing

    2014-12-01

    Transfer of nucleotide strings in the Synechocystis sp. PCC6803 genome is investigated to exhibit periodic and non-periodic correlation structures by using the recurrence plot method and the phase space reconstruction technique. The periodic correlation structures are generated by periodic transfer of several substrings in long periodic or non-periodic nucleotide strings embedded in the coding regions of genes. The non-periodic correlation structures are generated by non-periodic transfer of several substrings covering or overlapping with the coding regions of genes. In the periodic and non-periodic transfer, some gaps divide the long nucleotide strings into the substrings and prevent their global transfer. Most of the gaps are either the replacement of one base or the insertion/reduction of one base. In the reconstructed phase space, the points generated from two or three steps for the continuous iterative transfer via the second maximal distance can be fitted by two lines. It partly reveals an intrinsic dynamics in the transfer of nucleotide strings. Due to the comparison of the relative positions and lengths, the substrings concerned with the non-periodic correlation structures are almost identical to the mobile elements annotated in the genome. The mobile elements are thus endowed with the basic results on the correlation structures. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. Carbon-14 decay as a source of non-canonical bases in DNA.

    PubMed

    Sassi, Michel; Carter, Damien J; Uberuaga, Blas P; Stanek, Chris R; Marks, Nigel A

    2014-01-01

    Significant experimental effort has been applied to study radioactive beta-decay in biological systems. Atomic-scale knowledge of this transmutation process is lacking due to the absence of computer simulations. Carbon-14 is an important beta-emitter, being ubiquitous in the environment and an intrinsic part of the genetic code. Over a lifetime, around 50 billion (14)C decays occur within human DNA. We apply ab initio molecular dynamics to quantify (14)C-induced bond rupture in a variety of organic molecules, including DNA base pairs. We show that double bonds and ring structures confer radiation resistance. These features, present in the canonical bases of the DNA, enhance their resistance to (14)C-induced bond-breaking. In contrast, the sugar group of the DNA and RNA backbone is vulnerable to single-strand breaking. We also show that Carbon-14 decay provides a mechanism for creating mutagenic wobble-type mispairs. The observation that DNA has a resistance to natural radioactivity has not previously been recognized. We show that (14)C decay can be a source for generating non-canonical bases. Our findings raise questions such as how the genetic apparatus deals with the appearance of an extra nitrogen in the canonical bases. It is not obvious whether or not the DNA repair mechanism detects this modification nor how DNA replication is affected by a non-canonical nucleobase. Accordingly, (14)C may prove to be a source of genetic alteration that is impossible to avoid due to the universal presence of radiocarbon in the environment. © 2013.

  13. What Information is Stored in DNA: Does it Contain Digital Error Correcting Codes?

    NASA Astrophysics Data System (ADS)

    Liebovitch, Larry

    1998-03-01

    The longest term correlations in living systems are the information stored in DNA which reflects the evolutionary history of an organism. The 4 bases (A,T,G,C) encode sequences of amino acids as well as locations of binding sites for proteins that regulate DNA. The fidelity of this important information is maintained by ANALOG error check mechanisms. When a single strand of DNA is replicated the complementary base is inserted in the new strand. Sometimes the wrong base is inserted that sticks out disrupting the phosphate backbone. The new base is not yet methylated, so repair enzymes, that slide along the DNA, can tear out the wrong base and replace it with the right one. The bases in DNA form a sequence of 4 different symbols and so the information is encoded in a DIGITAL form. All the digital codes in our society (ISBN book numbers, UPC product codes, bank account numbers, airline ticket numbers) use error checking code, where some digits are functions of other digits to maintain the fidelity of transmitted informaiton. Does DNA also utitlize a DIGITAL error chekcing code to maintain the fidelity of its information and increase the accuracy of replication? That is, are some bases in DNA functions of other bases upstream or downstream? This raises the interesting mathematical problem: How does one determine whether some symbols in a sequence of symbols are a function of other symbols. It also bears on the issue of determining algorithmic complexity: What is the function that generates the shortest algorithm for reproducing the symbol sequence. The error checking codes most used in our technology are linear block codes. We developed an efficient method to test for the presence of such codes in DNA. We coded the 4 bases as (0,1,2,3) and used Gaussian elimination, modified for modulus 4, to test if some bases are linear combinations of other bases. We used this method to analyze the base sequence in the genes from the lac operon and cytochrome C. We did not find evidence for such error correcting codes in these genes. However, we analyzed only a small amount of DNA and if digitial error correcting schemes are present in DNA, they may be more subtle than such simple linear block codes. The basic issue we raise here, is how information is stored in DNA and an appreciation that digital symbol sequences, such as DNA, admit of interesting schemes to store and protect the fidelity of their information content. Liebovitch, Tao, Todorov, Levine. 1996. Biophys. J. 71:1539-1544. Supported by NIH grant EY6234.

  14. Self-entanglement of long linear DNA vectors using transient non-B-DNA attachment points: a new concept for improvement of non-viral therapeutic gene delivery.

    PubMed

    Tolmachov, Oleg E

    2012-05-01

    The cell-specific and long-term expression of therapeutic transgenes often requires a full array of native gene control elements including distal enhancers, regulatory introns and chromatin organisation sequences. The delivery of such extended gene expression modules to human cells can be accomplished with non-viral high-molecular-weight DNA vectors, in particular with several classes of linear DNA vectors. All high-molecular-weight DNA vectors are susceptible to damage by shear stress, and while for some of the vectors the harmful impact of shear stress can be minimised through the transformation of the vectors to compact topological configurations by supercoiling and/or knotting, linear DNA vectors with terminal loops or covalently attached terminal proteins cannot be self-compacted in this way. In this case, the only available self-compacting option is self-entangling, which can be defined as the folding of single DNA molecules into a configuration with mutual restriction of molecular motion by the individual segments of bent DNA. A negatively charged phosphate backbone makes DNA self-repulsive, so it is reasonable to assume that a certain number of 'sticky points' dispersed within DNA could facilitate the entangling by bringing DNA segments into proximity and by interfering with the DNA slipping away from the entanglement. I propose that the spontaneous entanglement of vector DNA can be enhanced by the interlacing of the DNA with sites capable of mutual transient attachment through the formation of non-B-DNA forms, such as interacting cruciform structures, inter-segment triplexes, slipped-strand DNA, left-handed duplexes (Z-forms) or G-quadruplexes. It is expected that the non-B-DNA based entanglement of the linear DNA vectors would consist of the initial transient and co-operative non-B-DNA mediated binding events followed by tight self-ensnarement of the vector DNA. Once in the nucleoplasm of the target human cells, the DNA can be disentangled by type II topoisomerases. The technology for such self-entanglement can be an avenue for the improvement of gene delivery with high-molecular-weight naked DNA using therapeutically important methods associated with considerable shear stress. Priority applications include in vivo muscle electroporation and sonoporation for Duchenne muscular dystrophy patients, aerosol inhalation to reach the target lung cells of cystic fibrosis patients and bio-ballistic delivery to skin melanomas with the vector DNA adsorbed on gold or tungsten projectiles. Copyright © 2012 Elsevier Ltd. All rights reserved.

  15. Alternative DNA structure formation in the mutagenic human c-MYC promoter

    PubMed Central

    del Mundo, Imee Marie A.; Zewail-Foote, Maha; Kerwin, Sean M.

    2017-01-01

    Abstract Mutation ‘hotspot’ regions in the genome are susceptible to genetic instability, implicating them in diseases. These hotspots are not random and often co-localize with DNA sequences potentially capable of adopting alternative DNA structures (non-B DNA, e.g. H-DNA and G4-DNA), which have been identified as endogenous sources of genomic instability. There are regions that contain overlapping sequences that may form more than one non-B DNA structure. The extent to which one structure impacts the formation/stability of another, within the sequence, is not fully understood. To address this issue, we investigated the folding preferences of oligonucleotides from a chromosomal breakpoint hotspot in the human c-MYC oncogene containing both potential G4-forming and H-DNA-forming elements. We characterized the structures formed in the presence of G4-DNA-stabilizing K+ ions or H-DNA-stabilizing Mg2+ ions using multiple techniques. We found that under conditions favorable for H-DNA formation, a stable intramolecular triplex DNA structure predominated; whereas, under K+-rich, G4-DNA-forming conditions, a plurality of unfolded and folded species were present. Thus, within a limited region containing sequences with the potential to adopt multiple structures, only one structure predominates under a given condition. The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene. PMID:28334873

  16. Variations in the non-coding transcriptome as a driver of inter-strain divergence and physiological adaptation in bacteria

    PubMed Central

    Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R.; Voß, Björn

    2015-01-01

    In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5′UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5′UTR. Such an sRNA/mRNA structure, which we name ‘actuaton’, represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation. PMID:25902393

  17. Development of 3D electromagnetic modeling tools for airborne vehicles

    NASA Technical Reports Server (NTRS)

    Volakis, John L.

    1992-01-01

    The main goal of this project is to develop methodologies for scattering by airborne composite vehicles. Although our primary focus continues to be the development of a general purpose code for analyzing the entire structure as a single unit, a number of other tasks are also pursued in parallel with this effort. These tasks are important in testing the overall approach and in developing suitable models for materials coatings, junctions and, more generally, in assessing the effectiveness of the various parts comprising the final code. Here, we briefly discuss our progress on the five different tasks which were pursued during this period. Our progress on each of these tasks is described in the detailed reports (listed at the end of this report) and the memoranda included. The first task described below is, of course, the core of this project and deals with the development of the overall code. Undoubtedly, it is the outcome of the research which was funded by NASA-Ames and the Navy over the past three years. During this year we developed the first finite element code for scattering by structures of arbitrary shape and composition. The code employs a new absorbing boundary condition which allows termination of the finite element mesh only 0.3 lambda from the outer surface of the target. This leads to a remarkable reduction of the mesh size and is a unique feature of the code. Other unique features of this code include capabilities to model resistive sheets, impedance sheets and anisotropic materials. This last capability is the latest feature of the code and is still under development. The code has been extensively validated for a number of composite geometries and some examples are given. The validation of the code is still in progress for anisotropic and larger non-metallic geometries and cavities. The developed finite element code is based on a Galerkin's formulation and employs edge-based tetrahedral elements for discretizing the dielectric sections and the region between the target and the outer mesh termination boundary (ATB). This boundary is placed in conformity with the target's outer surface, thus resulting in additional reduction of the unknown count.

  18. De Novo Origin of Human Protein-Coding Genes

    PubMed Central

    Wu, Dong-Dong; Irwin, David M.; Zhang, Ya-Ping

    2011-01-01

    The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. PMID:22102831

  19. RNA therapeutics: RNAi and antisense mechanisms and clinical applications.

    PubMed

    Chery, Jessica

    2016-07-01

    RNA therapeutics refers to the use of oligonucleotides to target primarily ribonucleic acids (RNA) for therapeutic efforts or in research studies to elucidate functions of genes. Oligonucleotides are distinct from other pharmacological modalities, such as small molecules and antibodies that target mainly proteins, due to their mechanisms of action and chemical properties. Nucleic acids come in two forms: deoxyribonucleic acids (DNA) and ribonucleic acids (RNA). Although DNA is more stable, RNA offers more structural variety ranging from messenger RNA (mRNA) that codes for protein to non-coding RNAs, microRNA (miRNA), transfer RNA (tRNA), short interfering RNAs (siRNAs), ribosomal RNA (rRNA), and long-noncoding RNAs (lncRNAs). As our understanding of the wide variety of RNAs deepens, researchers have sought to target RNA since >80% of the genome is estimated to be transcribed. These transcripts include non-coding RNAs such as miRNAs and siRNAs that function in gene regulation by playing key roles in the transfer of genetic information from DNA to protein, the final product of the central dogma in biology 1 . Currently there are two main approaches used to target RNA: double stranded RNA-mediated interference (RNAi) and antisense oligonucleotides (ASO). Both approaches are currently in clinical trials for targeting of RNAs involved in various diseases, such as cancer and neurodegeneration. In fact, ASOs targeting spinal muscular atrophy and amyotrophic lateral sclerosis have shown positive results in clinical trials 2 . Advantages of ASOs include higher affinity due to the development of chemical modifications that increase affinity, selectivity while decreasing toxicity due to off-target effects. This review will highlight the major therapeutic approaches of RNA medicine currently being applied with a focus on RNAi and ASOs.

  20. Parallel-vector computation for linear structural analysis and non-linear unconstrained optimization problems

    NASA Technical Reports Server (NTRS)

    Nguyen, D. T.; Al-Nasra, M.; Zhang, Y.; Baddourah, M. A.; Agarwal, T. K.; Storaasli, O. O.; Carmona, E. A.

    1991-01-01

    Several parallel-vector computational improvements to the unconstrained optimization procedure are described which speed up the structural analysis-synthesis process. A fast parallel-vector Choleski-based equation solver, pvsolve, is incorporated into the well-known SAP-4 general-purpose finite-element code. The new code, denoted PV-SAP, is tested for static structural analysis. Initial results on a four processor CRAY 2 show that using pvsolve reduces the equation solution time by a factor of 14-16 over the original SAP-4 code. In addition, parallel-vector procedures for the Golden Block Search technique and the BFGS method are developed and tested for nonlinear unconstrained optimization. A parallel version of an iterative solver and the pvsolve direct solver are incorporated into the BFGS method. Preliminary results on nonlinear unconstrained optimization test problems, using pvsolve in the analysis, show excellent parallel-vector performance indicating that these parallel-vector algorithms can be used in a new generation of finite-element based structural design/analysis-synthesis codes.

  1. The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae).

    PubMed

    Pan, Hong-Chun; Fang, Hong-Yan; Li, Shi-Wei; Liu, Jun-Hong; Wang, Ying; Wang, An-Tai

    2014-12-01

    The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae) is composed of two linear DNA molecules. The mitochondrial DNA (mtDNA) molecule 1 is 8010 bp long and contains six protein-coding genes, large subunit rRNA, methionine and tryptophan tRNAs, two pseudogenes consisting respectively of a partial copy of COI, and terminal sequences at two ends of the linear mtDNA, while the mtDNA molecule 2 is 7576 bp long and contains seven protein-coding genes, small subunit rRNA, methionine tRNA, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mtDNA. COI gene begins with GTG as start codon, whereas other 12 protein-coding genes start with a typical ATG initiation codon. In addition, all protein-coding genes are terminated with TAA as stop codon.

  2. Magnesium and Calcium in Isolated Cell Nuclei

    PubMed Central

    Naora, H.; Naora, H.; Mirsky, A. E.; Allfrey, V. G.

    1961-01-01

    The calcium and magnesium contents of thymus nuclei have been determined and the nuclear sites of attachment of these two elements have been studied. The nuclei used for these purposes were isolated in non-aqueous media and in sucrose solutions. Non-aqueous nuclei contain 0.024 per cent calcium and 0.115 per cent magnesium. Calcium and magnesium are held at different sites. The greater part of the magnesium is bound to DNA, probably to its phosphate groups. Evidence is presented that the magnesium atoms combined with the phosphate groups of DNA are also attached to mononucleotides. There is reason to believe that those DNA-phosphate groups to which magnesium is bound, less than 1/10th of the total, are metabolically active, while those to which histones are attached seem to be inactive. PMID:13727745

  3. A novel species-specific tandem repeat DNA family from Sinapis arvensis: detection of telomere-like sequences.

    PubMed

    Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M

    1996-08-01

    DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.

  4. Evaluation of fluorescence in situ hybridization techniques to study long non-coding RNA expression in cultured cells

    PubMed Central

    Soares, Ricardo J; Maglieri, Giulia; Gutschner, Tony; Lund, Anders H; Nielsen, Boye S

    2018-01-01

    Abstract Deciphering the functions of long non-coding RNAs (lncRNAs) is facilitated by visualization of their subcellular localization using in situ hybridization (ISH) techniques. We evaluated four different ISH methods for detection of MALAT1 and CYTOR in cultured cells: a multiple probe detection approach with or without enzymatic signal amplification, a branched-DNA (bDNA) probe and an LNA-modified probe with enzymatic signal amplification. All four methods adequately stained MALAT1 in the nucleus in all of three cell lines investigated, HeLa, NHDF and T47D, and three of the methods detected the less expressed CYTOR. The sensitivity of the four ISH methods was evaluated by image analysis. In all three cell lines, the two methods involving enzymatic amplification gave the most intense MALAT1 signal, but the signal-to-background ratios were not different. CYTOR was best detected using the bDNA method. All four ISH methods showed significantly reduced MALAT1 signal in knock-out cells, and siRNA-induced knock-down of CYTOR resulted in significantly reduced CYTOR ISH signal, indicating good specificity of the probe designs and detection systems. Our data suggest that the ISH methods allow detection of both abundant and less abundantly expressed lncRNAs, although the latter required the use of the most specific and sensitive probe detection system. PMID:29059327

  5. Colon Cancer-Upregulated Long Non-Coding RNA lincDUSP Regulates Cell Cycle Genes and Potentiates Resistance to Apoptosis.

    PubMed

    Forrest, Megan E; Saiakhova, Alina; Beard, Lydia; Buchner, David A; Scacheri, Peter C; LaFramboise, Thomas; Markowitz, Sanford; Khalil, Ahmad M

    2018-05-09

    Long non-coding RNAs (lncRNAs) are frequently dysregulated in many human cancers. We sought to identify candidate oncogenic lncRNAs in human colon tumors by utilizing RNA sequencing data from 22 colon tumors and 22 adjacent normal colon samples from The Cancer Genome Atlas (TCGA). The analysis led to the identification of ~200 differentially expressed lncRNAs. Validation in an independent cohort of normal colon and patient-derived colon cancer cell lines identified a novel lncRNA, lincDUSP, as a potential candidate oncogene. Knockdown of lincDUSP in patient-derived colon tumor cell lines resulted in significantly decreased cell proliferation and clonogenic potential, and increased susceptibility to apoptosis. The knockdown of lincDUSP affects the expression of ~800 genes, and NCI pathway analysis showed enrichment of DNA damage response and cell cycle control pathways. Further, identification of lincDUSP chromatin occupancy sites by ChIRP-Seq demonstrated association with genes involved in the replication-associated DNA damage response and cell cycle control. Consistent with these findings, lincDUSP knockdown in colon tumor cell lines increased both the accumulation of cells in early S-phase and γH2AX foci formation, indicating increased DNA damage response induction. Taken together, these results demonstrate a key role of lincDUSP in the regulation of important pathways in colon cancer.

  6. Analysis of LexA binding sites and transcriptomics in response to genotoxic stress in Leptospira interrogans.

    PubMed

    Schons-Fonseca, Luciane; da Silva, Josefa B; Milanez, Juliana S; Domingos, Renan H; Smith, Janet L; Nakaya, Helder I; Grossman, Alan D; Ho, Paulo L; da Costa, Renata M A

    2016-02-18

    We determined the effects of DNA damage caused by ultraviolet radiation on gene expression in Leptospira interrogans using DNA microarrays. These data were integrated with DNA binding in vivo of LexA1, a regulator of the DNA damage response, assessed by chromatin immunoprecipitation and massively parallel DNA sequencing (ChIP-seq). In response to DNA damage, Leptospira induced expression of genes involved in DNA metabolism, in mobile genetic elements and defective prophages. The DNA repair genes involved in removal of photo-damage (e.g. nucleotide excision repair uvrABC, recombinases recBCD and resolvases ruvABC) were not induced. Genes involved in various metabolic pathways were down regulated, including genes involved in cell growth, RNA metabolism and the tricarboxylic acid cycle. From ChIP-seq data, we observed 24 LexA1 binding sites located throughout chromosome 1 and one binding site in chromosome 2. Expression of many, but not all, genes near those sites was increased following DNA damage. Binding sites were found as far as 550 bp upstream from the start codon, or 1 kb into the coding sequence. Our findings indicate that there is a shift in gene expression following DNA damage that represses genes involved in cell growth and virulence, and induces genes involved in mutagenesis and recombination. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Conserved expression of transposon-derived non-coding transcripts in primate stem cells.

    PubMed

    Ramsay, LeeAnn; Marchetto, Maria C; Caron, Maxime; Chen, Shu-Huang; Busche, Stephan; Kwan, Tony; Pastinen, Tomi; Gage, Fred H; Bourque, Guillaume

    2017-02-28

    A significant portion of expressed non-coding RNAs in human cells is derived from transposable elements (TEs). Moreover, it has been shown that various long non-coding RNAs (lncRNAs), which come from the human endogenous retrovirus subfamily H (HERVH), are not only expressed but required for pluripotency in human embryonic stem cells (hESCs). To identify additional TE-derived functional non-coding transcripts, we generated RNA-seq data from induced pluripotent stem cells (iPSCs) of four primate species (human, chimpanzee, gorilla, and rhesus) and searched for transcripts whose expression was conserved. We observed that about 30% of TE instances expressed in human iPSCs had orthologous TE instances that were also expressed in chimpanzee and gorilla. Notably, our analysis revealed a number of repeat families with highly conserved expression profiles including HERVH but also MER53, which is known to be the source of a placental-specific family of microRNAs (miRNAs). We also identified a number of repeat families from all classes of TEs, including MLT1-type and Tigger families, that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved. Together, these results describe TE families and TE-derived lncRNAs whose conserved expression patterns can be used to identify what are likely functional TE-derived non-coding transcripts in primate iPSCs.

  8. Charge-reversal Lipids, Peptide-based Lipids, and Nucleoside-based Lipids for Gene Delivery

    PubMed Central

    LaManna, Caroline M.; Lusic, Hrvoje; Camplo, Michel; McIntosh, Thomas J.; Barthélémy, Philippe; Grinstaff, Mark W.

    2013-01-01

    Conspectus Twenty years after gene therapy was introduced in the clinic, advances in the technique continue to garner headlines as successes pique the interest of clinicians, researchers, and the public. Gene therapy’s appeal stems from its potential to revolutionize modern medical therapeutics by offering solutions to a myriad of diseases by tailoring the treatment to a specific individual’s genetic code. Both viral and non-viral vectors have been used in the clinic, but the low transfection efficiencies when utilizing non-viral vectors have lead to an increased focus on engineering new gene delivery vectors. To address the challenges facing non-viral or synthetic vectors, specifically lipid-based carriers, we have focused on three main themes throughout our research: 1) that releasing the nucleic acid from the carrier will increase gene transfection; 2) that utilizing biologically inspired designs, such as DNA binding proteins, to create lipids with peptide-based headgroups will improve delivery; and 3) that mimicking the natural binding patterns observed within DNA, by using lipids having a nucleoside headgroup, will give unique supramolecular assembles with high transfection efficiency. The results presented in this Account demonstrate that cellular uptake and transfection efficacy can be improved by engineering the chemical components of the lipid vectors to enhance nucleic acid binding and release kinetics. Specifically, our research has shown that the incorporation of a charge-reversal moiety to initiate change of the lipid from positive to negative net charge during the transfection process improves transfection. In addition, by varying the composition of the spacer (rigid, flexible, short, long, and aromatic) between the cationic headgroup and the hydrophobic chains, lipids can be tailored to interact with different nucleic acids (DNA, RNA, siRNA) and accordingly affect delivery, uptake outcomes, and transfection efficiency. Introduction of a peptide headgroup into the lipid provides a mechanism to affect the binding of the lipid to the nucleic acid, to influence the supramolecular lipoplex structure, and to enhance gene transfection activity. Lastly, we discuss the in-vitro successes we have had when using lipids possessing a nucleoside headgroup to create unique self-assembled structures and to deliver DNA to cells. In this Account, we state our hypotheses and design elements as well as describe the techniques that we have utilized in our research, in order to provide readers with the tools to characterize and engineer new vectors. PMID:22439686

  9. Informational structure of genetic sequences and nature of gene splicing

    NASA Astrophysics Data System (ADS)

    Trifonov, E. N.

    1991-10-01

    Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

  10. Mitoepigenetics and drug addiction.

    PubMed

    Sadakierska-Chudy, Anna; Frankowska, Małgorzata; Filip, Małgorzata

    2014-11-01

    Being the center of energy production in eukaryotic cells, mitochondria are also crucial for various cellular processes including intracellular Ca(2+) signaling and generation of reactive oxygen species (ROS). Mitochondria contain their own circular DNA which encodes not only proteins, transfer RNA and ribosomal RNAs but also non-coding RNAs. The most recent line of evidence indicates the presence of 5-methylcytosine and 5-hydroxymethylcytosine in mitochondrial DNA (mtDNA); thus, the level of gene expression - in a way similar to nuclear DNA - can be regulated by direct epigenetic modifications. Up to now, very little data shows the possibility of epigenetic regulation of mtDNA. Mitochondria and mtDNA are particularly important in the nervous system and may participate in the initiation of drug addiction. In fact, some addictive drugs enhance ROS production and generate oxidative stress that in turn alters mitochondrial and nuclear gene expression. This review summarizes recent findings on mitochondrial function, mtDNA copy number and epigenetics in drug addiction. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Advanced composites structural concepts and materials technologies for primary aircraft structures: Structural response and failure analysis

    NASA Technical Reports Server (NTRS)

    Dorris, William J.; Hairr, John W.; Huang, Jui-Tien; Ingram, J. Edward; Shah, Bharat M.

    1992-01-01

    Non-linear analysis methods were adapted and incorporated in a finite element based DIAL code. These methods are necessary to evaluate the global response of a stiffened structure under combined in-plane and out-of-plane loading. These methods include the Arc Length method and target point analysis procedure. A new interface material model was implemented that can model elastic-plastic behavior of the bond adhesive. Direct application of this method is in skin/stiffener interface failure assessment. Addition of the AML (angle minus longitudinal or load) failure procedure and Hasin's failure criteria provides added capability in the failure predictions. Interactive Stiffened Panel Analysis modules were developed as interactive pre-and post-processors. Each module provides the means of performing self-initiated finite elements based analysis of primary structures such as a flat or curved stiffened panel; a corrugated flat sandwich panel; and a curved geodesic fuselage panel. This module brings finite element analysis into the design of composite structures without the requirement for the user to know much about the techniques and procedures needed to actually perform a finite element analysis from scratch. An interactive finite element code was developed to predict bolted joint strength considering material and geometrical non-linearity. The developed method conducts an ultimate strength failure analysis using a set of material degradation models.

  12. Transposable element evolution in Heliconius suggests genome diversity within Lepidoptera

    PubMed Central

    2013-01-01

    Background Transposable elements (TEs) have the potential to impact genome structure, function and evolution in profound ways. In order to understand the contribution of transposable elements (TEs) to Heliconius melpomene, we queried the H. melpomene draft sequence to identify repetitive sequences. Results We determined that TEs comprise ~25% of the genome. The predominant class of TEs (~12% of the genome) was the non-long terminal repeat (non-LTR) retrotransposons, including a novel SINE family. However, this was only slightly higher than content derived from DNA transposons, which are diverse, with several families having mobilized in the recent past. Compared to the only other well-studied lepidopteran genome, Bombyx mori, H. melpomene exhibits a higher DNA transposon content and a distinct repertoire of retrotransposons. We also found that H. melpomene exhibits a high rate of TE turnover with few older elements accumulating in the genome. Conclusions Our analysis represents the first complete, de novo characterization of TE content in a butterfly genome and suggests that, while TEs are able to invade and multiply, TEs have an overall deleterious effect and/or that maintaining a small genome is advantageous. Our results also hint that analysis of additional lepidopteran genomes will reveal substantial TE diversity within the group. PMID:24088337

  13. ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data.

    PubMed

    Zhou, Ke-Ren; Liu, Shun; Sun, Wen-Ju; Zheng, Ling-Ling; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2017-01-04

    The abnormal transcriptional regulation of non-coding RNAs (ncRNAs) and protein-coding genes (PCGs) is contributed to various biological processes and linked with human diseases, but the underlying mechanisms remain elusive. In this study, we developed ChIPBase v2.0 (http://rna.sysu.edu.cn/chipbase/) to explore the transcriptional regulatory networks of ncRNAs and PCGs. ChIPBase v2.0 has been expanded with ∼10 200 curated ChIP-seq datasets, which represent about 20 times expansion when comparing to the previous released version. We identified thousands of binding motif matrices and their binding sites from ChIP-seq data of DNA-binding proteins and predicted millions of transcriptional regulatory relationships between transcription factors (TFs) and genes. We constructed 'Regulator' module to predict hundreds of TFs and histone modifications that were involved in or affected transcription of ncRNAs and PCGs. Moreover, we built a web-based tool, Co-Expression, to explore the co-expression patterns between DNA-binding proteins and various types of genes by integrating the gene expression profiles of ∼10 000 tumor samples and ∼9100 normal tissues and cell lines. ChIPBase also provides a ChIP-Function tool and a genome browser to predict functions of diverse genes and visualize various ChIP-seq data. This study will greatly expand our understanding of the transcriptional regulations of ncRNAs and PCGs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Transposable Elements and DNA Methylation Create in Embryonic Stem Cells Human-Specific Regulatory Sequences Associated with Distal Enhancers and Noncoding RNAs

    PubMed Central

    Glinsky, Gennadi V.

    2015-01-01

    Despite significant progress in the structural and functional characterization of the human genome, understanding of the mechanisms underlying the genetic basis of human phenotypic uniqueness remains limited. Here, I report that transposable element-derived sequences, most notably LTR7/HERV-H, LTR5_Hs, and L1HS, harbor 99.8% of the candidate human-specific regulatory loci (HSRL) with putative transcription factor-binding sites in the genome of human embryonic stem cells (hESC). A total of 4,094 candidate HSRL display selective and site-specific binding of critical regulators (NANOG [Nanog homeobox], POU5F1 [POU class 5 homeobox 1], CCCTC-binding factor [CTCF], Lamin B1), and are preferentially located within the matrix of transcriptionally active DNA segments that are hypermethylated in hESC. hESC-specific NANOG-binding sites are enriched near the protein-coding genes regulating brain size, pluripotency long noncoding RNAs, hESC enhancers, and 5-hydroxymethylcytosine-harboring regions immediately adjacent to binding sites. Sequences of only 4.3% of hESC-specific NANOG-binding sites are present in Neanderthals’ genome, suggesting that a majority of these regulatory elements emerged in Modern Humans. Comparisons of estimated creation rates of novel TF-binding sites revealed that there was 49.7-fold acceleration of creation rates of NANOG-binding sites in genomes of Chimpanzees compared with the mouse genomes and further 5.7-fold acceleration in genomes of Modern Humans compared with the Chimpanzees genomes. Preliminary estimates suggest that emergence of one novel NANOG-binding site detectable in hESC required 466 years of evolution. Pathway analysis of coding genes that have hESC-specific NANOG-binding sites within gene bodies or near gene boundaries revealed their association with physiological development and functions of nervous and cardiovascular systems, embryonic development, behavior, as well as development of a diverse spectrum of pathological conditions such as cancer, diseases of cardiovascular and reproductive systems, metabolic diseases, multiple neurological and psychological disorders. A proximity placement model is proposed explaining how a 33–47% excess of NANOG, CTCF, and POU5F1 proteins immobilized on a DNA scaffold may play a functional role at distal regulatory elements. PMID:25956794

  15. Genome organization of epidemic Acinetobacter baumannii strains.

    PubMed

    Di Nocera, Pier Paolo; Rocco, Francesco; Giannouli, Maria; Triassi, Maria; Zarrilli, Raffaele

    2011-10-10

    Acinetobacter baumannii is an opportunistic pathogen responsible for hospital-acquired infections. A. baumannii epidemics described world-wide were caused by few genotypic clusters of strains. The occurrence of epidemics caused by multi-drug resistant strains assigned to novel genotypes have been reported over the last few years. In the present study, we compared whole genome sequences of three A. baumannii strains assigned to genotypes ST2, ST25 and ST78, representative of the most frequent genotypes responsible for epidemics in several Mediterranean hospitals, and four complete genome sequences of A. baumannii strains assigned to genotypes ST1, ST2 and ST77. Comparative genome analysis showed extensive synteny and identified 3068 coding regions which are conserved, at the same chromosomal position, in all A. baumannii genomes. Genome alignments also identified 63 DNA regions, ranging in size from 4 o 126 kb, all defined as genomic islands, which were present in some genomes, but were either missing or replaced by non-homologous DNA sequences in others. Some islands are involved in resistance to drugs and metals, others carry genes encoding surface proteins or enzymes involved in specific metabolic pathways, and others correspond to prophage-like elements. Accessory DNA regions encode 12 to 19% of the potential gene products of the analyzed strains. The analysis of a collection of epidemic A. baumannii strains showed that some islands were restricted to specific genotypes. The definition of the genome components of A. baumannii provides a scaffold to rapidly evaluate the genomic organization of novel clinical A. baumannii isolates. Changes in island profiling will be useful in genomic epidemiology of A. baumannii population.

  16. Germ line insertion of mtDNA at the breakpoint junction of a reciprocal constitutional translocation.

    PubMed

    Willett-Brozick, J E; Savul, S A; Richey, L E; Baysal, B E

    2001-08-01

    Constitutional chromosomal translocations are relatively common causes of human morbidity, yet the DNA double-strand break (DSB) repair mechanisms that generate them are incompletely understood. We cloned, sequenced and analyzed the breakpoint junctions of a familial constitutional reciprocal translocation t(9;11)(p24;q23). Within the 10-kb region flanking the breakpoints, chromosome 11 had 25% repeat elements, whereas chromosome 9 had 98% repeats, 95% of which were L1-type LINE elements. The breakpoints occurred within an L1-type repeat element at 9p24 and at the 3'-end of an Alu sequence at 11q23. At the breakpoint junction of derivative chromosome 9, we discovered an unusually large 41-bp insertion, which showed 100% identity to 12S mitochondrial DNA (mtDNA) between nucleotides 896 and 936 of the mtDNA sequence. Analysis of the human genome failed to show the preexistence of the inserted sequence at normal chromosomes 9 and 11 breakpoint junctions or elsewhere in the genome, strongly suggesting that the insertion was derived from human mtDNA and captured into the junction during the DSB repair process. To our knowledge, these findings represent the first observation of spontaneous germ line insertion of modern human mtDNA sequences and suggest that DSB repair may play a role in inter-organellar gene transfer in vivo. Our findings also provide evidence for a previously unrecognized insertional mechanism in human, by which non-mobile extra-chromosomal fragments can be inserted into the genome at DSB repair junctions.

  17. Understanding the relationship between DNA methylation and histone lysine methylation☆

    PubMed Central

    Rose, Nathan R.; Klose, Robert J.

    2014-01-01

    DNA methylation acts as an epigenetic modification in vertebrate DNA. Recently it has become clear that the DNA and histone lysine methylation systems are highly interrelated and rely mechanistically on each other for normal chromatin function in vivo. Here we examine some of the functional links between these systems, with a particular focus on several recent discoveries suggesting how lysine methylation may help to target DNA methylation during development, and vice versa. In addition, the emerging role of non-methylated DNA found in CpG islands in defining histone lysine methylation profiles at gene regulatory elements will be discussed in the context of gene regulation. This article is part of a Special Issue entitled: Methylation: A Multifaceted Modification — looking at transcription and beyond. PMID:24560929

  18. The emerging role of epigenetics in rheumatic diseases.

    PubMed

    Gay, Steffen; Wilson, Anthony G

    2014-03-01

    Epigenetics is a key mechanism regulating the expression of genes. There are three main and interrelated mechanisms: DNA methylation, post-translational modification of histone proteins and non-coding RNA. Gene activation is generally associated with lower levels of DNA methylation in promoters and with distinct histone marks such as acetylation of amino acids in histones. Unlike the genetic code, the epigenome is altered by endogenous (e.g. hormonal) and environmental (e.g. diet, exercise) factors and changes with age. Recent evidence implicates epigenetic mechanisms in the pathogenesis of common rheumatic disease, including RA, OA, SLE and scleroderma. Epigenetic drift has been implicated in age-related changes in the immune system that result in the development of a pro-inflammatory status termed inflammageing, potentially increasing the risk of age-related conditions such as polymyalgia rheumatica. Therapeutic targeting of the epigenome has shown promise in animal models of rheumatic diseases. Rapid advances in computational biology and DNA sequencing technology will lead to a more comprehensive understanding of the roles of epigenetics in the pathogenesis of common rheumatic diseases.

  19. Informatic and genomic analysis of melanocyte cDNA libraries as a resource for the study of melanocyte development and function.

    PubMed

    Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J

    2007-06-01

    As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.

  20. Comprehensive Reconstruction and Visualization of Non-Coding Regulatory Networks in Human

    PubMed Central

    Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba

    2014-01-01

    Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape. PMID:25540777

  1. Comprehensive reconstruction and visualization of non-coding regulatory networks in human.

    PubMed

    Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba

    2014-01-01

    Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape.

  2. Characterization of the repetitive DNA elements in the genome of fish lymphocystis disease viruses.

    PubMed

    Schnitzler, P; Darai, G

    1989-09-01

    The complete DNA nucleotide sequence of the repetitive DNA elements in the genome of fish lymphocystis disease virus (FLDV) isolated from two different species (flounder and dab) was determined. The size of these repetitive DNA elements was found to be 1413 bp which corresponds to the DNA sequences of the 5' terminus of the EcoRI DNA fragment B (0.034 to 0.052 m.u.) and to the EcoRI DNA fragment M (0.718 to 0.736 m.u.) of the FLDV genome causing lymphocystis disease in flounder and plaice. The degree of DNA nucleotide homology between both regions was found to be 99%. The repetitive DNA element in the genome of FLDV isolated from other fish species (dab) was identified and is located within the EcoRI DNA fragment B and J of the viral genome. The DNA nucleotide sequence of one duplicate of this repetition (EcoRI DNA fragment J) was determined (1410 bp) and compared to the DNA nucleotide sequences of the repetitive DNA elements of the genome of FLDV isolated from flounder. It was found that the repetitive DNA elements of the genome of FLDV derived from two different fish species are highly conserved and possess a degree of DNA sequence homology of 94%. The DNA sequences of each strand of the individual repetitive element possess one open reading frame.

  3. Genomic assessment of the evolution of the prion protein gene family in vertebrates.

    PubMed

    Harrison, Paul M; Khachane, Amit; Kumar, Manish

    2010-05-01

    Prion diseases are devastating neurological disorders caused by the propagation of particles containing an alternative beta-sheet-rich form of the prion protein (PrP). Genes paralogous to PrP, called Doppel and Shadoo, have been identified, that also have neuropathological relevance. To aid in the further functional characterization of PrP and its relatives, we annotated completely the PrP gene family (PrP-GF), in the genomes of 42 vertebrates, through combined strategic application of gene prediction programs and advanced remote homology detection techniques (such as HMMs, PSI-TBLASTN and pGenThreader). We have uncovered several previously undescribed paralogous genes and pseudogenes. We find that current high-quality genomic evidence indicates that the PrP relative Doppel, was likely present in the last common ancestor of present-day Tetrapoda, but was lost in the bird lineage, since its divergence from reptiles. Using the new gene annotations, we have defined the consensus of structural features that are characteristic of the PrP and Doppel structures, across diverse Tetrapoda clades. Furthermore, we describe in detail a transcribed pseudogene derived from Shadoo that is conserved across primates, and that overlaps the meiosis gene, SYCE1, thus possibly regulating its expression. In addition, we analysed the locus of PRNP/PRND for significant conservation across the genomic DNA of eleven mammals, and determined the phylogenetic penetration of non-coding exons. The genomic evidence indicates that the second PRNP non-coding exon found in even-toed ungulates and rodents, is conserved in all high-coverage genome assemblies of primates (human, chimp, orang utan and macaque), and is, at least, likely to have fallen out of use during primate speciation. Furthermore, we have demonstrated that the PRNT gene (at the PRNP human locus) is conserved across at least sixteen mammals, and evolves like a long non-coding RNA, fashioned from fragments of ancient, long, interspersed elements. These annotations and evolutionary analyses will be of further use for functional characterisation of the PrP-GF, and will be updatable in a semi-automated fashion as more genomes accumulate. Copyright 2010 Elsevier Inc. All rights reserved.

  4. High-efficiency transformation of Pichia stipitis based on its URA3 gene and a homologous autonomous replication sequence, ARS2.

    PubMed Central

    Yang, V W; Marks, J A; Davis, B P; Jeffries, T W

    1994-01-01

    This paper describes the first high-efficiency transformation system for the xylose-fermenting yeast Pichia stipitis. The system includes integrating and autonomously replicating plasmids based on the gene for orotidine-5'-phosphate decarboxylase (URA3) and an autonomous replicating sequence (ARS) element (ARS2) isolated from P. stipitis CBS 6054. Ura- auxotrophs were obtained by selecting for resistance to 5-fluoroorotic acid and were identified as ura3 mutants by transformation with P. stipitis URA3. P. stipitis URA3 was cloned by its homology to Saccharomyces cerevisiae URA3, with which it is 69% identical in the coding region. P. stipitis ARS elements were cloned functionally through plasmid rescue. These sequences confer autonomous replication when cloned into vectors bearing the P. stipitis URA3 gene. P. stipitis ARS2 has features similar to those of the consensus ARS of S. cerevisiae and other ARS elements. Circular plasmids bearing the P. stipitis URA3 gene with various amounts of flanking sequences produced 600 to 8,600 Ura+ transformants per micrograms of DNA by electroporation. Most transformants obtained with circular vectors arose without integration of vector sequences. One vector yielded 5,200 to 12,500 Ura+ transformants per micrograms of DNA after it was linearized at various restriction enzyme sites within the P. stipitis URA3 insert. Transformants arising from linearized vectors produced stable integrants, and integration events were site specific for the genomic ura3 in 20% of the transformants examined. Plasmids bearing the P. stipitis URA3 gene and ARS2 element produced more than 30,000 transformants per micrograms of plasmid DNA. Autonomously replicating plasmids were stable for at least 50 generations in selection medium and were present at an average of 10 copies per nucleus. Images PMID:7811063

  5. Automating the generation of finite element dynamical cores with Firedrake

    NASA Astrophysics Data System (ADS)

    Ham, David; Mitchell, Lawrence; Homolya, Miklós; Luporini, Fabio; Gibson, Thomas; Kelly, Paul; Cotter, Colin; Lange, Michael; Kramer, Stephan; Shipton, Jemma; Yamazaki, Hiroe; Paganini, Alberto; Kärnä, Tuomas

    2017-04-01

    The development of a dynamical core is an increasingly complex software engineering undertaking. As the equations become more complete, the discretisations more sophisticated and the hardware acquires ever more fine-grained parallelism and deeper memory hierarchies, the problem of building, testing and modifying dynamical cores becomes increasingly complex. Here we present Firedrake, a code generation system for the finite element method with specialist features designed to support the creation of geoscientific models. Using Firedrake, the dynamical core developer writes the partial differential equations in weak form in a high level mathematical notation. Appropriate function spaces are chosen and time stepping loops written at the same high level. When the programme is run, Firedrake generates high performance C code for the resulting numerics which are executed in parallel. Models in Firedrake typically take a tiny fraction of the lines of code required by traditional hand-coding techniques. They support more sophisticated numerics than are easily achieved by hand, and the resulting code is frequently higher performance. Critically, debugging, modifying and extending a model written in Firedrake is vastly easier than by traditional methods due to the small, highly mathematical code base. Firedrake supports a wide range of key features for dynamical core creation: A vast range of discretisations, including both continuous and discontinuous spaces and mimetic (C-grid-like) elements which optimally represent force balances in geophysical flows. High aspect ratio layered meshes suitable for ocean and atmosphere domains. Curved elements for high accuracy representations of the sphere. Support for non-finite element operators, such as parametrisations. Access to PETSc, a world-leading library of programmable linear and nonlinear solvers. High performance adjoint models generated automatically by symbolically reasoning about the forward model. This poster will present the key features of the Firedrake system, as well as those of Gusto, an atmospheric dynamical core, and Thetis, a coastal ocean model, both of which are written in Firedrake.

  6. DNABIT Compress - Genome compression algorithm.

    PubMed

    Rajarajeswari, Pothuraju; Apparao, Allam

    2011-01-22

    Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.

  7. CRITICA: coding region identification tool invoking comparative analysis

    NASA Technical Reports Server (NTRS)

    Badger, J. H.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

    1999-01-01

    Gene recognition is essential to understanding existing and future DNA sequence data. CRITICA (Coding Region Identification Tool Invoking Comparative Analysis) is a suite of programs for identifying likely protein-coding sequences in DNA by combining comparative analysis of DNA sequences with more common noncomparative methods. In the comparative component of the analysis, regions of DNA are aligned with related sequences from the DNA databases; if the translation of the aligned sequences has greater amino acid identity than expected for the observed percentage nucleotide identity, this is interpreted as evidence for coding. CRITICA also incorporates noncomparative information derived from the relative frequencies of hexanucleotides in coding frames versus other contexts (i.e., dicodon bias). The dicodon usage information is derived by iterative analysis of the data, such that CRITICA is not dependent on the existence or accuracy of coding sequence annotations in the databases. This independence makes the method particularly well suited for the analysis of novel genomes. CRITICA was tested by analyzing the available Salmonella typhimurium DNA sequences. Its predictions were compared with the DNA sequence annotations and with the predictions of GenMark. CRITICA proved to be more accurate than GenMark, and moreover, many of its predictions that would seem to be errors instead reflect problems in the sequence databases. The source code of CRITICA is freely available by anonymous FTP (rdp.life.uiuc.edu in/pub/critica) and on the World Wide Web (http:/(/)rdpwww.life.uiuc.edu).

  8. DNA methylation aberrancies as a guide for surveillance and treatment of human cancers

    PubMed Central

    Liang, Gangning; Weisenberger, Daniel J.

    2017-01-01

    ABSTRACT DNA methylation aberrancies are hallmarks of human cancers and are characterized by global DNA hypomethylation of repetitive elements and non-CpG rich regions concomitant with locus-specific DNA hypermethylation. DNA methylation changes may result in altered gene expression profiles, most notably the silencing of tumor suppressors, microRNAs, endogenous retorviruses and tumor antigens due to promoter DNA hypermethylation, as well as oncogene upregulation due to gene-body DNA hypermethylation. Here, we review DNA methylation aberrancies in human cancers, their use in cancer surveillance and the interplay between DNA methylation and histone modifications in gene regulation. We also summarize DNA methylation inhibitors and their therapeutic effects in cancer treatment. In this context, we describe the integration of DNA methylation inhibitors with conventional chemotherapies, DNA repair inhibitors and immune-based therapies, to bring the epigenome closer to its normal state and increase sensitivity to other therapeutic agents to improve patient outcome and survival. PMID:28358281

  9. Expression of the genetic suppressor element 24.2 (GSE24.2) decreases DNA damage and oxidative stress in X-linked dyskeratosis congenita cells.

    PubMed

    Manguan-Garcia, Cristina; Pintado-Berninches, Laura; Carrillo, Jaime; Machado-Pinilla, Rosario; Sastre, Leandro; Pérez-Quilis, Carme; Esmoris, Isabel; Gimeno, Amparo; García-Giménez, Jose Luis; Pallardó, Federico V; Perona, Rosario

    2014-01-01

    The predominant X-linked form of Dyskeratosis congenita results from mutations in DKC1, which encodes dyskerin, a protein required for ribosomal RNA modification that is also a component of the telomerase complex. We have previously found that expression of an internal fragment of dyskerin (GSE24.2) rescues telomerase activity in X-linked dyskeratosis congenita (X-DC) patient cells. Here we have found that an increased basal and induced DNA damage response occurred in X-DC cells in comparison with normal cells. DNA damage that is also localized in telomeres results in increased heterochromatin formation and senescence. Expression of a cDNA coding for GSE24.2 rescues both global and telomeric DNA damage. Furthermore, transfection of bacterial purified or a chemically synthesized GSE24.2 peptide is able to rescue basal DNA damage in X-DC cells. We have also observed an increase in oxidative stress in X-DC cells and expression of GSE24.2 was able to diminish it. Altogether our data indicated that supplying GSE24.2, either from a cDNA vector or as a peptide reduces the pathogenic effects of Dkc1 mutations and suggests a novel therapeutic approach.

  10. Rotifer rDNA-specific R9 retrotransposable elements generate an exceptionally long target site duplication upon insertion.

    PubMed

    Gladyshev, Eugene A; Arkhipova, Irina R

    2009-12-15

    Ribosomal DNA genes in many eukaryotes contain insertions of non-LTR retrotransposable elements belonging to the R2 clade. These elements persist in the host genomes by inserting site-specifically into multicopy target sites, thereby avoiding random disruption of single-copy host genes. Here we describe R9 retrotransposons from the R2 clade in the 28S RNA genes of bdelloid rotifers, small freshwater invertebrate animals best known for their long-term asexuality and for their ability to survive repeated cycles of desiccation and rehydration. While the structural organization of R9 elements is highly similar to that of other members of the R2 clade, they are characterized by two distinct features: site-specific insertion into a previously unreported target sequence within the 28S gene, and an unusually long target site duplication of 126 bp. We discuss the implications of these findings in the context of bdelloid genome organization and the mechanisms of target-primed reverse transcription.

  11. Heterochromatin and molecular characterization of DsmarMITE transposable element in the beetle Dichotomius schiffleri (Coleoptera: Scarabaeidae).

    PubMed

    Xavier, Crislaine; Cabral-de-Mello, Diogo Cavalcanti; de Moura, Rita Cássia

    2014-12-01

    Cytogenetic studies of the Neotropical beetle genus Dichotomius (Scarabaeinae, Coleoptera) have shown dynamism for centromeric constitutive heterochromatin sequences. In the present work we studied the chromosomes and isolated repetitive sequences of Dichotomius schiffleri aiming to contribute to the understanding of coleopteran genome/chromosomal organization. Dichotomius schiffleri presented a conserved karyotype and heterochromatin distribution in comparison to other species of the genus with 2n = 18, biarmed chromosomes, and pericentromeric C-positive blocks. Similarly to heterochromatin distributional patterns, the highly and moderately repetitive DNA fraction (C 0 t-1 DNA) was detected in pericentromeric areas, contrasting with the euchromatic mapping of an isolated TE (named DsmarMITE). After structural analyses, the DsmarMITE was classified as a non-autonomous element of the type miniature inverted-repeat transposable element (MITE) with terminal inverted repeats similar to Mariner elements of insects from different orders. The euchromatic distribution for DsmarMITE indicates that it does not play a part in the dynamics of constitutive heterochromatin sequences.

  12. Chimeric NP Non Coding Regions between Type A and C Influenza Viruses Reveal Their Role in Translation Regulation

    PubMed Central

    Crescenzo-Chaigne, Bernadette; Barbezange, Cyril; Frigard, Vianney; Poulain, Damien; van der Werf, Sylvie

    2014-01-01

    Exchange of the non coding regions of the NP segment between type A and C influenza viruses was used to demonstrate the importance not only of the proximal panhandle, but also of the initial distal panhandle strength in type specificity. Both elements were found to be compulsory to rescue infectious virus by reverse genetics systems. Interestingly, in type A influenza virus infectious context, the length of the NP segment 5′ NC region once transcribed into mRNA was found to impact its translation, and the level of produced NP protein consequently affected the level of viral genome replication. PMID:25268971

  13. Noncoding transcripts in sense and antisense orientation regulate the epigenetic state of ribosomal RNA genes.

    PubMed

    Bierhoff, H; Schmitz, K; Maass, F; Ye, J; Grummt, I

    2010-01-01

    Alternative transcription of the same gene in sense and antisense orientation regulates expression of protein-coding genes. Here we show that noncoding RNA (ncRNA) in sense and antisense orientation also controls transcription of rRNA genes (rDNA). rDNA exists in two types of chromatin--a euchromatic conformation that is permissive to transcription and a heterochromatic conformation that is transcriptionally silent. Silencing of rDNA is mediated by NoRC, a chromatin-remodeling complex that triggers heterochromatin formation. NoRC function requires RNA that is complementary to the rDNA promoter (pRNA). pRNA forms a DNA:RNA triplex with a regulatory element in the rDNA promoter, and this triplex structure is recognized by DNMT3b. The results imply that triplex-mediated targeting of DNMT3b to specific sequences may be a common pathway in epigenetic regulation. We also show that rDNA is transcribed in antisense orientation. The level of antisense RNA (asRNA) is down-regulated in cancer cells and up-regulated in senescent cells. Ectopic asRNA triggers trimethylation of histone H4 at lysine 20 (H4K20me3), suggesting that antisense transcripts guide the histone methyltransferase Suv4-20 to rDNA. The results reveal that noncoding RNAs in sense and antisense orientation are important determinants of the epigenetic state of rDNA.

  14. Relative stability of DNA as a generic criterion for promoter prediction: whole genome annotation of microbial genomes with varying nucleotide base composition.

    PubMed

    Rangannan, Vetriselvi; Bansal, Manju

    2009-12-01

    The rapid increase in genome sequence information has necessitated the annotation of their functional elements, particularly those occurring in the non-coding regions, in the genomic context. Promoter region is the key regulatory region, which enables the gene to be transcribed or repressed, but it is difficult to determine experimentally. Hence an in silico identification of promoters is crucial in order to guide experimental work and to pin point the key region that controls the transcription initiation of a gene. In this analysis, we demonstrate that while the promoter regions are in general less stable than the flanking regions, their average free energy varies depending on the GC composition of the flanking genomic sequence. We have therefore obtained a set of free energy threshold values, for genomic DNA with varying GC content and used them as generic criteria for predicting promoter regions in several microbial genomes, using an in-house developed tool PromPredict. On applying it to predict promoter regions corresponding to the 1144 and 612 experimentally validated TSSs in E. coli (50.8% GC) and B. subtilis (43.5% GC) sensitivity of 99% and 95% and precision values of 58% and 60%, respectively, were achieved. For the limited data set of 81 TSSs available for M. tuberculosis (65.6% GC) a sensitivity of 100% and precision of 49% was obtained.

  15. Behind the curtain of non-coding RNAs; long non-coding RNAs regulating hepatocarcinogenesis

    PubMed Central

    El Khodiry, Aya; Afify, Menna; El Tayebi, Hend M

    2018-01-01

    Hepatocellular carcinoma (HCC) is one of the most common and aggressive cancers worldwide. HCC is the fifth common malignancy in the world and the second leading cause of cancer death in Asia. Long non-coding RNAs (lncRNAs) are RNAs with a length greater than 200 nucleotides that do not encode proteins. lncRNAs can regulate gene expression and protein synthesis in several ways by interacting with DNA, RNA and proteins in a sequence specific manner. They could regulate cellular and developmental processes through either gene inhibition or gene activation. Many studies have shown that dysregulation of lncRNAs is related to many human diseases such as cardiovascular diseases, genetic disorders, neurological diseases, immune mediated disorders and cancers. However, the study of lncRNAs is challenging as they are poorly conserved between species, their expression levels aren’t as high as that of mRNAs and have great interpatient variations. The study of lncRNAs expression in cancers have been a breakthrough as it unveils potential biomarkers and drug targets for cancer therapy and helps understand the mechanism of pathogenesis. This review discusses many long non-coding RNAs and their contribution in HCC, their role in development, metastasis, and prognosis of HCC and how to regulate and target these lncRNAs as a therapeutic tool in HCC treatment in the future. PMID:29434445

  16. Extracting DNA words based on the sequence features: non-uniform distribution and integrity.

    PubMed

    Li, Zhi; Cao, Hongyan; Cui, Yuehua; Zhang, Yanbo

    2016-01-25

    DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.

  17. Phylogenetic Network for European mtDNA

    PubMed Central

    Finnilä, Saara; Lehtonen, Mervi S.; Majamaa, Kari

    2001-01-01

    The sequence in the first hypervariable segment (HVS-I) of the control region has been used as a source of evolutionary information in most phylogenetic analyses of mtDNA. Population genetic inference would benefit from a better understanding of the variation in the mtDNA coding region, but, thus far, complete mtDNA sequences have been rare. We determined the nucleotide sequence in the coding region of mtDNA from 121 Finns, by conformation-sensitive gel electrophoresis and subsequent sequencing and by direct sequencing of the D loop. Furthermore, 71 sequences from our previous reports were included, so that the samples represented all the mtDNA haplogroups present in the Finnish population. We found a total of 297 variable sites in the coding region, which allowed the compilation of unambiguous phylogenetic networks. The D loop harbored 104 variable sites, and, in most cases, these could be localized within the coding-region networks, without discrepancies. Interestingly, many homoplasies were detected in the coding region. Nucleotide variation in the rRNA and tRNA genes was 6%, and that in the third nucleotide positions of structural genes amounted to 22% of that in the HVS-I. The complete networks enabled the relationships between the mtDNA haplogroups to be analyzed. Phylogenetic networks based on the entire coding-region sequence in mtDNA provide a rich source for further population genetic studies, and complete sequences make it easier to differentiate between disease-causing mutations and rare polymorphisms. PMID:11349229

  18. Transient Non Lin Deformation in Fractured Rock

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sartori, Enrico

    1998-10-14

    MATLOC is a nonlinear, transient, two-dimensional (planer and axisymmetric), thermal stress, finite-element code designed to determine the deformation within a fractured rock mass. The mass is modeled as a nonlinear anistropic elastic material which can exhibit stress-dependent bi-linear locking behavior.

  19. Deep sequencing reveals unique small RNA repertoire that is regulated during head regeneration in Hydra magnipapillata.

    PubMed

    Krishna, Srikar; Nair, Aparna; Cheedipudi, Sirisha; Poduval, Deepak; Dhawan, Jyotsna; Palakodeti, Dasaradhi; Ghanekar, Yashoda

    2013-01-07

    Small non-coding RNAs such as miRNAs, piRNAs and endo-siRNAs fine-tune gene expression through post-transcriptional regulation, modulating important processes in development, differentiation, homeostasis and regeneration. Using deep sequencing, we have profiled small non-coding RNAs in Hydra magnipapillata and investigated changes in small RNA expression pattern during head regeneration. Our results reveal a unique repertoire of small RNAs in hydra. We have identified 126 miRNA loci; 123 of these miRNAs are unique to hydra. Less than 50% are conserved across two different strains of Hydra vulgaris tested in this study, indicating a highly diverse nature of hydra miRNAs in contrast to bilaterian miRNAs. We also identified siRNAs derived from precursors with perfect stem-loop structure and that arise from inverted repeats. piRNAs were the most abundant small RNAs in hydra, mapping to transposable elements, the annotated transcriptome and unique non-coding regions on the genome. piRNAs that map to transposable elements and the annotated transcriptome display a ping-pong signature. Further, we have identified several miRNAs and piRNAs whose expression is regulated during hydra head regeneration. Our study defines different classes of small RNAs in this cnidarian model system, which may play a role in orchestrating gene expression essential for hydra regeneration.

  20. Deep sequencing reveals unique small RNA repertoire that is regulated during head regeneration in Hydra magnipapillata

    PubMed Central

    Krishna, Srikar; Nair, Aparna; Cheedipudi, Sirisha; Poduval, Deepak; Dhawan, Jyotsna; Palakodeti, Dasaradhi; Ghanekar, Yashoda

    2013-01-01

    Small non-coding RNAs such as miRNAs, piRNAs and endo-siRNAs fine-tune gene expression through post-transcriptional regulation, modulating important processes in development, differentiation, homeostasis and regeneration. Using deep sequencing, we have profiled small non-coding RNAs in Hydra magnipapillata and investigated changes in small RNA expression pattern during head regeneration. Our results reveal a unique repertoire of small RNAs in hydra. We have identified 126 miRNA loci; 123 of these miRNAs are unique to hydra. Less than 50% are conserved across two different strains of Hydra vulgaris tested in this study, indicating a highly diverse nature of hydra miRNAs in contrast to bilaterian miRNAs. We also identified siRNAs derived from precursors with perfect stem–loop structure and that arise from inverted repeats. piRNAs were the most abundant small RNAs in hydra, mapping to transposable elements, the annotated transcriptome and unique non-coding regions on the genome. piRNAs that map to transposable elements and the annotated transcriptome display a ping–pong signature. Further, we have identified several miRNAs and piRNAs whose expression is regulated during hydra head regeneration. Our study defines different classes of small RNAs in this cnidarian model system, which may play a role in orchestrating gene expression essential for hydra regeneration. PMID:23166307

  1. Cis-acting RNA elements in the Hepatitis C virus RNA genome

    PubMed Central

    Sagan, Selena M.; Chahal, Jasmin; Sarnow, Peter

    2017-01-01

    Hepatitis C virus (HCV) infection is a rapidly increasing global health problem with an estimated 170 million people infected worldwide. HCV is a hepatotropic, positive-sense RNA virus of the family Flaviviridae. As a positive-sense RNA virus, the HCV genome itself must serve as a template for translation, replication and packaging. The viral RNA must therefore be a dynamic structure that is able to readily accommodate structural changes to expose different regions of the genome to viral and cellular proteins to carry out the HCV life cycle. The ∼9600 nucleotide viral genome contains a single long open reading frame flanked by 5′ and 3′ non-coding regions that contain cis-acting RNA elements important for viral translation, replication and stability. Additional cis-acting RNA elements have also been identified in the coding sequences as well as in the 3′ end of the negative-strand replicative intermediate. Herein, we provide an overview of the importance of these cis-acting RNA elements in the HCV life cycle. PMID:25576644

  2. New t-gap insertion-deletion-like metrics for DNA hybridization thermodynamic modeling.

    PubMed

    D'yachkov, Arkadii G; Macula, Anthony J; Pogozelski, Wendy K; Renz, Thomas E; Rykov, Vyacheslav V; Torney, David C

    2006-05-01

    We discuss the concept of t-gap block isomorphic subsequences and use it to describe new abstract string metrics that are similar to the Levenshtein insertion-deletion metric. Some of the metrics that we define can be used to model a thermodynamic distance function on single-stranded DNA sequences. Our model captures a key aspect of the nearest neighbor thermodynamic model for hybridized DNA duplexes. One version of our metric gives the maximum number of stacked pairs of hydrogen bonded nucleotide base pairs that can be present in any secondary structure in a hybridized DNA duplex without pseudoknots. Thermodynamic distance functions are important components in the construction of DNA codes, and DNA codes are important components in biomolecular computing, nanotechnology, and other biotechnical applications that employ DNA hybridization assays. We show how our new distances can be calculated by using a dynamic programming method, and we derive a Varshamov-Gilbert-like lower bound on the size of some of codes using these distance functions as constraints. We also discuss software implementation of our DNA code design methods.

  3. Highly sensitive detection of mutations in CHO cell recombinant DNA using multi-parallel single molecule real-time DNA sequencing.

    PubMed

    Cartwright, Joseph F; Anderson, Karin; Longworth, Joseph; Lobb, Philip; James, David C

    2018-06-01

    High-fidelity replication of biologic-encoding recombinant DNA sequences by engineered mammalian cell cultures is an essential pre-requisite for the development of stable cell lines for the production of biotherapeutics. However, immortalized mammalian cells characteristically exhibit an increased point mutation frequency compared to mammalian cells in vivo, both across their genomes and at specific loci (hotspots). Thus unforeseen mutations in recombinant DNA sequences can arise and be maintained within producer cell populations. These may affect both the stability of recombinant gene expression and give rise to protein sequence variants with variable bioactivity and immunogenicity. Rigorous quantitative assessment of recombinant DNA integrity should therefore form part of the cell line development process and be an essential quality assurance metric for instances where synthetic/multi-component assemblies are utilized to engineer mammalian cells, such as the assessment of recombinant DNA fidelity or the mutability of single-site integration target loci. Based on Pacific Biosciences (Menlo Park, CA) single molecule real-time (SMRT™) circular consensus sequencing (CCS) technology we developed a rDNA sequence analysis tool to process the multi-parallel sequencing of ∼40,000 single recombinant DNA molecules. After statistical filtering of raw sequencing data, we show that this analytical method is capable of detecting single point mutations in rDNA to a minimum single mutation frequency of 0.0042% (<1/24,000 bases). Using a stable CHO transfectant pool harboring a randomly integrated 5 kB plasmid construct encoding GFP we found that 28% of recombinant plasmid copies contained at least one low frequency (<0.3%) point mutation. These mutations were predominantly found in GC base pairs (85%) and that there was no positional bias in mutation across the plasmid sequence. There was no discernable difference between the mutation frequencies of coding and non-coding DNA. The putative ratio of non-synonymous and synonymous changes within the open reading frames (ORFs) in the plasmid sequence indicates that natural selection does not impact upon the prevalence of these mutations. Here we have demonstrated the abundance of mutations that fall outside of the reported range of detection of next generation sequencing (NGS) and second generation sequencing (SGS) platforms, providing a methodology capable of being utilized in cell line development platforms to identify the fidelity of recombinant genes throughout the production process. © 2018 Wiley Periodicals, Inc.

  4. Identification of a high-efficiency baculovirus DNA replication origin that functions in insect and mammalian cells.

    PubMed

    Wu, Yueh-Lung; Wu, Carol-P; Huang, Yu-Hui; Huang, Sheng-Ping; Lo, Huei-Ru; Chang, Hao-Shuo; Lin, Pi-Hsiu; Wu, Ming-Cheng; Chang, Chia-Jung; Chao, Yu-Chan

    2014-11-01

    The p143 gene from Autographa californica multinucleocapsid nucleopolyhedrovirus (AcMNPV) has been found to increase the expression of luciferase, which is driven by the polyhedrin gene promoter, in a plasmid with virus coinfection. Further study indicated that this is due to the presence of a replication origin (ori) in the coding region of this gene. Transient DNA replication assays showed that a specific fragment of the p143 coding sequence, p143-3, underwent virus-dependent DNA replication in Spodoptera frugiperda IPLB-Sf-21 (Sf-21) cells. Deletion analysis of the p143-3 fragment showed that subfragment p143-3.2a contained the essential sequence of this putative ori. Sequence analysis of this region revealed a unique distribution of imperfect palindromes with high AT contents. No sequence homology or similarity between p143-3.2a and any other known ori was detected, suggesting that it is a novel baculovirus ori. Further study showed that the p143-3.2a ori can replicate more efficiently in infected Sf-21 cells than baculovirus homologous regions (hrs), the major baculovirus ori, or non-hr oris during virus replication. Previously, hr on its own was unable to replicate in mammalian cells, and for mammalian viral oris, viral proteins are generally required for their proper replication in host cells. However, the p143-3.2a ori was, surprisingly, found to function as an efficient ori in mammalian cells without the need for any viral proteins. We conclude that p143 contains a unique sequence that can function as an ori to enhance gene expression in not only insect cells but also mammalian cells. Baculovirus DNA replication relies on both hr and non-hr oris; however, so far very little is known about the latter oris. Here we have identified a new non-hr ori, the p143 ori, which resides in the coding region of p143. By developing a novel DNA replication-enhanced reporter system, we have identified and located the core region required for the p143 ori. This ori contains a large number of imperfect inverted repeats and is the most active ori in the viral genome during virus infection in insect cells. We also found that it is a unique ori that can replicate in mammalian cells without the assistance of baculovirus gene products. The identification of this ori should contribute to a better understanding of baculovirus DNA replication. Also, this ori is very useful in assisting with gene expression in mammalian cells. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  5. Long non-coding RNAs in anti-cancer drug resistance.

    PubMed

    Chen, Qin-Nan; Wei, Chen-Chen; Wang, Zhao-Xia; Sun, Ming

    2017-01-03

    Chemotherapy is one of the basic treatments for cancers; however, drug resistance is mainly responsible for the failure of clinical treatment. The mechanism of drug resistance is complicated because of interaction among various factors including drug efflux, DNA damage repair, apoptosis and targets mutation. Long non-coding RNAs (lncRNAs) have been a focus of research in the field of bioscience, and the latest studies have revealed that lncRNAs play essential roles in drug resistance in breast cancer, gastric cancer and lung cancer, et al. Dysregulation of multiple targets and pathways by lncRNAs results in the occurrence of chemoresistance. In this review, we will discuss the mechanisms underlying lncRNA-mediated resistance to chemotherapy and the therapeutic potential of lncRNAs in future cancer treatment.

  6. Alternative DNA structure formation in the mutagenic human c-MYC promoter.

    PubMed

    Del Mundo, Imee Marie A; Zewail-Foote, Maha; Kerwin, Sean M; Vasquez, Karen M

    2017-05-05

    Mutation 'hotspot' regions in the genome are susceptible to genetic instability, implicating them in diseases. These hotspots are not random and often co-localize with DNA sequences potentially capable of adopting alternative DNA structures (non-B DNA, e.g. H-DNA and G4-DNA), which have been identified as endogenous sources of genomic instability. There are regions that contain overlapping sequences that may form more than one non-B DNA structure. The extent to which one structure impacts the formation/stability of another, within the sequence, is not fully understood. To address this issue, we investigated the folding preferences of oligonucleotides from a chromosomal breakpoint hotspot in the human c-MYC oncogene containing both potential G4-forming and H-DNA-forming elements. We characterized the structures formed in the presence of G4-DNA-stabilizing K+ ions or H-DNA-stabilizing Mg2+ ions using multiple techniques. We found that under conditions favorable for H-DNA formation, a stable intramolecular triplex DNA structure predominated; whereas, under K+-rich, G4-DNA-forming conditions, a plurality of unfolded and folded species were present. Thus, within a limited region containing sequences with the potential to adopt multiple structures, only one structure predominates under a given condition. The predominance of H-DNA implicates this structure in the instability associated with the human c-MYC oncogene. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Dnmt2 mediates intergenerational transmission of paternally acquired metabolic disorders through sperm small non-coding RNAs.

    PubMed

    Zhang, Yunfang; Zhang, Xudong; Shi, Junchao; Tuorto, Francesca; Li, Xin; Liu, Yusheng; Liebers, Reinhard; Zhang, Liwen; Qu, Yongcun; Qian, Jingjing; Pahima, Maya; Liu, Ying; Yan, Menghong; Cao, Zhonghong; Lei, Xiaohua; Cao, Yujing; Peng, Hongying; Liu, Shichao; Wang, Yue; Zheng, Huili; Woolsey, Rebekah; Quilici, David; Zhai, Qiwei; Li, Lei; Zhou, Tong; Yan, Wei; Lyko, Frank; Zhang, Ying; Zhou, Qi; Duan, Enkui; Chen, Qi

    2018-05-01

    The discovery of RNAs (for example, messenger RNAs, non-coding RNAs) in sperm has opened the possibility that sperm may function by delivering additional paternal information aside from solely providing the DNA 1 . Increasing evidence now suggests that sperm small non-coding RNAs (sncRNAs) can mediate intergenerational transmission of paternally acquired phenotypes, including mental stress 2,3 and metabolic disorders 4-6 . How sperm sncRNAs encode paternal information remains unclear, but the mechanism may involve RNA modifications. Here we show that deletion of a mouse tRNA methyltransferase, DNMT2, abolished sperm sncRNA-mediated transmission of high-fat-diet-induced metabolic disorders to offspring. Dnmt2 deletion prevented the elevation of RNA modifications (m 5 C, m 2 G) in sperm 30-40 nt RNA fractions that are induced by a high-fat diet. Also, Dnmt2 deletion altered the sperm small RNA expression profile, including levels of tRNA-derived small RNAs and rRNA-derived small RNAs, which might be essential in composing a sperm RNA 'coding signature' that is needed for paternal epigenetic memory. Finally, we show that Dnmt2-mediated m 5 C contributes to the secondary structure and biological properties of sncRNAs, implicating sperm RNA modifications as an additional layer of paternal hereditary information.

  8. Small Open Reading Frames, Non-Coding RNAs and Repetitive Elements in Bradyrhizobium japonicum USDA 110

    PubMed Central

    Hahn, Julia; Tsoy, Olga V.; Thalmann, Sebastian; Čuklina, Jelena; Gelfand, Mikhail S.

    2016-01-01

    Small open reading frames (sORFs) and genes for non-coding RNAs are poorly investigated components of most genomes. Our analysis of 1391 ORFs recently annotated in the soybean symbiont Bradyrhizobium japonicum USDA 110 revealed that 78% of them contain less than 80 codons. Twenty-one of these sORFs are conserved in or outside Alphaproteobacteria and most of them are similar to genes found in transposable elements, in line with their broad distribution. Stabilizing selection was demonstrated for sORFs with proteomic evidence and bll1319_ISGA which is conserved at the nucleotide level in 16 alphaproteobacterial species, 79 species from other taxa and 49 other Proteobacteria. Further we used Northern blot hybridization to validate ten small RNAs (BjsR1 to BjsR10) belonging to new RNA families. We found that BjsR1 and BjsR3 have homologs outside the genus Bradyrhizobium, and BjsR5, BjsR6, BjsR7, and BjsR10 have up to four imperfect copies in Bradyrhizobium genomes. BjsR8, BjsR9, and BjsR10 are present exclusively in nodules, while the other sRNAs are also expressed in liquid cultures. We also found that the level of BjsR4 decreases after exposure to tellurite and iron, and this down-regulation contributes to survival under high iron conditions. Analysis of additional small RNAs overlapping with 3’-UTRs revealed two new repetitive elements named Br-REP1 and Br-REP2. These REP elements may play roles in the genomic plasticity and gene regulation and could be useful for strain identification by PCR-fingerprinting. Furthermore, we studied two potential toxin genes in the symbiotic island and confirmed toxicity of the yhaV homolog bll1687 but not of the newly annotated higB homolog blr0229_ISGA in E. coli. Finally, we revealed transcription interference resulting in an antisense RNA complementary to blr1853, a gene induced in symbiosis. The presented results expand our knowledge on sORFs, non-coding RNAs and repetitive elements in B. japonicum and related bacteria. PMID:27788207

  9. The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister-group to protostomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Helfenbein, Kevin G.; Fourcade, H. Matthew; Vanjani, Rohit G.

    2004-05-01

    We report the first complete mitochondrial (mt) DNA sequence from a member of the phylum Chaetognatha (arrow worms). The Paraspadella gotoi mtDNA is highly unusual, missing 23 of the genes commonly found in animal mtDNAs, including atp6, which has otherwise been found universally to be present. Its 14 genes are unusually arranged into two groups, one on each strand. One group is punctuated by numerous non-coding intergenic nucleotides, while the other group is tightly packed, having no non-coding nucleotides, leading to speculation that there are two transcription units with differing modes of expression. The phylogenetic position of the Chaetognatha withinmore » the Metazoa has long been uncertain, with conflicting or equivocal results from various morphological analyses and rRNA sequence comparisons. Comparisons here of amino acid sequences from mitochondrially encoded proteins gives a single most parsimonious tree that supports a position of Chaetognatha as sister to the protostomes studied here. From this, one can more clearly interpret the patterns of evolution of various developmental features, especially regarding the embryological fate of the blastopore.« less

  10. Capturing Snapshots of APE1 Processing DNA Damage

    PubMed Central

    Freudenthal, Bret D.; Beard, William A.; Cuneo, Matthew J.; Dyrkheeva, Nadezhda S.; Wilson, Samuel H.

    2015-01-01

    DNA apurinic-apyrimidinic (AP) sites are prevalent non-coding threats to genomic stability and are processed by AP endonuclease 1 (APE1). APE1 incises the AP-site phosphodiester backbone, generating a DNA repair intermediate that is potentially cytotoxic. The molecular events of the incision reaction remain elusive due in part to limited structural information. We report multiple high-resolution human APE1:DNA structures that divulge novel features of the APE1 reaction, including the metal binding site, nucleophile, and arginine clamps that mediate product release. We also report APE1:DNA structures with a T:G mismatch 5′ to the AP-site, representing a clustered lesion occurring in methylated CpG dinucleotides. These reveal that APE1 molds the T:G mismatch into a unique Watson-Crick like geometry that distorts the active site reducing incision. These snapshots provide mechanistic clarity for APE1, while affording a rational framework to manipulate biological responses to DNA damage. PMID:26458045

  11. OnTheFly: a database of Drosophila melanogaster transcription factors and their binding sites.

    PubMed

    Shazman, Shula; Lee, Hunjoong; Socol, Yakov; Mann, Richard S; Honig, Barry

    2014-01-01

    We present OnTheFly (http://bhapp.c2b2.columbia.edu/OnTheFly/index.php), a database comprising a systematic collection of transcription factors (TFs) of Drosophila melanogaster and their DNA-binding sites. TFs predicted in the Drosophila melanogaster genome are annotated and classified and their structures, obtained via experiment or homology models, are provided. All known preferred TF DNA-binding sites obtained from the B1H, DNase I and SELEX methodologies are presented. DNA shape parameters predicted for these sites are obtained from a high throughput server or from crystal structures of protein-DNA complexes where available. An important feature of the database is that all DNA-binding domains and their binding sites are fully annotated in a eukaryote using structural criteria and evolutionary homology. OnTheFly thus provides a comprehensive view of TFs and their binding sites that will be a valuable resource for deciphering non-coding regulatory DNA.

  12. Organisation of the plant genome in chromosomes.

    PubMed

    Heslop-Harrison, J S Pat; Schwarzacher, Trude

    2011-04-01

    The plant genome is organized into chromosomes that provide the structure for the genetic linkage groups and allow faithful replication, transcription and transmission of the hereditary information. Genome sizes in plants are remarkably diverse, with a 2350-fold range from 63 to 149,000 Mb, divided into n=2 to n= approximately 600 chromosomes. Despite this huge range, structural features of chromosomes like centromeres, telomeres and chromatin packaging are well-conserved. The smallest genomes consist of mostly coding and regulatory DNA sequences present in low copy, along with highly repeated rDNA (rRNA genes and intergenic spacers), centromeric and telomeric repetitive DNA and some transposable elements. The larger genomes have similar numbers of genes, with abundant tandemly repeated sequence motifs, and transposable elements alone represent more than half the DNA present. Chromosomes evolve by fission, fusion, duplication and insertion events, allowing evolution of chromosome size and chromosome number. A combination of sequence analysis, genetic mapping and molecular cytogenetic methods with comparative analysis, all only becoming widely available in the 21st century, is elucidating the exact nature of the chromosome evolution events at all timescales, from the base of the plant kingdom, to intraspecific or hybridization events associated with recent plant breeding. As well as being of fundamental interest, understanding and exploiting evolutionary mechanisms in plant genomes is likely to be a key to crop development for food production. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.

  13. The Use and Effectiveness of Triple Multiplex System for Coding Region Single Nucleotide Polymorphism in Mitochondrial DNA Typing of Archaeologically Obtained Human Skeletons from Premodern Joseon Tombs of Korea

    PubMed Central

    Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon

    2015-01-01

    Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods. PMID:26345190

  14. A compositional segmentation of the human mitochondrial genome is related to heterogeneities in the guanine mutation rate

    PubMed Central

    Samuels, David C.; Boys, Richard J.; Henderson, Daniel A.; Chinnery, Patrick F.

    2003-01-01

    We applied a hidden Markov model segmentation method to the human mitochondrial genome to identify patterns in the sequence, to compare these patterns to the gene structure of mtDNA and to see whether these patterns reveal additional characteristics important for our understanding of genome evolution, structure and function. Our analysis identified three segmentation categories based upon the sequence transition probabilities. Category 2 segments corresponded to the tRNA and rRNA genes, with a greater strand-symmetry in these segments. Category 1 and 3 segments covered the protein- coding genes and almost all of the non-coding D-loop. Compared to category 1, the mtDNA segments assigned to category 3 had much lower guanine abundance. A comparison to two independent databases of mitochondrial mutations and polymorphisms showed that the high substitution rate of guanine in human mtDNA is largest in the category 3 segments. Analysis of synonymous mutations showed the same pattern. This suggests that this heterogeneity in the mutation rate is partly independent of respiratory chain function and is a direct property of the genome sequence itself. This has important implications for our understanding of mtDNA evolution and its use as a ‘molecular clock’ to determine the rate of population and species divergence. PMID:14530452

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kerr, J.M.; Fisher, L.W.; Termine, J.D.

    The authors have isolated and partially sequenced the human bone sialoprotein gene (IBSP). IBSP has been sublocalized by in situ hybridization to chromosome 4q38-q31 and is composed of six small exons (51 to 159 bp) and 1 large exon ([approximately]2.6 kb). The intron/exon junctions defined by sequence analysis are of class O, retaining an intact coding triplet. Sequence analysis of the 5[prime] upstream region revealed a TATAA (nucleotides -30 to-25 from the transcriptional start point) and a CCAAT (nucleotides -56 to-52) box, both in the reverse orientation. Intron 1 contains interesting structural elements composed of polypyrimidine repeats followed by amore » poly(AC)[sub n] tract. Both types of structural elements have been detected in promoter regions of other genes and have been implicated in transcriptional regulation. Several differences between the previously published cDNA sequence and the authors' sequence have been identified, most of which are contained within the untranslated exon 1. Three base revisions in the coding region include a G to T (Gly to Val, amino acid 195), T to C (Val to Ala, amino acid 268), and T to A (Glu to Asp, amino acid 270). In conclusion, the genomic organization and potential regulatory elements of human IBSP have been elucidated. 42 refs., 4 figs., 1 tab.« less

  16. Organization and transient expression of the gene for human U11 snRNA

    PubMed Central

    Clemens, Suter-Crazzolara; Walter, Keller

    1991-01-01

    The nucleotide sequence of U11 small nuclear RNA, a minor U RNA from HeLa cells, was determined. Computer analysis of the sequence (135 residues) predicts two strong hairpin loops which are separated by seventeen nucleotides containing an Sm binding site (AAUUUUUUGG). A synthetic gene was constructed in which the coding region of U11 RNA is under the control of a T7 promoter. This vector can be used to produce U11 RNA in vitro. Southern hybridization and PCR analysis of HeLa genomic DNA suggest that U11 RNA is encoded by a single copy gene, and that at least three genomic regions could be U11 RNA pseudogenes. A HeLa genomic copy of a U11 gene was isolated by inverted PCR. This gene contains the U11 RNA coding sequence and several sequence elements unique for the U RNA genes. These include a Distal Sequence Element (DSE, ATTTGCATA) present between positions −215 and −223 relative to the start of transcription; a Proximal Sequence Element (PSE, TTCACCTTTACCAAAAATG) located between positions −43 and −63 ; and a 3′box (GTTAGGCGAAATATTA) between positions +150 and +166. Transfection of HeLa cells with this gene revealed that it is functioning in vivo and can produce U11 RNA. PMID:1820214

  17. Three-dimensional integral imaging displays using a quick-response encoded elemental image array: an overview

    NASA Astrophysics Data System (ADS)

    Markman, A.; Javidi, B.

    2016-06-01

    Quick-response (QR) codes are barcodes that can store information such as numeric data and hyperlinks. The QR code can be scanned using a QR code reader, such as those built into smartphone devices, revealing the information stored in the code. Moreover, the QR code is robust to noise, rotation, and illumination when scanning due to error correction built in the QR code design. Integral imaging is an imaging technique used to generate a three-dimensional (3D) scene by combining the information from two-dimensional (2D) elemental images (EIs) each with a different perspective of a scene. Transferring these 2D images in a secure manner can be difficult. In this work, we overview two methods to store and encrypt EIs in multiple QR codes. The first method uses run-length encoding with Huffman coding and the double-random-phase encryption (DRPE) to compress and encrypt an EI. This information is then stored in a QR code. An alternative compression scheme is to perform photon-counting on the EI prior to compression. Photon-counting is a non-linear transformation of data that creates redundant information thus improving image compression. The compressed data is encrypted using the DRPE. Once information is stored in the QR codes, it is scanned using a smartphone device. The information scanned is decompressed and decrypted and an EI is recovered. Once all EIs have been recovered, a 3D optical reconstruction is generated.

  18. Variability in interhospital trauma data coding and scoring: A challenge to the accuracy of aggregated trauma registries.

    PubMed

    Arabian, Sandra S; Marcus, Michael; Captain, Kevin; Pomphrey, Michelle; Breeze, Janis; Wolfe, Jennefer; Bugaev, Nikolay; Rabinovici, Reuven

    2015-09-01

    Analyses of data aggregated in state and national trauma registries provide the platform for clinical, research, development, and quality improvement efforts in trauma systems. However, the interhospital variability and accuracy in data abstraction and coding have not yet been directly evaluated. This multi-institutional, Web-based, anonymous study examines interhospital variability and accuracy in data coding and scoring by registrars. Eighty-two American College of Surgeons (ACS)/state-verified Level I and II trauma centers were invited to determine different data elements including diagnostic, procedure, and Abbreviated Injury Scale (AIS) coding as well as selected National Trauma Data Bank definitions for the same fictitious case. Variability and accuracy in data entries were assessed by the maximal percent agreement among the registrars for the tested data elements, and 95% confidence intervals were computed to compare this level of agreement to the ideal value of 100%. Variability and accuracy in all elements were compared (χ testing) based on Trauma Quality Improvement Program (TQIP) membership, level of trauma center, ACS verification, and registrar's certifications. Fifty registrars (61%) completed the survey. The overall accuracy for all tested elements was 64%. Variability was noted in all examined parameters except for the place of occurrence code in all groups and the lower extremity AIS code in Level II trauma centers and in the Certified Specialist in Trauma Registry- and Certified Abbreviated Injury Scale Specialist-certified registrar groups. No differences in variability were noted when groups were compared based on TQIP membership, level of center, ACS verification, and registrar's certifications, except for prehospital Glasgow Coma Scale (GCS), where TQIP respondents agreed more than non-TQIP centers (p = 0.004). There is variability and inaccuracy in interhospital data coding and scoring of injury information. This finding casts doubt on the validity of registry data used in all aspects of trauma care and injury surveillance.

  19. The control of lambda DNA terminase synthesis.

    PubMed Central

    Murialdo, H; Davidson, A; Chow, S; Gold, M

    1987-01-01

    Nu1 and A, the genes coding for bacteriophage lambda DNA terminase, rank among the most poorly translated genes expressed in E. coli. To understand the reason for this low level of translation the genes were cloned into plasmids and their expression measured. In addition, the wild type DNA sequences immediately preceding the genes were reduced and modified. It was found that the elements that control translation are contained in the 100 base pairs upstream from the initiation codon. Interchanging these upstream sequences with those of an efficiently translated gene dramatically increased the translation of terminase subunits. It seems unlikely that the rare codons present in the genes, and any feature of their mRNA secondary structure play a role in the control of their translation. The elimination of cos from plasmids containing Nu1 and A also resulted in an increase in terminase production. This result suggests a role for cos in the control of late gene expression. The terminase subunit overproducer strains are potentially very useful for the design of improved DNA packaging and cosmid mapping techniques. Images PMID:3029667

  20. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

    PubMed

    Catania, Francesco; Lynch, Michael

    2010-05-04

    In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

Top