Science.gov

Sample records for cis-regulatory motif directs

  1. Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs

    PubMed Central

    Ivan, Andra; Halfon, Marc S; Sinha, Saurabh

    2008-01-01

    We consider the problem of predicting cis-regulatory modules without knowledge of motifs. We formulate this problem in a pragmatic setting, and create over 30 new data sets, using Drosophila modules, to use as a 'benchmark'. We propose two new methods for the problem, and evaluate these, as well as two existing methods, on our benchmark. We find that the challenge of predicting cis-regulatory modules ab initio, without any input of relevant motifs, is a realizable goal. PMID:18226245

  2. An integrative and applicable phylogenetic footprinting framework for cis-regulatory motifs identification in prokaryotic genomes.

    PubMed

    Liu, Bingqiang; Zhang, Hanyuan; Zhou, Chuan; Li, Guojun; Fennell, Anne; Wang, Guanghui; Kang, Yu; Liu, Qi; Ma, Qin

    2016-08-09

    Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif prediction. Here we present an integrative phylogenetic footprinting framework for accurate motif predictions in prokaryotic genomes (MP(3)). The framework includes a new orthologous data preparation procedure, an additional promoter scoring and pruning method and an integration of six existing motif finding algorithms as basic motif search engines. Specifically, we collected orthologous genes from available prokaryotic genomes and built the orthologous regulatory regions based on sequence similarity of promoter regions. This procedure made full use of the large-scale genomic data and taxonomy information and filtered out the promoters with limited contribution to produce a high quality orthologous promoter set. The promoter scoring and pruning is implemented through motif voting by a set of complementary predicting tools that mine as many motif candidates as possible and simultaneously eliminate the effect of random noise. We have applied the framework to Escherichia coli k12 genome and evaluated the prediction performance through comparison with seven existing programs. This evaluation was systematically carried out at the nucleotide and binding site level, and the results showed that MP(3) consistently outperformed other popular motif finding tools. We have integrated MP(3) into our motif identification and analysis server DMINDA, allowing users to efficiently identify and analyze motifs in 2,072 completely sequenced prokaryotic genomes. The performance evaluation indicated that MP(3) is effective for predicting regulatory motifs in prokaryotic genomes. Its application may enhance

  3. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  4. Bacterial regulon modeling and prediction based on systematic cis regulatory motif analyses

    NASA Astrophysics Data System (ADS)

    Liu, Bingqiang; Zhou, Chuan; Li, Guojun; Zhang, Hanyuan; Zeng, Erliang; Liu, Qi; Ma, Qin

    2016-03-01

    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria.

  5. Mutagenesis of GATA motifs controlling the endoderm regulator elt-2 reveals distinct dominant and secondary cis-regulatory elements

    PubMed Central

    Du, Lawrence; Tracy, Sharon

    2016-01-01

    Cis-regulatory elements (CREs) are crucial links in developmental gene regulatory networks, but in many cases, it can be difficult to discern whether similar CREs are functionally equivalent. We found that despite similar conservation and binding capability to upstream activators, different GATA cis-regulatory motifs within the promoter of the C. elegans endoderm regulator elt-2 play distinctive roles in activating and modulating gene expression throughout development. We fused wild-type and mutant versions of the elt-2 promoter to a gfp reporter and inserted these constructs as single copies into the C. elegans genome. We then counted early embryonic gfp transcripts using single-molecule RNA FISH (smFISH) and quantified gut GFP fluorescence. We determined that a single primary dominant GATA motif located -527 bp upstream of the elt-2 start codon was necessary for both embryonic activation and later maintenance of transcription, while nearby secondary GATA motifs played largely subtle roles in modulating postembryonic levels of elt-2. Mutation of the primary activating site increased low-level spatiotemporally ectopic stochastic transcription, indicating that this site acts repressively in non-endoderm cells. Our results reveal that CREs with similar GATA factor binding affinities in close proximity can play very divergent context-dependent roles in regulating the expression of a developmentally critical gene in vivo. PMID:26896592

  6. Mutagenesis of GATA motifs controlling the endoderm regulator elt-2 reveals distinct dominant and secondary cis-regulatory elements.

    PubMed

    Du, Lawrence; Tracy, Sharon; Rifkin, Scott A

    2016-04-01

    Cis-regulatory elements (CREs) are crucial links in developmental gene regulatory networks, but in many cases, it can be difficult to discern whether similar CREs are functionally equivalent. We found that despite similar conservation and binding capability to upstream activators, different GATA cis-regulatory motifs within the promoter of the C. elegans endoderm regulator elt-2 play distinctive roles in activating and modulating gene expression throughout development. We fused wild-type and mutant versions of the elt-2 promoter to a gfp reporter and inserted these constructs as single copies into the C. elegans genome. We then counted early embryonic gfp transcripts using single-molecule RNA FISH (smFISH) and quantified gut GFP fluorescence. We determined that a single primary dominant GATA motif located 527bp upstream of the elt-2 start codon was necessary for both embryonic activation and later maintenance of transcription, while nearby secondary GATA motifs played largely subtle roles in modulating postembryonic levels of elt-2. Mutation of the primary activating site increased low-level spatiotemporally ectopic stochastic transcription, indicating that this site acts repressively in non-endoderm cells. Our results reveal that CREs with similar GATA factor binding affinities in close proximity can play very divergent context-dependent roles in regulating the expression of a developmentally critical gene in vivo. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Evolution of New cis-Regulatory Motifs Required for Cell-Specific Gene Expression in Caenorhabditis

    PubMed Central

    Félix, Marie-Anne

    2016-01-01

    Patterning of C. elegans vulval cell fates relies on inductive signaling. In this induction event, a single cell, the gonadal anchor cell, secretes LIN-3/EGF and induces three out of six competent precursor cells to acquire a vulval fate. We previously showed that this developmental system is robust to a four-fold variation in lin-3/EGF genetic dose. Here using single-molecule FISH, we find that the mean level of expression of lin-3 in the anchor cell is remarkably conserved. No change in lin-3 expression level could be detected among C. elegans wild isolates and only a low level of change—less than 30%—in the Caenorhabditis genus and in Oscheius tipulae. In C. elegans, lin-3 expression in the anchor cell is known to require three transcription factor binding sites, specifically two E-boxes and a nuclear-hormone-receptor (NHR) binding site. Mutation of any of these three elements in C. elegans results in a dramatic decrease in lin-3 expression. Yet only a single E-box is found in the Drosophilae supergroup of Caenorhabditis species, including C. angaria, while the NHR-binding site likely only evolved at the base of the Elegans group. We find that a transgene from C. angaria bearing a single E-box is sufficient for normal expression in C. elegans. Even a short 58 bp cis-regulatory fragment from C. angaria with this single E-box is able to replace the three transcription factor binding sites at the endogenous C. elegans lin-3 locus, resulting in the wild-type expression level. Thus, regulatory evolution occurring in cis within a 58 bp lin-3 fragment, results in a strict requirement for the NHR binding site and a second E-box in C. elegans. This single-cell, single-molecule, quantitative and functional evo-devo study demonstrates that conserved expression levels can hide extensive change in cis-regulatory site requirements and highlights the evolution of new cis-regulatory elements required for cell-specific gene expression. PMID:27588814

  8. Evidence for the role of transposons in the recruitment of cis-regulatory motifs during the evolution of C4 photosynthesis.

    PubMed

    Cao, Chensi; Xu, Jiajia; Zheng, Guangyong; Zhu, Xin-Guang

    2016-03-08

    C4 photosynthesis evolved from C3 photosynthesis and has higher light, water, and nitrogen use efficiencies. Several C4 photosynthesis genes show cell-specific expression patterns, which are required for these high resource-use efficiencies. However, the mechanisms underlying the evolution of cis-regulatory elements that control these cell-specific expression patterns remain elusive. In the present study, we tested the hypothesis that the cis-regulatory motifs related to C4 photosynthesis genes were recruited from non-photosynthetic genes and further examined potential mechanisms facilitating this recruitment. We examined 65 predicted bundle sheath cell-specific motifs, 17 experimentally validated cell-specific cis-regulatory elements, and 1,034 motifs derived from gene regulatory networks. Approximately 7, 5, and 1,000 of these three categories of motifs, respectively, were apparently recruited during the evolution of C4 photosynthesis. In addition, we checked 1) the distance between the acceptors and the donors of potentially recruited motifs in a chromosome, and 2) whether the potentially recruited motifs reside within the overlapping region of transposable elements and the promoter of donor genes. The results showed that 7, 4, and 658 of the potentially recruited motifs might have moved via the transposable elements. Furthermore, the potentially recruited motifs showed higher binding affinity to transcription factors compared to randomly generated sequences of the same length as the motifs. This study provides molecular evidence supporting the hypothesis that transposon-driven recruitment of pre-existing cis-regulatory elements from non-photosynthetic genes into photosynthetic genes plays an important role during C4 evolution. The findings of the present study coincide with the observed repetitive emergence of C4 during evolution.

  9. A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements.

    PubMed

    Henry, Kelli F; Kawashima, Tomokazu; Goldberg, Robert B

    2015-06-01

    Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean (Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we use site-directed mutagenesis experiments in transgenic tobacco globular-stage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. A homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.

  10. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

    PubMed Central

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

    2016-01-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282

  11. Retinoic acid-induced down-regulation of the interleukin-2 promoter via cis-regulatory sequences containing an octamer motif.

    PubMed Central

    Felli, M P; Vacca, A; Meco, D; Screpanti, I; Farina, A R; Maroder, M; Martinotti, S; Petrangeli, E; Frati, L; Gulino, A

    1991-01-01

    Retinoic acid (RA) is known to influence the proliferation and differentiation of a wide variety of transformed and developing cells. We found that RA and the specific RA receptor (RAR) ligand Ch55 inhibited the phorbol ester and calcium ionophore-induced expression of the T-cell growth factor interleukin-2 (IL-2) gene. Expression of transiently transfected chloramphenicol acetyltransferase vectors containing the 5'-flanking region of the IL-2 gene was also inhibited by RA. RA-induced down-regulation of the IL-2 enhancer is mediated by RAR, since overexpression of transfected RARs increased RA sensitivity of the IL-2 promoter. Functional analysis of chloramphenicol acetyltransferase vectors containing either internal deletion mutants of the region from -317 to +47 bp of the IL-2 enhancer or multimerized cis-regulatory elements showed that the RA-responsive element in the IL-2 promoter mapped to sequences containing an octamer motif. RAR also inhibited the transcriptional activity of the octamer motif of the immunoglobulin heavy chain enhancer. In spite of the transcriptional inhibition of the IL-2 octamer motif, RA did not decrease the in vitro DNA-binding capability of octamer-1 protein. These results identify a regulatory pathway within the IL-2 promoter which involves the octamer motif and RAR. Images PMID:1652063

  12. Promoter analysis reveals cis-regulatory motifs associated with the expression of the WRKY transcription factor CrWRKY1 in Catharanthus roseus.

    PubMed

    Yang, Zhirong; Patra, Barunava; Li, Runzhi; Pattanaik, Sitakanta; Yuan, Ling

    2013-12-01

    WRKY transcription factors (TFs) are emerging as an important group of regulators of plant secondary metabolism. However, the cis-regulatory elements associated with their regulation have not been well characterized. We have previously demonstrated that CrWRKY1, a member of subgroup III of the WRKY TF family, regulates biosynthesis of terpenoid indole alkaloids in the ornamental and medicinal plant, Catharanthus roseus. Here, we report the isolation and functional characterization of the CrWRKY1 promoter. In silico analysis of the promoter sequence reveals the presence of several potential TF binding motifs, indicating the involvement of additional TFs in the regulation of the TIA pathway. The CrWRKY1 promoter can drive the expression of a β-glucuronidase (GUS) reporter gene in native (C. roseus protoplasts and transgenic hairy roots) and heterologous (transgenic tobacco seedlings) systems. Analysis of 5'- or 3'-end deletions indicates that the sequence located between positions -140 to -93 bp and -3 to +113 bp, relative to the transcription start site, is critical for promoter activity. Mutation analysis shows that two overlapping as-1 elements and a CT-rich motif contribute significantly to promoter activity. The CrWRKY1 promoter is induced in response to methyl jasmonate (MJ) treatment and the promoter region between -230 and -93 bp contains a putative MJ-responsive element. The CrWRKY1 promoter can potentially be used as a tool to isolate novel TFs involved in the regulation of the TIA pathway.

  13. A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements

    DOE PAGES

    Henry, Kelli F.; Kawashima, Tomokazu; Goldberg, Robert B.

    2015-03-22

    Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean (Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we use site-directedmore » mutagenesis experiments in transgenic tobacco globularstage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. Lastly, a homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.« less

  14. Identification, occurrence, and validation of DRE and ABRE Cis-regulatory motifs in the promoter regions of genes of Arabidopsis thaliana.

    PubMed

    Mishra, Sonal; Shukla, Aparna; Upadhyay, Swati; Sanchita; Sharma, Pooja; Singh, Seema; Phukan, Ujjal J; Meena, Abha; Khan, Feroz; Tripathi, Vineeta; Shukla, Rakesh Kumar; Shrama, Ashok

    2014-04-01

    Plants posses a complex co-regulatory network which helps them to elicit a response under diverse adverse conditions. We used an in silico approach to identify the genes with both DRE and ABRE motifs in their promoter regions in Arabidopsis thaliana. Our results showed that Arabidopsis contains a set of 2,052 genes with ABRE and DRE motifs in their promoter regions. Approximately 72% or more of the total predicted 2,052 genes had a gap distance of less than 400 bp between DRE and ABRE motifs. For positional orientation of the DRE and ABRE motifs, we found that the DR form (one in direct and the other one in reverse orientation) was more prevalent than other forms. These predicted 2,052 genes include 155 transcription factors. Using microarray data from The Arabidopsis Information Resource (TAIR) database, we present 44 transcription factors out of 155 which are upregulated by more than twofold in response to osmotic stress and ABA treatment. Fifty-one transcripts from the one predicted above were validated using semiquantitative expression analysis to support the microarray data in TAIR. Taken together, we report a set of genes containing both DRE and ABRE motifs in their promoter regions in A. thaliana, which can be useful to understand the role of ABA under osmotic stress condition. © 2013 Institute of Botany, Chinese Academy of Sciences.

  15. Twine: display and analysis of cis-regulatory modules

    PubMed Central

    Pearson, Joseph C.; Crews, Stephen T.

    2013-01-01

    Summary: Many algorithms analyze enhancers for overrepresentation of known and novel motifs, with the goal of identifying binding sites for direct regulators of gene expression. Twine is a Java GUI with multiple graphical representations (‘Views’) of enhancer alignments that displays motifs, as IUPAC consensus sequences or position frequency matrices, in the context of phylogenetic conservation to facilitate cis-regulatory element discovery. Thresholds of phylogenetic conservation and motif stringency can be altered dynamically to facilitate detailed analysis of enhancer architecture. Views can be exported to vector graphics programs to generate high-quality figures for publication. Twine can be extended via Java plugins to manipulate alignments and analyze sequences. Availability: Twine is freely available as a compiled Java .jar package or Java source code at http://labs.bio.unc.edu/crews/twine/. Contact: steve_crews@unc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23658420

  16. cis-Regulatory control circuits in development.

    PubMed

    Howard, Meredith L; Davidson, Eric H

    2004-07-01

    During development, an organism undergoes many rounds of pattern formation, generating ever-greater complexity with each ensuing round of cell division and specification. The instructions for executing this process are encoded in the cis-regulatory modules that direct the expression of developmental transcription factors and signaling molecules. Each transcription factor binding site within a cis-regulatory module contributes information about when, where, or how much a gene is turned on, and by dissecting the modules driving a given gene, all the inputs governing expression of the gene can be accurately identified. Furthermore, by mapping the output of each gene to the inputs of other genes, it is possible to reverse engineer developmental circuits and even whole networks. At this higher level of organization, common bilaterian strategies for specifying progenitor fields, locking down regulatory states, and driving development forward emerge.

  17. Cis-regulatory Elements and Human Evolution

    PubMed Central

    Siepel, Adam

    2014-01-01

    Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence, human polymorphism, and combinations of divergence and polymorphism. We then consider “new frontiers” in this field stemming from recent research on transcriptional regulation. PMID:25218861

  18. A cis-Regulatory Signature for Chordate Anterior Neuroectodermal Genes

    PubMed Central

    Christiaen, Lionel; Joly, Jean-Stéphane

    2010-01-01

    One of the striking findings of comparative developmental genetics was that expression patterns of core transcription factors are extraordinarily conserved in bilaterians. However, it remains unclear whether cis-regulatory elements of their target genes also exhibit common signatures associated with conserved embryonic fields. To address this question, we focused on genes that are active in the anterior neuroectoderm and non-neural ectoderm of the ascidian Ciona intestinalis. Following the dissection of a prototypic anterior placodal enhancer, we searched all genomic conserved non-coding elements for duplicated motifs around genes showing anterior neuroectodermal expression. Strikingly, we identified an over-represented pentamer motif corresponding to the binding site of the homeodomain protein OTX, which plays a pivotal role in the anterior development of all bilaterian species. Using an in vivo reporter gene assay, we observed that 10 of 23 candidate cis-regulatory elements containing duplicated OTX motifs are active in the anterior neuroectoderm, thus showing that this cis-regulatory signature is predictive of neuroectodermal enhancers. These results show that a common cis-regulatory signature corresponding to K50-Paired homeodomain transcription factors is found in non-coding sequences flanking anterior neuroectodermal genes in chordate embryos. Thus, field-specific selector genes impose architectural constraints in the form of combinations of short tags on their target enhancers. This could account for the strong evolutionary conservation of the regulatory elements controlling field-specific selector genes responsible for body plan formation. PMID:20419150

  19. Decoding cis-regulatory systems in ascidians.

    PubMed

    Kusakabe, Takehiro

    2005-02-01

    Ascidians, or sea squirts, are lower chordates, and share basic gene repertoires and many characteristics, both developmental and physiological, with vertebrates. Therefore, decoding cis-regulatory systems in ascidians will contribute toward elucidating the genetic regulatory systems underlying the developmental and physiological processes of vertebrates. cis-Regulatory DNAs can also be used for tissue-specific genetic manipulation, a powerful tool for studying ascidian development and physiology. Because the ascidian genome is compact compared with vertebrate genomes, both intergenic regions and introns are relatively small in ascidians. Short upstream intergenic regions contain a complete set of cis-regulatory elements for spatially regulated expression of a majority of ascidian genes. These features of the ascidian genome are a great advantage in identifying cis-regulatory sequences and in analyzing their functions. Function of cis-regulatory DNAs has been analyzed for a number of tissue-specific and developmentally regulated genes of ascidians by introducing promoter-reporter fusion constructs into ascidian embryos. The availability of the whole genome sequences of the two Ciona species, Ciona intestinalis and Ciona savignyi, facilitates comparative genomics approaches to identify cis-regulatory DNAs. Recent studies demonstrate that computational methods can help identify cis-regulatory elements in the ascidian genome. This review presents a comprehensive list of ascidian genes whose cis-regulatory regions have been subjected to functional analysis, and highlights the recent advances in bioinformatics and comparative genomics approaches to cis-regulatory systems in ascidians.

  20. The Role of cis Regulatory Evolution in Maize Domestication

    PubMed Central

    Lemmon, Zachary H.; Bukowski, Robert; Sun, Qi; Doebley, John F.

    2014-01-01

    Gene expression differences between divergent lineages caused by modification of cis regulatory elements are thought to be important in evolution. We assayed genome-wide cis and trans regulatory differences between maize and its wild progenitor, teosinte, using deep RNA sequencing in F1 hybrid and parent inbred lines for three tissue types (ear, leaf and stem). Pervasive regulatory variation was observed with approximately 70% of ∼17,000 genes showing evidence of regulatory divergence between maize and teosinte. However, many fewer genes (1,079 genes) show consistent cis differences with all sampled maize and teosinte lines. For ∼70% of these 1,079 genes, the cis differences are specific to a single tissue. The number of genes with cis regulatory differences is greatest for ear tissue, which underwent a drastic transformation in form during domestication. As expected from the domestication bottleneck, maize possesses less cis regulatory variation than teosinte with this deficit greatest for genes showing maize-teosinte cis regulatory divergence, suggesting selection on cis regulatory differences during domestication. Consistent with selection on cis regulatory elements, genes with cis effects correlated strongly with genes under positive selection during maize domestication and improvement, while genes with trans regulatory effects did not. We observed a directional bias such that genes with cis differences showed higher expression of the maize allele more often than the teosinte allele, suggesting domestication favored up-regulation of gene expression. Finally, this work documents the cis and trans regulatory changes between maize and teosinte in over 17,000 genes for three tissues. PMID:25375861

  1. The cis-regulatory code of Hox function in Drosophila

    PubMed Central

    Sorge, Sebastian; Ha, Nati; Polychronidou, Maria; Friedrich, Jana; Bezdan, Daniela; Kaspar, Petra; Schaefer, Martin H; Ossowski, Stephan; Henz, Stefan R; Mundorf, Juliane; Rätzer, Jenny; Papagiannouli, Fani; Lohmann, Ingrid

    2012-01-01

    Precise gene expression is a fundamental aspect of organismal function and depends on the combinatorial interplay of transcription factors (TFs) with cis-regulatory DNA elements. While much is known about TF function in general, our understanding of their cell type-specific activities is still poor. To address how widely expressed transcriptional regulators modulate downstream gene activity with high cellular specificity, we have identified binding regions for the Hox TF Deformed (Dfd) in the Drosophila genome. Our analysis of architectural features within Hox cis-regulatory response elements (HREs) shows that HRE structure is essential for cell type-specific gene expression. We also find that Dfd and Ultrabithorax (Ubx), another Hox TF specifying different morphological traits, interact with non-overlapping regions in vivo, despite their similar DNA binding preferences. While Dfd and Ubx HREs exhibit comparable design principles, their motif compositions and motif-pair associations are distinct, explaining the highly selective interaction of these Hox proteins with the regulatory environment. Thus, our results uncover the regulatory code imprinted in Hox enhancers and elucidate the mechanisms underlying functional specificity of TFs in vivo. PMID:22781127

  2. The cis-regulatory code of Hox function in Drosophila.

    PubMed

    Sorge, Sebastian; Ha, Nati; Polychronidou, Maria; Friedrich, Jana; Bezdan, Daniela; Kaspar, Petra; Schaefer, Martin H; Ossowski, Stephan; Henz, Stefan R; Mundorf, Juliane; Rätzer, Jenny; Papagiannouli, Fani; Lohmann, Ingrid

    2012-08-01

    Precise gene expression is a fundamental aspect of organismal function and depends on the combinatorial interplay of transcription factors (TFs) with cis-regulatory DNA elements. While much is known about TF function in general, our understanding of their cell type-specific activities is still poor. To address how widely expressed transcriptional regulators modulate downstream gene activity with high cellular specificity, we have identified binding regions for the Hox TF Deformed (Dfd) in the Drosophila genome. Our analysis of architectural features within Hox cis-regulatory response elements (HREs) shows that HRE structure is essential for cell type-specific gene expression. We also find that Dfd and Ultrabithorax (Ubx), another Hox TF specifying different morphological traits, interact with non-overlapping regions in vivo, despite their similar DNA binding preferences. While Dfd and Ubx HREs exhibit comparable design principles, their motif compositions and motif-pair associations are distinct, explaining the highly selective interaction of these Hox proteins with the regulatory environment. Thus, our results uncover the regulatory code imprinted in Hox enhancers and elucidate the mechanisms underlying functional specificity of TFs in vivo.

  3. Cis-regulatory mutations in human disease

    PubMed Central

    2009-01-01

    Cis-acting regulatory sequences are required for the proper temporal and spatial control of gene expression. Variation in gene expression is highly heritable and a significant determinant of human disease susceptibility. The diversity of human genetic diseases attributed, in whole or in part, to mutations in non-coding regulatory sequences is on the rise. Improvements in genome-wide methods of associating genetic variation with human disease and predicting DNA with cis-regulatory potential are two of the major reasons for these recent advances. This review will highlight select examples from the literature that have successfully integrated genetic and genomic approaches to uncover the molecular basis by which cis-regulatory mutations alter gene expression and contribute to human disease. The fine mapping of disease-causing variants has led to the discovery of novel cis-acting regulatory elements that, in some instances, are located as far away as 1.5 Mb from the target gene. In other cases, the prior knowledge of the regulatory landscape surrounding the gene of interest aided in the selection of enhancers for mutation screening. The success of these studies should provide a framework for following up on the large number of genome-wide association studies that have identified common variants in non-coding regions of the genome that associate with increased risk of human diseases including, diabetes, autism, Crohn's, colorectal cancer, and asthma, to name a few. PMID:19641089

  4. Cis-regulatory mutations in human disease.

    PubMed

    Epstein, Douglas J

    2009-07-01

    Cis-acting regulatory sequences are required for the proper temporal and spatial control of gene expression. Variation in gene expression is highly heritable and a significant determinant of human disease susceptibility. The diversity of human genetic diseases attributed, in whole or in part, to mutations in non-coding regulatory sequences is on the rise. Improvements in genome-wide methods of associating genetic variation with human disease and predicting DNA with cis-regulatory potential are two of the major reasons for these recent advances. This review will highlight select examples from the literature that have successfully integrated genetic and genomic approaches to uncover the molecular basis by which cis-regulatory mutations alter gene expression and contribute to human disease. The fine mapping of disease-causing variants has led to the discovery of novel cis-acting regulatory elements that, in some instances, are located as far away as 1.5 Mb from the target gene. In other cases, the prior knowledge of the regulatory landscape surrounding the gene of interest aided in the selection of enhancers for mutation screening. The success of these studies should provide a framework for following up on the large number of genome-wide association studies that have identified common variants in non-coding regions of the genome that associate with increased risk of human diseases including, diabetes, autism, Crohn's, colorectal cancer, and asthma, to name a few.

  5. Enhancer divergence and cis-regulatory evolution in the human and chimp neural crest.

    PubMed

    Prescott, Sara L; Srinivasan, Rajini; Marchetto, Maria Carolina; Grishina, Irina; Narvaiza, Iñigo; Selleri, Licia; Gage, Fred H; Swigut, Tomek; Wysocka, Joanna

    2015-09-24

    cis-regulatory changes play a central role in morphological divergence, yet the regulatory principles underlying emergence of human traits remain poorly understood. Here, we use epigenomic profiling from human and chimpanzee cranial neural crest cells to systematically and quantitatively annotate divergence of craniofacial cis-regulatory landscapes. Epigenomic divergence is often attributable to genetic variation within TF motifs at orthologous enhancers, with a novel motif being most predictive of activity biases. We explore properties of this cis-regulatory change, revealing the role of particular retroelements, uncovering broad clusters of species-biased enhancers near genes associated with human facial variation, and demonstrating that cis-regulatory divergence is linked to quantitative expression differences of crucial neural crest regulators. Our work provides a wealth of candidates for future evolutionary studies and demonstrates the value of "cellular anthropology," a strategy of using in-vitro-derived embryonic cell types to elucidate both fundamental and evolving mechanisms underlying morphological variation in higher primates.

  6. Redundancy and the evolution of cis-regulatory element multiplicity.

    PubMed

    Paixão, Tiago; Azevedo, Ricardo B R

    2010-07-08

    The promoter regions of many genes contain multiple binding sites for the same transcription factor (TF). One possibility is that this multiplicity evolved through transitional forms showing redundant cis-regulation. To evaluate this hypothesis, we must disentangle the relative contributions of different evolutionary mechanisms to the evolution of binding site multiplicity. Here, we attempt to do this using a model of binding site evolution. Our model considers binding sequences and their interactions with TFs explicitly, and allows us to cast the evolution of gene networks into a neutral network framework. We then test some of the model's predictions using data from yeast. Analysis of the model suggested three candidate nonadaptive processes favoring the evolution of cis-regulatory element redundancy and multiplicity: neutral evolution in long promoters, recombination and TF promiscuity. We find that recombination rate is positively associated with binding site multiplicity in yeast. Our model also indicated that weak direct selection for multiplicity (partial redundancy) can play a major role in organisms with large populations. Our data suggest that selection for changes in gene expression level may have contributed to the evolution of multiple binding sites in yeast. We conclude that the evolution of cis-regulatory element redundancy and multiplicity is impacted by many aspects of the biology of an organism: both adaptive and nonadaptive processes, both changes in cis to binding sites and in trans to the TFs that interact with them, both the functional setting of the promoter and the population genetic context of the individuals carrying them.

  7. Prediction of Cis-Regulatory Elements Controlling Genes Differentially Expressed by Retinal and Choroidal Vascular Endothelial Cells.

    PubMed

    Choi, Dongseok; Appukuttan, Binoy; Binek, Sierra J; Planck, Stephen R; Stout, J Timothy; Rosenbaum, James T; Smith, Justine R

    2008-01-01

    Cultured endothelial cells of the human retina and choroid demonstrate distinct patterns of gene expression. We hypothesized that differential gene expression reflected differences in the interactions of transcription factors and respective cis-regulatory motifs(s) in these two emdothelial cell subpopulations, recognizing that motifs often exist as modules. We tested this hypothesis in silico by using TRANSFAC Professional and CisModule to identify cis-regulatory motifs and modules in genes that were differentially expressed by human retinal versus choroidal endothelial cells, as identified by analysis of a microarray data set. Motifs corresponding to eight transcription factors were significantly (p < 0.05) differentially abundant in genes that were relatively highly expressed in retinal (i.e., GCCR, HMGIY, HSF1, p53, VDR) or choroidal (i.e., E2F, YY1, ZF5) endothelial cells. Predicted cis-regulatory modules were quite different for these two groups of genes. Our findings raise the possibility of exploiting specific cis-regulatory motifs to target therapy at the ocular endothelial cells subtypes responsible for neovascular age-related macular degeneration or proliferative diabetic retinopathy.

  8. Computational discovery of soybean promoter cis-regulatory elements for the construction of soybean cyst nematode inducible synthetic promoters

    USDA-ARS?s Scientific Manuscript database

    Computational methods offer great hope but limited accuracy in the prediction of functional cis-regulatory elements; improvements are needed to enable synthetic promoter design. We applied an ensemble strategy for de novo soybean cyst nematode (SCN)-inducible motif discovery among promoters of 18 co...

  9. Histone replacement marks the boundaries of cis-regulatory domains.

    PubMed

    Mito, Yoshiko; Henikoff, Jorja G; Henikoff, Steven

    2007-03-09

    Cellular memory is maintained at homeotic genes by cis-regulatory elements whose mechanism of action is unknown. We have examined chromatin at Drosophila homeotic gene clusters by measuring, at high resolution, levels of histone replacement and nucleosome occupancy. Homeotic gene clusters display conspicuous peaks of histone replacement at boundaries of cis-regulatory domains superimposed over broad regions of low replacement. Peaks of histone replacement closely correspond to nuclease-hypersensitive sites, binding sites for Polycomb and trithorax group proteins, and sites of nucleosome depletion. Our results suggest the existence of a continuous process that disrupts nucleosomes and maintains accessibility of cis-regulatory elements.

  10. Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus

    PubMed Central

    Sundaram, Vasavi; Choudhary, Mayank N. K.; Pehrsson, Erica; Xing, Xiaoyun; Fiore, Christopher; Pandey, Manishi; Maricque, Brett; Udawatta, Methma; Ngo, Duc; Chen, Yujie; Paguntalan, Asia; Ray, Tammy; Hughes, Ava; Cohen, Barak A.; Wang, Ting

    2017-01-01

    Cis-regulatory modules contain multiple transcription factor (TF)-binding sites and integrate the effects of each TF to control gene expression in specific cellular contexts. Transposable elements (TEs) are uniquely equipped to deposit their regulatory sequences across a genome, which could also contain cis-regulatory modules that coordinate the control of multiple genes with the same regulatory logic. We provide the first evidence of mouse-specific TEs that encode a module of TF-binding sites in mouse embryonic stem cells (ESCs). The majority (77%) of the individual TEs tested exhibited enhancer activity in mouse ESCs. By mutating individual TF-binding sites within the TE, we identified a module of TF-binding motifs that cooperatively enhanced gene expression. Interestingly, we also observed the same motif module in the in silico constructed ancestral TE that also acted cooperatively to enhance gene expression. Our results suggest that ancestral TE insertions might have brought in cis-regulatory modules into the mouse genome. PMID:28348391

  11. Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines

    PubMed Central

    Xu, Xing; Ji, Yongmei; Stormo, Gary D.

    2009-01-01

    An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA

  12. Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

    PubMed Central

    Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

    2011-01-01

    We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875

  13. Validation of skeletal muscle cis-regulatory module predictions reveals nucleotide composition bias in functional enhancers.

    PubMed

    Kwon, Andrew T; Chou, Alice Yi; Arenillas, David J; Wasserman, Wyeth W

    2011-12-01

    We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions.

  14. E-cadherin intron 2 contains cis-regulatory elements essential for gene expression.

    PubMed

    Stemmler, Marc P; Hecht, Andreas; Kemler, Rolf

    2005-03-01

    Cadherin-mediated cell-cell adhesion plays important roles in mouse embryonic development, and changes in cadherin expression are often linked to morphogenetic events. For proper embryonic development and organ formation, the expression of E-cadherin must be tightly regulated. Dysregulated expression during tumorigenesis confers invasiveness and metastasis. Except for the E-box motifs in the E-cadherin promoter, little is known about the existence and location of cis-regulatory elements controlling E-cadherin gene expression. We have examined putative cis-regulatory elements in the E-cadherin gene and we show a pivotal role for intron 2 in activating transcription. Upon deleting the genomic intron 2 entirely, the E-cadherin locus becomes completely inactive in embryonic stem cells and during early embryonic development. Later in development, from E11.5 onwards, the locus is activated only weakly in the absence of intron 2 sequences. We demonstrate that in differentiated epithelia, intron 2 sequences are required both to initiate transcriptional activation and additionally to maintain E-cadherin expression. Detailed analysis also revealed that expression in the yolk sac is intron 2 independent, whereas expression in the lens and the salivary glands absolutely relies on cis-regulatory sequences of intron 2. Taken together, our findings reveal a complex mechanism of gene regulation, with a vital role for the large intron 2.

  15. Comparative genomics allows the discovery of cis-regulatory elements in mosquitoes

    PubMed Central

    Sieglaff, Douglas H.; Dunn, W. Augustine; Xie, Xiaohui S.; Megy, Karyn; Marinotti, Osvaldo; James, Anthony A.

    2009-01-01

    The discovery and mapping of cis-regulatory elements is important for understanding regulation of gene transcription in mosquito vectors of human diseases. Genome sequence data are available for 3 species, Aedes aegypti, Anopheles gambiae, and Culex quinquefasciatus (Diptera: Culicidae), representing 2 subfamilies (Culicinae and Anophelinae) that are estimated to have diverged 145 to 200 million years ago. Comparative genomics tools were used to screen genomic DNA fragments located in the 5′-end flanking regions of orthologous genes. These analyses resulted in the identification of 137 sequences, designated “mosquito motifs,” 7 to 9 nucleotides in length, representing 18 families of putative cis-regulatory elements conserved significantly among the 3 species when compared to the fruit fly, Drosophila melanogaster. Forty-one of the motifs were implicated previously in experiments as sites for binding transcription factors or functioning in the regulation of mosquito gene expression. Further analyses revealed associations between specific motifs and expression profiles, particularly in those genes that show increased or decreased mRNA abundance in females following a blood meal, and those accumulating transcription products exclusively or preferentially in the midgut, fat bodies, or ovaries. These results validate the methodology and support a relationship between the discovered motifs and the conservation of hematophagy in mosquitoes. PMID:19211788

  16. Global identification of the genetic networks and cis-regulatory elements of the cold response in zebrafish

    PubMed Central

    Hu, Peng; Liu, Mingli; Zhang, Dong; Wang, Jinfeng; Niu, Hongbo; Liu, Yimeng; Wu, Zhichao; Han, Bingshe; Zhai, Wanying; Shen, Yu; Chen, Liangbiao

    2015-01-01

    The transcriptional programs of ectothermic teleosts are directly influenced by water temperature. However, the cis- and trans-factors governing cold responses are not well characterized. We profiled transcriptional changes in eight zebrafish tissues exposed to mildly and severely cold temperatures using RNA-Seq. A total of 1943 differentially expressed genes (DEGs) were identified, from which 34 clusters representing distinct tissue and temperature response expression patterns were derived using the k-means fuzzy clustering algorithm. The promoter regions of the clustered DEGs that demonstrated strong co-regulation were analysed for enriched cis-regulatory elements with a motif discovery program, DREME. Seventeen motifs, ten known and seven novel, were identified, which covered 23% of the DEGs. Two motifs predicted to be the binding sites for the transcription factors Bcl6 and Jun, respectively, were chosen for experimental verification, and they demonstrated the expected cold-induced and cold-repressed patterns of gene regulation. Protein interaction modeling of the network components followed by experimental validation suggested that Jun physically interacts with Bcl6 and might be a hub factor that orchestrates the cold response in zebrafish. Thus, the methodology used and the regulatory networks uncovered in this study provide a foundation for exploring the mechanisms of cold adaptation in teleosts. PMID:26227973

  17. A Cis-Regulatory Map of the Drosophila Genome

    PubMed Central

    Nègre, Nicolas; Brown, Christopher D.; Ma, Lijia; Bristow, Christopher Aaron; Miller, Steven W.; Wagner, Ulrich; Kheradpour, Pouya; Eaton, Matthew L.; Loriaux, Paul; Sealfon, Rachel; Li, Zirong; Ishii, Haruhiko; Spokony, Rebecca F.; Chen, Jia; Hwang, Lindsay; Cheng, Chao; Auburn, Richard P.; Davis, Melissa B.; Domanus, Marc; Shah, Parantu K.; Morrison, Carolyn A.; Zieba, Jennifer; Suchy, Sarah; Senderowicz, Lionel; Victorsen, Alec; Bild, Nicholas A.; Grundstad, A. Jason; Hanley, David; MacAlpine, David M.; Mannervik, Mattias; Venken, Koen; Bellen, Hugo; White, Robert; Russell, Steven; Grossman, Robert L.; Ren, Bing; Gerstein, Mark; Posakony, James W.; Kellis, Manolis; White, Kevin P.

    2011-01-01

    Systematic annotation of gene regulatory elements is a major challenge in genome science. Direct mapping of chromatin modification marks and transcriptional factor binding sites genome-wide 1,2 has successfully identified specific subtypes of regulatory elements 3. In Drosophila several pioneering studies have provided genome-wide identification of Polycomb-Response Elements 4, chromatin states 5, transcription factor binding sites (TFBS) 6–9, PolII regulation 8, and insulator elements 10; however, comprehensive annotation of the regulatory genome remains a significant challenge. Here we describe results from the modENCODE cis-regulatory annotation project. We produced a map of the Drosophila melanogaster regulatory genome based on more than 300 chromatin immuno-precipitation (ChIP) datasets for eight chromatin features, five histone deacetylases (HDACs) and thirty-eight site-specific transcription factors (TFs) at different stages of development. Using these data we inferred more than 20,000 candidate regulatory elements and we validated a subset of predictions for promoters, enhancers, and insulators in vivo. We also identified nearly 2,000 genomic regions of dense TF binding associated with chromatin activity and accessibility. We discovered hundreds of new TF co-binding relationships and defined a TF network with over 800 potential regulatory relationships. PMID:21430782

  18. The molecular signature and cis-regulatory architecture of a C. elegans gustatory neuron

    PubMed Central

    Etchberger, John F.; Lorch, Adam; Sleumer, Monica C.; Zapf, Richard; Jones, Steven J.; Marra, Marco A.; Holt, Robert A.; Moerman, Donald G.; Hobert, Oliver

    2007-01-01

    Taste receptor cells constitute a highly specialized cell type that perceives and conveys specific sensory information to the brain. The detailed molecular composition of these cells and the mechanisms that program their fate are, in general, poorly understood. We have generated serial analysis of gene expression (SAGE) libraries from two distinct populations of single, isolated sensory neuron classes, the gustatory neuron class ASE and the thermosensory neuron class AFD, from the nematode Caenorhabditis elegans. By comparing these two libraries, we have identified >1000 genes that define the ASE gustatory neuron class on a molecular level. This set of genes contains determinants of the differentiated state of the ASE neuron, such as a surprisingly complex repertoire of transcription factors (TFs), ion channels, neurotransmitters, and receptors, as well as seven-transmembrane receptor (7TMR)-type putative gustatory receptor genes. Through the in vivo dissection of the cis-regulatory regions of several ASE-expressed genes, we identified a small cis-regulatory motif, the “ASE motif,” that is required for the expression of many ASE-expressed genes. We demonstrate that the ASE motif is a binding site for the C2H2 zinc finger TF CHE-1, which is essential for the correct differentiation of the ASE gustatory neuron. Taken together, our results provide a unique view of the molecular landscape of a single neuron type and reveal an important aspect of the regulatory logic for gustatory neuron specification in C. elegans. PMID:17606643

  19. The molecular signature and cis-regulatory architecture of a C. elegans gustatory neuron.

    PubMed

    Etchberger, John F; Lorch, Adam; Sleumer, Monica C; Zapf, Richard; Jones, Steven J; Marra, Marco A; Holt, Robert A; Moerman, Donald G; Hobert, Oliver

    2007-07-01

    Taste receptor cells constitute a highly specialized cell type that perceives and conveys specific sensory information to the brain. The detailed molecular composition of these cells and the mechanisms that program their fate are, in general, poorly understood. We have generated serial analysis of gene expression (SAGE) libraries from two distinct populations of single, isolated sensory neuron classes, the gustatory neuron class ASE and the thermosensory neuron class AFD, from the nematode Caenorhabditis elegans. By comparing these two libraries, we have identified >1000 genes that define the ASE gustatory neuron class on a molecular level. This set of genes contains determinants of the differentiated state of the ASE neuron, such as a surprisingly complex repertoire of transcription factors (TFs), ion channels, neurotransmitters, and receptors, as well as seven-transmembrane receptor (7TMR)-type putative gustatory receptor genes. Through the in vivo dissection of the cis-regulatory regions of several ASE-expressed genes, we identified a small cis-regulatory motif, the "ASE motif," that is required for the expression of many ASE-expressed genes. We demonstrate that the ASE motif is a binding site for the C2H2 zinc finger TF CHE-1, which is essential for the correct differentiation of the ASE gustatory neuron. Taken together, our results provide a unique view of the molecular landscape of a single neuron type and reveal an important aspect of the regulatory logic for gustatory neuron specification in C. elegans.

  20. Population genetics of cis-regulatory sequences that operate during embryonic development in the sea urchin Strongylocentrotus purpuratus.

    PubMed

    Garfield, David; Haygood, Ralph; Nielsen, William J; Wray, Gregory A

    2012-01-01

    Despite the fact that noncoding sequences comprise a substantial fraction of functional sites within all genomes, the evolutionary mechanisms that operate on genetic variation within regulatory elements remain poorly understood. In this study, we examine the population genetics of the core, upstream cis-regulatory regions of eight genes (AN, CyIIa, CyIIIa, Endo16, FoxB, HE, SM30 a, and SM50) that function during the early development of the purple sea urchin, Strongylocentrotus purpuratus. Quantitative and qualitative measures of segregating variation are not conspicuously different between cis-regulatory and closely linked "proxy neutral" noncoding regions containing no known functional sites. Length and compound mutations are common in noncoding sequences; conventional descriptive statistics ignore such mutations, under-representing true genetic variation by approximately 28% for these loci in this population. Patterns of variation in the cis-regulatory regions of six of the genes examined (CyIIa, CyIIIa, Endo16, FoxB, AN, and HE) are consistent with directional selection. Genetic variation within annotated transcription factor binding sites is comparable to, and frequently greater than, that of surrounding sequences. Comparisons of two paralog pairs (CyIIa/CyIIIa and AN/HE) suggest that distinct evolutionary processes have operated on their cis-regulatory regions following gene duplication. Together, these analyses provide a detailed view of the evolutionary mechanisms operating on noncoding sequences within a natural population, and underscore how little is known about how these processes operate on cis-regulatory sequences.

  1. Abundant raw material for cis-regulatory evolution in humans

    NASA Technical Reports Server (NTRS)

    Rockman, Matthew V.; Wray, Gregory A.

    2002-01-01

    Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.

  2. Abundant raw material for cis-regulatory evolution in humans.

    PubMed

    Rockman, Matthew V; Wray, Gregory A

    2002-11-01

    Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.

  3. Abundant raw material for cis-regulatory evolution in humans

    NASA Technical Reports Server (NTRS)

    Rockman, Matthew V.; Wray, Gregory A.

    2002-01-01

    Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.

  4. Expression-Guided In Silico Evaluation of Candidate Cis Regulatory Codes for Drosophila Muscle Founder Cells

    PubMed Central

    Gisselbrecht, Stephen S; He, Fangxue Sherry; Estrada, Beatriz; Michelson, Alan M; Bulyk, Martha L

    2006-01-01

    While combinatorial models of transcriptional regulation can be inferred for metazoan systems from a priori biological knowledge, validation requires extensive and time-consuming experimental work. Thus, there is a need for computational methods that can evaluate hypothesized cis regulatory codes before the difficult task of experimental verification is undertaken. We have developed a novel computational framework (termed “CodeFinder”) that integrates transcription factor binding site and gene expression information to evaluate whether a hypothesized transcriptional regulatory model (TRM; i.e., a set of co-regulating transcription factors) is likely to target a given set of co-expressed genes. Our basic approach is to simultaneously predict cis regulatory modules (CRMs) associated with a given gene set and quantify the enrichment for combinatorial subsets of transcription factor binding site motifs comprising the hypothesized TRM within these predicted CRMs. As a model system, we have examined a TRM experimentally demonstrated to drive the expression of two genes in a sub-population of cells in the developing Drosophila mesoderm, the somatic muscle founder cells. This TRM was previously hypothesized to be a general mode of regulation for genes expressed in this cell population. In contrast, the present analyses suggest that a modified form of this cis regulatory code applies to only a subset of founder cell genes, those whose gene expression responds to specific genetic perturbations in a similar manner to the gene on which the original model was based. We have confirmed this hypothesis by experimentally discovering six (out of 12 tested) new CRMs driving expression in the embryonic mesoderm, four of which drive expression in founder cells. PMID:16733548

  5. SMCis: An Effective Algorithm for Discovery of Cis-Regulatory Modules

    PubMed Central

    Guo, Haitao; Huo, Hongwei; Yu, Qiang

    2016-01-01

    The discovery of cis-regulatory modules (CRMs) is a challenging problem in computational biology. Limited by the difficulty of using an HMM to model dependent features in transcriptional regulatory sequences (TRSs), the probabilistic modeling methods based on HMMs cannot accurately represent the distance between regulatory elements in TRSs and are cumbersome to model the prevailing dependencies between motifs within CRMs. We propose a probabilistic modeling algorithm called SMCis, which builds a more powerful CRM discovery model based on a hidden semi-Markov model. Our model characterizes the regulatory structure of CRMs and effectively models dependencies between motifs at a higher level of abstraction based on segments rather than nucleotides. Experimental results on three benchmark datasets indicate that our method performs better than the compared algorithms. PMID:27637070

  6. Functional Evolution of a cis-Regulatory Module

    PubMed Central

    Palsson, Arnar; Alekseeva, Elena; Bergman, Casey M; Nathan, Janaki; Kreitman, Martin

    2005-01-01

    Lack of knowledge about how regulatory regions evolve in relation to their structure–function may limit the utility of comparative sequence analysis in deciphering cis-regulatory sequences. To address this we applied reverse genetics to carry out a functional genetic complementation analysis of a eukaryotic cis-regulatory module—the even-skipped stripe 2 enhancer—from four Drosophila species. The evolution of this enhancer is non-clock-like, with important functional differences between closely related species and functional convergence between distantly related species. Functional divergence is attributable to differences in activation levels rather than spatiotemporal control of gene expression. Our findings have implications for understanding enhancer structure–function, mechanisms of speciation and computational identification of regulatory modules. PMID:15757364

  7. CREME: Cis-Regulatory Module Explorer for the Human Genome

    SciTech Connect

    Loots, G G; Sharan, R; Ovcharenko, I; Ben-Hur, A

    2004-02-11

    The binding of transcription factors to specific regulatory sequence elements is a primary mechanism for controlling gene transcription. Eukaryotic genes are often regulated by several transcription factors, whose binding sites are tightly clustered and form cis-regulatory modules. In this paper we present a web-server, CREME, for identifying and visualizing cis-regulatory modules in the promoter regions of a given set of potentially co-regulated genes. CREME relies on a database of putative transcription factor binding sites that have been annotated across the human genome using a library of position weight matrices and evolutionary conservation with the mouse and rat genomes. A search algorithm is applied to this dataset to identify combinations of transcription factors whose binding sites tend to co-occur in close proximity in the promoter regions of the input gene set. The identified cis-regulatory modules are statistically scored and significant combinations are reported and graphically visualized. Our web-server is available at http://creme.dcode.org/.

  8. A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model

    PubMed Central

    2017-01-01

    The discovery of cis-regulatory modules (CRMs) is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them. PMID:28497059

  9. Identification of cis-regulatory mutations generating de novo edges in personalized cancer gene regulatory networks.

    PubMed

    Kalender Atak, Zeynep; Imrichova, Hana; Svetlichnyy, Dmitry; Hulselmans, Gert; Christiaens, Valerie; Reumers, Joke; Ceulemans, Hugo; Aerts, Stein

    2017-08-30

    The identification of functional non-coding mutations is a key challenge in the field of genomics. Here we introduce μ-cisTarget to filter, annotate and prioritize cis-regulatory mutations based on their putative effect on the underlying "personal" gene regulatory network. We validated μ-cisTarget by re-analyzing the TAL1 and LMO1 enhancer mutations in T-ALL, and the TERT promoter mutation in melanoma. Next, we re-sequenced the full genomes of ten cancer cell lines and used matched transcriptome data and motif discovery to identify master regulators with de novo binding sites that result in the up-regulation of nearby oncogenic drivers. μ-cisTarget is available from http://mucistarget.aertslab.org .

  10. A primer on regression methods for decoding cis-regulatory logic

    SciTech Connect

    Das, Debopriya; Pellegrini, Matteo; Gray, Joe W.

    2009-03-03

    The rapidly emerging field of systems biology is helping us to understand the molecular determinants of phenotype on a genomic scale [1]. Cis-regulatory elements are major sequence-based determinants of biological processes in cells and tissues [2]. For instance, during transcriptional regulation, transcription factors (TFs) bind to very specific regions on the promoter DNA [2,3] and recruit the basal transcriptional machinery, which ultimately initiates mRNA transcription (Figure 1A). Learning cis-Regulatory Elements from Omics Data A vast amount of work over the past decade has shown that omics data can be used to learn cis-regulatory logic on a genome-wide scale [4-6]--in particular, by integrating sequence data with mRNA expression profiles. The most popular approach has been to identify over-represented motifs in promoters of genes that are coexpressed [4,7,8]. Though widely used, such an approach can be limiting for a variety of reasons. First, the combinatorial nature of gene regulation is difficult to explicitly model in this framework. Moreover, in many applications of this approach, expression data from multiple conditions are necessary to obtain reliable predictions. This can potentially limit the use of this method to only large data sets [9]. Although these methods can be adapted to analyze mRNA expression data from a pair of biological conditions, such comparisons are often confounded by the fact that primary and secondary response genes are clustered together--whereas only the primary response genes are expected to contain the functional motifs [10]. A set of approaches based on regression has been developed to overcome the above limitations [11-32]. These approaches have their foundations in certain biophysical aspects of gene regulation [26,33-35]. That is, the models are motivated by the expected transcriptional response of genes due to the binding of TFs to their promoters. While such methods have gathered popularity in the computational domain

  11. Deep conservation of cis-regulatory elements in metazoans

    PubMed Central

    Maeso, Ignacio; Irimia, Manuel; Tena, Juan J.; Casares, Fernando; Gómez-Skarmeta, José Luis

    2013-01-01

    Despite the vast morphological variation observed across phyla, animals share multiple basic developmental processes orchestrated by a common ancestral gene toolkit. These genes interact with each other building complex gene regulatory networks (GRNs), which are encoded in the genome by cis-regulatory elements (CREs) that serve as computational units of the network. Although GRN subcircuits involved in ancient developmental processes are expected to be at least partially conserved, identification of CREs that are conserved across phyla has remained elusive. Here, we review recent studies that revealed such deeply conserved CREs do exist, discuss the difficulties associated with their identification and describe new approaches that will facilitate this search. PMID:24218633

  12. CisMiner: genome-wide in-silico cis-regulatory module prediction by fuzzy itemset mining.

    PubMed

    Navarro, Carmen; Lopez, Francisco J; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando

    2014-01-01

    Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by

  13. A set of structural features defines the cis-regulatory modules of antenna-expressed genes in Drosophila melanogaster.

    PubMed

    López, Yosvany; Vandenbon, Alexis; Nakai, Kenta

    2014-01-01

    Unraveling the biological information within the regulatory region (RR) of genes has become one of the major focuses of current genomic research. It has been hypothesized that RRs of co-expressed genes share similar architecture, but to the best of our knowledge, no studies have simultaneously examined multiple structural features, such as positioning of cis-regulatory elements relative to transcription start sites and to each other, and the order and orientation of regulatory motifs, to accurately describe overall cis-regulatory structure. In our work we present an improved computational method that builds a feature collection based on all of these structural features. We demonstrate the utility of this approach by modeling the cis-regulatory modules of antenna-expressed genes in Drosophila melanogaster. Six potential antenna-related motifs were predicted initially, including three that appeared to be novel. A feature set was created with the predicted motifs, where a correlation-based filter was used to remove irrelevant features, and a genetic algorithm was designed to optimize the feature set. Finally, a set of eight highly informative structural features was obtained for the RRs of antenna-expressed genes, achieving an area under the curve of 0.841. We used these features to score all D. melanogaster RRs for potentially unknown antenna-expressed genes sharing a similar regulatory structure. Validation of our predictions with an independent RNA sequencing dataset showed that 76.7% of genes with high scoring RRs were expressed in antenna. In addition, we found that the structural features we identified are highly conserved in RRs of orthologs in other Drosophila sibling species. This approach to identify tissue-specific regulatory structures showed comparable performance to previous approaches, but also uncovered additional interesting features because it also considered the order and orientation of motifs.

  14. High-throughput discovery of post-transcriptional cis-regulatory elements.

    PubMed

    Wissink, Erin M; Fogarty, Elizabeth A; Grimson, Andrew

    2016-03-03

    Post-transcriptional gene regulation controls the amount of protein produced from an individual mRNA by altering rates of decay and translation. Many sequence elements that direct post-transcriptional regulation have been found; in mammals, most such elements are located within the 3' untranslated regions (3'UTRs). Comparative genomic studies demonstrate that mammalian 3'UTRs contain extensive conserved sequence tracts, yet only a small fraction corresponds to recognized elements, implying that many additional novel elements exist. Despite a variety of computational, molecular, and biochemical approaches, identifying functional 3'UTRs elements remains difficult. We created a high-throughput cell-based screen that enables identification of functional post-transcriptional 3'UTR regulatory elements. Our system exploits integrated single-copy reporters, which are expressed and processed as endogenous genes. We screened many thousands of short random sequences for their regulatory potential. Control sequences with known effects were captured effectively using our approach, establishing that our methodology was robust. We found hundreds of functional sequences, which we validated in traditional reporter assays, including verifying their regulatory impact in native sequence contexts. Although 3'UTRs are typically considered repressive, most of the functional elements were activating, including ones that were preferentially conserved. Additionally, we adapted our screening approach to examine the effect of elements on RNA abundance, revealing that most elements act by altering mRNA stability. We developed and used a high-throughput approach to discover hundreds of post-transcriptional cis-regulatory elements. These results imply that most human 3'UTRs contain many previously unrecognized cis-regulatory elements, many of which are activating, and that the post-transcriptional fate of an mRNA is largely due to the actions of many individual cis-regulatory elements within its

  15. Motif-directed redesign of enzyme specificity.

    PubMed

    Borgo, Benjamin; Havranek, James J

    2014-03-01

    Computational protein design relies on several approximations, including the use of fixed backbones and rotamers, to reduce protein design to a computationally tractable problem. However, allowing backbone and off-rotamer flexibility leads to more accurate designs and greater conformational diversity. Exhaustive sampling of this additional conformational space is challenging, and often impossible. Here, we report a computational method that utilizes a preselected library of native interactions to direct backbone flexibility to accommodate placement of these functional contacts. Using these native interaction modules, termed motifs, improves the likelihood that the interaction can be realized, provided that suitable backbone perturbations can be identified. Furthermore, it allows a directed search of the conformational space, reducing the sampling needed to find low energy conformations. We implemented the motif-based design algorithm in Rosetta, and tested the efficacy of this method by redesigning the substrate specificity of methionine aminopeptidase. In summary, native enzymes have evolved to catalyze a wide range of chemical reactions with extraordinary specificity. Computational enzyme design seeks to generate novel chemical activities by altering the target substrates of these existing enzymes. We have implemented a novel approach to redesign the specificity of an enzyme and demonstrated its effectiveness on a model system.

  16. Genomic analysis reveals major determinants of cis-regulatory variation in Capsella grandiflora.

    PubMed

    Steige, Kim A; Laenen, Benjamin; Reimegård, Johan; Scofield, Douglas G; Slotte, Tanja

    2017-01-31

    Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor of cis-regulatory variation in a wide range of plant species.

  17. Genomic analysis reveals major determinants of cis-regulatory variation in Capsella grandiflora

    PubMed Central

    Steige, Kim A.; Laenen, Benjamin; Reimegård, Johan; Slotte, Tanja

    2017-01-01

    Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora. We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor of cis-regulatory variation in a wide range of plant species. PMID:28096395

  18. Evolution of lineage-specific functions in ancient cis-regulatory modules

    PubMed Central

    Pauls, Stefan; Goode, Debbie K.; Petrone, Libero; Oliveri, Paola; Elgar, Greg

    2015-01-01

    Morphological evolution is driven both by coding sequence variation and by changes in regulatory sequences. However, how cis-regulatory modules (CRMs) evolve to generate entirely novel expression domains is largely unknown. Here, we reconstruct the evolutionary history of a lens enhancer located within a CRM that not only predates the lens, a vertebrate innovation, but bilaterian animals in general. Alignments of orthologous sequences from different deuterostomes sub-divide the CRM into a deeply conserved core and a more divergent flanking region. We demonstrate that all deuterostome flanking regions, including invertebrate sequences, activate gene expression in the zebrafish lens through the same ancient cluster of activator sites. However, levels of gene expression vary between species due to the presence of repressor motifs in flanking region and core. These repressor motifs are responsible for the relatively weak enhancer activity of tetrapod flanking regions. Ray-finned fish, however, have gained two additional lineage-specific activator motifs which in combination with the ancient cluster of activators and the core constitute a potent lens enhancer. The exploitation and modification of existing regulatory potential in flanking regions but not in the highly conserved core might represent a more general model for the emergence of novel regulatory functions in complex CRMs. PMID:26538567

  19. Evolution of lineage-specific functions in ancient cis-regulatory modules.

    PubMed

    Pauls, Stefan; Goode, Debbie K; Petrone, Libero; Oliveri, Paola; Elgar, Greg

    2015-11-01

    Morphological evolution is driven both by coding sequence variation and by changes in regulatory sequences. However, how cis-regulatory modules (CRMs) evolve to generate entirely novel expression domains is largely unknown. Here, we reconstruct the evolutionary history of a lens enhancer located within a CRM that not only predates the lens, a vertebrate innovation, but bilaterian animals in general. Alignments of orthologous sequences from different deuterostomes sub-divide the CRM into a deeply conserved core and a more divergent flanking region. We demonstrate that all deuterostome flanking regions, including invertebrate sequences, activate gene expression in the zebrafish lens through the same ancient cluster of activator sites. However, levels of gene expression vary between species due to the presence of repressor motifs in flanking region and core. These repressor motifs are responsible for the relatively weak enhancer activity of tetrapod flanking regions. Ray-finned fish, however, have gained two additional lineage-specific activator motifs which in combination with the ancient cluster of activators and the core constitute a potent lens enhancer. The exploitation and modification of existing regulatory potential in flanking regions but not in the highly conserved core might represent a more general model for the emergence of novel regulatory functions in complex CRMs.

  20. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  1. TFM-Explorer: mining cis-regulatory regions in genomes

    PubMed Central

    Tonon, Laurie; Varré, Jean-Stéphane

    2010-01-01

    DNA-binding transcription factors (TFs) play a central role in transcription regulation, and computational approaches that help in elucidating complex mechanisms governing this basic biological process are of great use. In this perspective, we present the TFM-Explorer web server that is a toolbox to identify putative TF binding sites within a set of upstream regulatory sequences of genes sharing some regulatory mechanisms. TFM-Explorer finds local regions showing overrepresentation of binding sites. Accepted organisms are human, mouse, rat, chicken and drosophila. The server employs a number of features to help users to analyze their data: visualization of selected binding sites on genomic sequences, and selection of cis-regulatory modules. TFM-Explorer is available at http://bioinfo.lifl.fr/TFM. PMID:20522509

  2. Cis-regulatory landscapes in development and evolution.

    PubMed

    Maeso, Ignacio; Acemel, Rafael D; Gómez-Skarmeta, José Luis

    2017-04-01

    The recent advances in our understanding of the 3D organization of the chromatin together with an almost unlimited ability to detect cis-regulatory elements genome-wide using different biochemical signatures has provided us with an unprecedented power to study gene regulation. It is now possible to profile the complete regulatory apparatus controlling the spatio-temporal expression of any given gene, the so-called gene Regulatory Landscapes (RLs). Here we review several studies over the last two years demonstrating the functional consequences of altering RL structure in development, disease and evolution. These works clearly show that a deep understanding of transcriptional regulation is no longer conceivable without considering the 3D modular organization of animal genomes. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Identifying cis-regulatory changes involved in the evolution of aerobic fermentation in yeasts.

    PubMed

    Lin, Zhenguo; Wang, Tzi-Yuan; Tsai, Bing-Shi; Wu, Fang-Ting; Yu, Fu-Jung; Tseng, Yu-Jung; Sung, Huang-Mo; Li, Wen-Hsiung

    2013-01-01

    Gene regulation change has long been recognized as an important mechanism for phenotypic evolution. We used the evolution of yeast aerobic fermentation as a model to explore how gene regulation has evolved and how this process has contributed to phenotypic evolution and adaptation. Most eukaryotes fully oxidize glucose to CO2 and H2O in mitochondria to maximize energy yield, whereas some yeasts, such as Saccharomyces cerevisiae and its relatives, predominantly ferment glucose into ethanol even in the presence of oxygen, a phenomenon known as aerobic fermentation. We examined the genome-wide gene expression levels among 12 different yeasts and found that a group of genes involved in the mitochondrial respiration process showed the largest reduction in gene expression level during the evolution of aerobic fermentation. Our analysis revealed that the downregulation of these genes was significantly associated with massive loss of binding motifs of Cbf1p in the fermentative yeasts. Our experimental assays confirmed the binding of Cbf1p to the predicted motif and the activator role of Cbf1p. In summary, our study laid a foundation to unravel the long-time mystery about the genetic basis of evolution of aerobic fermentation, providing new insights into understanding the role of cis-regulatory changes in phenotypic evolution.

  4. An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome

    PubMed Central

    Won, Kyoung-Jae; Agarwal, Saurabh; Shen, Li; Shoemaker, Robert; Ren, Bing; Wang, Wei

    2009-01-01

    In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the binding sites of several TFs are often clustered. Recently, ChIP-chip and ChIP-sequencing experiments have been accumulated to identify TF binding sites as well as survey the chromatin modification patterns at the regulatory elements such as promoters and enhancers. We propose here a hidden Markov model (HMM) to incorporate sequence motif information, TF-DNA interaction data and chromatin modification patterns to precisely identify cis-regulatory modules (CRMs). We conducted ChIP-chip experiments on four TFs, CREB, E2F1, MAX, and YY1 in 1% of the human genome. We then trained a hidden Markov model (HMM) to identify the labels of the CRMs by incorporating the sequence motifs recognized by these TFs and the ChIP-chip ratio. Chromatin modification data was used to predict the functional sites and to further remove false positives. Cross-validation showed that our integrated HMM had a performance superior to other existing methods on predicting CRMs. Incorporating histone signature information successfully penalized false prediction and improved the whole performance. The dataset we used and the software are available at http://nash.ucsd.edu/CIS/. PMID:19434238

  5. Putative cis-regulatory elements in genes highly expressed in rice sperm cells

    PubMed Central

    2011-01-01

    Background The male germ line in flowering plants is initiated within developing pollen grains via asymmetric division. The smaller cell then becomes totally encased within a much larger vegetative cell, forming a unique "cell within a cell structure". The generative cell subsequently divides to give rise to two non-motile diminutive sperm cells, which take part in double fertilization and lead to the seed set. Sperm cells are difficult to investigate because of their presence within the confines of the larger vegetative cell. However, recently developed techniques for the isolation of rice sperm cells and the fully annotated rice genome sequence have allowed for the characterization of the transcriptional repertoire of sperm cells. Microarray gene expression data has identified a subset of rice genes that show unique or highly preferential expression in sperm cells. This information has led to the identification of cis-regulatory elements (CREs), which are conserved in sperm-expressed genes and are putatively associated with the control of cell-specific expression. Findings We aimed to identify the CREs associated with rice sperm cell-specific gene expression data using in silico prediction tools. We analyzed 1-kb upstream regions of the top 40 sperm cell co-expressed genes for over-represented conserved and novel motifs. Analysis of upstream regions with the SIGNALSCAN program with the PLACE database, MEME and the Mclip tool helped to find combinatorial sets of known transcriptional factor-binding sites along with two novel motifs putatively associated with the co-expression of sperm cell-specific genes. Conclusions Our data shows the occurrence of novel motifs, which are putative CREs and are likely targets of transcriptional factors regulating sperm cell gene expression. These motifs can be used to design the experimental verification of regulatory elements and the identification of transcriptional factors that regulate sperm cell-specific gene expression. PMID

  6. Nucleotide sequence conservation of novel and established cis-regulatory sites within the tyrosine hydroxylase gene promoter

    PubMed Central

    Wang, Meng; Banerjee, Kasturi; Baker, Harriet; Cave, John W.

    2015-01-01

    Tyrosine hydroxylase (TH) is the rate-limiting enzyme in catecholamine biosynthesis and its gene proximal promoter ( < 1 kb upstream from the transcription start site) is essential for regulating transcription in both the developing and adult nervous systems. Several putative regulatory elements within the TH proximal promoter have been reported, but evolutionary conservation of these elements has not been thoroughly investigated. Since many vertebrate species are used to model development, function and disorders of human catecholaminergic neurons, identifying evolutionarily conserved transcription regulatory mechanisms is a high priority. In this study, we align TH proximal promoter nucleotide sequences from several vertebrate species to identify evolutionarily conserved motifs. This analysis identified three elements (a TATA box, cyclic AMP response element (CRE) and a 5′-GGTGG-3′ site) that constitute the core of an ancient vertebrate TH promoter. Focusing on only eutherian mammals, two regions of high conservation within the proximal promoter were identified: a ∼250 bp region adjacent to the transcription start site and a ∼85 bp region located approximately 350 bp further upstream. Within both regions, conservation of previously reported cis-regulatory motifs and human single nucleotide variants was evaluated. Transcription reporter assays in a TH -expressing cell line demonstrated the functionality of highly conserved motifs in the proximal promoter regions and electromobility shift assays showed that brain-region specific complexes assemble on these motifs. These studies also identified a non-canonical CRE binding (CREB) protein recognition element in the proximal promoter. Together, these studies provide a detailed analysis of evolutionary conservation within the TH promoter and identify potential cis-regulatory motifs that underlie a core set of regulatory mechanisms in mammals. PMID:25774193

  7. Changes in Cis-regulatory Elements during Morphological Evolution

    PubMed Central

    Gaunt, Stephen J.; Paul, Yu-Lee

    2012-01-01

    How have animals evolved new body designs (morphological evolution)? This requires explanations both for simple morphological changes, such as differences in pigmentation and hair patterns between different Drosophila populations and species, and also for more complex changes, such as differences in the forelimbs of mice and bats, and the necks of amphibians and reptiles. The genetic changes and pathways involved in these evolutionary steps require identification. Many, though not all, of these events occur by changes in cis-regulatory (enhancer) elements within developmental genes. Enhancers are modular, each affecting expression in only one or a few tissues. Therefore it is possible to add, remove or alter an enhancer without producing changes in multiple tissues, and thereby avoid widespread (pleiotropic) deleterious effects. Ideally, for a given step in morphological evolution it is necessary to identify (i) the change in phenotype, (ii) the changes in gene expression, (iii) the DNA region, enhancer or otherwise, affected, (iv) the mutation involved, (v) the nature of the transcription or other factors that bind to this site. In practice these data are incomplete for most of the published studies upon morphological evolution. Here, the investigations are categorized according to how far these analyses have proceeded. PMID:24832508

  8. Computational methods for the detection of cis-regulatory modules.

    PubMed

    Van Loo, Peter; Marynen, Peter

    2009-09-01

    Metazoan transcription regulation occurs through the concerted action of multiple transcription factors that bind co-operatively to cis-regulatory modules (CRMs). The annotation of these key regulators of transcription is lagging far behind the annotation of the transcriptome itself. Here, we give an overview of existing computational methods to detect these CRMs in metazoan genomes. We subdivide these methods into three classes: CRM scanners screen sequences for CRMs based on predefined models that often consist of multiple position weight matrices (PWMs). CRM builders construct models of similar CRMs controlling a set of co-regulated or co-expressed genes. CRM genome screeners screen sequences or complete genomes for CRMs as homotypic or heterotypic clusters of binding sites for any combination of transcription factors. We believe that CRM scanners are currently the most advanced methods, although their applicability is limited. Finally, we argue that CRM builders that make use of PWM libraries will benefit greatly from future advances and will prove to be most instrumental for the annotation of regulatory regions in metazoan genomes.

  9. Assessing Computational Methods of Cis-Regulatory Module Prediction

    PubMed Central

    Su, Jing; Teichmann, Sarah A.; Down, Thomas A.

    2010-01-01

    Computational methods attempting to identify instances of cis-regulatory modules (CRMs) in the genome face a challenging problem of searching for potentially interacting transcription factor binding sites while knowledge of the specific interactions involved remains limited. Without a comprehensive comparison of their performance, the reliability and accuracy of these tools remains unclear. Faced with a large number of different tools that address this problem, we summarized and categorized them based on search strategy and input data requirements. Twelve representative methods were chosen and applied to predict CRMs from the Drosophila CRM database REDfly, and across the human ENCODE regions. Our results show that the optimal choice of method varies depending on species and composition of the sequences in question. When discriminating CRMs from non-coding regions, those methods considering evolutionary conservation have a stronger predictive power than methods designed to be run on a single genome. Different CRM representations and search strategies rely on different CRM properties, and different methods can complement one another. For example, some favour homotypical clusters of binding sites, while others perform best on short CRMs. Furthermore, most methods appear to be sensitive to the composition and structure of the genome to which they are applied. We analyze the principal features that distinguish the methods that performed well, identify weaknesses leading to poor performance, and provide a guide for users. We also propose key considerations for the development and evaluation of future CRM-prediction methods. PMID:21152003

  10. Detailed map of a cis-regulatory input function

    NASA Astrophysics Data System (ADS)

    Setty, Y.; Mayo, A. E.; Surette, M. G.; Alon, U.

    2003-06-01

    Most genes are regulated by multiple transcription factors that bind specific sites in DNA regulatory regions. These cis-regulatory regions perform a computation: the rate of transcription is a function of the active concentrations of each of the input transcription factors. Here, we used accurate gene expression measurements from living cell cultures, bearing GFP reporters, to map in detail the input function of the classic lacZYA operon of Escherichia coli, as a function of about a hundred combinations of its two inducers, cAMP and isopropyl -D-thiogalactoside (IPTG). We found an unexpectedly intricate function with four plateau levels and four thresholds. This result compares well with a mathematical model of the binding of the regulatory proteins cAMP receptor protein (CRP) and LacI to the lac regulatory region. The model is also used to demonstrate that with few mutations, the same region could encode much purer AND-like or even OR-like functions. This possibility means that the wild-type region is selected to perform an elaborate computation in setting the transcription rate. The present approach can be generally used to map the input functions of other genes.

  11. Comparative genomics-based identification and analysis of cis-regulatory elements

    PubMed Central

    Ogino, Hajime; Ochi, Haruki; Uchiyama, Chihiro; Louie, Sarah; Grainger, Robert M.

    2014-01-01

    Summary Identification of cis-regulatory elements, such as enhancers and promoters, is very important not only for analysis of gene regulatory networks but also as a tool for targeted gene expression experiments. In this chapter, we introduce an easy but reliable approach to predict enhancers of a gene of interest by comparing mammalian and Xenopus genome sequences, and to examine their activity using a co-transgenesis technique in Xenopus embryos. Since the bioinformatics analysis utilizes publically available web-tools, bench biologists can easily perform it without any need for special computing capability. The co-transgenesis assay, which directly uses polymerase chain reaction (PCR) products, quickly screens for activity of the candidate elements in a cloning-free manner. PMID:22956093

  12. Favorable genomic environments for cis-regulatory evolution: A novel theoretical framework.

    PubMed

    Maeso, Ignacio; Tena, Juan J

    2016-09-01

    Cis-regulatory changes are arguably the primary evolutionary source of animal morphological diversity. With the recent explosion of genome-wide comparisons of the cis-regulatory content in different animal species is now possible to infer general principles underlying enhancer evolution. However, these studies have also revealed numerous discrepancies and paradoxes, suggesting that the mechanistic causes and modes of cis-regulatory evolution are still not well understood and are probably much more complex than generally appreciated. Here, we argue that the mutational mechanisms and genomic regions generating new regulatory activities must comply with the constraints imposed by the molecular properties of cis-regulatory elements (CREs) and the organizational features of long-range chromatin interactions. Accordingly, we propose a new integrative evolutionary framework for cis-regulatory evolution based on two major premises for the origin of novel enhancer activity: (i) an accessible chromatin environment and (ii) compatibility with the 3D structure and interactions of pre-existing CREs. Mechanisms and DNA sequences not fulfilling these premises, will be less likely to have a measurable impact on gene expression and as such, will have a minor contribution to the evolution of gene regulation. Finally, we discuss current comparative cis-regulatory data under the light of this new evolutionary model, and propose that the two most prominent mechanisms for the evolution of cis-regulatory changes are the overprinting of ancestral CREs and the exaptation of transposable elements. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Overview Article: Identifying transcriptional cis-regulatory modules in animal genomes

    PubMed Central

    Suryamohan, Kushal; Halfon, Marc S.

    2014-01-01

    Gene expression is regulated through the activity of transcription factors and chromatin modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily-identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods has led to an explosion of both computational and empirical methods for CRM discovery in model and non-model organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against transcription factors or histone post-translational modifications, identification of nucleosome-depleted “open” chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted transcription factor binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. PMID:25704908

  14. Pitx1 Broadly Associates with Limb Enhancers and is Enriched on Hindlimb cis-Regulatory Elements

    PubMed Central

    Infante, Carlos R.; Park, Sungdae; Mihala, Alexandra; Kingsley, David M.; Menke, Douglas B.

    2013-01-01

    Extensive functional analyses have demonstrated that the pituitary homeodomain transcription factor Pitx1 plays a critical role in specifying hindlimb morphology in vertebrates. However, much less is known regarding the target genes and cis-regulatory elements through which Pitx1 acts. Earlier studies suggested that the hindlimb transcription factors Tbx4, HoxC10, and HoxC11 might be transcriptional targets of Pitx1, but definitive evidence for direct regulatory interactions has been lacking. Using ChIP-Seq on embryonic mouse hindlimbs, we have pinpointed the genome-wide location of Pitx1 binding sites during mouse hindlimb development and identified potential gene targets for Pitx1. We determined that Pitx1 binding is significantly enriched near genes involved in limb morphogenesis, including Tbx4, HoxC10, and HoxC11. Notably, Pitx1 is bound to the previously identified HLEA and HLEB hindlimb enhancers of the Tbx4 gene and to a newly identified Tbx2 hindlimb enhancer. Moreover, Pitx1 binding is significantly enriched on hindlimb relative to forelimb-specific cis-regulatory features that are differentially marked by H3K27ac. However, our analysis revealed that Pitx1 also strongly associates with many functionally verified limb enhancers that exhibit similar levels of activity in the embryonic mesenchyme of forelimbs and hindlimbs. We speculate that Pitx1 influences hindlimb morphology both through the activation of hindlimb specific enhancers as well as through the hindlimb-specific modulation of enhancers that are active in both sets of limbs. PMID:23201014

  15. Selected heterozygosity at cis-regulatory sequences increases the expression homogeneity of a cell population in humans.

    PubMed

    Sung, Min Kyung; Jang, Juneil; Lee, Kang Seon; Ghim, Cheol-Min; Choi, Jung Kyoon

    2016-07-28

    Examples of heterozygote advantage in humans are scarce and limited to protein-coding sequences. Here, we attempt a genome-wide functional inference of advantageous heterozygosity at cis-regulatory regions. The single-nucleotide polymorphisms bearing the signatures of balancing selection are enriched in active cis-regulatory regions of immune cells and epithelial cells, the latter of which provide barrier function and innate immunity. Examples associated with ancient trans-specific balancing selection are also discovered. Allelic imbalance in chromatin accessibility and divergence in transcription factor motif sequences indicate that these balanced polymorphisms cause distinct regulatory variation. However, a majority of these variants show no association with the expression level of the target gene. Instead, single-cell experimental data for gene expression and chromatin accessibility demonstrate that heterozygous sequences can lower cell-to-cell variability in proportion to selection strengths. This negative correlation is more pronounced for highly expressed genes and consistently observed when using different data and methods. Based on mathematical modeling, we hypothesize that extrinsic noise from fluctuations in transcription factor activity may be amplified in homozygotes, whereas it is buffered in heterozygotes. While high expression levels are coupled with intrinsic noise reduction, regulatory heterozygosity can contribute to the suppression of extrinsic noise. This mechanism may confer a selective advantage by increasing cell population homogeneity and thereby enhancing the collective action of the cells, especially of those involved in the defense systems in humans.

  16. Global reorganisation of cis-regulatory units upon lineage commitment of human embryonic stem cells.

    PubMed

    Freire-Pritchett, Paula; Schoenfelder, Stefan; Várnai, Csilla; Wingett, Steven W; Cairns, Jonathan; Collier, Amanda J; García-Vílchez, Raquel; Furlan-Magaril, Mayra; Osborne, Cameron S; Fraser, Peter; Rugg-Gunn, Peter J; Spivakov, Mikhail

    2017-03-23

    Long-range cis-regulatory elements such as enhancers coordinate cell-specific transcriptional programmes by engaging in DNA looping interactions with target promoters. Deciphering the interplay between the promoter connectivity and activity of cis-regulatory elements during lineage commitment is crucial for understanding developmental transcriptional control. Here, we use Promoter Capture Hi-C to generate a high-resolution atlas of chromosomal interactions involving ~22,000 gene promoters in human pluripotent and lineage-committed cells, identifying putative target genes for known and predicted enhancer elements. We reveal extensive dynamics of cis-regulatory contacts upon lineage commitment, including the acquisition and loss of promoter interactions. This spatial rewiring occurs preferentially with predicted changes in the activity of cis-regulatory elements and is associated with changes in target gene expression. Our results provide a global and integrated view of promoter interactome dynamics during lineage commitment of human pluripotent cells.

  17. cis-Regulatory remodeling of the SCL locus during vertebrate evolution.

    PubMed

    Göttgens, Berthold; Ferreira, Rita; Sanchez, Maria-José; Ishibashi, Shoko; Li, Juan; Spensberger, Dominik; Lefevre, Pascal; Ottersbach, Katrin; Chapman, Michael; Kinston, Sarah; Knezevic, Kathy; Hoogenkamp, Maarten; Follows, George A; Bonifer, Constanze; Amaya, Enrique; Green, Anthony R

    2010-12-01

    Development progresses through a sequence of cellular identities which are determined by the activities of networks of transcription factor genes. Alterations in cis-regulatory elements of these genes play a major role in evolutionary change, but little is known about the mechanisms responsible for maintaining conserved patterns of gene expression. We have studied the evolution of cis-regulatory mechanisms controlling the SCL gene, which encodes a key transcriptional regulator of blood, vasculature, and brain development and exhibits conserved function and pattern of expression throughout vertebrate evolution. SCL cis-regulatory elements are conserved between frog and chicken but accrued alterations at an accelerated rate between 310 and 200 million years ago, with subsequent fixation of a new cis-regulatory pattern at the beginning of the mammalian radiation. As a consequence, orthologous elements shared by mammals and lower vertebrates exhibit functional differences and binding site turnover between widely separated cis-regulatory modules. However, the net effect of these alterations is constancy of overall regulatory inputs and of expression pattern. Our data demonstrate remarkable cis-regulatory remodelling across the SCL locus and indicate that stable patterns of expression can mask extensive regulatory change. These insights illuminate our understanding of vertebrate evolution.

  18. Characterization of a putative cis-regulatory element that controls transcriptional activity of the pig uroplakin II gene promoter

    SciTech Connect

    Kwon, Deug-Nam; Park, Mi-Ryung; Park, Jong-Yi; Cho, Ssang-Goo; Park, Chankyu; Oh, Jae-Wook; Song, Hyuk; Kim, Jae-Hwan; Kim, Jin-Hoi

    2011-07-01

    Highlights: {yields} The sequences of -604 to -84 bp of the pUPII promoter contained the region of a putative negative cis-regulatory element. {yields} The core promoter was located in the 5F-1. {yields} Transcription factor HNF4 can directly bind in the pUPII core promoter region, which plays a critical role in controlling promoter activity. {yields} These features of the pUPII promoter are fundamental to development of a target-specific vector. -- Abstract: Uroplakin II (UPII) is a one of the integral membrane proteins synthesized as a major differentiation product of mammalian urothelium. UPII gene expression is bladder specific and differentiation dependent, but little is known about its transcription response elements and molecular mechanism. To identify the cis-regulatory elements in the pig UPII (pUPII) gene promoter region, we constructed pUPII 5' upstream region deletion mutants and demonstrated that each of the deletion mutants participates in controlling the expression of the pUPII gene in human bladder carcinoma RT4 cells. We also identified a new core promoter region and putative negative cis-regulatory element within a minimal promoter region. In addition, we showed that hepatocyte nuclear factor 4 (HNF4) can directly bind in the pUPII core promoter (5F-1) region, which plays a critical role in controlling promoter activity. Transient cotransfection experiments showed that HNF4 positively regulates pUPII gene promoter activity. Thus, the binding element and its binding protein, HNF4 transcription factor, may be involved in the mechanism that specifically regulates pUPII gene transcription.

  19. Identification of Cis-regulatory elements in the mouse Pax9/Nkx2-9 genomic region: implication for evolutionary conserved synteny.

    PubMed Central

    Santagati, Fabio; Abe, Kuniya; Schmidt, Volker; Schmitt-John, Thomas; Suzuki, Misao; Yamamura, Ken-Ichi; Imai, Kenji

    2003-01-01

    We previously reported close physical linkage between Pax9 and Nkx2-9 in the human, mouse, and pufferfish (Fugu rubripes) genomes. In this study, we analyzed cis-regulatory elements of the two genes by comparative sequencing in the three species and by transgenesis in the mouse. We identified two regions including conserved noncoding sequences that possessed specific enhancer activities for expression of Pax9 in the medial nasal process and of Nkx2-9 in the ventral neural tube. Remarkably, the latter contained the consensus Gli-binding motif. Interestingly, the identified Pax9 cis-regulatory sequences were located in an intron of the neighboring gene Slc25a21. Close examination of an extended genomic interval around Pax9 revealed the presence of strong synteny conservation in the human, mouse, and Fugu genomes. We propose such an intersecting organization of cis-regulatory sequences in multigenic regions as a possible mechanism that maintains evolutionary conserved synteny. PMID:14504231

  20. Characterization and identification of cis-regulatory elements in Arabidopsis based on single-nucleotide polymorphism information.

    PubMed

    Korkuc, Paula; Schippers, Jos H M; Walther, Dirk

    2014-01-01

    Identifying regulatory elements and revealing their role in gene expression regulation remains a central goal of plant genome research. We exploited the detailed genomic sequencing information of a large number of Arabidopsis (Arabidopsis thaliana) accessions to characterize known and to identify novel cis-regulatory elements in gene promoter regions of Arabidopsis by relying on conservation as the hallmark signal of functional relevance. Based on the genomic layout and the obtained density profiles of single-nucleotide polymorphisms (SNPs) in sequence regions upstream of transcription start sites, the average length of promoter regions in Arabidopsis could be established at 500 bp. Genes associated with high degrees of variability of their respective upstream regions are preferentially involved in environmental response and signaling processes, while low levels of promoter SNP density are common among housekeeping genes. Known cis-elements were found to exhibit a decreased SNP density than sequence regions not associated with known motifs. For 15 known cis-element motifs, strong positional preferences relative to the transcription start site were detected based on their promoter SNP density profiles. Five novel candidate cis-element motifs were identified as consensus motifs of 17 sequence hexamers exhibiting increased sequence conservation combined with evidence of positional preferences, annotation information, and functional relevance for inducing correlated gene expression. Our study demonstrates that the currently available resolution of SNP data offers novel ways for the identification of functional genomic elements and the characterization of gene promoter sequences.

  1. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

    PubMed Central

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-01-01

    Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488

  2. The search for cis-regulatory driver mutations in cancer genomes.

    PubMed

    Poulos, Rebecca C; Sloane, Mathew A; Hesson, Luke B; Wong, Jason W H

    2015-10-20

    With the advent of high-throughput and relatively inexpensive whole-genome sequencing technology, the focus of cancer research has begun to shift toward analyses of somatic mutations in non-coding cis-regulatory elements of the cancer genome. Cis-regulatory elements play an important role in gene regulation, with mutations in these elements potentially resulting in changes to the expression of linked genes. The recent discoveries of recurrent TERT promoter mutations in melanoma, and recurrent mutations that create a super-enhancer regulating TAL1 expression in T-cell acute lymphoblastic leukaemia (T-ALL), have sparked significant interest in the search for other somatic cis-regulatory mutations driving cancer development. In this review, we look more closely at the TERT promoter and TAL1 enhancer alterations and use these examples to ask whether other cis-regulatory mutations may play a role in cancer susceptibility. In doing so, we make observations from the data emerging from recent research in this field, and describe the experimental and analytical approaches which could be adopted in the hope of better uncovering the true functional significance of somatic cis-regulatory mutations in cancer.

  3. Global profiling of rice and poplar transcriptomes highlights key conserved circadian-controlled pathways and cis-regulatory modules.

    PubMed

    Filichkin, Sergei A; Breton, Ghislain; Priest, Henry D; Dharmawardhana, Palitha; Jaiswal, Pankaj; Fox, Samuel E; Michael, Todd P; Chory, Joanne; Kay, Steve A; Mockler, Todd C

    2011-01-01

    the plant circadian network: the morning (ME, GBOX), evening (EE, GATA), and midnight (PBX/TBX/SBX) modules. Identification of identical overrepresented motifs in the promoters of cycling genes from different species suggests that the core diurnal/circadian cis-regulatory network is deeply conserved between mono- and dicotyledonous species.

  4. Complex interactions between cis-regulatory modules in native conformation are critical for Drosophila snail expression

    PubMed Central

    Dunipace, Leslie; Ozdemir, Anil; Stathopoulos, Angelike

    2011-01-01

    It has been shown in several organisms that multiple cis-regulatory modules (CRMs) of a gene locus can be active concurrently to support similar spatiotemporal expression. To understand the functional importance of such seemingly redundant CRMs, we examined two CRMs from the Drosophila snail gene locus, which are both active in the ventral region of pre-gastrulation embryos. By performing a deletion series in a ∼25 kb DNA rescue construct using BAC recombineering and site-directed transgenesis, we demonstrate that the two CRMs are not redundant. The distal CRM is absolutely required for viability, whereas the proximal CRM is required only under extreme conditions such as high temperature. Consistent with their distinct requirements, the CRMs support distinct expression patterns: the proximal CRM exhibits an expanded expression domain relative to endogenous snail, whereas the distal CRM exhibits almost complete overlap with snail except at the anterior-most pole. We further show that the distal CRM normally limits the increased expression domain of the proximal CRM and that the proximal CRM serves as a `damper' for the expression levels driven by the distal CRM. Thus, the two CRMs interact in cis in a non-additive fashion and these interactions may be important for fine-tuning the domains and levels of gene expression. PMID:21813571

  5. Recurrent modification of a conserved cis-regulatory element underlies fruit fly pigmentation diversity.

    PubMed

    Rogers, William A; Salomone, Joseph R; Tacy, David J; Camino, Eric M; Davis, Kristen A; Rebeiz, Mark; Williams, Thomas M

    2013-08-01

    The development of morphological traits occurs through the collective action of networks of genes connected at the level of gene expression. As any node in a network may be a target of evolutionary change, the recurrent targeting of the same node would indicate that the path of evolution is biased for the relevant trait and network. Although examples of parallel evolution have implicated recurrent modification of the same gene and cis-regulatory element (CRE), little is known about the mutational and molecular paths of parallel CRE evolution. In Drosophila melanogaster fruit flies, the Bric-à-brac (Bab) transcription factors control the development of a suite of sexually dimorphic traits on the posterior abdomen. Female-specific Bab expression is regulated by the dimorphic element, a CRE that possesses direct inputs from body plan (ABD-B) and sex-determination (DSX) transcription factors. Here, we find that the recurrent evolutionary modification of this CRE underlies both intraspecific and interspecific variation in female pigmentation in the melanogaster species group. By reconstructing the sequence and regulatory activity of the ancestral Drosophila melanogaster dimorphic element, we demonstrate that a handful of mutations were sufficient to create independent CRE alleles with differing activities. Moreover, intraspecific and interspecific dimorphic element evolution proceeded with little to no alterations to the known body plan and sex-determination regulatory linkages. Collectively, our findings represent an example where the paths of evolution appear biased to a specific CRE, and drastic changes in function were accompanied by deep conservation of key regulatory linkages.

  6. Cis-regulatory analysis of the sea urchin pigment cell gene polyketide synthase.

    PubMed

    Calestani, Cristina; Rogers, David J

    2010-04-15

    The Strongylocentrotus purpuratus polyketide synthase gene (SpPks) encodes an enzyme required for the biosynthesis of the larval pigment echinochrome. SpPks is expressed exclusively in pigment cells and their precursors starting at blastula stage. The 7th-9th cleavage Delta-Notch signaling, required for pigment cell development, positively regulates SpPks. In previous studies, the transcription factors glial cell missing (SpGcm), SpGatae and kruppel-like (SpKrl/z13) have been shown to positively regulate SpPks. To uncover the structure of the Gene Regulatory Network (GRN) regulating the specification and differentiation processes of pigment cells, we experimentally analyzed the putative SpPks cis-regulatory region. We established that the -1.5kb region is sufficient to recapitulate the correct spatial and temporal expression of SpPks. Predicted DNA-binding sites for SpGcm, SpGataE and SpKrl are located within this region. The mutagenesis of these DNA-binding sites indicated that SpGcm, SpGataE and SpKrl are direct positive regulators of SpPks. These results demonstrate that the sea urchin GRN for pigment cell development is quite shallow, which is typical of type I embryo development.

  7. Identification of distal cis-regulatory elements at mouse mitoferrin loci using zebrafish transgenesis.

    PubMed

    Amigo, Julio D; Yu, Ming; Troadec, Marie-Berengere; Gwynn, Babette; Cooney, Jeffrey D; Lambert, Amy J; Chi, Neil C; Weiss, Mitchell J; Peters, Luanne L; Kaplan, Jerry; Cantor, Alan B; Paw, Barry H

    2011-04-01

    Mitoferrin 1 (Mfrn1; Slc25a37) and mitoferrin 2 (Mfrn2; Slc25a28) function as essential mitochondrial iron importers for heme and Fe/S cluster biogenesis. A genetic deficiency of Mfrn1 results in a profound hypochromic anemia in vertebrate species. To map the cis-regulatory modules (CRMs) that control expression of the Mfrn genes, we utilized genome-wide chromatin immunoprecipitation (ChIP) datasets for the major erythroid transcription factor GATA-1. We identified the CRMs that faithfully drive the expression of Mfrn1 during blood and heart development and Mfrn2 ubiquitously. Through in vivo analyses of the Mfrn-CRMs in zebrafish and mouse, we demonstrate their functional and evolutionary conservation. Using knockdowns with morpholinos and cell sorting analysis in transgenic zebrafish embryos, we show that GATA-1 directly regulates the expression of Mfrn1. Mutagenesis of individual GATA-1 binding cis elements (GBE) demonstrated that at least two of the three GBE within this CRM are functionally required for GATA-mediated transcription of Mfrn1. Furthermore, ChIP assays demonstrate switching from GATA-2 to GATA-1 at these elements during erythroid maturation. Our results provide new insights into the genetic regulation of mitochondrial function and iron homeostasis and, more generally, illustrate the utility of genome-wide ChIP analysis combined with zebrafish transgenesis for identifying long-range transcriptional enhancers that regulate tissue development.

  8. [Role of genes and their cis-regulatory elements during animal morphological evolution].

    PubMed

    Sun, Boyuan; Tu, Jianbo; Li, Ying; Yang, Mingyao

    2014-06-01

    Cis-regulatory hypothesis is one of the most important theories in evolutionary developmental biology (evo-devo), which claims that evolution of cis-regulatory elements (CREs) plays a key role during evolution of morphology. However, an increasing number of experimental results show that cis-regulatory hypothesis alone is not far enough to explain the complexity of evo-devo processes. Other modifications, including mutations of protein coding, gene and genome duplications, and flexibility of homeodomains and CREs, also cause the morphological changes in animals. In this review, we retrospect the recent results of evolution of CREs and genes associated with CREs and discuss new methods and trends for research in evo-devo.

  9. Barcoded DNA-Tag Reporters for Multiplex Cis-Regulatory Analysis

    PubMed Central

    Nam, Jongmin; Davidson, Eric H.

    2012-01-01

    Cis-regulatory DNA sequences causally mediate patterns of gene expression, but efficient experimental analysis of these control systems has remained challenging. Here we develop a new version of “barcoded" DNA-tag reporters, “Nanotags" that permit simultaneous quantitative analysis of up to 130 distinct cis-regulatory modules (CRMs). The activities of these reporters are measured in single experiments by the NanoString RNA counting method and other quantitative procedures. We demonstrate the efficiency of the Nanotag method by simultaneously measuring hourly temporal activities of 126 CRMs from 46 genes in the developing sea urchin embryo, otherwise a virtually impossible task. Nanotags are also used in gene perturbation experiments to reveal cis-regulatory responses of many CRMs at once. Nanotag methodology can be applied to many research areas, ranging from gene regulatory networks to functional and evolutionary genomics. PMID:22563420

  10. Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors

    PubMed Central

    Andersson, Claes R; Hvidsten, Torgeir R; Isaksson, Anders; Gustafsson, Mats G; Komorowski, Jan

    2007-01-01

    Background We address the issue of explaining the presence or absence of phase-specific transcription in budding yeast cultures under different conditions. To this end we use a model-based detector of gene expression periodicity to divide genes into classes depending on their behavior in experiments using different synchronization methods. While computational inference of gene regulatory circuits typically relies on expression similarity (clustering) in order to find classes of potentially co-regulated genes, this method instead takes advantage of known time profile signatures related to the studied process. Results We explain the regulatory mechanisms of the inferred periodic classes with cis-regulatory descriptors that combine upstream sequence motifs with experimentally determined binding of transcription factors. By systematic statistical analysis we show that periodic classes are best explained by combinations of descriptors rather than single descriptors, and that different combinations correspond to periodic expression in different classes. We also find evidence for additive regulation in that the combinations of cis-regulatory descriptors associated with genes periodically expressed in fewer conditions are frequently subsets of combinations associated with genes periodically expression in more conditions. Finally, we demonstrate that our approach retrieves combinations that are more specific towards known cell-cycle related regulators than the frequently used clustering approach. Conclusion The results illustrate how a model-based approach to expression analysis may be particularly well suited to detect biologically relevant mechanisms. Our new approach makes it possible to provide more refined hypotheses about regulatory mechanisms of the cell cycle and it can easily be adjusted to reveal regulation of other, non-periodic, cellular processes. PMID:17939860

  11. cis-Regulatory Mutations Are a Genetic Cause of Human Limb Malformations

    PubMed Central

    VanderMeer, Julia E.; Ahituv, Nadav

    2011-01-01

    The underlying mutations that cause human limb malformations are often difficult to determine, particularly for limb malformations that occur as isolated traits. Evidence from a variety of studies shows that cis-regulatory mutations, specifically in enhancers, can lead to some of these isolated limb malformations. Here, we provide a review of human limb malformations that have been shown to be caused by enhancer mutations and propose that cis-regulatory mutations will continue to be identified as the cause of additional human malformations as our understanding of regulatory sequences improves. PMID:21509892

  12. Decoding a Signature-Based Model of Transcription Cofactor Recruitment Dictated by Cardinal Cis-Regulatory Elements in Proximal Promoter Regions

    PubMed Central

    Benner, Christopher; Hutt, Kasey R.; Stunnenberg, Rieka; Garcia-Bassets, Ivan

    2013-01-01

    Genome-wide maps of DNase I hypersensitive sites (DHSs) reveal that most human promoters contain perpetually active cis-regulatory elements between −150 bp and +50 bp (−150/+50 bp) relative to the transcription start site (TSS). Transcription factors (TFs) recruit cofactors (chromatin remodelers, histone/protein-modifying enzymes, and scaffold proteins) to these elements in order to organize the local chromatin structure and coordinate the balance of post-translational modifications nearby, contributing to the overall regulation of transcription. However, the rules of TF-mediated cofactor recruitment to the −150/+50 bp promoter regions remain poorly understood. Here, we provide evidence for a general model in which a series of cis-regulatory elements (here termed ‘cardinal’ motifs) prefer acting individually, rather than in fixed combinations, within the −150/+50 bp regions to recruit TFs that dictate cofactor signatures distinctive of specific promoter subsets. Subsequently, human promoters can be subclassified based on the presence of cardinal elements and their associated cofactor signatures. In this study, furthermore, we have focused on promoters containing the nuclear respiratory factor 1 (NRF1) motif as the cardinal cis-regulatory element and have identified the pervasive association of NRF1 with the cofactor lysine-specific demethylase 1 (LSD1/KDM1A). This signature might be distinctive of promoters regulating nuclear-encoded mitochondrial and other particular genes in at least some cells. Together, we propose that decoding a signature-based, expanded model of control at proximal promoter regions should lead to a better understanding of coordinated regulation of gene transcription. PMID:24244184

  13. Cis-regulatory mechanisms governing stem and progenitor cell transitions

    PubMed Central

    Johnson, Kirby D.; Kong, Guangyao; Gao, Xin; Chang, Yuan-I; Hewitt, Kyle J.; Sanalkumar, Rajendran; Prathibha, Rajalekshmi; Ranheim, Erik A.; Dewey, Colin N.; Zhang, Jing; Bresnick, Emery H.

    2015-01-01

    Cis-element encyclopedias provide information on phenotypic diversity and disease mechanisms. Although cis-element polymorphisms and mutations are instructive, deciphering function remains challenging. Mutation of an intronic GATA motif (+9.5) in GATA2, encoding a master regulator of hematopoiesis, underlies an immunodeficiency associated with myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML). Whereas an inversion relocalizes another GATA2 cis-element (−77) to the proto-oncogene EVI1, inducing EVI1 expression and AML, whether this reflects ectopic or physiological activity is unknown. We describe a mouse strain that decouples −77 function from proto-oncogene deregulation. The −77−/− mice exhibited a novel phenotypic constellation including late embryonic lethality and anemia. The −77 established a vital sector of the myeloid progenitor transcriptome, conferring multipotentiality. Unlike the +9.5−/− embryos, hematopoietic stem cell genesis was unaffected in −77−/− embryos. These results illustrate a paradigm in which cis-elements in a locus differentially control stem and progenitor cell transitions, and therefore the individual cis-element alterations cause unique and overlapping disease phenotypes. PMID:26601269

  14. Characterization of the cis-regulatory region of the Drosophila homeotic gene Sex combs reduced

    SciTech Connect

    Gindhart, J.G. Jr.; King, N.A.; Kaufman, T.C.

    1995-02-01

    The Drosophilia homeotic gene Sex combs reduced (Scr) controls the segmental identity of the labial and prothoracic segments in the embryo and adult. It encodes a sequence-specific transcription factor that controls, in concert with other gene products, differentiative pathways of tissues in which Scr is expressed. During embryogenesis, Scr accumulation is observed in a discrete spatiotemporal pattern that includes the labial and prothoracic ectoderm, the subesophageal ganglion of the ventral nerve cord and the visceral mesoderm of the anterior and posterior midgut. Previous analyses have demonstrated that breakpoint mutations located in a 75-kb interval, including the Scr transcription unit and 50 kb of upstream DNA, cause Scr misexpression during development, presumably because these mutations remove Scr cis-regulatory sequences from the proximity of the Scr promoter. To gain a better understanding of the regulatory interactions necessary for the control of Scr transcription during embryogenesis, we have begun a molecular analysis of the Scr regulatory interval. DNA fragments from this 75-kb region were subcloned into P-element vectors containing either an Scr-lacZ or hsp70-lacZ fusion gene, and patterns of reporter gene expression were assayed in transgenic embryos. Several fragments appear to contain Scr regulatory sequences, as they direct reporter gene expression in patterns similar to those normally observed for Scr, whereas other DNA fragments direct Scr reporter gene expression in developmentally interesting but non-Scr-like patterns during embryogenesis. Scr expression in some tissues appears to be controlled by multiple regulatory elements that are separated, in some cases, by more than 20 kb of intervening DNA. This analysis provides an entry point for the study of how Scr transcription is regulated at the molecular level. 60 refs., 7 figs., 1 tab.

  15. Complex patterns of cis-regulatory polymorphisms in ebony underlie standing pigmentation variation in Drosophila melanogaster.

    PubMed

    Miyagi, Ryutaro; Akiyama, Noriyoshi; Osada, Naoki; Takahashi, Aya

    2015-12-01

    Pigmentation traits in adult Drosophila melanogaster were used in this study to investigate how phenotypic variations in continuous ecological traits can be maintained in a natural population. First, pigmentation variation in the adult female was measured at seven different body positions in 20 strains from the Drosophila melanogaster Genetic Reference Panel (DGRP) originating from a natural population in North Carolina. Next, to assess the contributions of cis-regulatory polymorphisms of the genes involved in the melanin biosynthesis pathway, allele-specific expression levels of four genes were quantified by amplicon sequencing using a 454 GS Junior. Among those genes, ebony was significantly associated with pigmentation intensity of the thoracic segment. Detailed sequence analysis of the gene regulatory regions of this gene indicated that many different functional cis-regulatory alleles are segregating in the population and that variations outside the core enhancer element could potentially play important roles in the regulation of gene expression. In addition, a slight enrichment of distantly associated SNP pairs was observed in the ~10 kb cis-regulatory region of ebony, which suggested the presence of interacting elements scattered across the region. In contrast, sequence analysis in the core cis-regulatory region of tan indicated that SNPs within the region are significantly associated with allele-specific expression level of this gene. Collectively, the data suggest that the underlying genetic differences in the cis-regulatory regions that control intraspecific pigmentation variation can be more complex than those of interspecific pigmentation trait differences, where causal genetic changes are typically confined to modular enhancer elements.

  16. Motif-role-fingerprints: the building-blocks of motifs, clustering-coefficients and transitivities in directed networks.

    PubMed

    McDonnell, Mark D; Yaveroğlu, Ömer Nebil; Schmerl, Brett A; Iannella, Nicolangelo; Ward, Lawrence M

    2014-01-01

    Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are 'structural' (induced subgraphs) and 'functional' (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File.

  17. Motif-Role-Fingerprints: The Building-Blocks of Motifs, Clustering-Coefficients and Transitivities in Directed Networks

    PubMed Central

    McDonnell, Mark D.; Yaveroğlu, Ömer Nebil; Schmerl, Brett A.; Iannella, Nicolangelo; Ward, Lawrence M.

    2014-01-01

    Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are ‘structural’ (induced subgraphs) and ‘functional’ (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File. PMID:25486535

  18. No Excess of Cis-Regulatory Variation Associated with Intraspecific Selection in Wild Pearl Millet (Cenchrus americanus)

    PubMed Central

    Rhoné, Bénédicte; Mariac, Cédric; Couderc, Marie; Berthouly-Salazar, Cécile; Ousseini, Issaka Salia

    2017-01-01

    Several studies suggest that cis-regulatory mutations are the favorite target of evolutionary changes, one reason being that cis-regulatory mutations might have fewer deleterious pleiotropic effects than protein-coding mutations. A review of the process also suggests that this bias towards adaptive cis-regulatory variation might be less pronounced at the intraspecific level compared with the interspecific level. In this study, we assessed the contribution of cis-regulatory variation to adaptation at the intraspecific level using populations of wild pearl millet (Cenchrus americanus ssp. monodii) sampled along an environmental gradient in Niger. From RNA sequencing of hybrids to assess allele-specific expression, we identified genes with cis-regulatory divergence between two parental accessions collected in contrasted environmental conditions. This revealed that ∼15% of transcribed genes showed cis-regulatory variation. Intersecting the gene set exhibiting cis-regulatory variation with the gene set identified as targets of selection revealed no excess of cis-acting mutations among the selected genes. We additionally found no excess of cis-regulatory variation among genes associated with adaptive traits. As our approach relied on methods identifying mainly genes submitted to strong selection pressure or with high phenotypic effect, the contribution of cis-regulatory changes to soft selection or polygenic adaptive traits remains to be tested. However our results favor the hypothesis that enrichment of adaptive cis-regulatory divergence builds up over time. For short evolutionary time-scales, cis-acting mutations are not predominantly involved in adaptive evolution associated with strong selective signal. PMID:28137746

  19. Functional Evolution of cis-Regulatory Modules at a Homeotic Gene in Drosophila

    PubMed Central

    Schiller, Benjamin J.; Bae, Esther; Tran, Diana A.; Shur, Andrey S.; Allen, John M.; Rau, Christoph; Bender, Welcome; Fisher, William W.; Celniker, Susan E.; Drewell, Robert A.

    2009-01-01

    It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules. PMID:19893611

  20. Recurrent Modification of a Conserved Cis-Regulatory Element Underlies Fruit Fly Pigmentation Diversity

    PubMed Central

    Rogers, William A.; Salomone, Joseph R.; Tacy, David J.; Camino, Eric M.; Davis, Kristen A.; Rebeiz, Mark; Williams, Thomas M.

    2013-01-01

    The development of morphological traits occurs through the collective action of networks of genes connected at the level of gene expression. As any node in a network may be a target of evolutionary change, the recurrent targeting of the same node would indicate that the path of evolution is biased for the relevant trait and network. Although examples of parallel evolution have implicated recurrent modification of the same gene and cis-regulatory element (CRE), little is known about the mutational and molecular paths of parallel CRE evolution. In Drosophila melanogaster fruit flies, the Bric-à-brac (Bab) transcription factors control the development of a suite of sexually dimorphic traits on the posterior abdomen. Female-specific Bab expression is regulated by the dimorphic element, a CRE that possesses direct inputs from body plan (ABD-B) and sex-determination (DSX) transcription factors. Here, we find that the recurrent evolutionary modification of this CRE underlies both intraspecific and interspecific variation in female pigmentation in the melanogaster species group. By reconstructing the sequence and regulatory activity of the ancestral Drosophila melanogaster dimorphic element, we demonstrate that a handful of mutations were sufficient to create independent CRE alleles with differing activities. Moreover, intraspecific and interspecific dimorphic element evolution proceeded with little to no alterations to the known body plan and sex-determination regulatory linkages. Collectively, our findings represent an example where the paths of evolution appear biased to a specific CRE, and drastic changes in function were accompanied by deep conservation of key regulatory linkages. PMID:24009528

  1. Cis-Regulatory Timers for Developmental Gene Expression

    PubMed Central

    Christiaen, Lionel

    2013-01-01

    How does a fertilized egg decode its own genome to eventually develop into a mature animal? Each developing cell must activate a battery of genes in a timely manner and according to the function it will ultimately perform, but how? During development of the notochord—a structure akin to the vertebrate spine—in a simple marine invertebrate, an essential protein called Brachyury binds to specific sites in its target genes. A study just published in PLOS Biology reports that if the target gene contains multiple Brachyury-binding sites it will be activated early in development but if it contains only one site it will be activated later. Genes that contain no binding site can still be activated by Brachyury, but only indirectly by an earlier Brachyury-dependent gene product, so later than the directly activated genes. Thus, this study shows how several genes can interpret the presence of a single factor differently to become active at distinct times in development. PMID:24204213

  2. Computational identification of new structured cis-regulatory elements in the 3'-untranslated region of human protein coding genes.

    PubMed

    Chen, Xiaowei Sylvia; Brown, Chris M

    2012-10-01

    Messenger ribonucleic acids (RNAs) contain a large number of cis-regulatory RNA elements that function in many types of post-transcriptional regulation. These cis-regulatory elements are often characterized by conserved structures and/or sequences. Although some classes are well known, given the wide range of RNA-interacting proteins in eukaryotes, it is likely that many new classes of cis-regulatory elements are yet to be discovered. An approach to this is to use computational methods that have the advantage of analysing genomic data, particularly comparative data on a large scale. In this study, a set of structural discovery algorithms was applied followed by support vector machine (SVM) classification. We trained a new classification model (CisRNA-SVM) on a set of known structured cis-regulatory elements from 3'-untranslated regions (UTRs) and successfully distinguished these and groups of cis-regulatory elements not been strained on from control genomic and shuffled sequences. The new method outperformed previous methods in classification of cis-regulatory RNA elements. This model was then used to predict new elements from cross-species conserved regions of human 3'-UTRs. Clustering of these elements identified new classes of potential cis-regulatory elements. The model, training and testing sets and novel human predictions are available at: http://mRNA.otago.ac.nz/CisRNA-SVM.

  3. Single embryo-resolution quantitative analysis of reporters permits multiplex spatial cis-regulatory analysis.

    PubMed

    Guay, Catherine L; McQuade, Sean T; Nam, Jongmin

    2017-02-15

    Cis-regulatory modules (CRMs) control spatiotemporal gene expression patterns in embryos. While measurement of quantitative CRM activities has become efficient, measurement of spatial CRM activities still relies on slow, one-by-one methods. To overcome this bottleneck, we have developed a high-throughput method that can simultaneously measure quantitative and spatial CRM activities. The new method builds profiles of quantitative CRM activities measured at single-embryo resolution in many mosaic embryos and uses these profiles as a 'fingerprint' of spatial patterns. We show in sea urchin embryos that the new method, Multiplex and Mosaic Observation of Spatial Information encoded in Cis-regulatory modules (MMOSAIC), can efficiently predict spatial activities of new CRMs and can detect spatial responses of CRMs to gene perturbations. We anticipate that MMOSAIC will facilitate systems-wide functional analyses of CRMs in embryos. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Evolved tooth gain in sticklebacks is associated with a cis-regulatory allele of Bmp6

    PubMed Central

    Cleves, Phillip A.; Ellis, Nicholas A.; Jimenez, Monica T.; Nunez, Stephanie M.; Schluter, Dolph; Kingsley, David M.; Miller, Craig T.

    2014-01-01

    Developmental genetic studies of evolved differences in morphology have led to the hypothesis that cis-regulatory changes often underlie morphological evolution. However, because most of these studies focus on evolved loss of traits, the genetic architecture and possible association with cis-regulatory changes of gain traits are less understood. Here we show that a derived benthic freshwater stickleback population has evolved an approximate twofold gain in ventral pharyngeal tooth number compared with their ancestral marine counterparts. Comparing laboratory-reared developmental time courses of a low-toothed marine population and this high-toothed benthic population reveals that increases in tooth number and tooth plate area and decreases in tooth spacing arise at late juvenile stages. Genome-wide linkage mapping identifies largely separate sets of quantitative trait loci affecting different aspects of dental patterning. One large-effect quantitative trait locus controlling tooth number fine-maps to a genomic region containing an excellent candidate gene, Bone morphogenetic protein 6 (Bmp6). Stickleback Bmp6 is expressed in developing teeth, and no coding changes are found between the high- and low-toothed populations. However, quantitative allele-specific expression assays of Bmp6 in developing teeth in F1 hybrids show that cis-regulatory changes have elevated the relative expression level of the freshwater benthic Bmp6 allele at late, but not early, stages of stickleback development. Collectively, our data support a model where a late-acting cis-regulatory up-regulation of Bmp6 expression underlies a significant increase in tooth number in derived benthic sticklebacks. PMID:25205810

  5. Evolved tooth gain in sticklebacks is associated with a cis-regulatory allele of Bmp6.

    PubMed

    Cleves, Phillip A; Ellis, Nicholas A; Jimenez, Monica T; Nunez, Stephanie M; Schluter, Dolph; Kingsley, David M; Miller, Craig T

    2014-09-23

    Developmental genetic studies of evolved differences in morphology have led to the hypothesis that cis-regulatory changes often underlie morphological evolution. However, because most of these studies focus on evolved loss of traits, the genetic architecture and possible association with cis-regulatory changes of gain traits are less understood. Here we show that a derived benthic freshwater stickleback population has evolved an approximate twofold gain in ventral pharyngeal tooth number compared with their ancestral marine counterparts. Comparing laboratory-reared developmental time courses of a low-toothed marine population and this high-toothed benthic population reveals that increases in tooth number and tooth plate area and decreases in tooth spacing arise at late juvenile stages. Genome-wide linkage mapping identifies largely separate sets of quantitative trait loci affecting different aspects of dental patterning. One large-effect quantitative trait locus controlling tooth number fine-maps to a genomic region containing an excellent candidate gene, Bone morphogenetic protein 6 (Bmp6). Stickleback Bmp6 is expressed in developing teeth, and no coding changes are found between the high- and low-toothed populations. However, quantitative allele-specific expression assays of Bmp6 in developing teeth in F1 hybrids show that cis-regulatory changes have elevated the relative expression level of the freshwater benthic Bmp6 allele at late, but not early, stages of stickleback development. Collectively, our data support a model where a late-acting cis-regulatory up-regulation of Bmp6 expression underlies a significant increase in tooth number in derived benthic sticklebacks.

  6. Dynamic SPR monitoring of yeast nuclear protein binding to a cis-regulatory element

    SciTech Connect

    Mao, Grace; Brody, James P.

    2007-11-09

    Gene expression is controlled by protein complexes binding to short specific sequences of DNA, called cis-regulatory elements. Expression of most eukaryotic genes is controlled by dozens of these elements. Comprehensive identification and monitoring of these elements is a major goal of genomics. In pursuit of this goal, we are developing a surface plasmon resonance (SPR) based assay to identify and monitor cis-regulatory elements. To test whether we could reliably monitor protein binding to a regulatory element, we immobilized a 16 bp region of Saccharomyces cerevisiae chromosome 5 onto a gold surface. This 16 bp region of DNA is known to bind several proteins and thought to control expression of the gene RNR1, which varies through the cell cycle. We synchronized yeast cell cultures, and then sampled these cultures at a regular interval. These samples were processed to purify nuclear lysate, which was then exposed to the sensor. We found that nuclear protein binds this particular element of DNA at a significantly higher rate (as compared to unsynchronized cells) during G1 phase. Other time points show levels of DNA-nuclear protein binding similar to the unsynchronized control. We also measured the apparent association complex of the binding to be 0.014 s{sup -1}. We conclude that (1) SPR-based assays can monitor DNA-nuclear protein binding and that (2) for this particular cis-regulatory element, maximum DNA-nuclear protein binding occurs during G1 phase.

  7. Creating and validating cis-regulatory maps of tissue-specific gene expression regulation

    PubMed Central

    O'Connor, Timothy R.; Bailey, Timothy L.

    2014-01-01

    Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules–CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for ‘other’ tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a ‘nearest neighbor’ heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps. PMID:25200088

  8. Predominant contribution of cis-regulatory divergence in the evolution of mouse alternative splicing

    PubMed Central

    Gao, Qingsong; Sun, Wei; Ballegeer, Marlies; Libert, Claude; Chen, Wei

    2015-01-01

    Divergence of alternative splicing represents one of the major driving forces to shape phenotypic diversity during evolution. However, the extent to which these divergences could be explained by the evolving cis-regulatory versus trans-acting factors remains unresolved. To globally investigate the relative contributions of the two factors for the first time in mammals, we measured splicing difference between C57BL/6J and SPRET/EiJ mouse strains and allele-specific splicing pattern in their F1 hybrid. Out of 11,818 alternative splicing events expressed in the cultured fibroblast cells, we identified 796 with significant difference between the parental strains. After integrating allele-specific data from F1 hybrid, we demonstrated that these events could be predominately attributed to cis-regulatory variants, including those residing at and beyond canonical splicing sites. Contrary to previous observations in Drosophila, such predominant contribution was consistently observed across different types of alternative splicing. Further analysis of liver tissues from the same mouse strains and reanalysis of published datasets on other strains showed similar trends, implying in general the predominant contribution of cis-regulatory changes in the evolution of mouse alternative splicing. PMID:26134616

  9. The identification of cis-regulatory elements: A review from a machine learning perspective.

    PubMed

    Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W

    2015-12-01

    The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field.

  10. Creating and validating cis-regulatory maps of tissue-specific gene expression regulation.

    PubMed

    O'Connor, Timothy R; Bailey, Timothy L

    2014-01-01

    Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules-CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for 'other' tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a 'nearest neighbor' heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps.

  11. Allelic imbalance identifies novel tissue specific cis-regulatory variation for human UGT2B15

    PubMed Central

    Sun, Chang; Southard, Catherine; Witonsky, David B.; Olopade, Olufunmilayo I.; Di Rienzo, Anna

    2010-01-01

    Allelic imbalance (AI) is a powerful tool to identify cis-regulatory variation for gene expression. UGT2B15 is an important enzyme involved in the metabolism of multiple endobiotics and xenobiotics. In this study, we measured the relative expression of two alleles at this gene by using SNP rs1902023:G>T. An excess of the G over the T allele was consistently observed in liver (P<0.001), but not in breast (P=0.06) samples, suggesting that SNPs in strong linkage disequilibrium with G253T regulate UGT2B15 expression in liver. Seven such SNPs were identified by resequencing the promoter and exon 1, which define two distinct haplotypes. Reporter gene assays confirmed that one haplotype displayed ~20% higher promoter activity compared to the other major haplotype in liver HepG2 (P<0.001), but not in breast MCF-7 (P=0.540) cells. Reporter gene assays with additional constructs pointed to rs34010522:G>T and rs35513228:C>T as the cis-regulatory variants; both SNPs were also evaluated in LNCaP and Caco-2 cells. By ChIP, we showed that the transcription factor Nrf2 binds to the region spanning rs34010522:G>T in all four cell lines. Our results provide a good example for how AI can be used to identify cis-regulatory variation and gain insights into the tissue specific regulation of gene expression. PMID:19847790

  12. MyoD reprogramming requires Six1 and Six4 homeoproteins: genome-wide cis-regulatory module analysis

    PubMed Central

    Santolini, Marc; Sakakibara, Iori; Gauthier, Morgane; Ribas-Aulinas, Francesc; Takahashi, Hirotaka; Sawasaki, Tatsuya; Mouly, Vincent; Concordet, Jean-Paul; Defossez, Pierre-Antoine; Hakim, Vincent; Maire, Pascal

    2016-01-01

    Myogenic regulatory factors of the MyoD family have the ability to reprogram differentiated cells toward a myogenic fate. In this study, we demonstrate that Six1 or Six4 are required for the reprogramming by MyoD of mouse embryonic fibroblasts (MEFs). Using microarray experiments, we found 761 genes under the control of both Six and MyoD. Using MyoD ChIPseq data and a genome-wide search for Six1/4 MEF3 binding sites, we found significant co-localization of binding sites for MyoD and Six proteins on over a thousand mouse genomic DNA regions. The combination of both datasets yielded 82 genes which are synergistically activated by Six and MyoD, with 96 associated MyoD+MEF3 putative cis-regulatory modules (CRMs). Fourteen out of 19 of the CRMs that we tested demonstrated in Luciferase assays a synergistic action also observed for their cognate gene. We searched putative binding sites on these CRMs using available databases and de novo search of conserved motifs and demonstrated that the Six/MyoD synergistic activation takes place in a feedforward way. It involves the recruitment of these two families of transcription factors to their targets, together with partner transcription factors, encoded by genes that are themselves activated by Six and MyoD, including Mef2, Pbx-Meis and EBF. PMID:27302134

  13. MyoD reprogramming requires Six1 and Six4 homeoproteins: genome-wide cis-regulatory module analysis.

    PubMed

    Santolini, Marc; Sakakibara, Iori; Gauthier, Morgane; Ribas-Aulinas, Francesc; Takahashi, Hirotaka; Sawasaki, Tatsuya; Mouly, Vincent; Concordet, Jean-Paul; Defossez, Pierre-Antoine; Hakim, Vincent; Maire, Pascal

    2016-10-14

    Myogenic regulatory factors of the MyoD family have the ability to reprogram differentiated cells toward a myogenic fate. In this study, we demonstrate that Six1 or Six4 are required for the reprogramming by MyoD of mouse embryonic fibroblasts (MEFs). Using microarray experiments, we found 761 genes under the control of both Six and MyoD. Using MyoD ChIPseq data and a genome-wide search for Six1/4 MEF3 binding sites, we found significant co-localization of binding sites for MyoD and Six proteins on over a thousand mouse genomic DNA regions. The combination of both datasets yielded 82 genes which are synergistically activated by Six and MyoD, with 96 associated MyoD+MEF3 putative cis-regulatory modules (CRMs). Fourteen out of 19 of the CRMs that we tested demonstrated in Luciferase assays a synergistic action also observed for their cognate gene. We searched putative binding sites on these CRMs using available databases and de novo search of conserved motifs and demonstrated that the Six/MyoD synergistic activation takes place in a feedforward way. It involves the recruitment of these two families of transcription factors to their targets, together with partner transcription factors, encoded by genes that are themselves activated by Six and MyoD, including Mef2, Pbx-Meis and EBF. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Characterization of the Cis-Regulatory Region of the Drosophila Homeotic Gene Sex Combs Reduced

    PubMed Central

    Gindhart-Jr., J. G.; King, A. N.; Kaufman, T. C.

    1995-01-01

    The Drosophila homeotic gene Sex combs reduced (Scr) controls the segmental identity of the labial and prothoracic segments in the embryo and adult. It encodes a sequence-specific transcription factor that controls, in concert with other gene products, differentiative pathways of tissues in which Scr is expressed. During embryogenesis, Scr accumulation is observed in a discrete spatiotemporal pattern that includes the labial and prothoracic ectoderm, the subesophageal ganglion of the ventral nerve cord and the visceral mesoderm of the anterior and posterior midgut. Previous analyses have demonstrated that breakpoint mutations located in a 75-kb interval, including the Scr transcription unit and 50 kb of upstream DNA, cause Scr misexpression during development, presumably because these mutations remove Scr cis-regulatory sequences from the proximity of the Scr promoter. To gain a better understanding of the regulatory interactions necessary for the control of Scr transcription during embryogenesis, we have begun a molecular analysis of the Scr regulatory interval. DNA fragments from this 75-kb region were subcloned into P-element vectors containing either an Scr-lacZ or hsp70-lacZ fusion gene, and patterns of reporter gene expression were assayed in transgenic embryos. Several fragments appear to contain Scr regulatory sequences, as they direct reporter gene expression in patterns similar to those normally observed for Scr, whereas other DNA fragments direct Scr reporter gene expression in developmentally interesting but non-Scr-like patterns during embryogenesis. Scr expression in some tissues appears to be controlled by multiple regulatory elements that are separated, in some cases, by more than 20 kb of intervening DNA. Interestingly, regulatory sequences that direct reporter gene expression in an Scr-like pattern in the anterior and posterior midgut are imbedded in the regulatory region of the segmentation gene fushi tarazu (ftz), which is normally located

  15. The evolution of heat shock protein sequences, cis-regulatory elements, and expression profiles in the eusocial Hymenoptera.

    PubMed

    Nguyen, Andrew D; Gotelli, Nicholas J; Cahan, Sara Helms

    2016-01-19

    The eusocial Hymenoptera have radiated across a wide range of thermal environments, exposing them to significant physiological stressors. We reconstructed the evolutionary history of three families of Heat Shock Proteins (Hsp90, Hsp70, Hsp40), the primary molecular chaperones protecting against thermal damage, across 12 Hymenopteran species and four other insect orders. We also predicted and tested for thermal inducibility of eight Hsps from the presence of cis-regulatory heat shock elements (HSEs). We tested whether Hsp induction patterns in ants were associated with different thermal environments. We found evidence for duplications, losses, and cis-regulatory changes in two of the three gene families. One member of the Hsp90 gene family, hsp83, duplicated basally in the Hymenoptera, with shifts in HSE motifs in the novel copy. Both copies were retained in bees, but ants retained only the novel HSE copy. For Hsp70, Hymenoptera lack the primary heat-inducible orthologue from Drosophila melanogaster and instead induce the cognate form, hsc70-4, which also underwent an early duplication. Episodic diversifying selection was detected along the branch predating the duplication of hsc70-4 and continued along one of the paralogue branches after duplication. Four out of eight Hsp genes were heat-inducible and matched the predictions based on presence of conserved HSEs. For the inducible homologues, the more thermally tolerant species, Pogonomyrmex barbatus, had greater Hsp basal expression and induction in response to heat stress than did the less thermally tolerant species, Aphaenogaster picea. Furthermore, there was no trade-off between basal expression and induction. Our results highlight the unique evolutionary history of Hsps in eusocial Hymenoptera, which has been shaped by gains, losses, and changes in cis-regulation. Ants, and most likely other Hymenoptera, utilize lineage-specific heat inducible Hsps, whose expression patterns are associated with adaptive

  16. Expression, subcellular localization, and cis-regulatory structure of duplicated phytoene synthase genes in melon (Cucumis melo L.).

    PubMed

    Qin, Xiaoqiong; Coku, Ardian; Inoue, Kentaro; Tian, Li

    2011-10-01

    Carotenoids perform many critical functions in plants, animals, and humans. It is therefore important to understand carotenoid biosynthesis and its regulation in plants. Phytoene synthase (PSY) catalyzes the first committed and rate-limiting step in carotenoid biosynthesis. While PSY is present as a single copy gene in Arabidopsis, duplicated PSY genes have been identified in many economically important monocot and dicot crops. CmPSY1 was previously identified from melon (Cucumis melo L.), but was not functionally characterized. We isolated a second PSY gene, CmPSY2, from melon in this work. CmPSY2 possesses a unique intron/exon structure that has not been observed in other plant PSYs. Both CmPSY1 and CmPSY2 are functional in vitro, but exhibit distinct expression patterns in different melon tissues and during fruit development, suggesting differential regulation of the duplicated melon PSY genes. In vitro chloroplast import assays verified the plastidic localization of CmPSY1 and CmPSY2 despite the lack of an obvious plastid target peptide in CmPSY2. Promoter motif analysis of the duplicated melon and tomato PSY genes and the Arabidopsis PSY revealed distinctive cis-regulatory structures of melon PSYs and identified gibberellin-responsive motifs in all PSYs except for SlPSY1, which has not been reported previously. Overall, these data provide new insights into the evolutionary history of plant PSY genes and the regulation of PSY expression by developmental and environmental signals that may involve different regulatory networks.

  17. Divergence in cis-regulatory sequences surrounding the opsin gene arrays of African cichlid fishes

    PubMed Central

    2011-01-01

    Background Divergence within cis-regulatory sequences may contribute to the adaptive evolution of gene expression, but functional alleles in these regions are difficult to identify without abundant genomic resources. Among African cichlid fishes, the differential expression of seven opsin genes has produced adaptive differences in visual sensitivity. Quantitative genetic analysis suggests that cis-regulatory alleles near the SWS2-LWS opsins may contribute to this variation. Here, we sequence BACs containing the opsin genes of two cichlids, Oreochromis niloticus and Metriaclima zebra. We use phylogenetic footprinting and shadowing to examine divergence in conserved non-coding elements, promoter sequences, and 3'-UTRs surrounding each opsin in search of candidate cis-regulatory sequences that influence cichlid opsin expression. Results We identified 20 conserved non-coding elements surrounding the opsins of cichlids and other teleosts, including one known enhancer and a retinal microRNA. Most conserved elements contained computationally-predicted binding sites that correspond to transcription factors that function in vertebrate opsin expression; O. niloticus and M. zebra were significantly divergent in two of these. Similarly, we found a large number of relevant transcription factor binding sites within each opsin's proximal promoter, and identified five opsins that were considerably divergent in both expression and the number of transcription factor binding sites shared between O. niloticus and M. zebra. We also found several microRNA target sites within the 3'-UTR of each opsin, including two 3'-UTRs that differ significantly between O. niloticus and M. zebra. Finally, we examined interspecific divergence among 18 phenotypically diverse cichlids from Lake Malawi for one conserved non-coding element, two 3'-UTRs, and five opsin proximal promoters. We found that all regions were highly conserved with some evidence of CRX transcription factor binding site turnover. We

  18. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  19. Quantitative functional interrelations within the cis-regulatory system of the S. purpuratus Endo16 gene.

    PubMed

    Yuh, C H; Moore, J G; Davidson, E H

    1996-12-01

    Embryonic expression of the Endo16 gene of Strongylocentrotus purpuratus is controlled by interactions with at least 13 different DNA-binding factors. These interactions occur within a cis-regulatory domain that extends about 2300 bp upstream from the transcription start site. A recent functional characterization of this domain reveals six different subregions, or cis-regulatory modules, each of which displays a specific regulatory subfunction when linked with the basal promoter and in some cases various other modules (C.-H. Yuh and E. Davidson (1996) Development 122, 1069-1082). In the present work, we analyzed quantitative time-course measurements of the CAT enzyme output of embryos bearing expression constructs controlled by various Endo16 regulatory modules, either singly or in combination. Three of these modules function positively in that, in isolation, each is capable of promoting expression in vegetal plate and adjacent cell lineages, though with different temporal profiles of activity. Models for the mode of interaction of the three positive modules with one another were tested by assuming mathematical relations that would generate, from the measured single module time courses, the experimentally observed profiles of activity obtained when the relevant modules are physically linked in the same construct. The generated and observed time functions were compared, and the differences were minimized by least squares adjustment of a scale parameter. When the modules were tested in context of the endogenous promoter region, one of the positive modules (A) was found to increase the output of the others (B and G), by a constant factor. In contrast, a solution in which the time-course data of modules A and B are multiplied by one another was required for the interrelations of the positive modules when a minimal SV40 promoter was used. One interpretation is that, in this construct, each module independently stimulates the basal transcription complex. We used a

  20. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  1. Divergence in cis-regulatory sequences surrounding the opsin gene arrays of African cichlid fishes.

    PubMed

    O'Quin, Kelly E; Smith, Daniel; Naseer, Zan; Schulte, Jane; Engel, Samuel D; Loh, Yong-Hwee E; Streelman, J Todd; Boore, Jeffrey L; Carleton, Karen L

    2011-05-09

    Divergence within cis-regulatory sequences may contribute to the adaptive evolution of gene expression, but functional alleles in these regions are difficult to identify without abundant genomic resources. Among African cichlid fishes, the differential expression of seven opsin genes has produced adaptive differences in visual sensitivity. Quantitative genetic analysis suggests that cis-regulatory alleles near the SWS2-LWS opsins may contribute to this variation. Here, we sequence BACs containing the opsin genes of two cichlids, Oreochromis niloticus and Metriaclima zebra. We use phylogenetic footprinting and shadowing to examine divergence in conserved non-coding elements, promoter sequences, and 3'-UTRs surrounding each opsin in search of candidate cis-regulatory sequences that influence cichlid opsin expression. We identified 20 conserved non-coding elements surrounding the opsins of cichlids and other teleosts, including one known enhancer and a retinal microRNA. Most conserved elements contained computationally-predicted binding sites that correspond to transcription factors that function in vertebrate opsin expression; O. niloticus and M. zebra were significantly divergent in two of these. Similarly, we found a large number of relevant transcription factor binding sites within each opsin's proximal promoter, and identified five opsins that were considerably divergent in both expression and the number of transcription factor binding sites shared between O. niloticus and M. zebra. We also found several microRNA target sites within the 3'-UTR of each opsin, including two 3'-UTRs that differ significantly between O. niloticus and M. zebra. Finally, we examined interspecific divergence among 18 phenotypically diverse cichlids from Lake Malawi for one conserved non-coding element, two 3'-UTRs, and five opsin proximal promoters. We found that all regions were highly conserved with some evidence of CRX transcription factor binding site turnover. We also found three

  2. Conservation and evolution of cis-regulatory systems in ascomycete fungi

    SciTech Connect

    Gasch, Audrey P.; Moses, Alan M.; Chiang, Derek Y.; Fraser, Hunter B.; Berardini, Mark; Eisen, Michael B.

    2004-03-15

    Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups from S. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa.

  3. Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

    PubMed

    Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

    2010-01-15

    Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short.

  4. The evolution of cichlid fish egg-spots is linked with a cis-regulatory change.

    PubMed

    Santos, M Emília; Braasch, Ingo; Boileau, Nicolas; Meyer, Britta S; Sauteur, Loïc; Böhne, Astrid; Belting, Heinz-Georg; Affolter, Markus; Salzburger, Walter

    2014-10-09

    The origin of novel phenotypic characters is a key component in organismal diversification; yet, the mechanisms underlying the emergence of such evolutionary novelties are largely unknown. Here we examine the origin of egg-spots, an evolutionary innovation of the most species-rich group of cichlids, the haplochromines, where these conspicuous male fin colour markings are involved in mating. Applying a combination of RNAseq, comparative genomics and functional experiments, we identify two novel pigmentation genes, fhl2a and fhl2b, and show that especially the more rapidly evolving b-paralog is associated with egg-spot formation. We further find that egg-spot bearing haplochromines, but not other cichlids, feature a transposable element in the cis-regulatory region of fhl2b. Using transgenic zebrafish, we finally demonstrate that this region shows specific enhancer activities in iridophores, a type of pigment cells found in egg-spots, suggesting that a cis-regulatory change is causally linked to the gain of expression in egg-spot bearing haplochromines.

  5. Cis-regulatory elements affecting the Nanos gene promoter in the germline stem cells.

    PubMed

    Ali, Ijaz; ur Rehman, Muti; Rashid, Farzana; Khan, Sanaullah; Iqbal, Aqib; Laixin, Xia; ud din Ahmed, Naeem; Swati, A Zahoor

    2010-02-15

    Drosophila Nanos gene plays an important role in stem cell maintenance and body patterning. With the purpose of understanding the cis-regulatory machinery involved in the transcription of the nanos gene in the germline stem cells, we examined its promoter fragment from +97 to -708 relative to the transcription start site and identified enhancer elements located between position -108 and +97. Experiments with transgenic flies revealed that the minimal promoter (from -108 to +20) is sufficient in the germline stem cells for the GFP expression in transgenic Drosophila. Moreover, the flag-tagged nanos protein blotting experiments revealed that a short promoter fragment plus some sequences of the nos 5'UTR spanning -108 to +97 could efficiently drive the expression of the flag-tagged [Nos-mRNA-nos3'UTR] transgene in transgenic flies indicating that the cis-regulatory elements located between positions -108 and +97 of the nanos promoter are sufficient to fully transcribe the nanos mRNA. Deletion of the identified cis-acting sequences from the promoter rendered it non-functional as it could no longer transcribe the nanos mRNA in transgenic flies thus revealing the importance of these sequences for the transcription of the nanos gene. Copyright 2009 Elsevier B.V. All rights reserved.

  6. Autoregulatory feedback controls sequential action of cis-regulatory modules at the brinker locus.

    PubMed

    Dunipace, Leslie; Saunders, Abbie; Ashe, Hilary L; Stathopoulos, Angelike

    2013-09-16

    cis-regulatory modules (CRMs) act sequentially to regulate temporal expression of genes, but how the switch from one to the next is accomplished is not well understood. To provide insight, here we investigate the cis-regulatory system controlling brinker (brk) expression in the Drosophila embryo. Two distally located CRMs support expression at different times, while a promoter-proximal element (PPE) is required to support their action. In the absence of Brk protein itself or upon mutagenesis of Brk binding sites within the PPE, the late-acting CRM, specifically, is delayed. This block to late-acting CRM function appears to be removed when the early-acting CRM is also deleted. These results demonstrate that autoregulatory feedback is necessary for the early-acting CRM to disengage from the promoter so that the late-acting CRM may act. Autoregulation may be a commonly used mechanism to control sequential CRM action necessary for dynamic gene expression throughout the course of development. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  7. Autoregulatory Feedback Controls Sequential Action of cis-Regulatory Modules at the brinker Locus

    PubMed Central

    Dunipace, Leslie; Saunders, Abbie; Ashe, Hilary L.; Stathopoulos, Angelike

    2013-01-01

    Summary cis-regulatory modules (CRMs) act sequentially to regulate temporal expression of genes, but how the switch from one to the next is accomplished is not well understood. To provide insight, here we investigate the cis-regulatory system controlling brinker (brk) expression in the Drosophila embryo. Two distally located CRMs support expression at different times, while a promoter-proximal element (PPE) is required to support their action. In the absence of Brk protein itself or upon mutagenesis of Brk binding sites within the PPE, the late-acting CRM, specifically, is delayed. This block to late-acting CRM function appears to be removed when the early-acting CRM is also deleted. These results demonstrate that autoregulatory feedback is necessary for the early-acting CRM to disengage from the promoter so that the late-acting CRM may act. Autoregulation may be a commonly used mechanism to control sequential CRM action necessary for dynamic gene expression throughout the course of development. PMID:24044892

  8. How petals change their spots: cis-regulatory re-wiring in Clarkia (Onagraceae).

    PubMed

    Martins, Talline R; Jiang, Peng; Rausher, Mark D

    2016-09-06

    A long-standing question in evolutionary developmental biology is how new traits evolve. Although most floral pigmentation studies have focused on how pigment intensity and composition diversify, few, if any, have explored how a pattern element can shift position. In the present study, we examine the genetic changes underlying shifts in the position of petal spots in Clarkia. Comparative transcriptome analyses were used to identify potential candidate genes responsible for spot formation. Co-segregation analyses in F2 individuals segregating for different spot positions, quantitative PCR, and pyrosequencing, were used to confirm the role of the candidate gene in determining spot position. Transient expression assays were used to identify the expression domain of different alleles. An R2R3Myb transcription factor (CgMyb1) activated spot formation, and different alleles of CgMyb1 were expressed in different domains, leading to spot formation in different petal locations. Reporter assays revealed that promoters from different alleles determine different locations of expression. The evolutionary shift in spot position is due to one or more cis-regulatory changes in the promoter of CgMyb1, indicating that shifts in pattern element position can be caused by changes in a single gene, and that cis-regulatory rewiring can be used to alter the relative position of an existing character.

  9. Cell-type specific cis-regulatory networks: insights from Hox transcription factors.

    PubMed

    Polychronidou, Maria; Lohmann, Ingrid

    2013-01-01

    Hox proteins are a prominent class of transcription factors that specify cell and tissue identities in animal embryos. In sharp contrast to tissue-specifically expressed transcription factors, which coordinate regulatory pathways leading to the differentiation of a selected tissue, Hox proteins are active in many different cell types but are nonetheless able to differentially regulate gene expression in a context-dependent manner. This particular feature makes Hox proteins ideal candidates for elucidating the mechanisms employed by transcription factors to achieve tissue-specific functions in multi-cellular organisms. Here we discuss how the recent genome-wide identification and characterization of Hox cis-regulatory elements has provided insight concerning the molecular mechanisms underlying the high spatiotemporal specificity of Hox proteins. In particular, it was shown that Hox transcriptional outputs depend on the cell-type specific interplay of the different Hox proteins with co-regulatory factors as well as with epigenetic modifiers. Based on these observations it becomes clear that cell-type specific approaches are required for dissecting the tissue-specific Hox regulatory code. Identification and comparative analysis of Hox cis-regulatory elements driving target gene expression in different cell types in combination with analyses on how cofactors, epigenetic modifiers and protein-protein interactions mediate context-dependent Hox function will elucidate the mechanistic basis of tissue-specific gene regulation.

  10. Relocation Facilitates the Acquisition of Short Cis-Regulatory Regions that Drive the Expression of Retrogenes during Spermatogenesis in Drosophila

    PubMed Central

    Sorourian, Mehran; Kunte, Mansi M.; Domingues, Susana; Gallach, Miguel; Özdil, Fulya; Río, Javier; Betrán, Esther

    2014-01-01

    Retrogenes are functional processed copies of genes that originate via the retrotranscription of an mRNA intermediate and often exhibit testis-specific expression. Although this expression pattern appears to be favored by selection, the origin of such expression bias remains unexplained. Here, we study the regulation of two young testis-specific Drosophila retrogenes, Dntf-2r and Pros28.1A, using genetic transformation and the enhanced green fluorescent protein reporter gene in Drosophila melanogaster. We show that two different short (<24 bp) regions upstream of the transcription start sites (TSSs) act as testis-specific regulatory motifs in these genes. The Dntf-2r regulatory region is similar to the known β2 tubulin 14-bp testis motif (β2-tubulin gene upstream element 1 [β2-UE1]). Comparative sequence analyses reveal that this motif was already present before the Dntf-2r insertion and was likely driving the transcription of a noncoding RNA. We also show that the β2-UE1 occurs in the regulatory regions of other testis-specific retrogenes, and is functional in either orientation. In contrast, the Pros28.1A testes regulatory region in D. melanogaster appears to be novel. Only Pros28.1B, an older paralog of the Pros28.1 gene family, seems to carry a similar regulatory sequence. It is unclear how the Pros28.1A regulatory region was acquired in D. melanogaster, but it might have evolved de novo from within a region that may have been preprimed for testes expression. We conclude that relocation is critical for the evolutionary origin of male germline-specific cis-regulatory regions of retrogenes because expression depends on either the site of the retrogene insertion or the sequence changes close to the TSS thereafter. As a consequence we infer that positive selection will play a role in the evolution of these regulatory regions and can often act from the moment of the retrocopy insertion. PMID:24855141

  11. Evolutionary analysis of the cis-regulatory region of the spicule matrix gene SM50 in strongylocentrotid sea urchins.

    PubMed

    Walters, Jenna; Binkley, Elaine; Haygood, Ralph; Romano, Laura A

    2008-03-15

    An evolutionary analysis of transcriptional regulation is essential to understanding the molecular basis of phenotypic diversity. The sea urchin is an ideal system in which to explore the functional consequence of variation in cis-regulatory sequences. We are particularly interested in the evolution of genes involved in the patterning and synthesis of its larval skeleton. This study focuses on the cis-regulatory region of SM50, which has already been characterized to a considerable extent in the purple sea urchin, Strongylocentrotus purpuratus. We have isolated the cis-regulatory region from 15 individuals of S. purpuratus as well as seven closely related species in the family Strongylocentrotidae. We have performed a variety of statistical tests and present evidence that the cis-regulatory elements upstream of the SM50 gene have been subject to positive selection along the lineage leading to S. purpuratus. In addition, we have performed electrophoretic mobility shift assays (EMSAs) and demonstrate that nucleotide substitutions within Element C affect the ability of nuclear proteins to bind to this cis-regulatory element among members of the family Strongylocentrotidae. We speculate that such changes in SM50 and other genes could accumulate to produce altered patterns of gene expression with functional consequences during skeleton formation.

  12. Profiling of conserved non-coding elements upstream of SHOX and functional characterisation of the SHOX cis-regulatory landscape

    PubMed Central

    Verdin, Hannah; Fernández-Miñán, Ana; Benito-Sanz, Sara; Janssens, Sandra; Callewaert, Bert; Waele, Kathleen De; Schepper, Jean De; François, Inge; Menten, Björn; Heath, Karen E.; Gómez-Skarmeta, José Luis; Baere, Elfride De

    2015-01-01

    Genetic defects such as copy number variations (CNVs) in non-coding regions containing conserved non-coding elements (CNEs) outside the transcription unit of their target gene, can underlie genetic disease. An example of this is the short stature homeobox (SHOX) gene, regulated by seven CNEs located downstream and upstream of SHOX, with proven enhancer capacity in chicken limbs. CNVs of the downstream CNEs have been reported in many idiopathic short stature (ISS) cases, however, only recently have a few CNVs of the upstream enhancers been identified. Here, we set out to provide insight into: (i) the cis-regulatory role of these upstream CNEs in human cells, (ii) the prevalence of upstream CNVs in ISS, and (iii) the chromatin architecture of the SHOX cis-regulatory landscape in chicken and human cells. Firstly, luciferase assays in human U2OS cells, and 4C-seq both in chicken limb buds and human U2OS cells, demonstrated cis-regulatory enhancer capacities of the upstream CNEs. Secondly, CNVs of these upstream CNEs were found in three of 501 ISS patients. Finally, our 4C-seq interaction map of the SHOX region reveals a cis-regulatory domain spanning more than 1 Mb and harbouring putative new cis-regulatory elements. PMID:26631348

  13. Does Positive Selection Drive Transcription Factor Binding Site Turnover? A Test with Drosophila Cis-Regulatory Modules

    PubMed Central

    He, Bin Z.; Holloway, Alisha K.; Maerkl, Sebastian J.; Kreitman, Martin

    2011-01-01

    Transcription factor binding site(s) (TFBS) gain and loss (i.e., turnover) is a well-documented feature of cis-regulatory module (CRM) evolution, yet little attention has been paid to the evolutionary force(s) driving this turnover process. The predominant view, motivated by its widespread occurrence, emphasizes the importance of compensatory mutation and genetic drift. Positive selection, in contrast, although it has been invoked in specific instances of adaptive gene expression evolution, has not been considered as a general alternative to neutral compensatory evolution. In this study we evaluate the two hypotheses by analyzing patterns of single nucleotide polymorphism in the TFBS of well-characterized CRM in two closely related Drosophila species, Drosophila melanogaster and Drosophila simulans. An important feature of the analysis is classification of TFBS mutations according to the direction of their predicted effect on binding affinity, which allows gains and losses to be evaluated independently along the two phylogenetic lineages. The observed patterns of polymorphism and divergence are not compatible with neutral evolution for either class of mutations. Instead, multiple lines of evidence are consistent with contributions of positive selection to TFBS gain and loss as well as purifying selection in its maintenance. In discussion, we propose a model to reconcile the finding of selection driving TFBS turnover with constrained CRM function over long evolutionary time. PMID:21572512

  14. Exome Sequencing and cis-Regulatory Mapping Identify Mutations in MAK, a Gene Encoding a Regulator of Ciliary Length, as a Cause of Retinitis Pigmentosa

    PubMed Central

    Özgül, Rıza Köksal; Siemiatkowska, Anna M.; Yücel, Didem; Myers, Connie A.; Collin, Rob W.J.; Zonneveld, Marijke N.; Beryozkin, Avigail; Banin, Eyal; Hoyng, Carel B.; van den Born, L. Ingeborgh; Bose, Ron; Shen, Wei; Sharon, Dror; Cremers, Frans P.M.; Klevering, B. Jeroen; den Hollander, Anneke I.; Corbo, Joseph C.

    2011-01-01

    A fundamental challenge in analyzing exome-sequence data is distinguishing pathogenic mutations from background polymorphisms. To address this problem in the context of a genetically heterogeneous disease, retinitis pigmentosa (RP), we devised a candidate-gene prioritization strategy called cis-regulatory mapping that utilizes ChIP-seq data for the photoreceptor transcription factor CRX to rank candidate genes. Exome sequencing combined with this approach identified a homozygous nonsense mutation in male germ cell-associated kinase (MAK) in the single affected member of a consanguineous Turkish family with RP. MAK encodes a cilium-associated mitogen-activated protein kinase whose function is conserved from the ciliated alga, Chlamydomonas reinhardtii, to humans. Mutations in MAK orthologs in mice and other model organisms result in abnormally long cilia and, in mice, rapid photoreceptor degeneration. Subsequent sequence analyses of additional individuals with RP identified five probands with missense mutations in MAK. Two of these mutations alter amino acids that are conserved in all known kinases, and an in vitro kinase assay indicates that these mutations result in a loss of kinase activity. Thus, kinase activity appears to be critical for MAK function in humans. This study highlights a previously underappreciated role for CRX as a direct transcriptional regulator of ciliary genes in photoreceptors. In addition, it demonstrates the effectiveness of CRX-based cis-regulatory mapping in prioritizing candidate genes from exome data and suggests that this strategy should be generally applicable to a range of retinal diseases. PMID:21835304

  15. Establishment of a Developmental Compartment Requires Interactions between Three Synergistic Cis-regulatory Modules

    PubMed Central

    Bieli, Dimitri; Kanca, Oguz; Requena, David; Hamaratoglu, Fisun; Gohl, Daryl; Schedl, Paul; Affolter, Markus; Slattery, Matthew; Müller, Martin; Estella, Carlos

    2015-01-01

    The subdivision of cell populations in compartments is a key event during animal development. In Drosophila, the gene apterous (ap) divides the wing imaginal disc in dorsal vs ventral cell lineages and is required for wing formation. ap function as a dorsal selector gene has been extensively studied. However, the regulation of its expression during wing development is poorly understood. In this study, we analyzed ap transcriptional regulation at the endogenous locus and identified three cis-regulatory modules (CRMs) essential for wing development. Only when the three CRMs are combined, robust ap expression is obtained. In addition, we genetically and molecularly analyzed the trans-factors that regulate these CRMs. Our results propose a three-step mechanism for the cell lineage compartment expression of ap that includes initial activation, positive autoregulation and Trithorax-mediated maintenance through separable CRMs. PMID:26468882

  16. Massively parallel cis-regulatory analysis in the mammalian central nervous system.

    PubMed

    Shen, Susan Q; Myers, Connie A; Hughes, Andrew E O; Byrne, Leah C; Flannery, John G; Corbo, Joseph C

    2016-02-01

    Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell-derived organoids.

  17. The Cis-regulatory Logic of the Mammalian Photoreceptor Transcriptional Network

    PubMed Central

    Hsiau, Timothy H.-C.; Diaconu, Claudiu; Myers, Connie A.; Lee, Jongwoo; Cepko, Constance L.; Corbo, Joseph C.

    2007-01-01

    The photoreceptor cells of the retina are subject to a greater number of genetic diseases than any other cell type in the human body. The majority of more than 120 cloned human blindness genes are highly expressed in photoreceptors. In order to establish an integrative framework in which to understand these diseases, we have undertaken an experimental and computational analysis of the network controlled by the mammalian photoreceptor transcription factors, Crx, Nrl, and Nr2e3. Using microarray and in situ hybridization datasets we have produced a model of this network which contains over 600 genes, including numerous retinal disease loci as well as previously uncharacterized photoreceptor transcription factors. To elucidate the connectivity of this network, we devised a computational algorithm to identify the photoreceptor-specific cis-regulatory elements (CREs) mediating the interactions between these transcription factors and their target genes. In vivo validation of our computational predictions resulted in the discovery of 19 novel photoreceptor-specific CREs near retinal disease genes. Examination of these CREs permitted the definition of a simple cis-regulatory grammar rule associated with high-level expression. To test the generality of this rule, we used an expanded form of it as a selection filter to evolve photoreceptor CREs from random DNA sequences in silico. When fused to fluorescent reporters, these evolved CREs drove strong, photoreceptor-specific expression in vivo. This study represents the first systematic identification and in vivo validation of CREs in a mammalian neuronal cell type and lays the groundwork for a systems biology of photoreceptor transcriptional regulation. PMID:17653270

  18. Evolution of metal hyperaccumulation required cis-regulatory changes and triplication of HMA4.

    PubMed

    Hanikenne, Marc; Talke, Ina N; Haydon, Michael J; Lanz, Christa; Nolte, Andrea; Motte, Patrick; Kroymann, Juergen; Weigel, Detlef; Krämer, Ute

    2008-05-15

    Little is known about the types of mutations underlying the evolution of species-specific traits. The metal hyperaccumulator Arabidopsis halleri has the rare ability to colonize heavy-metal-polluted soils, and, as an extremophile sister species of Arabidopsis thaliana, it is a powerful model for research on adaptation. A. halleri naturally accumulates and tolerates leaf concentrations as high as 2.2% zinc and 0.28% cadmium in dry biomass. On the basis of transcriptomics studies, metal hyperaccumulation in A. halleri has been associated with more than 30 candidate genes that are expressed at higher levels in A. halleri than in A. thaliana. Some of these genes have been genetically mapped to broad chromosomal segments of between 4 and 24 cM co-segregating with Zn and Cd hypertolerance. However, the in planta loss-of-function approaches required to demonstrate the contribution of a given candidate gene to metal hyperaccumulation or hypertolerance have not been pursued to date. Using RNA interference to downregulate HMA4 (HEAVY METAL ATPASE 4) expression, we show here that Zn hyperaccumulation and full hypertolerance to Cd and Zn in A. halleri depend on the metal pump HMA4. Contrary to a postulated global trans regulatory factor governing high expression of numerous metal hyperaccumulation genes, we demonstrate that enhanced expression of HMA4 in A. halleri is attributable to a combination of modified cis-regulatory sequences and copy number expansion, in comparison to A. thaliana. Transfer of an A. halleri HMA4 gene to A. thaliana recapitulates Zn partitioning into xylem vessels and the constitutive transcriptional upregulation of Zn deficiency response genes characteristic of Zn hyperaccumulators. Our results demonstrate the importance of cis-regulatory mutations and gene copy number expansion in the evolution of a complex naturally selected extreme trait. The elucidation of a natural strategy for metal hyperaccumulation enables the rational design of technologies

  19. D-MATRIX: a web tool for constructing weight matrix of conserved DNA motifs.

    PubMed

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-07-27

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D-MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co-regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos-box cis-regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D-MATRIX tool is accessible through the CIMAP domain network. http://203.190.147.116/dmatrix/

  20. Cis-Regulatory Variants Affect CHRNA5 mRNA Expression in Populations of African and European Ancestry

    PubMed Central

    Wang, Jen-Chyong; Spiegel, Noah; Bertelsen, Sarah; Le, Nhung; McKenna, Nicholas; Budde, John P.; Harari, Oscar; Kapoor, Manav; Brooks, Andrew; Hancock, Dana; Tischfield, Jay; Foroud, Tatiana; Bierut, Laura J.; Steinbach, Joe Henry; Edenberg, Howard J.; Traynor, Bryan J.; Goate, Alison M.

    2013-01-01

    Variants within the gene cluster encoding α3, α5, and β4 nicotinic receptor subunits are major risk factors for substance dependence. The strongest impact on risk is associated with variation in the CHRNA5 gene, where at least two mechanisms are at work: amino acid variation and altered mRNA expression levels. The risk allele of the non-synonymous variant (rs16969968; D398N) primarily occurs on the haplotype containing the low mRNA expression allele. In populations of European ancestry, there are approximately 50 highly correlated variants in the CHRNA5-CHRNA3-CHRNB4 gene cluster and the adjacent PSMA4 gene region that are associated with CHRNA5 mRNA levels. It is not clear which of these variants contribute to the changes in CHRNA5 transcript level. Because populations of African ancestry have reduced linkage disequilibrium among variants spanning this gene cluster, eQTL mapping in subjects of African ancestry could potentially aid in defining the functional variants that affect CHRNA5 mRNA levels. We performed quantitative allele specific gene expression using frontal cortices derived from 49 subjects of African ancestry and 111 subjects of European ancestry. This method measures allele-specific transcript levels in the same individual, which eliminates other biological variation that occurs when comparing expression levels between different samples. This analysis confirmed that substance dependence associated variants have a direct cis-regulatory effect on CHRNA5 transcript levels in human frontal cortices of African and European ancestry and identified 10 highly correlated variants, located in a 9 kb region, that are potential functional variants modifying CHRNA5 mRNA expression levels. PMID:24303001

  1. Cis-regulatory variants affect CHRNA5 mRNA expression in populations of African and European ancestry.

    PubMed

    Wang, Jen-Chyong; Spiegel, Noah; Bertelsen, Sarah; Le, Nhung; McKenna, Nicholas; Budde, John P; Harari, Oscar; Kapoor, Manav; Brooks, Andrew; Hancock, Dana; Tischfield, Jay; Foroud, Tatiana; Bierut, Laura J; Steinbach, Joe Henry; Edenberg, Howard J; Traynor, Bryan J; Goate, Alison M

    2013-01-01

    Variants within the gene cluster encoding α3, α5, and β4 nicotinic receptor subunits are major risk factors for substance dependence. The strongest impact on risk is associated with variation in the CHRNA5 gene, where at least two mechanisms are at work: amino acid variation and altered mRNA expression levels. The risk allele of the non-synonymous variant (rs16969968; D398N) primarily occurs on the haplotype containing the low mRNA expression allele. In populations of European ancestry, there are approximately 50 highly correlated variants in the CHRNA5-CHRNA3-CHRNB4 gene cluster and the adjacent PSMA4 gene region that are associated with CHRNA5 mRNA levels. It is not clear which of these variants contribute to the changes in CHRNA5 transcript level. Because populations of African ancestry have reduced linkage disequilibrium among variants spanning this gene cluster, eQTL mapping in subjects of African ancestry could potentially aid in defining the functional variants that affect CHRNA5 mRNA levels. We performed quantitative allele specific gene expression using frontal cortices derived from 49 subjects of African ancestry and 111 subjects of European ancestry. This method measures allele-specific transcript levels in the same individual, which eliminates other biological variation that occurs when comparing expression levels between different samples. This analysis confirmed that substance dependence associated variants have a direct cis-regulatory effect on CHRNA5 transcript levels in human frontal cortices of African and European ancestry and identified 10 highly correlated variants, located in a 9 kb region, that are potential functional variants modifying CHRNA5 mRNA expression levels.

  2. Intronic Cis-Regulatory Modules Mediate Tissue-Specific and Microbial Control of angptl4/fiaf Transcription

    PubMed Central

    Camp, J. Gray; Jazwa, Amelia L.; Trent, Chad M.; Rawls, John F.

    2012-01-01

    The intestinal microbiota enhances dietary energy harvest leading to increased fat storage in adipose tissues. This effect is caused in part by the microbial suppression of intestinal epithelial expression of a circulating inhibitor of lipoprotein lipase called Angiopoietin-like 4 (Angptl4/Fiaf). To define the cis-regulatory mechanisms underlying intestine-specific and microbial control of Angptl4 transcription, we utilized the zebrafish system in which host regulatory DNA can be rapidly analyzed in a live, transparent, and gnotobiotic vertebrate. We found that zebrafish angptl4 is transcribed in multiple tissues including the liver, pancreatic islet, and intestinal epithelium, which is similar to its mammalian homologs. Zebrafish angptl4 is also specifically suppressed in the intestinal epithelium upon colonization with a microbiota. In vivo transgenic reporter assays identified discrete tissue-specific regulatory modules within angptl4 intron 3 sufficient to drive expression in the liver, pancreatic islet β-cells, or intestinal enterocytes. Comparative sequence analyses and heterologous functional assays of angptl4 intron 3 sequences from 12 teleost fish species revealed differential evolution of the islet and intestinal regulatory modules. High-resolution functional mapping and site-directed mutagenesis defined the minimal set of regulatory sequences required for intestinal activity. Strikingly, the microbiota suppressed the transcriptional activity of the intestine-specific regulatory module similar to the endogenous angptl4 gene. These results suggest that the microbiota might regulate host intestinal Angptl4 protein expression and peripheral fat storage by suppressing the activity of an intestine-specific transcriptional enhancer. This study provides a useful paradigm for understanding how microbial signals interact with tissue-specific regulatory networks to control the activity and evolution of host gene transcription. PMID:22479192

  3. PReMod: a database of genome-wide mammalian cis-regulatory module predictions.

    PubMed

    Ferretti, Vincent; Poitras, Christian; Bergeron, Dominique; Coulombe, Benoit; Robert, François; Blanchette, Mathieu

    2007-01-01

    We describe PReMod, a new database of genome-wide cis-regulatory module (CRM) predictions for both the human and the mouse genomes. The prediction algorithm, described previously in Blanchette et al. (2006) Genome Res., 16, 656-668, exploits the fact that many known CRMs are made of clusters of phylogenetically conserved and repeated transcription factors (TF) binding sites. Contrary to other existing databases, PReMod is not restricted to modules located proximal to genes, but in fact mostly contains distal predicted CRMs (pCRMs). Through its web interface, PReMod allows users to (i) identify pCRMs around a gene of interest; (ii) identify pCRMs that have binding sites for a given TF (or a set of TFs) or (iii) download the entire dataset for local analyses. Queries can also be refined by filtering for specific chromosomal regions, for specific regions relative to genes or for the presence of CpG islands. The output includes information about the binding sites predicted within the selected pCRMs, and a graphical display of their distribution within the pCRMs. It also provides a visual depiction of the chromosomal context of the selected pCRMs in terms of neighboring pCRMs and genes, all of which are linked to the UCSC Genome Browser and the NCBI. PReMod: http://genomequebec.mcgill.ca/PReMod.

  4. Genetic Analysis of Transvection Effects Involving Cis-Regulatory Elements of the Drosophila Ultrabithorax Gene

    PubMed Central

    Micol, J. L.; Castelli-Gair, J. E.; Garcia-Bellido, A.

    1990-01-01

    The Ultrabithorax (Ubx) gene of Drosophila melanogaster contains two functionally distinguishable regions: the protein-coding Ubx transcription unit and, upstream of it, the transcribed but non-protein-coding bxd region. Numerous recessive, partial loss-of-function mutations which appear to be regulatory mutations map within the bxd region and within the introns of the Ubx transcription unit. In addition, mutations within the Ubx unit exons are known and most of these behave as null alleles. Ubx(1) is one such allele. We have confirmed that, although the Ubx(1) allele does not produce detectable Ubx proteins (UBX), it does retain other genetic functions detectable by their effects on the expression of a paired, homologous Ubx allele, i.e., by transvection. We have extended previous analyses made by E. B. Lewis by mapping the critical elements of the Ubx gene which participate in transvection effects. Our results show that the Ubx(1) allele retains wild-type functions whose effectiveness can be reduced (1) by additional cis mutations in the bxd region or in introns of the Ubx transcription unit, as well as (2) by rearrangements disturbing pairing between homologous Ubx genes. Our results suggest that those remnant functions in Ubx(1) are able to modulate the activity of the allele located in the homologous chromosome. We discuss the normal cis regulatory role of these functions involved in trans interactions between homologous Ubx genes, as well as the implications of our results for the current models on transvection. PMID:2123161

  5. Dual functionality of cis-regulatory elements as developmental enhancers and Polycomb response elements.

    PubMed

    Erceg, Jelena; Pakozdi, Tibor; Marco-Ferreres, Raquel; Ghavi-Helm, Yad; Girardot, Charles; Bracken, Adrian P; Furlong, Eileen E M

    2017-03-15

    Developmental gene expression is tightly regulated through enhancer elements, which initiate dynamic spatio-temporal expression, and Polycomb response elements (PREs), which maintain stable gene silencing. These two cis-regulatory functions are thought to operate through distinct dedicated elements. By examining the occupancy of the Drosophila pleiohomeotic repressive complex (PhoRC) during embryogenesis, we revealed extensive co-occupancy at developmental enhancers. Using an established in vivo assay for PRE activity, we demonstrated that a subset of characterized developmental enhancers can function as PREs, silencing transcription in a Polycomb-dependent manner. Conversely, some classic Drosophila PREs can function as developmental enhancers in vivo, activating spatio-temporal expression. This study therefore uncovers elements with dual function: activating transcription in some cells (enhancers) while stably maintaining transcriptional silencing in others (PREs). Given that enhancers initiate spatio-temporal gene expression, reuse of the same elements by the Polycomb group (PcG) system may help fine-tune gene expression and ensure the timely maintenance of cell identities. © 2017 Erceg et al.; Published by Cold Spring Harbor Laboratory Press.

  6. Distal cis-regulatory elements are required for tissue-specific expression of enamelin (Enam)

    PubMed Central

    Hu, Yuanyuan; Papagerakis, Petros; Ye, Ling; Feng, Jerry Q.; Simmer, James P.; Hu, Jan C-C.

    2009-01-01

    Enamel formation is orchestrated by the sequential expression of genes encoding enamel matrix proteins; however, the mechanisms sustaining the spatio–temporal order of gene transcription during amelogenesis are poorly understood. The aim of this study was to characterize the cis-regulatory sequences necessary for normal expression of enamelin (Enam). Several enamelin transcription regulatory regions, showing high sequence homology among species, were identified. DNA constructs containing 5.2 or 3.9 kb regions upstream of the enamelin translation initiation site were linked to a LacZ reporter and used to generate transgenic mice. Only the 5.2-Enam–LacZ construct was sufficient to recapitulate the endogenous pattern of enamelin tooth-specific expression. The 3.9-Enam–LacZ transgenic lines showed no expression in dental cells, but ectopic β-galactosidase activity was detected in osteoblasts. Potential transcription factor-binding sites were identified that may be important in controlling enamelin basal promoter activity and in conferring enamelin tissue-specific expression. Our study provides new insights into regulatory mechanisms governing enamelin expression. PMID:18353004

  7. Directed Network Motifs in Alzheimer’s Disease and Mild Cognitive Impairment

    PubMed Central

    Friedman, Eric J.; Young, Karl; Tremper, Graham; Liang, Jason; Landsberg, Adam S.; Schuff, Norbert

    2015-01-01

    Directed network motifs are the building blocks of complex networks, such as human brain networks, and capture deep connectivity information that is not contained in standard network measures. In this paper we present the first application of directed network motifs in vivo to human brain networks, utilizing recently developed directed progression networks which are built upon rates of cortical thickness changes between brain regions. This is in contrast to previous studies which have relied on simulations and in vitro analysis of non-human brains. We show that frequencies of specific directed network motifs can be used to distinguish between patients with Alzheimer’s disease (AD) and normal control (NC) subjects. Especially interesting from a clinical standpoint, these motif frequencies can also distinguish between subjects with mild cognitive impairment who remained stable over three years (MCI) and those who converted to AD (CONV). Furthermore, we find that the entropy of the distribution of directed network motifs increased from MCI to CONV to AD, implying that the distribution of pathology is more structured in MCI but becomes less so as it progresses to CONV and further to AD. Thus, directed network motifs frequencies and distributional properties provide new insights into the progression of Alzheimer’s disease as well as new imaging markers for distinguishing between normal controls, stable mild cognitive impairment, MCI converters and Alzheimer’s disease. PMID:25879535

  8. Genome-wide computational analysis reveals cardiomyocyte-specific transcriptional Cis-regulatory motifs that enable efficient cardiac gene therapy.

    PubMed

    Rincon, Melvin Y; Sarcar, Shilpita; Danso-Abeam, Dina; Keyaerts, Marleen; Matrai, Janka; Samara-Kuko, Ermira; Acosta-Sanchez, Abel; Athanasopoulos, Takis; Dickson, George; Lahoutte, Tony; De Bleser, Pieter; VandenDriessche, Thierry; Chuah, Marinee K

    2015-01-01

    Gene therapy is a promising emerging therapeutic modality for the treatment of cardiovascular diseases and hereditary diseases that afflict the heart. Hence, there is a need to develop robust cardiac-specific expression modules that allow for stable expression of the gene of interest in cardiomyocytes. We therefore explored a new approach based on a genome-wide bioinformatics strategy that revealed novel cardiac-specific cis-acting regulatory modules (CS-CRMs). These transcriptional modules contained evolutionary-conserved clusters of putative transcription factor binding sites that correspond to a "molecular signature" associated with robust gene expression in the heart. We then validated these CS-CRMs in vivo using an adeno-associated viral vector serotype 9 that drives a reporter gene from a quintessential cardiac-specific α-myosin heavy chain promoter. Most de novo designed CS-CRMs resulted in a >10-fold increase in cardiac gene expression. The most robust CRMs enhanced cardiac-specific transcription 70- to 100-fold. Expression was sustained and restricted to cardiomyocytes. We then combined the most potent CS-CRM4 with a synthetic heart and muscle-specific promoter (SPc5-12) and obtained a significant 20-fold increase in cardiac gene expression compared to the cytomegalovirus promoter. This study underscores the potential of rational vector design to improve the robustness of cardiac gene therapy.

  9. Genome-wide Computational Analysis Reveals Cardiomyocyte-specific Transcriptional Cis-regulatory Motifs That Enable Efficient Cardiac Gene Therapy

    PubMed Central

    Rincon, Melvin Y; Sarcar, Shilpita; Danso-Abeam, Dina; Keyaerts, Marleen; Matrai, Janka; Samara-Kuko, Ermira; Acosta-Sanchez, Abel; Athanasopoulos, Takis; Dickson, George; Lahoutte, Tony; De Bleser, Pieter; VandenDriessche, Thierry; Chuah, Marinee K

    2015-01-01

    Gene therapy is a promising emerging therapeutic modality for the treatment of cardiovascular diseases and hereditary diseases that afflict the heart. Hence, there is a need to develop robust cardiac-specific expression modules that allow for stable expression of the gene of interest in cardiomyocytes. We therefore explored a new approach based on a genome-wide bioinformatics strategy that revealed novel cardiac-specific cis-acting regulatory modules (CS-CRMs). These transcriptional modules contained evolutionary-conserved clusters of putative transcription factor binding sites that correspond to a “molecular signature” associated with robust gene expression in the heart. We then validated these CS-CRMs in vivo using an adeno-associated viral vector serotype 9 that drives a reporter gene from a quintessential cardiac-specific α-myosin heavy chain promoter. Most de novo designed CS-CRMs resulted in a >10-fold increase in cardiac gene expression. The most robust CRMs enhanced cardiac-specific transcription 70- to 100-fold. Expression was sustained and restricted to cardiomyocytes. We then combined the most potent CS-CRM4 with a synthetic heart and muscle-specific promoter (SPc5-12) and obtained a significant 20-fold increase in cardiac gene expression compared to the cytomegalovirus promoter. This study underscores the potential of rational vector design to improve the robustness of cardiac gene therapy. PMID:25195597

  10. Studying the functional conservation of cis-regulatory modules and their transcriptional output.

    PubMed

    Bauer, Denis C; Bailey, Timothy L

    2008-04-29

    Cis-regulatory modules (CRMs) are distinct, genomic regions surrounding the target gene that can independently activate the promoter to drive transcription. The activation of a CRM is controlled by the binding of a certain combination of transcription factors (TFs). It would be of great benefit if the transcriptional output mediated by a specific CRM could be predicted. Of equal benefit would be identifying in silico a specific CRM as the driver of the expression in a specific tissue or situation. We extend a recently developed biochemical modeling approach to manage both prediction tasks. Given a set of TFs, their protein concentrations, and the positions and binding strengths of each of the TFs in a putative CRM, the model predicts the transcriptional output of the gene. Our approach predicts the location of the regulating CRM by using predicted TF binding sites in regions near the gene as input to the model and searching for the region that yields a predicted transcription rate most closely matching the known rate. Here we show the ability of the model on the example of one of the CRMs regulating the eve gene, MSE2. A model trained on the MSE2 in D. melanogaster was applied to the surrounding sequence of the eve gene in seven other Drosophila species. The model successfully predicts the correct MSE2 location and output in six out of eight Drosophila species we examine. The model is able to generalize from D. melanogaster to other Drosophila species and accurately predicts the location and transcriptional output of MSE2 in those species. However, we also show that the current model is not specific enough to function as a genome-wide CRM scanner, because it incorrectly predicts other genomic regions to be MSE2s.

  11. Cis-regulatory elements are harbored in Intron5 of the RUNX1 gene

    PubMed Central

    2014-01-01

    Background Human RUNX1 gene is one of the most frequent target for chromosomal translocations associated with acute myeloid leukemia (AML) and acute lymphoid leukemia (ALL). The highest prevalence in AML is noted with (8; 21) translocation; which represents 12 to 15% of all AML cases. Interestingly, all the breakpoints mapped to date in t(8;21) are clustered in intron 5 of the RUNX1 gene and intron 1 of the ETO gene. No homologous sequences have been found at the recombination regions; but DNase I hypersensitive sites (DHS) have been mapped to the areas of the genes involved in t(8;21). Presence of DHS sites is commonly associated with regulatory elements such as promoters, enhancers and silencers, among others. Results In this study we used a combination of comparative genomics, cloning and transfection assays to evaluate potential regulatory elements located in intron 5 of the RUNX1 gene. Our genomic analysis identified nine conserved non-coding sequences that are evolutionarily conserved among rat, mouse and human. We cloned two of these regions in pGL-3 Promoter plasmid in order to analyze their transcriptional regulatory activity. Our results demonstrate that the identified regions can indeed regulate transcription of a reporter gene in a distance and position independent manner; moreover, their transcriptional effect is cell type specific. Conclusions We have identified nine conserved non coding sequence that are harbored in intron 5 of the RUNX1 gene. We have also demonstrated that two of these regions can regulate transcriptional activity in vitro. Taken together our results suggest that intron 5 of the RUNX1 gene contains multiple potential cis-regulatory elements. PMID:24655352

  12. Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models

    PubMed Central

    Svetlichnyy, Dmitry; Imrichova, Hana; Fiers, Mark; Kalender Atak, Zeynep; Aerts, Stein

    2015-01-01

    Cancer genomes contain vast amounts of somatic mutations, many of which are passenger mutations not involved in oncogenesis. Whereas driver mutations in protein-coding genes can be distinguished from passenger mutations based on their recurrence, non-coding mutations are usually not recurrent at the same position. Therefore, it is still unclear how to identify cis-regulatory driver mutations, particularly when chromatin data from the same patient is not available, thus relying only on sequence and expression information. Here we use machine-learning methods to predict functional regulatory regions using sequence information alone, and compare the predicted activity of the mutated region with the reference sequence. This way we define the Predicted Regulatory Impact of a Mutation in an Enhancer (PRIME). We find that the recently identified driver mutation in the TAL1 enhancer has a high PRIME score, representing a “gain-of-target” for MYB, whereas the highly recurrent TERT promoter mutation has a surprisingly low PRIME score. We trained Random Forest models for 45 cancer-related transcription factors, and used these to score variations in the HeLa genome and somatic mutations across more than five hundred cancer genomes. Each model predicts only a small fraction of non-coding mutations with a potential impact on the function of the encompassing regulatory region. Nevertheless, as these few candidate driver mutations are often linked to gains in chromatin activity and gene expression, they may contribute to the oncogenic program by altering the expression levels of specific oncogenes and tumor suppressor genes. PMID:26562774

  13. Deciphering Cis-Regulatory Element Mediated Combinatorial Regulation in Rice under Blast Infected Condition

    PubMed Central

    Deb, Arindam; Kundu, Sudip

    2015-01-01

    Combinations of cis-regulatory elements (CREs) present at the promoters facilitate the binding of several transcription factors (TFs), thereby altering the consequent gene expressions. Due to the eminent complexity of the regulatory mechanism, the combinatorics of CRE-mediated transcriptional regulation has been elusive. In this work, we have developed a new methodology that quantifies the co-occurrence tendencies of CREs present in a set of promoter sequences; these co-occurrence scores are filtered in three consecutive steps to test their statistical significance; and the significantly co-occurring CRE pairs are presented as networks. These networks of co-occurring CREs are further transformed to derive higher order of regulatory combinatorics. We have further applied this methodology on the differentially up-regulated gene-sets of rice tissues under fungal (Magnaporthe) infected conditions to demonstrate how it helps to understand the CRE-mediated combinatorial gene regulation. Our analysis includes a wide spectrum of biologically important results. The CRE pairs having a strong tendency to co-occur often exhibit very similar joint distribution patterns at the promoters of rice. We couple the network approach with experimental results of plant gene regulation and defense mechanisms and find evidences of auto and cross regulation among TF families, cross-talk among multiple hormone signaling pathways, similarities and dissimilarities in regulatory combinatorics between different tissues, etc. Our analyses have pointed a highly distributed nature of the combinatorial gene regulation facilitating an efficient alteration in response to fungal attack. All together, our proposed methodology could be an important approach in understanding the combinatorial gene regulation. It can be further applied to unravel the tissue and/or condition specific combinatorial gene regulation in other eukaryotic systems with the availability of annotated genomic sequences and suitable

  14. cis-Regulatory Circuits Regulating NEK6 Kinase Overexpression in Transformed B Cells Are Super-Enhancer Independent.

    PubMed

    Huang, Yue; Koues, Olivia I; Zhao, Jiang-Yang; Liu, Regina; Pyfrom, Sarah C; Payton, Jacqueline E; Oltz, Eugene M

    2017-03-21

    Alterations in distal regulatory elements that control gene expression underlie many diseases, including cancer. Epigenomic analyses of normal and diseased cells have produced correlative predictions for connections between dysregulated enhancers and target genes involved in pathogenesis. However, with few exceptions, these predicted cis-regulatory circuits remain untested. Here, we dissect cis-regulatory circuits that lead to overexpression of NEK6, a mitosis-associated kinase, in human B cell lymphoma. We find that only a minor subset of predicted enhancers is required for NEK6 expression. Indeed, an annotated super-enhancer is dispensable for NEK6 overexpression and for maintaining the architecture of a B cell-specific regulatory hub. A CTCF cluster serves as a chromatin and architectural boundary to block communication of the NEK6 regulatory hub with neighboring genes. Our findings emphasize that validation of predicted cis-regulatory circuits and super-enhancers is needed to prioritize transcriptional control elements as therapeutic targets. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  15. Changes in cis-regulatory elements of a key floral regulator are associated with divergence of inflorescence architectures.

    PubMed

    Kusters, Elske; Della Pina, Serena; Castel, Rob; Souer, Erik; Koes, Ronald

    2015-08-15

    Higher plant species diverged extensively with regard to the moment (flowering time) and position (inflorescence architecture) at which flowers are formed. This seems largely caused by variation in the expression patterns of conserved genes that specify floral meristem identity (FMI), rather than changes in the encoded proteins. Here, we report a functional comparison of the promoters of homologous FMI genes from Arabidopsis, petunia, tomato and Antirrhinum. Analysis of promoter-reporter constructs in petunia and Arabidopsis, as well as complementation experiments, showed that the divergent expression of leafy (LFY) and the petunia homolog aberrant leaf and flower (ALF) results from alterations in the upstream regulatory network rather than cis-regulatory changes. The divergent expression of unusual floral organs (UFO) from Arabidopsis, and the petunia homolog double top (DOT), however, is caused by the loss or gain of cis-regulatory promoter elements, which respond to trans-acting factors that are expressed in similar patterns in both species. Introduction of pUFO:UFO causes no obvious defects in Arabidopsis, but in petunia it causes the precocious and ectopic formation of flowers. This provides an example of how a change in a cis-regulatory region can account for a change in the plant body plan. © 2015. Published by The Company of Biologists Ltd.

  16. Mapping Association between Long-Range Cis-Regulatory Regions and Their Target Genes Using Comparative Genomics

    NASA Astrophysics Data System (ADS)

    Mongin, Emmanuel; Dewar, Ken; Blanchette, Mathieu

    In chordates, long-range cis-regulatory regions are involved in the control of transcription initiation (either as repressors or enhancers). They can be located as far as 1 Mb from the transcription start site of the target gene and can regulate more than one gene. Therefore, proper characterization of functional interactions between long-range cis-regulatory regions and their target genes remains problematic. We present a novel method to predict such interactions based on the analysis of rearrangements between the human and 16 other vertebrate genomes. Our method is based on the assumption that genome rearrangements that would disrupt the functional interaction between a cis-regulatory region and its target gene are likely to be deleterious. Therefore, conservation of synteny through evolution would be an indication of a functional interaction. We use our algorithm to classify a set of 1,406,084 putative associations from the human genome. This genome-wide map of interactions has many potential applications, including the selection of candidate regions prior to in vivo experimental characterization, a better characterization of regulatory regions involved in position effect diseases, and an improved understanding of the mechanisms and importance of long-range regulation.

  17. Genes associated with the cis-regulatory functions of intragenic LINE-1 elements.

    PubMed

    Wanichnopparat, Wachiraporn; Suwanwongse, Kulachanya; Pin-On, Piyapat; Aporntewan, Chatchawit; Mutirangura, Apiwat

    2013-03-27

    Thousands of intragenic long interspersed element 1 sequences (LINE-1 elements or L1s) reside within genes. These intragenic L1 sequences are conserved and regulate the expression of their host genes. When L1 methylation is decreased, either through chemical induction or in cancer, the intragenic L1 transcription is increased. The resulting L1 mRNAs form RISC complexes with pre-mRNA to degrade the complementary mRNA. In this study, we screened for genes that are involved in intragenic L1 regulation networks. Genes containing L1s were obtained from L1Base (http://l1base.molgen.mpg.de). The expression profiles of 205 genes in 516 gene knockdown experiments were obtained from the Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo). The expression levels of the genes with and without L1s were compared using Pearson's chi-squared test. After a permutation based statistical analysis and a multiple hypothesis testing, 73 genes were found to induce significant regulatory changes (upregulation and/or downregulation) in genes with L1s. In detail, 5 genes were found to induce both the upregulation and downregulation of genes with L1s, whereas 27 and 37 genes induced the downregulation and upregulation, respectively, of genes with L1s. These regulations sometimes differed depending on the cell type and the orientation of the intragenic L1s. Moreover, the siRNA-regulating genes containing L1s possess a variety of molecular functions, are responsible for many cellular phenotypes and are associated with a number of diseases. Cells use intragenic L1s as cis-regulatory elements within gene bodies to modulate gene expression. There may be several mechanisms by which L1s mediate gene expression. Intragenic L1s may be involved in the regulation of several biological processes, including DNA damage and repair, inflammation, immune function, embryogenesis, cell differentiation, cellular response to external stimuli and hormonal responses. Furthermore, in addition to cancer

  18. Identification and characterization of a cis-regulatory element for zygotic gene expression in Chlamydomonas reinhardtii

    DOE PAGES

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; ...

    2016-03-26

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient tomore » confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. Furthermore, we predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes.« less

  19. Identification and Characterization of a cis-Regulatory Element for Zygotic Gene Expression in Chlamydomonas reinhardtii

    PubMed Central

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; Umen, James

    2016-01-01

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient to confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. We predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes. PMID:27172209

  20. Functional characterization of motif sequences under purifying selection.

    PubMed

    Chen, De-Hua; Chang, Andrew Ying-Fei; Liao, Ben-Yang; Yeang, Chen-Hsiang

    2013-02-01

    Diverse life forms are driven by the evolution of gene regulatory programs including changes in regulator proteins and cis-regulatory elements. Alterations of cis-regulatory elements are likely to dominate the evolution of the gene regulatory networks, as they are subjected to smaller selective constraints compared with proteins and hence may evolve quickly to adapt the environment. Prior studies on cis-regulatory element evolution focus primarily on sequence substitutions of known transcription factor-binding motifs. However, evolutionary models for the dynamics of motif occurrence are relatively rare, and comprehensive characterization of the evolution of all possible motif sequences has not been pursued. In the present study, we propose an algorithm to estimate the strength of purifying selection of a motif sequence based on an evolutionary model capturing the birth and death of motif occurrences on promoters. We term this measure as the 'evolutionary retention coefficient', as it is related yet distinct from the canonical definition of selection coefficient in population genetics. Using this algorithm, we estimate and report the evolutionary retention coefficients of all possible 10-nucleotide sequences from the aligned promoter sequences of 27 748. orthologous gene families in 34 mammalian species. Intriguingly, the evolutionary retention coefficients of motifs are intimately associated with their functional relevance. Top-ranking motifs (sorted by evolutionary retention coefficients) are significantly enriched with transcription factor-binding sequences according to the curated knowledge from the TRANSFAC database and the ChIP-seq data generated from the ENCODE Consortium. Moreover, genes harbouring high-scoring motifs on their promoters retain significantly coherent expression profiles, and those genes are over-represented in the functional classes involved in gene regulation. The validation results reveal the dependencies between natural selection and

  1. An ancient yet flexible cis-regulatory architecture allows localized Hedgehog tuning by patched/Ptch1

    PubMed Central

    Lorberbaum, David S; Ramos, Andrea I; Peterson, Kevin A; Carpenter, Brandon S; Parker, David S; De, Sandip; Hillers, Lauren E; Blake, Victoria M; Nishi, Yuichi; McFarlane, Matthew R; Chiang, Ason CY; Kassis, Judith A; Allen, Benjamin L; McMahon, Andrew P; Barolo, Scott

    2016-01-01

    The Hedgehog signaling pathway is part of the ancient developmental-evolutionary animal toolkit. Frequently co-opted to pattern new structures, the pathway is conserved among eumetazoans yet flexible and pleiotropic in its effects. The Hedgehog receptor, Patched, is transcriptionally activated by Hedgehog, providing essential negative feedback in all tissues. Our locus-wide dissections of the cis-regulatory landscapes of fly patched and mouse Ptch1 reveal abundant, diverse enhancers with stage- and tissue-specific expression patterns. The seemingly simple, constitutive Hedgehog response of patched/Ptch1 is driven by a complex regulatory architecture, with batteries of context-specific enhancers engaged in promoter-specific interactions to tune signaling individually in each tissue, without disturbing patterning elsewhere. This structure—one of the oldest cis-regulatory features discovered in animal genomes—explains how patched/Ptch1 can drive dramatic adaptations in animal morphology while maintaining its essential core function. It may also suggest a general model for the evolutionary flexibility of conserved regulators and pathways. DOI: http://dx.doi.org/10.7554/eLife.13550.001 PMID:27146892

  2. Differential contribution of cis-regulatory elements to higher order chromatin structure and expression of the CFTR locus

    PubMed Central

    Yang, Rui; Kerschner, Jenny L.; Gosalia, Nehal; Neems, Daniel; Gorsic, Lidija K.; Safi, Alexias; Crawford, Gregory E.; Kosak, Steven T.; Leir, Shih-Hsing; Harris, Ann

    2016-01-01

    Higher order chromatin structure establishes domains that organize the genome and coordinate gene expression. However, the molecular mechanisms controlling transcription of individual loci within a topological domain (TAD) are not fully understood. The cystic fibrosis transmembrane conductance regulator (CFTR) gene provides a paradigm for investigating these mechanisms. CFTR occupies a TAD bordered by CTCF/cohesin binding sites within which are cell-type-selective cis-regulatory elements for the locus. We showed previously that intronic and extragenic enhancers, when occupied by specific transcription factors, are recruited to the CFTR promoter by a looping mechanism to drive gene expression. Here we use a combination of CRISPR/Cas9 editing of cis-regulatory elements and siRNA-mediated depletion of architectural proteins to determine the relative contribution of structural elements and enhancers to the higher order structure and expression of the CFTR locus. We found the boundaries of the CFTR TAD are conserved among diverse cell types and are dependent on CTCF and cohesin complex. Removal of an upstream CTCF-binding insulator alters the interaction profile, but has little effect on CFTR expression. Within the TAD, intronic enhancers recruit cell-type selective transcription factors and deletion of a pivotal enhancer element dramatically decreases CFTR expression, but has minor effect on its 3D structure. PMID:26673704

  3. cis regulatory requirements for hypodermal cell-specific expression of the Caenorhabditis elegans cuticle collagen gene dpy-7.

    PubMed Central

    Gilleard, J S; Barry, J D; Johnstone, I L

    1997-01-01

    The Caenorhabditis elegans cuticle collagens are encoded by a multigene family of between 50 and 100 members and are the major component of the nematode cuticular exoskeleton. They are synthesized in the hypodermis prior to secretion and incorporation into the cuticle and exhibit complex patterns of spatial and temporal expression. We have investigated the cis regulatory requirements for tissue- and stage-specific expression of the cuticle collagen gene dpy-7 and have identified a compact regulatory element which is sufficient to specify hypodermal cell reporter gene expression. This element appears to be a true tissue-specific promoter element, since it encompasses the dpy-7 transcription initiation sites and functions in an orientation-dependent manner. We have also shown, by interspecies transformation experiments, that the dpy-7 cis regulatory elements are functionally conserved between C. elegans and C. briggsae, and comparative sequence analysis supports the importance of the regulatory sequence that we have identified by reporter gene analysis. All of our data suggest that the spatial expression of the dpy-7 cuticle collagen gene is established essentially by a small tissue-specific promoter element and does not require upstream activator or repressor elements. In addition, we have found the DPY-7 polypeptide is very highly conserved between the two species and that the C. briggsae polypeptide can function appropriately within the C. elegans cuticle. This finding suggests a remarkably high level of conservation of individual cuticle components, and their interactions, between these two nematode species. PMID:9121480

  4. Modular Utilization of Distal cis-Regulatory Elements Controls Ifng Gene Expression in T Cells Activated by Distinct Stimuli

    PubMed Central

    Balasubramani, Anand; Shibata, Yoichiro; Crawford, Gregory E.; Baldwin, Albert S.; Hatton, Robin D.; Weaver, Casey T.

    2010-01-01

    SUMMARY Distal cis-regulatory elements play essential roles in the T lineage-specific expression of cytokine genes. We have mapped interactions of three transacting factors – NF-κB, STAT4 and T-bet – with cis elements in the Ifng locus. We find that RelA is critical for optimal Ifng expression and is differentially recruited to multiple elements contingent upon T cell receptor (TCR) or interleukin-12 (IL-12) plus IL-18 signaling. RelA recruitment to at least four elements is dependent on T-bet-dependent remodeling of the Ifng locus and co-recruitment of STAT4. STAT4 and NF-κB therefore cooperate at multiple cis elements to enable NF-κB–dependent enhancement of Ifng expression. RelA recruitment to distal elements was similar in Th1 and Tc1 effector cells, although T-bet was dispensable in CD8 effectors. These results support a model of Ifng regulation in which distal cis-regulatory elements differentially recruit key transcription factors in a modular fashion to initiate gene transcription induced by distinct activation signals. PMID:20643337

  5. Inheritance of gene expression level and selective constraints on trans- and cis-regulatory changes in yeast.

    PubMed

    Schaefke, Bernhard; Emerson, J J; Wang, Tzi-Yuan; Lu, Mei-Yeh Jade; Hsieh, Li-Ching; Li, Wen-Hsiung

    2013-09-01

    Gene expression evolution can be caused by changes in cis- or trans-regulatory elements or both. As cis and trans regulation operate through different molecular mechanisms, cis and trans mutations may show different inheritance patterns and may be subjected to different selective constraints. To investigate these issues, we obtained and analyzed gene expression data from two Saccharomyces cerevisiae strains and their hybrid, using high-throughput sequencing. Our data indicate that compared with other types of genes, those with antagonistic cis-trans interactions are more likely to exhibit over- or underdominant inheritance of expression level. Moreover, in accordance with previous studies, genes with trans variants tend to have a dominant inheritance pattern, whereas cis variants are enriched for additive inheritance. In addition, cis regulatory differences contribute more to expression differences between species than within species, whereas trans regulatory differences show a stronger association between divergence and polymorphism. Our data indicate that in the trans component of gene expression differences genes subjected to weaker selective constraints tend to have an excess of polymorphism over divergence compared with those subjected to stronger selective constraints. In contrast, in the cis component, this difference between genes under stronger and weaker selective constraint is mostly absent. To explain these observations, we propose that purifying selection more strongly shapes trans changes than cis changes and that positive selection may have significantly contributed to cis regulatory divergence.

  6. Asymmetrically reduced expression of hand1 homeologs involving a single nucleotide substitution in a cis-regulatory element.

    PubMed

    Ochi, Haruki; Suzuki, Nanoka; Kawaguchi, Akane; Ogino, Hajime

    2017-03-28

    During vertebrate evolution, whole genome duplications resulted in a number of duplicated genes, some of which eventually changed their expression patterns and/or levels via alteration of cis-regulatory sequences. However, the initial process involved in such cis-regulatory changes remains unclear. Therefore, we investigated this process by analyzing the duplicated hand1 genes of Xenopus laevis (hand1.L and hand1.S), which were generated by allotetraploidization 17-18 million years ago, and compared these with their single ortholog in the ancestral-type diploid species X. tropicalis. A dN/dS analysis indicated that hand1.L and hand1.S are still under purifying selection, and thus, their products appear to retain ancestral functional properties. RNA-seq and in situ hybridization analyses revealed that hand1.L and hand1.S have similar expression patterns to each other and to X. tropicalis hand1, but the hand1.S expression level was much lower than the hand1.L expression level in the primordial heart. A comparative sequence analysis, luciferase reporter analysis, ChIP-PCR analysis, and transgenic reporter analysis showed that a single nucleotide substitution in the hand1.S promoter was responsible for the reduced expression in the heart. These findings demonstrated that a small change in the promoter sequence can trigger diversification of duplicated gene expression prior to diversification of their encoded protein functions in a young duplicated genome.

  7. MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

    PubMed

    Ozaki, Haruka; Iwasaki, Wataru

    2016-08-01

    As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Long-range DNase I hypersensitivity mapping reveals the imprinted Igf2r and Air promoters share cis-regulatory elements

    PubMed Central

    Pauler, Florian M.; Stricker, Stefan H.; Warczok, Katarzyna E.; Barlow, Denise P.

    2005-01-01

    Epigenetic mechanisms restrict the expression of imprinted genes to one parental allele in diploid cells. At the Igf2r/Air imprinted cluster on mouse chromosome 17, paternal-specific expression of the Air noncoding RNA has been shown to silence three genes in cis: Igf2r, Slc22a2, and Slc22a3. By an unbiased mapping of DNase I hypersensitive sites (DHS) in a 192-kb region flanking Igf2r and Air, we identified 21 DHS, of which nine mapped to evolutionarily conserved sequences. Based on the hypothesis that silencing effects of Air would be directed towards cis regulatory elements used to activate genes, DHS are potential key players in the control of imprinted expression. However, in this 192-kb region only the two DHS mapping to the Igf2r and Air promoters show parental specificity. The remaining 19 DHS were present on both parental alleles and, thus, have the potential to activate Igf2r on the maternal allele and Air on the paternal allele. The possibility that the Igf2r and Air promoters share the same cis-acting regulatory elements, albeit on opposite parental chromosomes, was supported by the similar expression profiles of Igf2r and Air in vivo. These results refine our understanding of the onset of imprinted silencing at this cluster and indicate the Air noncoding RNA may specifically target silencing to the Igf2r promoter. PMID:16204191

  9. Group 1 Innate Lymphoid Cell Lineage Identity Is Determined by a cis-Regulatory Element Marked by a Long Non-coding RNA.

    PubMed

    Mowel, Walter K; McCright, Sam J; Kotzin, Jonathan J; Collet, Magalie A; Uyar, Asli; Chen, Xin; DeLaney, Alexandra; Spencer, Sean P; Virtue, Anthony T; Yang, EnJun; Villarino, Alejandro; Kurachi, Makoto; Dunagin, Margaret C; Pritchard, Gretchen Harms; Stein, Judith; Hughes, Cynthia; Fonseca-Pereira, Diogo; Veiga-Fernandes, Henrique; Raj, Arjun; Kambayashi, Taku; Brodsky, Igor E; O'Shea, John J; Wherry, E John; Goff, Loyal A; Rinn, John L; Williams, Adam; Flavell, Richard A; Henao-Mejia, Jorge

    2017-09-19

    Commitment to the innate lymphoid cell (ILC) lineage is determined by Id2, a transcriptional regulator that antagonizes T and B cell-specific gene expression programs. Yet how Id2 expression is regulated in each ILC subset remains poorly understood. We identified a cis-regulatory element demarcated by a long non-coding RNA (lncRNA) that controls the function and lineage identity of group 1 ILCs, while being dispensable for early ILC development and homeostasis of ILC2s and ILC3s. The locus encoding this lncRNA, which we termed Rroid, directly interacted with the promoter of its neighboring gene, Id2, in group 1 ILCs. Moreover, the Rroid locus, but not the lncRNA itself, controlled the identity and function of ILC1s by promoting chromatin accessibility and deposition of STAT5 at the promoter of Id2 in response to interleukin (IL)-15. Thus, non-coding elements responsive to extracellular cues unique to each ILC subset represent a key regulatory layer for controlling the identity and function of ILCs. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Shared Enhancer Activity in the Limbs and Phallus and Functional Divergence of a Limb-Genital cis-Regulatory Element in Snakes.

    PubMed

    Infante, Carlos R; Mihala, Alexandra G; Park, Sungdae; Wang, Jialiang S; Johnson, Kenji K; Lauderdale, James D; Menke, Douglas B

    2015-10-12

    The amniote phallus and limbs differ dramatically in their morphologies but share patterns of signaling and gene expression in early development. Thus far, the extent to which genital and limb transcriptional networks also share cis-regulatory elements has remained unexplored. We show that many limb enhancers are retained in snake genomes, suggesting that these elements may function in non-limb tissues. Consistent with this, our analysis of cis-regulatory activity in mice and Anolis lizards reveals that patterns of enhancer activity in embryonic limbs and genitalia overlap heavily. In mice, deletion of HLEB, an enhancer of Tbx4, produces defects in hindlimbs and genitalia, establishing the importance of this limb-genital enhancer for development of these different appendages. Further analyses demonstrate that the HLEB of snakes has lost hindlimb enhancer function while retaining genital activity. Our findings identify roles for Tbx4 in genital development and highlight deep similarities in cis-regulatory activity between limbs and genitalia.

  11. Numb directs the subcellular localization of EAAT3 through binding the YxNxxF motif.

    PubMed

    Su, Jin-Feng; Wei, Jian; Li, Pei-Shan; Miao, Hong-Hua; Ma, Yong-Chao; Qu, Yu-Xiu; Xu, Jie; Qin, Jie; Li, Bo-Liang; Song, Bao-Liang; Xu, Zheng-Ping; Luo, Jie

    2016-08-15

    Excitatory amino acid transporter type 3 (EAAT3, also known as SLC1A1) is a high-affinity, Na(+)-dependent glutamate carrier that localizes primarily within the cell and at the apical plasma membrane. Although previous studies have reported proteins and sequence regions involved in EAAT3 trafficking, the detailed molecular mechanism by which EAAT3 is distributed to the correct location still remains elusive. Here, we identify that the YVNGGF sequence in the C-terminus of EAAT3 is responsible for its intracellular localization and apical sorting in rat hepatoma cells CRL1601 and Madin-Darby canine kidney (MDCK) cells, respectively. We further demonstrate that Numb, a clathrin adaptor protein, directly binds the YVNGGF motif and regulates the localization of EAAT3. Mutation of Y503, N505 and F508 within the YVNGGF motif to alanine residues or silencing Numb by use of small interfering RNA (siRNA) results in the aberrant localization of EAAT3. Moreover, both Numb and the YVNGGF motif mediate EAAT3 endocytosis in CRL1601 cells. In summary, our study suggests that Numb is a pivotal adaptor protein that mediates the subcellular localization of EAAT3 through binding the YxNxxF (where x stands for any amino acid) motif.

  12. Analysis of hundreds of cis-regulatory landscapes at high resolution in a single, high-throughput experiment.

    PubMed

    Hughes, Jim R; Roberts, Nigel; McGowan, Simon; Hay, Deborah; Giannoulatou, Eleni; Lynch, Magnus; De Gobbi, Marco; Taylor, Stephen; Gibbons, Richard; Higgs, Douglas R

    2014-02-01

    Gene expression during development and differentiation is regulated in a cell- and stage-specific manner by complex networks of intergenic and intragenic cis-regulatory elements whose numbers and representation in the genome far exceed those of structural genes. Using chromosome conformation capture, it is now possible to analyze in detail the interaction between enhancers, silencers, boundary elements and promoters at individual loci, but these techniques are not readily scalable. Here we present a high-throughput approach (Capture-C) to analyze cis interactions, interrogating hundreds of specific interactions at high resolution in a single experiment. We show how this approach will facilitate detailed, genome-wide analysis to elucidate the general principles by which cis-acting sequences control gene expression. In addition, we show how Capture-C will expedite identification of the target genes and functional effects of SNPs that are associated with complex diseases, which most frequently lie in intergenic cis-acting regulatory elements.

  13. Cis-Regulatory Evolution of Forkhead Box O1 (FOXO1), a Terminal Selector Gene for Decidual Stromal Cell Identity.

    PubMed

    Park, Yeonwoo; Nnamani, Mauris C; Maziarz, Jamie; Wagner, Günter P

    2016-12-01

    Studies in human and mouse have shown that decidual stromal cells (DSC), which develop in the innermost lining of uterus, mediate placentation by regulating maternal immune response against the fetus and the extent of fetal invasion. Investigating when and how DSC evolved is thus a key step to reconstructing the evolutionary history of mammalian pregnancy. We present molecular evidence placing the origin of DSC in the stem lineage of eutherians (extant placental mammals). The transcription factor forkhead box O1 (FOXO1) is a part of the core regulatory transcription factor complex (CoRC) that establishes the cell type identity of DSC. Decidualization, the process through which DSC differentiate from endometrial stromal fibroblasts, requires transcriptional upregulation of FOXO1 Contrary to other examples in mammals where gene recruitment is caused by the origin of an alternative promoter, FOXO1 is transcribed from the same promoter in DSC as in endometrial stromal fibroblasts. Comparing the activities of FOXO1 promoters from human, mouse, manatee (Afrotheria), and opossum (marsupial) revealed that FOXO1 promoter evolved responsiveness to decidualization signals in the stem lineage of eutherians. This eutherian vs. marsupial pattern of promoter activity was not observed in some other cell types expressing FOXO1, suggesting that this cis-regulatory evolution occurred specifically in the context of the origin of DSC. Sequence comparison revealed eutherian-specifically conserved nucleotides that contribute to the eutherian promoter activity. We conclude that the cis-regulatory activity of a terminal selector gene for decidual stromal cell identity evolved in the stem lineage of eutherians supporting a model where decidual cells are a eutherian innovation. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  14. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  15. RAR/RXR binding dynamics distinguish pluripotency from differentiation associated cis-regulatory elements

    PubMed Central

    Chatagnon, Amandine; Veber, Philippe; Morin, Valérie; Bedo, Justin; Triqueneaux, Gérard; Sémon, Marie; Laudet, Vincent; d'Alché-Buc, Florence; Benoit, Gérard

    2015-01-01

    In mouse embryonic cells, ligand-activated retinoic acid receptors (RARs) play a key role in inhibiting pluripotency-maintaining genes and activating some major actors of cell differentiation. To investigate the mechanism underlying this dual regulation, we performed joint RAR/RXR ChIP-seq and mRNA-seq time series during the first 48 h of the RA-induced Primitive Endoderm (PrE) differentiation process in F9 embryonal carcinoma (EC) cells. We show here that this dual regulation is associated with RAR/RXR genomic redistribution during the differentiation process. In-depth analysis of RAR/RXR binding sites occupancy dynamics and composition show that in undifferentiated cells, RAR/RXR interact with genomic regions characterized by binding of pluripotency-associated factors and high prevalence of the non-canonical DR0-containing RA response element. By contrast, in differentiated cells, RAR/RXR bound regions are enriched in functional Sox17 binding sites and are characterized with a higher frequency of the canonical DR5 motif. Our data offer an unprecedentedly detailed view on the action of RA in triggering pluripotent cell differentiation and demonstrate that RAR/RXR action is mediated via two different sets of regulatory regions tightly associated with cell differentiation status. PMID:25897113

  16. RAR/RXR binding dynamics distinguish pluripotency from differentiation associated cis-regulatory elements.

    PubMed

    Chatagnon, Amandine; Veber, Philippe; Morin, Valérie; Bedo, Justin; Triqueneaux, Gérard; Sémon, Marie; Laudet, Vincent; d'Alché-Buc, Florence; Benoit, Gérard

    2015-05-26

    In mouse embryonic cells, ligand-activated retinoic acid receptors (RARs) play a key role in inhibiting pluripotency-maintaining genes and activating some major actors of cell differentiation. To investigate the mechanism underlying this dual regulation, we performed joint RAR/RXR ChIP-seq and mRNA-seq time series during the first 48 h of the RA-induced Primitive Endoderm (PrE) differentiation process in F9 embryonal carcinoma (EC) cells. We show here that this dual regulation is associated with RAR/RXR genomic redistribution during the differentiation process. In-depth analysis of RAR/RXR binding sites occupancy dynamics and composition show that in undifferentiated cells, RAR/RXR interact with genomic regions characterized by binding of pluripotency-associated factors and high prevalence of the non-canonical DR0-containing RA response element. By contrast, in differentiated cells, RAR/RXR bound regions are enriched in functional Sox17 binding sites and are characterized with a higher frequency of the canonical DR5 motif. Our data offer an unprecedentedly detailed view on the action of RA in triggering pluripotent cell differentiation and demonstrate that RAR/RXR action is mediated via two different sets of regulatory regions tightly associated with cell differentiation status. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Separate elements of the TERMINAL FLOWER 1 cis-regulatory region integrate pathways to control flowering time and shoot meristem identity.

    PubMed

    Serrano-Mislata, Antonio; Fernández-Nohales, Pedro; Doménech, María J; Hanzawa, Yoshie; Bradley, Desmond; Madueño, Francisco

    2016-09-15

    TERMINAL FLOWER 1 (TFL1) is a key regulator of Arabidopsis plant architecture that responds to developmental and environmental signals to control flowering time and the fate of shoot meristems. TFL1 expression is dynamic, being found in all shoot meristems, but not in floral meristems, with the level and distribution changing throughout development. Using a variety of experimental approaches we have analysed the TFL1 promoter to elucidate its functional structure. TFL1 expression is based on distinct cis-regulatory regions, the most important being located 3' of the coding sequence. Our results indicate that TFL1 expression in the shoot apical versus lateral inflorescence meristems is controlled through distinct cis-regulatory elements, suggesting that different signals control expression in these meristem types. Moreover, we identified a cis-regulatory region necessary for TFL1 expression in the vegetative shoot and required for a wild-type flowering time, supporting that TFL1 expression in the vegetative meristem controls flowering time. Our study provides a model for the functional organisation of TFL1 cis-regulatory regions, contributing to our understanding of how developmental pathways are integrated at the genomic level of a key regulator to control plant architecture. © 2016. Published by The Company of Biologists Ltd.

  18. Two RNA-binding motifs in eIF3 direct HCV IRES-dependent translation

    PubMed Central

    Sun, Chaomin; Querol-Audí, Jordi; Mortimer, Stefanie A.; Arias-Palomo, Ernesto; Doudna, Jennifer A.; Nogales, Eva; Cate, Jamie H. D.

    2013-01-01

    The initiation of protein synthesis plays an essential regulatory role in human biology. At the center of the initiation pathway, the 13-subunit eukaryotic translation initiation factor 3 (eIF3) controls access of other initiation factors and mRNA to the ribosome by unknown mechanisms. Using electron microscopy (EM), bioinformatics and biochemical experiments, we identify two highly conserved RNA-binding motifs in eIF3 that direct translation initiation from the hepatitis C virus internal ribosome entry site (HCV IRES) RNA. Mutations in the RNA-binding motif of subunit eIF3a weaken eIF3 binding to the HCV IRES and the 40S ribosomal subunit, thereby suppressing eIF2-dependent recognition of the start codon. Mutations in the eIF3c RNA-binding motif also reduce 40S ribosomal subunit binding to eIF3, and inhibit eIF5B-dependent steps downstream of start codon recognition. These results provide the first connection between the structure of the central translation initiation factor eIF3 and recognition of the HCV genomic RNA start codon, molecular interactions that likely extend to the human transcriptome. PMID:23766293

  19. cisMEP: an integrated repository of genomic epigenetic profiles and cis-regulatory modules in Drosophila

    PubMed Central

    2014-01-01

    Background Cis-regulatory modules (CRMs), or the DNA sequences required for regulating gene expression, play the central role in biological researches on transcriptional regulation in metazoan species. Nowadays, the systematic understanding of CRMs still mainly resorts to computational methods due to the time-consuming and small-scale nature of experimental methods. But the accuracy and reliability of different CRM prediction tools are still unclear. Without comparative cross-analysis of the results and combinatorial consideration with extra experimental information, there is no easy way to assess the confidence of the predicted CRMs. This limits the genome-wide understanding of CRMs. Description It is known that transcription factor binding and epigenetic profiles tend to determine functions of CRMs in gene transcriptional regulation. Thus integration of the genome-wide epigenetic profiles with systematically predicted CRMs can greatly help researchers evaluate and decipher the prediction confidence and possible transcriptional regulatory functions of these potential CRMs. However, these data are still fragmentary in the literatures. Here we performed the computational genome-wide screening for potential CRMs using different prediction tools and constructed the pioneer database, cisMEP (cis-regulatory module epigenetic profile database), to integrate these computationally identified CRMs with genomic epigenetic profile data. cisMEP collects the literature-curated TFBS location data and nine genres of epigenetic data for assessing the confidence of these potential CRMs and deciphering the possible CRM functionality. Conclusions cisMEP aims to provide a user-friendly interface for researchers to assess the confidence of different potential CRMs and to understand the functions of CRMs through experimentally-identified epigenetic profiles. The deposited potential CRMs and experimental epigenetic profiles for confidence assessment provide experimentally testable

  20. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  1. Unveiling combinatorial regulation through the combination of ChIP information and in silico cis-regulatory module detection

    PubMed Central

    Sun, Hong; Guns, Tias; Fierro, Ana Carolina; Thorrez, Lieven; Nijssen, Siegfried; Marchal, Kathleen

    2012-01-01

    Computationally retrieving biologically relevant cis-regulatory modules (CRMs) is not straightforward. Because of the large number of candidates and the imperfection of the screening methods, many spurious CRMs are detected that are as high scoring as the biologically true ones. Using ChIP-information allows not only to reduce the regions in which the binding sites of the assayed transcription factor (TF) should be located, but also allows restricting the valid CRMs to those that contain the assayed TF (here referred to as applying CRM detection in a query-based mode). In this study, we show that exploiting ChIP-information in a query-based way makes in silico CRM detection a much more feasible endeavor. To be able to handle the large datasets, the query-based setting and other specificities proper to CRM detection on ChIP-Seq based data, we developed a novel powerful CRM detection method ‘CPModule’. By applying it on a well-studied ChIP-Seq data set involved in self-renewal of mouse embryonic stem cells, we demonstrate how our tool can recover combinatorial regulation of five known TFs that are key in the self-renewal of mouse embryonic stem cells. Additionally, we make a number of new predictions on combinatorial regulation of these five key TFs with other TFs documented in TRANSFAC. PMID:22422841

  2. Comparative epigenomics in distantly related teleost species identifies conserved cis-regulatory nodes active during the vertebrate phylotypic period.

    PubMed

    Tena, Juan J; González-Aguilera, Cristina; Fernández-Miñán, Ana; Vázquez-Marín, Javier; Parra-Acero, Helena; Cross, Joe W; Rigby, Peter W J; Carvajal, Jaime J; Wittbrodt, Joachim; Gómez-Skarmeta, José L; Martínez-Morales, Juan R

    2014-07-01

    The complex relationship between ontogeny and phylogeny has been the subject of attention and controversy since von Baer's formulations in the 19th century. The classic concept that embryogenesis progresses from clade general features to species-specific characters has often been revisited. It has become accepted that embryos from a clade show maximum morphological similarity at the so-called phylotypic period (i.e., during mid-embryogenesis). According to the hourglass model, body plan conservation would depend on constrained molecular mechanisms operating at this period. More recently, comparative transcriptomic analyses have provided conclusive evidence that such molecular constraints exist. Examining cis-regulatory architecture during the phylotypic period is essential to understand the evolutionary source of body plan stability. Here we compare transcriptomes and key epigenetic marks (H3K4me3 and H3K27ac) from medaka (Oryzias latipes) and zebrafish (Danio rerio), two distantly related teleosts separated by an evolutionary distance of 115-200 Myr. We show that comparison of transcriptome profiles correlates with anatomical similarities and heterochronies observed at the phylotypic stage. Through comparative epigenomics, we uncover a pool of conserved regulatory regions (≈700), which are active during the vertebrate phylotypic period in both species. Moreover, we show that their neighboring genes encode mainly transcription factors with fundamental roles in tissue specification. We postulate that these regulatory regions, active in both teleost genomes, represent key constrained nodes of the gene networks that sustain the vertebrate body plan.

  3. Ancient polymorphism and functional variation in the primate MHC-DQA1 5' cis-regulatory region.

    PubMed

    Loisel, Dagan A; Rockman, Matthew V; Wray, Gregory A; Altmann, Jeanne; Alberts, Susan C

    2006-10-31

    Precise regulation of MHC gene expression is critical to vertebrate immune surveillance and response. Polymorphisms in the 5' proximal promoter region of the human class II gene HLA-DQA1 have been shown to influence its transcriptional regulation and may contribute to the pathogenesis of autoimmune diseases. We investigated the evolutionary history of this cis-regulatory region by sequencing the DQA1 5' proximal promoter region in eight nonhuman primate species. We observed unexpectedly high levels of sequence variation and multiple strong signatures of balancing selection in this region. Specifically, the considerable DQA1 promoter region diversity was characterized by abundant shared (or trans-species) polymorphism and a pronounced lack of fixed differences between species. The majority of transcription factor binding sites in the DQA1 promoter region were polymorphic within species, and these binding site polymorphisms were commonly shared among multiple species despite evidence for negative selection eliminating a significant fraction of binding site mutations. We assessed the functional consequences of intraspecific promoter region diversity using a cell line-based reporter assay and detected significant differences among baboon DQA1 promoter haplotypes in their ability to drive transcription in vitro. The functional differentiation of baboon promoter haplotypes, together with the significant deviations from neutral sequence evolution, suggests a role for balancing selection in the evolution of DQA1 transcriptional regulation in primates.

  4. Functional roles of Aves class-specific cis-regulatory elements on macroevolution of bird-specific features

    PubMed Central

    Seki, Ryohei; Li, Cai; Fang, Qi; Hayashi, Shinichi; Egawa, Shiro; Hu, Jiang; Xu, Luohao; Pan, Hailin; Kondo, Mao; Sato, Tomohiko; Matsubara, Haruka; Kamiyama, Namiko; Kitajima, Keiichi; Saito, Daisuke; Liu, Yang; Gilbert, M. Thomas P.; Zhou, Qi; Xu, Xing; Shiroishi, Toshihiko; Irie, Naoki; Tamura, Koji; Zhang, Guojie

    2017-01-01

    Unlike microevolutionary processes, little is known about the genetic basis of macroevolutionary processes. One of these magnificent examples is the transition from non-avian dinosaurs to birds that has created numerous evolutionary innovations such as self-powered flight and its associated wings with flight feathers. By analysing 48 bird genomes, we identified millions of avian-specific highly conserved elements (ASHCEs) that predominantly (>99%) reside in non-coding regions. Many ASHCEs show differential histone modifications that may participate in regulation of limb development. Comparative embryonic gene expression analyses across tetrapod species suggest ASHCE-associated genes have unique roles in developing avian limbs. In particular, we demonstrate how the ASHCE driven avian-specific expression of gene Sim1 driven by ASHCE may be associated with the evolution and development of flight feathers. Together, these findings demonstrate regulatory roles of ASHCEs in the creation of avian-specific traits, and further highlight the importance of cis-regulatory rewiring during macroevolutionary changes. PMID:28165450

  5. Retinal Expression of the Drosophila eyes absent Gene Is Controlled by Several Cooperatively Acting Cis-regulatory Elements.

    PubMed

    Weasner, Bonnie M; Weasner, Brandon P; Neuman, Sarah D; Bashirullah, Arash; Kumar, Justin P

    2016-12-01

    The eyes absent (eya) gene of the fruit fly, Drosophila melanogaster, is a member of an evolutionarily conserved gene regulatory network that controls eye formation in all seeing animals. The loss of eya leads to the complete elimination of the compound eye while forced expression of eya in non-retinal tissues is sufficient to induce ectopic eye formation. Within the developing retina eya is expressed in a dynamic pattern and is involved in tissue specification/determination, cell proliferation, apoptosis, and cell fate choice. In this report we explore the mechanisms by which eya expression is spatially and temporally governed in the developing eye. We demonstrate that multiple cis-regulatory elements function cooperatively to control eya transcription and that spacing between a pair of enhancer elements is important for maintaining correct gene expression. Lastly, we show that the loss of eya expression in sine oculis (so) mutants is the result of massive cell death and a progressive homeotic transformation of retinal progenitor cells into head epidermis.

  6. Functional roles of Aves class-specific cis-regulatory elements on macroevolution of bird-specific features.

    PubMed

    Seki, Ryohei; Li, Cai; Fang, Qi; Hayashi, Shinichi; Egawa, Shiro; Hu, Jiang; Xu, Luohao; Pan, Hailin; Kondo, Mao; Sato, Tomohiko; Matsubara, Haruka; Kamiyama, Namiko; Kitajima, Keiichi; Saito, Daisuke; Liu, Yang; Gilbert, M Thomas P; Zhou, Qi; Xu, Xing; Shiroishi, Toshihiko; Irie, Naoki; Tamura, Koji; Zhang, Guojie

    2017-02-06

    Unlike microevolutionary processes, little is known about the genetic basis of macroevolutionary processes. One of these magnificent examples is the transition from non-avian dinosaurs to birds that has created numerous evolutionary innovations such as self-powered flight and its associated wings with flight feathers. By analysing 48 bird genomes, we identified millions of avian-specific highly conserved elements (ASHCEs) that predominantly (>99%) reside in non-coding regions. Many ASHCEs show differential histone modifications that may participate in regulation of limb development. Comparative embryonic gene expression analyses across tetrapod species suggest ASHCE-associated genes have unique roles in developing avian limbs. In particular, we demonstrate how the ASHCE driven avian-specific expression of gene Sim1 driven by ASHCE may be associated with the evolution and development of flight feathers. Together, these findings demonstrate regulatory roles of ASHCEs in the creation of avian-specific traits, and further highlight the importance of cis-regulatory rewiring during macroevolutionary changes.

  7. Comparative epigenomics in distantly related teleost species identifies conserved cis-regulatory nodes active during the vertebrate phylotypic period

    PubMed Central

    Tena, Juan J.; González-Aguilera, Cristina; Fernández-Miñán, Ana; Vázquez-Marín, Javier; Parra-Acero, Helena; Cross, Joe W.; Rigby, Peter W.J.; Carvajal, Jaime J.; Wittbrodt, Joachim; Gómez-Skarmeta, José L.; Martínez-Morales, Juan R.

    2014-01-01

    The complex relationship between ontogeny and phylogeny has been the subject of attention and controversy since von Baer’s formulations in the 19th century. The classic concept that embryogenesis progresses from clade general features to species-specific characters has often been revisited. It has become accepted that embryos from a clade show maximum morphological similarity at the so-called phylotypic period (i.e., during mid-embryogenesis). According to the hourglass model, body plan conservation would depend on constrained molecular mechanisms operating at this period. More recently, comparative transcriptomic analyses have provided conclusive evidence that such molecular constraints exist. Examining cis-regulatory architecture during the phylotypic period is essential to understand the evolutionary source of body plan stability. Here we compare transcriptomes and key epigenetic marks (H3K4me3 and H3K27ac) from medaka (Oryzias latipes) and zebrafish (Danio rerio), two distantly related teleosts separated by an evolutionary distance of 115–200 Myr. We show that comparison of transcriptome profiles correlates with anatomical similarities and heterochronies observed at the phylotypic stage. Through comparative epigenomics, we uncover a pool of conserved regulatory regions (≈700), which are active during the vertebrate phylotypic period in both species. Moreover, we show that their neighboring genes encode mainly transcription factors with fundamental roles in tissue specification. We postulate that these regulatory regions, active in both teleost genomes, represent key constrained nodes of the gene networks that sustain the vertebrate body plan. PMID:24709821

  8. A comparative analysis of the evolution, expression, and cis-regulatory element of polygalacturonase genes in grasses and dicots.

    PubMed

    Liang, Ying; Yu, Youjian; Cui, Jinlong; Lyu, Meiling; Xu, Liai; Cao, Jiashu

    2016-11-01

    Cell walls are a distinguishing characteristic of plants essential to their survival. The pectin content of primary cell walls in grasses and dicots is distinctly different. Polygalacturonases (PGs) can degrade pectins and participate in multiple developmental processes of plants. This study comprehensively compared the evolution, expression, and cis-regulatory element of PGs in grasses and dicots. A total of 577 PGs identified from five grasses and five dicots fell into seven clades. Evolutionary analysis demonstrated the distinct differences between grasses and dicots in patterns of gene duplication and loss, and evolutionary rates. Grasses generally contained much fewer clade C and F members than dicots. We found that this disparity was the result of less duplication and more gene losses in grasses. More duplications occurred in clades D and E, and expression analysis showed that most of clade E members were expressed ubiquitously at a high overall level and clade D members were closely related to male reproduction in both grasses and dicots, suggesting their biological functions were highly conserved across species. In addition to the general role in reproductive development, PGs of clades C and F specifically played roles in root development in dicots, shedding light on organ differentiation between the two groups of plants. A regulatory element analysis of clade C and F members implied that possible functions of PGs in specific biological responses contributed to their expansion and preservation. This work can improve the knowledge of PGs in plants generally and in grasses specifically and is beneficial to functional studies.

  9. Retinal Expression of the Drosophila eyes absent Gene Is Controlled by Several Cooperatively Acting Cis-regulatory Elements

    PubMed Central

    Neuman, Sarah D.; Bashirullah, Arash; Kumar, Justin P.

    2016-01-01

    The eyes absent (eya) gene of the fruit fly, Drosophila melanogaster, is a member of an evolutionarily conserved gene regulatory network that controls eye formation in all seeing animals. The loss of eya leads to the complete elimination of the compound eye while forced expression of eya in non-retinal tissues is sufficient to induce ectopic eye formation. Within the developing retina eya is expressed in a dynamic pattern and is involved in tissue specification/determination, cell proliferation, apoptosis, and cell fate choice. In this report we explore the mechanisms by which eya expression is spatially and temporally governed in the developing eye. We demonstrate that multiple cis-regulatory elements function cooperatively to control eya transcription and that spacing between a pair of enhancer elements is important for maintaining correct gene expression. Lastly, we show that the loss of eya expression in sine oculis (so) mutants is the result of massive cell death and a progressive homeotic transformation of retinal progenitor cells into head epidermis. PMID:27930646

  10. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

    PubMed Central

    Karnik, Rahul; Beer, Michael A.

    2015-01-01

    The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884

  11. Functional characterisation of cis-regulatory elements governing dynamic Eomes expression in the early mouse embryo.

    PubMed

    Simon, Claire S; Downes, Damien J; Gosden, Matthew E; Telenius, Jelena; Higgs, Douglas R; Hughes, Jim R; Costello, Ita; Bikoff, Elizabeth K; Robertson, Elizabeth J

    2017-02-07

    The T-box transcription factor (TF) Eomes is a key regulator of cell fate decisions during early mouse development. The cis-acting regulatory elements that direct expression in the anterior visceral endoderm (AVE), primitive streak (PS) and definitive endoderm (DE) have yet to be defined. Here, we identified three gene-proximal enhancer-like sequences (PSE_a, PSE_b and VPE) that faithfully activate tissue specific expression in transgenic embryos. However, targeted deletion experiments demonstrate that PSE_a and PSE_b are dispensable and only the VPE is required for optimal Eomes expression in vivo Embryos lacking this enhancer display variably penetrant defects in anterior-posterior axis orientation and DE formation. Chromosome conformation capture experiments reveal VPE-promoter interactions embryonic stem cells (ESC), prior to gene activation. The locus resides in a large (500kb) pre-formed compartment in ESC and activation during DE differentiation occurs in the absence of 3D structural changes. ATAC-seq analysis reveals that VPE, PSE_a, and four additional putative enhancers display increased chromatin accessibility in DE associated with Smad2/3 binding coincident with transcriptional activation. In contrast, activation of the Eomes target genes Foxa2 and Lhx1 is associated with higher order chromatin reorganisation. Thus diverse regulatory mechanisms govern activation of lineage specifying TFs during early development.

  12. Extensive cis-Regulatory Variation Robust to Environmental Perturbation in Arabidopsis[W

    PubMed Central

    Cubillos, Francisco A.; Stegle, Oliver; Grondin, Cécile; Canut, Matthieu; Tisné, Sébastien; Gy, Isabelle

    2014-01-01

    cis- and trans-acting factors affect gene expression and responses to environmental conditions. However, for most plant systems, we lack a comprehensive map of these factors and their interaction with environmental variation. Here, we examined allele-specific expression (ASE) in an F1 hybrid to study how alleles from two Arabidopsis thaliana accessions affect gene expression. To investigate the effect of the environment, we used drought stress and developed a variance component model to estimate the combined genetic contributions of cis- and trans-regulatory polymorphisms, environmental factors, and their interactions. We quantified ASE for 11,003 genes, identifying 3318 genes with consistent ASE in control and stress conditions, demonstrating that cis-acting genetic effects are essentially robust to changes in the environment. Moreover, we found 1618 genes with genotype x environment (GxE) interactions, mostly cis x E interactions with magnitude changes in ASE. We found fewer trans x E interactions, but these effects were relatively less robust across conditions, showing more changes in the direction of the effect between environments; this confirms that trans-regulation plays an important role in the response to environmental conditions. Our data provide a detailed map of cis- and trans-regulation and GxE interactions in A. thaliana, laying the ground for mechanistic investigations and studies in other plants and environments. PMID:25428981

  13. Extensive cis-regulatory variation robust to environmental perturbation in Arabidopsis.

    PubMed

    Cubillos, Francisco A; Stegle, Oliver; Grondin, Cécile; Canut, Matthieu; Tisné, Sébastien; Gy, Isabelle; Loudet, Olivier

    2014-11-01

    cis- and trans-acting factors affect gene expression and responses to environmental conditions. However, for most plant systems, we lack a comprehensive map of these factors and their interaction with environmental variation. Here, we examined allele-specific expression (ASE) in an F1 hybrid to study how alleles from two Arabidopsis thaliana accessions affect gene expression. To investigate the effect of the environment, we used drought stress and developed a variance component model to estimate the combined genetic contributions of cis- and trans-regulatory polymorphisms, environmental factors, and their interactions. We quantified ASE for 11,003 genes, identifying 3318 genes with consistent ASE in control and stress conditions, demonstrating that cis-acting genetic effects are essentially robust to changes in the environment. Moreover, we found 1618 genes with genotype x environment (GxE) interactions, mostly cis x E interactions with magnitude changes in ASE. We found fewer trans x E interactions, but these effects were relatively less robust across conditions, showing more changes in the direction of the effect between environments; this confirms that trans-regulation plays an important role in the response to environmental conditions. Our data provide a detailed map of cis- and trans-regulation and GxE interactions in A. thaliana, laying the ground for mechanistic investigations and studies in other plants and environments.

  14. Computation-based discovery of related transcriptional regulatory modules and motifs using an experimentally validated combinatorial model.

    PubMed

    Halfon, Marc S; Grad, Yonatan; Church, George M; Michelson, Alan M

    2002-07-01

    Gene expression is regulated by transcription factors that interact with cis-regulatory elements. Predicting these elements from sequence data has proven difficult. We describe here a successful computational search for elements that direct expression in a particular temporal-spatial pattern in the Drosophila embryo, based on a single well characterized enhancer model. The fly genome was searched to identify sequence elements containing the same combination of transcription factors as those found in the model. Experimental evaluation of the search results demonstrates that our method can correctly predict regulatory elements and highlights the importance of functional testing as a means of identifying false-positive results. We also show that the search results enable the identification of additional relevant sequence motifs whose functions can be empirically validated. This approach, combined with gene expression and phylogenetic sequence data, allows for genome-wide identification of related regulatory elements, an important step toward understanding the genetic regulatory networks involved in development.

  15. Cis-regulatory underpinnings of human GLI3 expression in embryonic craniofacial structures and internal organs.

    PubMed

    Abbasi, Amir A; Minhas, Rashid; Schmidt, Ansgar; Koch, Sabine; Grzeschik, Karl-Heinz

    2013-10-01

    The zinc finger transcription factor Gli3 is an important mediator of Sonic hedgehog (Shh) signaling. During early embryonic development Gli3 participates in patterning and growth of the central nervous system, face, skeleton, limb, tooth and gut. Precise regulation of the temporal and spatial expression of Gli3 is crucial for the proper specification of these structures in mammals and other vertebrates. Previously we reported a set of human intronic cis-regulators controlling almost the entire known repertoire of endogenous Gli3 expression in mouse neural tube and limbs. However, the genetic underpinning of GLI3 expression in other embryonic domains such as craniofacial structures and internal organs remain elusive. Here we demonstrate in a transgenic mice assay the potential of a subset of human/fish conserved non-coding sequences (CNEs) residing within GLI3 intronic intervals to induce reporter gene expression at known regions of endogenous Gli3 transcription in embryonic domains other than central nervous system (CNS) and limbs. Highly specific reporter expression was observed in craniofacial structures, eye, gut, and genitourinary system. Moreover, the comparison of expression patterns directed by these intronic cis-acting regulatory elements in mouse and zebrafish embryos suggests that in accordance with sequence conservation, the target site specificity of a subset of these elements remains preserved among these two lineages. Taken together with our recent investigations, it is proposed here that during vertebrate evolution the Gli3 expression control acquired multiple, independently acting, intronic enhancers for spatiotemporal patterning of CNS, limbs, craniofacial structures and internal organs.

  16. An arthropod cis-regulatory element functioning in sensory organ precursor development dates back to the Cambrian

    PubMed Central

    2010-01-01

    Background An increasing number of publications demonstrate conservation of function of cis-regulatory elements without sequence similarity. In invertebrates such functional conservation has only been shown for closely related species. Here we demonstrate the existence of an ancient arthropod regulatory element that functions during the selection of neural precursors. The activity of genes of the achaete-scute (ac-sc) family endows cells with neural potential. An essential, conserved characteristic of proneural genes is their ability to restrict their own activity to single or a small number of progenitor cells from their initially broad domains of expression. This is achieved through a process called lateral inhibition. A regulatory element, the sensory organ precursor enhancer (SOPE), is required for this process. First identified in Drosophila, the SOPE contains discrete binding sites for four regulatory factors. The SOPE of the Drosophila asense gene is situated in the 5' UTR. Results Through a manual comparison of consensus binding site sequences we have been able to identify a SOPE in UTR sequences of asense-like genes in species belonging to all four arthropod groups (Crustacea, Myriapoda, Chelicerata and Insecta). The SOPEs of the spider Cupiennius salei and the insect Tribolium castaneum are shown to be functional in transgenic Drosophila. This would place the origin of this regulatory sequence as far back as the last common ancestor of the Arthropoda, that is, in the Cambrian, 550 million years ago. Conclusions The SOPE is not detectable by inter-specific sequence comparison, raising the possibility that other ancient regulatory modules in invertebrates might have escaped detection. PMID:20868489

  17. Cis-regulatory sequence variation and association with Mycoplasma load in natural populations of the house finch (Carpodacus mexicanus)

    PubMed Central

    Backström, Niclas; Shipilina, Daria; Blom, Mozes P K; Edwards, Scott V

    2013-01-01

    Characterization of the genetic basis of fitness traits in natural populations is important for understanding how organisms adapt to the changing environment and to novel events, such as epizootics. However, candidate fitness-influencing loci, such as regulatory regions, are usually unavailable in nonmodel species. Here, we analyze sequence data from targeted resequencing of the cis-regulatory regions of three candidate genes for disease resistance (CD74, HSP90α, and LCP1) in populations of the house finch (Carpodacus mexicanus) historically exposed (Alabama) and naïve (Arizona) to Mycoplasma gallisepticum. Our study, the first to quantify variation in regulatory regions in wild birds, reveals that the upstream regions of CD74 and HSP90α are GC-rich, with the former exhibiting unusually low sequence variation for this species. We identified two SNPs, located in a GC-rich region immediately upstream of an inferred promoter site in the gene HSP90α, that were significantly associated with Mycoplasma pathogen load in the two populations. The SNPs are closely linked and situated in potential regulatory sequences: one in a binding site for the transcription factor nuclear NFYα and the other in a dinucleotide microsatellite ((GC)6). The genotype associated with pathogen load in the putative NFYα binding site was significantly overrepresented in the Alabama birds. However, we did not see strong effects of selection at this SNP, perhaps because selection has acted on standing genetic variation over an extremely short time in a highly recombining region. Our study is a useful starting point to explore functional relationships between sequence polymorphisms, gene expression, and phenotypic traits, such as pathogen resistance that affect fitness in the wild. PMID:23532859

  18. High constitutive activity of a broad panel of housekeeping and tissue-specific cis-regulatory elements depends on a subset of ETS proteins.

    PubMed

    Curina, Alessia; Termanini, Alberto; Barozzi, Iros; Prosperini, Elena; Simonatto, Marta; Polletti, Sara; Silvola, Alessio; Soldi, Monica; Austenaa, Liv; Bonaldi, Tiziana; Ghisletti, Serena; Natoli, Gioacchino

    2017-02-15

    Enhancers and promoters that control the transcriptional output of terminally differentiated cells include cell type-specific and broadly active housekeeping elements. Whether the high constitutive activity of these two groups of cis-regulatory elements relies on entirely distinct or instead also on shared regulators is unknown. By dissecting the cis-regulatory repertoire of macrophages, we found that the ELF subfamily of ETS proteins selectively bound within 60 base pairs (bp) from the transcription start sites of highly active housekeeping genes. ELFs also bound constitutively active, but not poised, macrophage-specific enhancers and promoters. The role of ELFs in promoting high-level constitutive transcription was suggested by multiple evidence: ELF sites enabled robust transcriptional activation by endogenous and minimal synthetic promoters, ELF recruitment was stabilized by the transcriptional machinery, and ELF proteins mediated recruitment of transcriptional and chromatin regulators to core promoters. These data suggest that the co-optation of a limited number of highly active transcription factors represents a broadly adopted strategy to equip both cell type-specific and housekeeping cis-regulatory elements with the ability to efficiently promote transcription.

  19. Multiple cis-regulatory elements are involved in the complex regulation of the sieve element-specific MtSEO-F1 promoter from Medicago truncatula.

    PubMed

    Bucsenez, M; Rüping, B; Behrens, S; Twyman, R M; Noll, G A; Prüfer, D

    2012-09-01

    The sieve element occlusion (SEO) gene family includes several members that are expressed specifically in immature sieve elements (SEs) in the developing phloem of dicotyledonous plants. To determine how this restricted expression profile is achieved, we analysed the SE-specific Medicago truncatula SEO-F1 promoter (PMtSEO-F1) by constructing deletion, substitution and hybrid constructs and testing them in transgenic tobacco plants using green fluorescent protein as a reporter. This revealed four promoter regions, each containing cis-regulatory elements that activate transcription in SEs. One of these segments also contained sufficient information to suppress PMtSEO-F1 transcription in the phloem companion cells (CCs). Subsequent in silico analysis revealed several candidate cis-regulatory elements that PMtSEO-F1 shares with other SEO promoters. These putative sieve element boxes (PSE boxes) are promising candidates for cis-regulatory elements controlling the SE-specific expression of PMtSEO-F1. © 2012 German Botanical Society and The Royal Botanical Society of the Netherlands.

  20. Kinetics of transcription initiation directed by multiple cis-regulatory elements on the glnAp2 promoter.

    PubMed

    Wang, Yaolai; Liu, Feng; Wang, Wei

    2016-12-15

    Transcription initiation is orchestrated by dynamic molecular interactions, with kinetic steps difficult to detect. Utilizing a hybrid method, we aim to unravel essential kinetic steps of transcriptional regulation on the glnAp2 promoter, whose regulatory region includes two enhancers (sites I and II) and three low-affinity sequences (sites III-V), to which the transcriptional activator NtrC binds. By structure reconstruction, we analyze all possible organization architectures of the transcription apparatus (TA). The main regulatory mode involves two NtrC hexamers: one at enhancer II transiently associates with site V such that the other at enhancer I can rapidly approach and catalyze the σ(54)-RNA polymerase holoenzyme. We build a kinetic model characterizing essential steps of the TA operation; with the known kinetics of the holoenzyme interacting with DNA, this model enables the kinetics beyond technical detection to be determined by fitting the input-output function of the wild-type promoter. The model further quantitatively reproduces transcriptional activities of various mutated promoters. These results reveal different roles played by two enhancers and interpret why the low-affinity elements conditionally enhance or repress transcription. This work presents an integrated dynamic picture of regulated transcription initiation and suggests an evolutionarily conserved characteristic guaranteeing reliable transcriptional response to regulatory signals. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Kinetics of transcription initiation directed by multiple cis-regulatory elements on the glnAp2 promoter

    PubMed Central

    Wang, Yaolai; Liu, Feng; Wang, Wei

    2016-01-01

    Transcription initiation is orchestrated by dynamic molecular interactions, with kinetic steps difficult to detect. Utilizing a hybrid method, we aim to unravel essential kinetic steps of transcriptional regulation on the glnAp2 promoter, whose regulatory region includes two enhancers (sites I and II) and three low-affinity sequences (sites III-V), to which the transcriptional activator NtrC binds. By structure reconstruction, we analyze all possible organization architectures of the transcription apparatus (TA). The main regulatory mode involves two NtrC hexamers: one at enhancer II transiently associates with site V such that the other at enhancer I can rapidly approach and catalyze the σ54-RNA polymerase holoenzyme. We build a kinetic model characterizing essential steps of the TA operation; with the known kinetics of the holoenzyme interacting with DNA, this model enables the kinetics beyond technical detection to be determined by fitting the input-output function of the wild-type promoter. The model further quantitatively reproduces transcriptional activities of various mutated promoters. These results reveal different roles played by two enhancers and interpret why the low-affinity elements conditionally enhance or repress transcription. This work presents an integrated dynamic picture of regulated transcription initiation and suggests an evolutionarily conserved characteristic guaranteeing reliable transcriptional response to regulatory signals. PMID:27899598

  2. Identification of Important Nodes in Directed Biological Networks: A Network Motif Approach

    PubMed Central

    Wang, Pei; Lü, Jinhu; Yu, Xinghuo

    2014-01-01

    Identification of important nodes in complex networks has attracted an increasing attention over the last decade. Various measures have been proposed to characterize the importance of nodes in complex networks, such as the degree, betweenness and PageRank. Different measures consider different aspects of complex networks. Although there are numerous results reported on undirected complex networks, few results have been reported on directed biological networks. Based on network motifs and principal component analysis (PCA), this paper aims at introducing a new measure to characterize node importance in directed biological networks. Investigations on five real-world biological networks indicate that the proposed method can robustly identify actually important nodes in different networks, such as finding command interneurons, global regulators and non-hub but evolutionary conserved actually important nodes in biological networks. Receiver Operating Characteristic (ROC) curves for the five networks indicate remarkable prediction accuracy of the proposed measure. The proposed index provides an alternative complex network metric. Potential implications of the related investigations include identifying network control and regulation targets, biological networks modeling and analysis, as well as networked medicine. PMID:25170616

  3. Identification of important nodes in directed biological networks: a network motif approach.

    PubMed

    Wang, Pei; Lü, Jinhu; Yu, Xinghuo

    2014-01-01

    Identification of important nodes in complex networks has attracted an increasing attention over the last decade. Various measures have been proposed to characterize the importance of nodes in complex networks, such as the degree, betweenness and PageRank. Different measures consider different aspects of complex networks. Although there are numerous results reported on undirected complex networks, few results have been reported on directed biological networks. Based on network motifs and principal component analysis (PCA), this paper aims at introducing a new measure to characterize node importance in directed biological networks. Investigations on five real-world biological networks indicate that the proposed method can robustly identify actually important nodes in different networks, such as finding command interneurons, global regulators and non-hub but evolutionary conserved actually important nodes in biological networks. Receiver Operating Characteristic (ROC) curves for the five networks indicate remarkable prediction accuracy of the proposed measure. The proposed index provides an alternative complex network metric. Potential implications of the related investigations include identifying network control and regulation targets, biological networks modeling and analysis, as well as networked medicine.

  4. i-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly.

    PubMed

    Imrichová, Hana; Hulselmans, Gert; Atak, Zeynep Kalender; Potier, Delphine; Aerts, Stein

    2015-07-01

    i-cisTarget is a web tool to predict regulators of a set of genomic regions, such as ChIP-seq peaks or co-regulated/similar enhancers. i-cisTarget can also be used to identify upstream regulators and their target enhancers starting from a set of co-expressed genes. Whereas the original version of i-cisTarget was focused on Drosophila data, the 2015 update also provides support for human and mouse data. i-cisTarget detects transcription factor motifs (position weight matrices) and experimental data tracks (e.g. from ENCODE, Roadmap Epigenomics) that are enriched in the input set of regions. As experimental data tracks we include transcription factor ChIP-seq data, histone modification ChIP-seq data and open chromatin data. The underlying processing method is based on a ranking-and-recovery procedure, allowing accurate determination of enrichment across heterogeneous datasets, while also discriminating direct from indirect target regions through a 'leading edge' analysis. We illustrate i-cisTarget on various Ewing sarcoma datasets to identify EWS-FLI1 targets starting from ChIP-seq, differential ATAC-seq, differential H3K27ac and differential gene expression data. Use of i-cisTarget is free and open to all, and there is no login requirement. Address: http://gbiomed.kuleuven.be/apps/lcb/i-cisTarget. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing.

    PubMed

    Larsson, Erik; Lindahl, Per; Mostad, Petter

    2007-10-28

    Correct temporal and spatial gene expression during metazoan development relies on combinatorial interactions between different transcription factors. As a consequence, cis-regulatory elements often colocalize in clusters termed cis-regulatory modules. These may have requirements on organizational features such as spacing, order and helical phasing (periodic spacing) between binding sites. Due to the turning of the DNA helix, a small modification of the distance between a pair of sites may sometimes drastically disrupt function, while insertion of a full helical turn of DNA (10-11 bp) between cis elements may cause functionality to be restored. Recently, de novo motif discovery methods which incorporate organizational properties such as colocalization and order preferences have been developed, but there are no tools which incorporate periodic spacing into the model. We have developed a web based motif discovery tool, HeliCis, which features a flexible model which allows de novo detection of motifs with periodic spacing. Depending on the parameter settings it may also be used for discovering colocalized motifs without periodicity or motifs separated by a fixed gap of known or unknown length. We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs. It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure. HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing. Our evaluations show that it can detect weak periodic patterns which are not easily discovered using a sequential approach, i.e. first finding the binding sites and second analyzing the properties of their pairwise distances.

  6. RNA recognition motif 2 directs the recruitment of SF2/ASF to nuclear stress bodies

    PubMed Central

    Chiodi, Ilaria; Corioni, Margherita; Giordano, Manuela; Valgardsdottir, Rut; Ghigna, Claudia; Cobianchi, Fabio; Xu, Rui-Ming; Riva, Silvano; Biamonti, Giuseppe

    2004-01-01

    Heat shock induces the transcriptional activation of large heterochromatic regions of the human genome composed of arrays of satellite III DNA repeats. A number of RNA-processing factors, among them splicing factor SF2/ASF, associate with these transcription factors giving rise to nuclear stress bodies (nSBs). Here, we show that the recruitment of SF2/ASF to these structures is mediated by its second RNA recognition motif. Amino acid substitutions in the first α-helix of this domain, but not in the β-strand regions, abrogate the association with nSBs. The same mutations drastically affect the in vivo activity of SF2/ASF in the alternative splicing of adenoviral E1A transcripts. Sequence analysis identifies four putative high-affinity binding sites for SF2/ASF in the transcribed strand of the satellite III DNA. We have verified by gel mobility shift assays that the second RNA-binding domain of SF2/ASF binds at least one of these sites. Our analysis suggests that the recruitment of SF2/ASF to nSBs is mediated by a direct interaction with satellite III transcripts and points to the second RNA-binding domain of the protein as the major determinant of this interaction. PMID:15302913

  7. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    PubMed Central

    Fauteux, François; Strömvik, Martina V

    2009-01-01

    Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs

  8. Direct Activation of RhoA by Reactive Oxygen Species Requires a Redox-Sensitive Motif

    PubMed Central

    Campbell, Sharon L.; Burridge, Keith

    2009-01-01

    Background Rho family GTPases are critical regulators of the cytoskeleton and affect cell migration, cell-cell adhesion, and cell-matrix adhesion. As with all GTPases, their activity is determined by their guanine nucleotide-bound state. Understanding how Rho proteins are activated and inactivated has largely focused on regulatory proteins such as guanine nucleotide exchange factors (GEFs) and GTPase activating proteins (GAPs). However, recent in vitro studies have indicated that GTPases may also be directly regulated by redox agents. We hypothesized that this redox-based mechanism occurs in cells and affects cytoskeletal dynamics, and in this report we conclude this is indeed a novel mechanism of regulating the GTPase RhoA. Methodology/Principal Findings In this report, we show that RhoA can be directly activated by reactive oxygen species (ROS) in cells, and that this requires two critical cysteine residues located in a unique redox-sensitive motif within the phosphoryl binding loop. First, we show that ROS can reversibly activate RhoA and induce stress fiber formation, a well characterized readout of RhoA activity. To determine the role of cysteine residues in this mechanism of regulation, we generated cysteine to alanine RhoA mutants. Mutation of these cysteines abolishes ROS-mediated activation and stress fiber formation, indicating that these residues are critical for redox-regulation of RhoA. Importantly, these mutants maintain the ability to be activated by GEFs. Conclusions/Significance Our findings identify a novel mechanism for the regulation of RhoA in cells by ROS, which is independent of classical regulatory proteins. This mechanism of regulation may be particularly relevant in pathological conditions where ROS are generated and the cellular redox-balance altered, such as in asthma and ischemia-reperfusion injury. PMID:19956681

  9. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus.

    PubMed

    Damle, Sagar; Davidson, Eric H

    2011-09-15

    Deployment of the gene-regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an auto-repressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring.

  10. The autoimmunity-associated BLK haplotype exhibits cis-regulatory effects on mRNA and protein expression that are prominently observed in B cells early in development

    PubMed Central

    Simpfendorfer, Kim R.; Olsson, Lina M.; Manjarrez Orduño, Nataly; Khalili, Houman; Simeone, Alyssa M.; Katz, Matthew S.; Lee, Annette T.; Diamond, Betty; Gregersen, Peter K.

    2012-01-01

    The gene B lymphocyte kinase (BLK) is associated with rheumatoid arthritis, systemic lupus erythematosus and several other autoimmune disorders. The disease risk haplotype is known to be associated with reduced expression of BLK mRNA transcript in human B cell lines; however, little is known about cis-regulation of BLK message or protein levels in native cell types. Here, we show that in primary human B lymphocytes, cis-regulatory effects of disease-associated single nucleotide polymorphisms in BLK are restricted to naïve and transitional B cells. Cis-regulatory effects are not observed in adult B cells in later stages of differentiation. Allelic expression bias was also identified in primary human T cells from adult peripheral and umbilical cord blood (UCB), thymus and tonsil, although mRNA levels were reduced compared with B cells. Allelic regulation of Blk expression at the protein level was confirmed in UCB B cell subsets by intracellular staining and flow cytometry. Blk protein expression in CD4+ and CD8+ T cells was documented by western blot analysis; however, differences in protein expression levels by BLK genotype were not observed in any T cell subset. Blk allele expression differences at the protein level are thus restricted to early B cells, indicating that the involvement of Blk in the risk for autoimmune disease likely acts during the very early stages of B cell development. PMID:22678060

  11. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php.

  12. ‘In silico expression analysis’, a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences

    PubMed Central

    Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated ‘in silico expression analysis’ was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the ‘in silico expression analysis’ resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the ‘in silico expression analysis’ predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. Database URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  13. Whole-mount in situ hybridization of sectioned tissues of species hybrids to detect cis-regulatory changes in gene expression pattern.

    PubMed

    Futahashi, Ryo

    2011-01-01

    To distinguish whether differences in gene expression between species or between individuals of the same species are caused by cis-regulatory changes or by distribution differences in trans-regulatory proteins, comparison of species-specific mRNA expression in an F1 hybrid by whole-mount in situ hybridization is a rarely used yet very powerful tool. If asymmetric expression pattern is observed for the two alleles, this implies a cis-regulatory divergence of this gene. Alternatively, if symmetric expression pattern is observed for both alleles, the change in expression of this gene is probably caused by changes in the distribution of trans-regulatory proteins. In this chapter, I describe how to prepare RNA probes, tissue samples and how to detect mRNA expression pattern using in situ hybridization. Although I choose to present here the detection of yellow-related gene (YRG) expression pattern in the larval epidermis of swallowtail butterflies, this protocol can be adapted to other species and tissues. YRG mRNA expression is correlated with interspecific differences of yellow and green larval color pattern such as V-shaped markings in swallowtail butterflies. F1 hybrids show an intermediate color pattern between parental species. In this case, both species-specific YRG mRNA showed a similar expression pattern in F1 hybrids, suggesting that the change in expression of YRG is mainly caused by changes in the distribution of trans-regulatory proteins.

  14. Motif analysis in directed ordered networks and applications to food webs

    PubMed Central

    Paulau, Pavel V.; Feenders, Christoph; Blasius, Bernd

    2015-01-01

    The analysis of small recurrent substructures, so called network motifs, has become a standard tool of complex network science to unveil the design principles underlying the structure of empirical networks. In many natural systems network nodes are associated with an intrinsic property according to which they can be ordered and compared against each other. Here, we expand standard motif analysis to be able to capture the hierarchical structure in such ordered networks. Our new approach is based on the identification of all ordered 3-node substructures and the visualization of their significance profile. We present a technique to calculate the fine grained motif spectrum by resolving the individual members of isomorphism classes (sets of substructures formed by permuting node-order). We apply this technique to computer generated ensembles of ordered networks and to empirical food web data, demonstrating the importance of considering node order for food-web analysis. Our approach may not only be helpful to identify hierarchical patterns in empirical food webs and other natural networks, it may also provide the base for extending motif analysis to other types of multi-layered networks. PMID:26144248

  15. i-cisTarget: an integrative genomics method for the prediction of regulatory features and cis-regulatory modules.

    PubMed

    Herrmann, Carl; Van de Sande, Bram; Potier, Delphine; Aerts, Stein

    2012-08-01

    The field of regulatory genomics today is characterized by the generation of high-throughput data sets that capture genome-wide transcription factor (TF) binding, histone modifications, or DNAseI hypersensitive regions across many cell types and conditions. In this context, a critical question is how to make optimal use of these publicly available datasets when studying transcriptional regulation. Here, we address this question in Drosophila melanogaster for which a large number of high-throughput regulatory datasets are available. We developed i-cisTarget (where the 'i' stands for integrative), for the first time enabling the discovery of different types of enriched 'regulatory features' in a set of co-regulated sequences in one analysis, being either TF motifs or 'in vivo' chromatin features, or combinations thereof. We have validated our approach on 15 co-expressed gene sets, 21 ChIP data sets, 628 curated gene sets and multiple individual case studies, and show that meaningful regulatory features can be confidently discovered; that bona fide enhancers can be identified, both by in vivo events and by TF motifs; and that combinations of in vivo events and TF motifs further increase the performance of enhancer prediction.

  16. iRegulon: from a gene list to a gene regulatory network using large motif and track collections.

    PubMed

    Janky, Rekin's; Verfaillie, Annelien; Imrichová, Hana; Van de Sande, Bram; Standaert, Laura; Christiaens, Valerie; Hulselmans, Gert; Herten, Koen; Naval Sanchez, Marina; Potier, Delphine; Svetlichnyy, Dmitry; Kalender Atak, Zeynep; Fiers, Mark; Marine, Jean-Christophe; Aerts, Stein

    2014-07-01

    Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org.

  17. iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections

    PubMed Central

    Imrichová, Hana; Van de Sande, Bram; Standaert, Laura; Christiaens, Valerie; Hulselmans, Gert; Herten, Koen; Naval Sanchez, Marina; Potier, Delphine; Svetlichnyy, Dmitry; Kalender Atak, Zeynep; Fiers, Mark; Marine, Jean-Christophe; Aerts, Stein

    2014-01-01

    Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org. PMID:25058159

  18. T versus D in the MTCXXC motif of copper transport proteins plays a role in directional metal transport.

    PubMed

    Niemiec, Moritz S; Dingeldein, Artur P G; Wittung-Stafshede, Pernilla

    2014-08-01

    To avoid toxicity and control levels of metal ions, organisms have developed specific metal transport systems. In humans, the cytoplasmic Cu chaperone Atox1 delivers Cu to metal-binding domains of ATP7A/B in the Golgi, for incorporation into Cu-dependent proteins. The Cu-binding motif in Atox1, as well as in target Cu-binding domains of ATP7A/B, consists of a MX1CXXC motif where X1 = T. The same motif, with X1 = D, is found in metal-binding domains of bacterial zinc transporters, such as ZntA. The Asp is proposed to stabilize divalent over monovalent metals in the binding site, although metal selectivity in vivo appears predominantly governed by protein-protein interactions. To probe the role of T versus D at the X1 position for Cu transfer in vitro, we created MDCXXC variants of Atox1 and the fourth metal-binding domain of ATP7B, WD4. We find that the mutants bind Cu like the wild-type proteins, but when mixed, in contrast to the wild-type pair, the mutant pair favors Cu-dependent hetero-dimers over directional Cu transport from Atox1 to WD4. Notably, both wild-type and mutant proteins can bind Zn in the absence of competing reducing agents. In presence of zinc, hetero-complexes are strongly favored for both protein pairs. We propose that T is conserved in this motif of Cu-transport proteins to promote directional metal transfer toward ATP7B, without formation of energetic sinks. The ability of both Atox1 and WD4 to bind zinc ions may not be a problem in vivo due to the presence of specific transport chains for Cu and Zn ions.

  19. Cis-regulatory signatures of orthologous stress-associated bZIP transcription factors from rice, sorghum and Arabidopsis based on phylogenetic footprints

    PubMed Central

    2012-01-01

    Background The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa) formed ten clusters of orthologous groups (COG) with genes from the monocot sorghum (Sorghum bicolor) and dicot Arabidopsis (Arabidopsis thaliana). The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns. Results The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS) classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic) or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis. Conclusions Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in conjunction with lineage

  20. Two negative cis-regulatory regions involved in fruit-specific promoter activity from watermelon (Citrullus vulgaris S.).

    PubMed

    Yin, Tao; Wu, Hanying; Zhang, Shanglong; Lu, Hongyu; Zhang, Lingxiao; Xu, Yong; Chen, Daming; Liu, Jingmei

    2009-01-01

    A 1.8 kb 5'-flanking region of the large subunit of ADP-glucose pyrophosphorylase, isolated from watermelon (Citrullus vulgaris S.), has fruit-specific promoter activity in transgenic tomato plants. Two negative regulatory regions, from -986 to -959 and from -472 to -424, were identified in this promoter region by fine deletion analyses. Removal of both regions led to constitutive expression in epidermal cells. Gain-of-function experiments showed that these two regions were sufficient to inhibit RFP (red fluorescent protein) expression in transformed epidermal cells when fused to the cauliflower mosaic virus (CaMV) 35S minimal promoter. Gel mobility shift experiments demonstrated the presence of leaf nuclear factors that interact with these two elements. A TCCAAAA motif was identified in these two regions, as well as one in the reverse orientation, which was confirmed to be a novel specific cis-element. A quantitative beta-glucuronidase (GUS) activity assay of stable transgenic tomato plants showed that the activities of chimeric promoters harbouring only one of the two cis-elements, or both, were approximately 10-fold higher in fruits than in leaves. These data confirm that the TCCAAAA motif functions as a fruit-specific element by inhibiting gene expression in leaves.

  1. Two negative cis-regulatory regions involved in fruit-specific promoter activity from watermelon (Citrullus vulgaris S.)

    PubMed Central

    Yin, Tao; Wu, Hanying; Zhang, Shanglong; Liu, Jingmei; Lu, Hongyu; Zhang, Lingxiao; Xu, Yong; Chen, Daming

    2009-01-01

    A 1.8 kb 5′-flanking region of the large subunit of ADP-glucose pyrophosphorylase, isolated from watermelon (Citrullus vulgaris S.), has fruit-specific promoter activity in transgenic tomato plants. Two negative regulatory regions, from –986 to –959 and from –472 to –424, were identified in this promoter region by fine deletion analyses. Removal of both regions led to constitutive expression in epidermal cells. Gain-of-function experiments showed that these two regions were sufficient to inhibit RFP (red fluorescent protein) expression in transformed epidermal cells when fused to the cauliflower mosaic virus (CaMV) 35S minimal promoter. Gel mobility shift experiments demonstrated the presence of leaf nuclear factors that interact with these two elements. A TCCAAAA motif was identified in these two regions, as well as one in the reverse orientation, which was confirmed to be a novel specific cis-element. A quantitative β-glucuronidase (GUS) activity assay of stable transgenic tomato plants showed that the activities of chimeric promoters harbouring only one of the two cis-elements, or both, were ∼10-fold higher in fruits than in leaves. These data confirm that the TCCAAAA motif functions as a fruit-specific element by inhibiting gene expression in leaves. PMID:19073962

  2. Zebrafish enhancer detection (ZED) vector: a new tool to facilitate transgenesis and the functional analysis of cis-regulatory regions in zebrafish.

    PubMed

    Bessa, José; Tena, Juan J; de la Calle-Mustienes, Elisa; Fernández-Miñán, Ana; Naranjo, Silvia; Fernández, Almudena; Montoliu, Lluis; Akalin, Altuna; Lenhard, Boris; Casares, Fernando; Gómez-Skarmeta, José Luis

    2009-09-01

    The identification and characterization of the regulatory activity of genomic sequences is crucial for understanding how the information contained in genomes is translated into cellular function. The cis-regulatory sequences control when, where, and how much genes are transcribed and can activate (enhancers) or repress (silencers) gene expression. Here, we describe a novel Tol2 transposon-based vector for assessing enhancer activity in the zebrafish (Danio rerio). This Zebrafish Enhancer Detector (ZED) vector harbors several key improvements, among them a sensitive and specific minimal promoter chosen for optimal enhancer activity detection, insulator sequences to shield the minimal promoter from position effects, and a positive control for transgenesis. Additionally, we demonstrate that highly conserved noncoding sequences homologous between humans and zebrafish largely with enhancer activity largely retain their tissue-specific enhancer activity during vertebrate evolution. More strikingly, insulator sequences from mouse and chicken, but not conserved in zebrafish, maintain their insulator capacity when tested in this model.

  3. Role of Direct Repeat and Stem-Loop Motifs in mtDNA Deletions: Cause or Coincidence?

    PubMed Central

    Lakshmanan, Lakshmi Narayanan; Gruber, Jan; Halliwell, Barry; Gunawan, Rudiyanto

    2012-01-01

    Deletion mutations within mitochondrial DNA (mtDNA) have been implicated in degenerative and aging related conditions, such as sarcopenia and neuro-degeneration. While the precise molecular mechanism of deletion formation in mtDNA is still not completely understood, genome motifs such as direct repeat (DR) and stem-loop (SL) have been observed in the neighborhood of deletion breakpoints and thus have been postulated to take part in mutagenesis. In this study, we have analyzed the mitochondrial genomes from four different mammals: human, rhesus monkey, mouse and rat, and compared them to randomly generated sequences to further elucidate the role of direct repeat and stem-loop motifs in aging associated mtDNA deletions. Our analysis revealed that in the four species, DR and SL structures are abundant and that their distributions in mtDNA are not statistically different from randomized sequences. However, the average distance between the reported age associated mtDNA breakpoints and their respective nearest DR motifs is significantly shorter than what is expected of random chance in human (p<10−4) and rhesus monkey (p = 0.0034), but not in mouse (p = 0.0719) and rat (p = 0.0437), indicating the existence of species specific difference in the relationship between DR motifs and deletion breakpoints. In addition, the frequencies of large DRs (>10 bp) tend to decrease with increasing lifespan among the four mammals studied here, further suggesting an evolutionary selection against stable mtDNA misalignments associated with long DRs in long-living animals. In contrast to the results on DR, the probability of finding SL motifs near a deletion breakpoint does not differ from random in any of the four mtDNA sequences considered. Taken together, the findings in this study give support for the importance of stable mtDNA misalignments, aided by long DRs, as a major mechanism of deletion formation in long-living, but not in short-living mammals. PMID:22529999

  4. PreCisIon: PREdiction of CIS-regulatory elements improved by gene’s positION

    PubMed Central

    Elati, Mohamed; Nicolle, Rémy; Junier, Ivan; Fernández, David; Fekih, Rim; Font, Julio; Képès, François

    2013-01-01

    Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases. PMID:23241390

  5. Sex Chromosome-wide Transcriptional Suppression and Compensatory Cis-Regulatory Evolution Mediate Gene Expression in the Drosophila Male Germline

    PubMed Central

    Landeen, Emily L.; Muirhead, Christina A.; Meiklejohn, Colin D.; Presgraves, Daven C.

    2016-01-01

    The evolution of heteromorphic sex chromosomes has repeatedly resulted in the evolution of sex chromosome-specific forms of regulation, including sex chromosome dosage compensation in the soma and meiotic sex chromosome inactivation in the germline. In the male germline of Drosophila melanogaster, a novel but poorly understood form of sex chromosome-specific transcriptional regulation occurs that is distinct from canonical sex chromosome dosage compensation or meiotic inactivation. Previous work shows that expression of reporter genes driven by testis-specific promoters is considerably lower—approximately 3-fold or more—for transgenes inserted into X chromosome versus autosome locations. Here we characterize this transcriptional suppression of X-linked genes in the male germline and its evolutionary consequences. Using transgenes and transpositions, we show that most endogenous X-linked genes, not just testis-specific ones, are transcriptionally suppressed several-fold specifically in the Drosophila male germline. In wild-type testes, this sex chromosome-wide transcriptional suppression is generally undetectable, being effectively compensated by the gene-by-gene evolutionary recruitment of strong promoters on the X chromosome. We identify and experimentally validate a promoter element sequence motif that is enriched upstream of the transcription start sites of hundreds of testis-expressed genes; evolutionarily conserved across species; associated with strong gene expression levels in testes; and overrepresented on the X chromosome. These findings show that the expression of X-linked genes in the Drosophila testes reflects a balance between chromosome-wide epigenetic transcriptional suppression and long-term compensatory adaptation by sex-linked genes. Our results have broad implications for the evolution of gene expression in the Drosophila male germline and for genome evolution. PMID:27404402

  6. Sex Chromosome-wide Transcriptional Suppression and Compensatory Cis-Regulatory Evolution Mediate Gene Expression in the Drosophila Male Germline.

    PubMed

    Landeen, Emily L; Muirhead, Christina A; Wright, Lori; Meiklejohn, Colin D; Presgraves, Daven C

    2016-07-01

    The evolution of heteromorphic sex chromosomes has repeatedly resulted in the evolution of sex chromosome-specific forms of regulation, including sex chromosome dosage compensation in the soma and meiotic sex chromosome inactivation in the germline. In the male germline of Drosophila melanogaster, a novel but poorly understood form of sex chromosome-specific transcriptional regulation occurs that is distinct from canonical sex chromosome dosage compensation or meiotic inactivation. Previous work shows that expression of reporter genes driven by testis-specific promoters is considerably lower-approximately 3-fold or more-for transgenes inserted into X chromosome versus autosome locations. Here we characterize this transcriptional suppression of X-linked genes in the male germline and its evolutionary consequences. Using transgenes and transpositions, we show that most endogenous X-linked genes, not just testis-specific ones, are transcriptionally suppressed several-fold specifically in the Drosophila male germline. In wild-type testes, this sex chromosome-wide transcriptional suppression is generally undetectable, being effectively compensated by the gene-by-gene evolutionary recruitment of strong promoters on the X chromosome. We identify and experimentally validate a promoter element sequence motif that is enriched upstream of the transcription start sites of hundreds of testis-expressed genes; evolutionarily conserved across species; associated with strong gene expression levels in testes; and overrepresented on the X chromosome. These findings show that the expression of X-linked genes in the Drosophila testes reflects a balance between chromosome-wide epigenetic transcriptional suppression and long-term compensatory adaptation by sex-linked genes. Our results have broad implications for the evolution of gene expression in the Drosophila male germline and for genome evolution.

  7. Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles.

    PubMed

    Gautheret, D; Lambert, A

    2001-11-09

    We present here a new approach to the problem of defining RNA signatures and finding their occurrences in sequence databases. The proposed method is based on "secondary structure profiles". An RNA sequence alignment with secondary structure information is used as an input. Two types of weight matrices/profiles are constructed from this alignment: single strands are represented by a classical lod-scores profile while helical regions are represented by an extended "helical profile" comprising 16 lod-scores per position, one for each of the 16 possible base-pairs. Database searches are then conducted using a simultaneous search for helical profiles and dynamic programming alignment of single strand profiles. The algorithm has been implemented into a new software, ERPIN, that performs both profile construction and database search. Applications are presented for several RNA motifs. The automated use of sequence information in both single-stranded and helical regions yields better sensitivity/specificity ratios than descriptor-based programs. Furthermore, since the translation of alignments into profiles is straightforward with ERPIN, iterative searches can easily be conducted to enrich collections of homologous RNAs. Copyright 2001 Academic Press.

  8. Revealing constitutively expressed resistance genes in Agrostis species using PCR-based motif-directed RNA fingerprinting.

    PubMed

    Budak, Hikmet; Su, Senem; Ergen, Neslihan

    2006-12-01

    Agrostis species are mainly used in athletic fields and golf courses. Their integrity is maintained by fungicides, which makes the development of disease-resistance varieties a high priority. However, there is a lack of knowledge about resistance (R) genes and their use for genetic improvement in Agrostis species. The objective of this study was to identify and clone constitutively expressed cDNAs encoding R gene-like (RGL) sequences from three Agrostis species (colonial bentgrass (A. capillaris L.), creeping bentgrass (A. stolonifera L.) and velvet bentgrass (A. canina L.)) by PCR-based motif-directed RNA fingerprinting towards relatively conserved nucleotide binding site (NBS) domains. Sixty-one constitutively expressed cDNA sequences were identified and characterized. Sequence analysis of ESTs and probable translation products revealed that RGLs are highly conserved among these three Agrostis species. Fifteen of them were shown to share conserved motifs found in other plant disease resistance genes such as MLA13, Xa1, YR6, YR23 and RPP5. The molecular evolutionary forces, analysed using the Ka/Ks ratio, reflected purifying selection both on NBS and leucine-rich repeat (LRR) intervening regions of discovered RGL sequences in these species. This study presents, for the first time, isolation and characterization of constitutively expressed RGL sequences from Agrostis species revealing the presence of TNL (TIR-NBS-LRR) type R genes in monocot plants. The characterized RGLs will further enhance knowledge on the molecular evolution of the R gene family in grasses.

  9. Integrative Modeling of eQTLs and Cis-Regulatory Elements Suggests Mechanisms Underlying Cell Type Specificity of eQTLs

    PubMed Central

    Brown, Christopher D.; Mangravite, Lara M.; Engelhardt, Barbara E.

    2013-01-01

    Genetic variants in cis-regulatory elements or trans-acting regulators frequently influence the quantity and spatiotemporal distribution of gene transcription. Recent interest in expression quantitative trait locus (eQTL) mapping has paralleled the adoption of genome-wide association studies (GWAS) for the analysis of complex traits and disease in humans. Under the hypothesis that many GWAS associations tag non-coding SNPs with small effects, and that these SNPs exert phenotypic control by modifying gene expression, it has become common to interpret GWAS associations using eQTL data. To fully exploit the mechanistic interpretability of eQTL-GWAS comparisons, an improved understanding of the genetic architecture and causal mechanisms of cell type specificity of eQTLs is required. We address this need by performing an eQTL analysis in three parts: first we identified eQTLs from eleven studies on seven cell types; then we integrated eQTL data with cis-regulatory element (CRE) data from the ENCODE project; finally we built a set of classifiers to predict the cell type specificity of eQTLs. The cell type specificity of eQTLs is associated with eQTL SNP overlap with hundreds of cell type specific CRE classes, including enhancer, promoter, and repressive chromatin marks, regions of open chromatin, and many classes of DNA binding proteins. These associations provide insight into the molecular mechanisms generating the cell type specificity of eQTLs and the mode of regulation of corresponding eQTLs. Using a random forest classifier with cell specific CRE-SNP overlap as features, we demonstrate the feasibility of predicting the cell type specificity of eQTLs. We then demonstrate that CREs from a trait-associated cell type can be used to annotate GWAS associations in the absence of eQTL data for that cell type. We anticipate that such integrative, predictive modeling of cell specificity will improve our ability to understand the mechanistic basis of human complex phenotypic

  10. A NOVEL DFNB1 DELETION ALLELE SUPPORTS THE EXISTENCE OF A DISTANT CIS-REGULATORY REGION THAT CONTROLS GJB2 AND GJB6 EXPRESSION

    PubMed Central

    Wilch, Ellen; Azaiez, Hela; Fisher, Rachel A.; Elfenbein, Jill; Murgia, Alessandra; Birkenhäger, Ralf; Bolz, Hanno; Costa, Sueli Matilde Silva; del Castillo, Ignacio; Haaf, Thomas; Hoefsloot, Lies; Kremer, Hannie; Kubisch, Christian; Le Marechal, Cedric; Pandya, Arti; Sartorato, Edi Lúcia; Schneider, Eberhard; Van Camp, Guy; Wuyts, Wim; Smith, Richard HJ; Friderici, Karen H.

    2010-01-01

    Eleven affected members of a large German-American family segregating recessively inherited, congenital, non-syndromic sensorineural hearing loss (SNHL) were found to be homozygous for the common 35delG mutation of GJB2, the gene encoding the gap junction protein Connexin 26. Surprisingly, four additional family members with bilateral profound SNHL carried only a single 35delG mutation. Previously, we demonstrated reduced expression of both GJB2 and GJB6 mRNA from the allele carried in trans with that bearing the 35delG mutation in these four persons. Using array comparative genome hybridization (arrayCGH), we have now identified on this allele a deletion of 131.4 kb whose proximal breakpoint lies more than 100 kb upstream of the transcriptional start sites of GJB2 and GJB6. This deletion, del(chr13:19,837,344-19,968,698), segregates as a completely penetrant DFNB1 allele in this family. It is not present in 528 persons with SNHL and monoallelic mutation of GJB2 or GJB6, nor have we identified any other candidate pathogenic copy number variation by arrayCGH in a subset of 10 such persons. Characterization of distant GJB2/GJB6 cis-regulatory regions evidenced by this allele may be required to find the ‘missing’ DFNB1 mutations that are believed to exist. PMID:20236118

  11. A cis-regulatory mutation in troponin-I of Drosophila reveals the importance of proper stoichiometry of structural proteins during muscle assembly.

    PubMed

    Firdaus, Hena; Mohan, Jayaram; Naz, Sarwat; Arathi, Prabhashankar; Ramesh, Saraf R; Nongthomba, Upendra

    2015-05-01

    Rapid and high wing-beat frequencies achieved during insect flight are powered by the indirect flight muscles, the largest group of muscles present in the thorax. Any anomaly during the assembly and/or structural impairment of the indirect flight muscles gives rise to a flightless phenotype. Multiple mutagenesis screens in Drosophila melanogaster for defective flight behavior have led to the isolation and characterization of mutations that have been instrumental in the identification of many proteins and residues that are important for muscle assembly, function, and disease. In this article, we present a molecular-genetic characterization of a flightless mutation, flightless-H (fliH), originally designated as heldup-a (hdp-a). We show that fliH is a cis-regulatory mutation of the wings up A (wupA) gene, which codes for the troponin-I protein, one of the troponin complex proteins, involved in regulation of muscle contraction. The mutation leads to reduced levels of troponin-I transcript and protein. In addition to this, there is also coordinated reduction in transcript and protein levels of other structural protein isoforms that are part of the troponin complex. The altered transcript and protein stoichiometry ultimately culminates in unregulated acto-myosin interactions and a hypercontraction muscle phenotype. Our results shed new insights into the importance of maintaining the stoichiometry of structural proteins during muscle assembly for proper function with implications for the identification of mutations and disease phenotypes in other species, including humans.

  12. Application of the cis-regulatory region of a heat-shock protein 70 gene to heat-inducible gene expression in the ascidian Ciona intestinalis.

    PubMed

    Kawaguchi, Akane; Utsumi, Nanami; Morita, Maki; Ohya, Aya; Wada, Shuichi

    2015-01-01

    Temporally controlled induction of gene expression is a useful technique for analyzing gene function. To make such a technique possible in Ciona intestinalis embryos, we employed the cis-regulatory region of the heat-shock protein 70 (HSP70) gene Ci-HSPA1/6/7-like for heat-inducible gene expression in C. intestinalis embryos. We showed that Ci-HSPA1/6/7-like becomes heat shock-inducible by the 32-cell stage during embryogenesis. The 5'-upstream region of Ci-HSPA1/6/7-like, which contains heat-shock elements indispensable for heat-inducible gene expression, induces the heat shock-dependent expression of a reporter gene in the whole embryo from the 32-cell to the middle gastrula stages and in progressively restricted areas of embryos in subsequent stages. We assessed the effects of heat-shock treatments in different conditions on the normality of embryos and induction of transgene expression. We evaluated the usefulness of this technique through overexpression experiments on the well-characterized, developmentally relevant gene, Ci-Bra, and showed that this technique is applicable for inferring the gene function in C. intestinalis.

  13. Annotated embryonic CNS expression patterns of 5000 GMR GAL4 lines: a resource for manipulating gene expression and analyzing cis-regulatory modules

    PubMed Central

    Manning, Laurina; Heckscher, Ellie S.; Purice, Maria D.; Roberts, Jourdain; Bennett, Alysha L.; Kroll, Jason R.; Pollard, Jill L.; Strader, Marie E.; Lupton, Josh R.; Dyukareva, Anna V.; Doan, Phuong Nam; Bauer, David M.; Wilbur, Allison N.; Tanner, Stephanie; Kelly, Jimmy J.; Lai, Sen-Lin; Tran, Khoa D.; Kohwi, Minoree; Laverty, Todd R.; Pearson, Joseph C.; Crews, Stephen T.; Rubin, Gerald M.; Doe, Chris Q.

    2012-01-01

    Here we describe the embryonic CNS expression of 5,000 GAL4 lines made using molecularly defined cis-regulatory DNA inserted into a single attP genomic location. We document and annotate the patterns in early embryos when neurogenesis is at its peak, and in older embryos where there is maximal neuronal diversity and the first neural circuits are established. We note expression in other tissues such as the lateral body wall (muscle, sensory neurons, trachea) and viscera. Companion papers report on the adult brain and larval imaginal discs, and the integrated datasets are available online (www.janelia.org/flylight/gal4-gen1). This collection of embryonically-expressed GAL4 lines will be valuable for determining neuronal morphology and function; the 1862 lines expressed in small subsets of neurons (<20/segment) will be especially valuable for characterizing interneuronal diversity and function, as interneurons comprise the majority of all CNS neurons, yet their gene expression profile and function remain virtually unexplored. PMID:23063363

  14. A Hox Transcription Factor Collective Binds a Highly Conserved Distal-less cis-Regulatory Module to Generate Robust Transcriptional Outcomes.

    PubMed

    Uhl, Juli D; Zandvakili, Arya; Gebelein, Brian

    2016-04-01

    cis-regulatory modules (CRMs) generate precise expression patterns by integrating numerous transcription factors (TFs). Surprisingly, CRMs that control essential gene patterns can differ greatly in conservation, suggesting distinct constraints on TF binding sites. Here, we show that a highly conserved Distal-less regulatory element (DCRE) that controls gene expression in leg precursor cells recruits multiple Hox, Extradenticle (Exd) and Homothorax (Hth) complexes to mediate dual outputs: thoracic activation and abdominal repression. Using reporter assays, we found that abdominal repression is particularly robust, as neither individual binding site mutations nor a DNA binding deficient Hth protein abolished cooperative DNA binding and in vivo repression. Moreover, a re-engineered DCRE containing a distinct configuration of Hox, Exd, and Hth sites also mediated abdominal Hox repression. However, the re-engineered DCRE failed to perform additional segment-specific functions such as thoracic activation. These findings are consistent with two emerging concepts in gene regulation: First, the abdominal Hox/Exd/Hth factors utilize protein-protein and protein-DNA interactions to form repression complexes on flexible combinations of sites, consistent with the TF collective model of CRM organization. Second, the conserved DCRE mediates multiple cell-type specific outputs, consistent with recent findings that pleiotropic CRMs are associated with conserved TF binding and added evolutionary constraints.

  15. Identification and characterization of promoters and cis-regulatory elements of genes involved in secondary metabolites production in hop (Humulus lupulus. L).

    PubMed

    Duraisamy, Ganesh Selvaraj; Mishra, Ajay Kumar; Kocabek, Tomas; Matoušek, Jaroslav

    2016-10-01

    Molecular and biochemical studies have shown that gene contains single or combination of different cis-acting regulatory elements are actively controlling the transcriptional regulation of associated genes, downstream effects of these result in the modulation of various biological pathways such as biotic/abiotic stress responses, hormonal responses to growth and development processes and secondary metabolite production. Therefore, the identification of promoters and their cis-regulatory elements is one of intriguing area to study the dynamic complex regulatory network of genes activities by integrating computational, comparative, structural and functional genomics. Several bioinformatics servers or database have been established to predict the cis-acting elements present in the promoter region of target gene and their association with the expression profiles in the TFs. The aim of this study is to predict possible cis-acting regulatory elements that have putative role in the transcriptional regulation of a dynamic network of metabolite gene activities controlling prenylflavonoid and bitter acids biosynthesis in hop (Humulus lupulus). Recent release of hop draft genome enabled us to predict the possible cis-acting regulatory elements by extracting 2kbp of 5' regulatory regions of genes important for lupulin metabolome biosynthesis, using Plant CARE, PLACE and Genomatix Matinspector professional databases. The result reveals the plausible role of cis-acting regulatory elements in the regulation of gene expression primarily involved in lupulin metabolome biosynthesis including under various stress conditions.

  16. Autosomal recessive retinitis pigmentosa with homozygous rhodopsin mutation E150K and non-coding cis-regulatory variants in CRX-binding regions of SAMD7

    PubMed Central

    Van Schil, Kristof; Karlstetter, Marcus; Aslanidis, Alexander; Dannhausen, Katharina; Azam, Maleeha; Qamar, Raheel; Leroy, Bart P.; Depasse, Fanny; Langmann, Thomas; De Baere, Elfride

    2016-01-01

    The aim of this study was to unravel the molecular pathogenesis of an unusual retinitis pigmentosa (RP) phenotype observed in a Turkish consanguineous family. Homozygosity mapping revealed two candidate genes, SAMD7 and RHO. A homozygous RHO mutation c.448G > A, p.E150K was found in two affected siblings, while no coding SAMD7 mutations were identified. Interestingly, four non-coding homozygous variants were found in two SAMD7 genomic regions relevant for binding of the retinal transcription factor CRX (CRX-bound regions, CBRs) in these affected siblings. Three variants are located in a promoter CBR termed CBR1, while the fourth is located more downstream in CBR2. Transcriptional activity of these variants was assessed by luciferase assays and electroporation of mouse retinal explants with reporter constructs of wild-type and variant SAMD7 CBRs. The combined CBR2/CBR1 variant construct showed significantly decreased SAMD7 reporter activity compared to the wild-type sequence, suggesting a cis-regulatory effect on SAMD7 expression. As Samd7 is a recently identified Crx-regulated transcriptional repressor in retina, we hypothesize that these SAMD7 variants might contribute to the retinal phenotype observed here, characterized by unusual, recognizable pigment deposits, differing from the classic spicular intraretinal pigmentation observed in other individuals homozygous for p.E150K, and typically associated with RP in general. PMID:26887858

  17. A Hox Transcription Factor Collective Binds a Highly Conserved Distal-less cis-Regulatory Module to Generate Robust Transcriptional Outcomes

    PubMed Central

    Uhl, Juli D.; Zandvakili, Arya; Gebelein, Brian

    2016-01-01

    cis-regulatory modules (CRMs) generate precise expression patterns by integrating numerous transcription factors (TFs). Surprisingly, CRMs that control essential gene patterns can differ greatly in conservation, suggesting distinct constraints on TF binding sites. Here, we show that a highly conserved Distal-less regulatory element (DCRE) that controls gene expression in leg precursor cells recruits multiple Hox, Extradenticle (Exd) and Homothorax (Hth) complexes to mediate dual outputs: thoracic activation and abdominal repression. Using reporter assays, we found that abdominal repression is particularly robust, as neither individual binding site mutations nor a DNA binding deficient Hth protein abolished cooperative DNA binding and in vivo repression. Moreover, a re-engineered DCRE containing a distinct configuration of Hox, Exd, and Hth sites also mediated abdominal Hox repression. However, the re-engineered DCRE failed to perform additional segment-specific functions such as thoracic activation. These findings are consistent with two emerging concepts in gene regulation: First, the abdominal Hox/Exd/Hth factors utilize protein-protein and protein-DNA interactions to form repression complexes on flexible combinations of sites, consistent with the TF collective model of CRM organization. Second, the conserved DCRE mediates multiple cell-type specific outputs, consistent with recent findings that pleiotropic CRMs are associated with conserved TF binding and added evolutionary constraints. PMID:27058369

  18. Human ADA3 regulates RARα transcriptional activity through direct contact between LxxLL motifs and the receptor coactivator pocket

    PubMed Central

    Li, Chia-Wei; Ai, Ni; Dinh, Gia Khanh; Welsh, William J.; Chen, J. Don

    2010-01-01

    The alternation/deficiency in activation-3 (ADA3) is an essential component of the human p300/CBP-associated factor (PCAF) and yeast Spt-Ada-Gcn5-acetyltransferase (SAGA) histone acetyltransferase complexes. These complexes facilitate transactivation of target genes by association with transcription factors and modification of local chromatin structure. It is known that the yeast ADA3 is required for nuclear receptor (NR)-mediated transactivation in yeast cells; however, the role of mammalian ADA3 in NR signaling remains elusive. In this study, we have investigated how the human (h) ADA3 regulates retinoic acid receptor (RAR) α-mediated transactivation. We show that hADA3 interacts directly with RARα in a hormone-dependent manner and this interaction contributes to RARα transactivation. Intriguingly, this interaction involves classical LxxLL motifs in hADA3, as demonstrated by both ‘loss’ and ‘gain’ of function mutations, as well as a functional coactivator pocket of the receptor. Additionally, we show that hADA3 associates with RARα target gene promoter in a hormone-dependent manner and ADA3 knockdown impairs RARβ2 expression. Furthermore, a structural model was established to illustrate an interaction network within the ADA3/RARα complex. These results suggest that hADA3 is a bona fide transcriptional coactivator for RARα, acting through a conserved mechanism involving direct contacts between NR boxes and the receptor’s co-activator pocket. PMID:20413580

  19. Vertebrate mRNAs with a 5'-terminal pyrimidine tract are candidates for translational repression in quiescent cells: characterization of the translational cis-regulatory element.

    PubMed Central

    Avni, D; Shama, S; Loreni, F; Meyuhas, O

    1994-01-01

    The translation of mammalian ribosomal protein (rp) mRNAs is selectively repressed in nongrowing cells. This response is mediated through a regulatory element residing in the 5' untranslated region of these mRNAs and includes a 5' terminal oligopyrimidine tract (5' TOP). To further characterize the translational cis-regulatory element, we monitored the translational behavior of various endogenous and heterologous mRNAs or hybrid transcripts derived from transfected chimeric genes. The translational efficiency of these mRNAs was assessed in cells that either were growing normally or were growth arrested under various physiological conditions. Our experiments have yielded the following results: (i) the translation of mammalian rp mRNAs is properly regulated in amphibian cells, and likewise, amphibian rp mRNA is regulated in mammalian cells, indicating that all of the elements required for translation control of rp mRNAs are conserved among vertebrate classes; (ii) selective translational control is not confined to rp mRNAs, as mRNAs encoding the naturally occurring ubiquitin-rp fusion protein and elongation factor 1 alpha, which contain a 5' TOP, also conform this mode of regulation; (iii) rat rpP2 mRNA contains only five pyrimidines in its 5' TOP, yet this mRNA is translationally controlled in the same fashion as other rp mRNAs with a 5' TOP of eight or more pyrimidines; (iv) full manifestation of this mode of regulation seems to require both the 5' TOP and sequences immediately downstream; and (v) an intact translational regulatory element from rpL32 mRNA fails to exert its regulatory properties even when preceded by a single A residue. Images PMID:8196625

  20. Direct Imaging of Hippocampal Epileptiform Calcium Motifs Following Kainic Acid Administration in Freely Behaving Mice

    PubMed Central

    Berdyyeva, Tamara K.; Frady, E. Paxon; Nassi, Jonathan J.; Aluisio, Leah; Cherkas, Yauheniya; Otte, Stephani; Wyatt, Ryan M.; Dugovic, Christine; Ghosh, Kunal K.; Schnitzer, Mark J.; Lovenberg, Timothy; Bonaventure, Pascal

    2016-01-01

    Prolonged exposure to abnormally high calcium concentrations is thought to be a core mechanism underlying hippocampal damage in epileptic patients; however, no prior study has characterized calcium activity during seizures in the live, intact hippocampus. We have directly investigated this possibility by combining whole-brain electroencephalographic (EEG) measurements with microendoscopic calcium imaging of pyramidal cells in the CA1 hippocampal region of freely behaving mice treated with the pro-convulsant kainic acid (KA). We observed that KA administration led to systematic patterns of epileptiform calcium activity: a series of large-scale, intensifying flashes of increased calcium fluorescence concurrent with a cluster of low-amplitude EEG waveforms. This was accompanied by a steady increase in cellular calcium levels (>5 fold increase relative to the baseline), followed by an intense spreading calcium wave characterized by a 218% increase in global mean intensity of calcium fluorescence (n = 8, range [114–349%], p < 10−4; t-test). The wave had no consistent EEG phenotype and occurred before the onset of motor convulsions. Similar changes in calcium activity were also observed in animals treated with 2 different proconvulsant agents, N-methyl-D-aspartate (NMDA) and pentylenetetrazol (PTZ), suggesting the measured changes in calcium dynamics are a signature of seizure activity rather than a KA-specific pathology. Additionally, despite reducing the behavioral severity of KA-induced seizures, the anticonvulsant drug valproate (VA, 300 mg/kg) did not modify the observed abnormalities in calcium dynamics. These results confirm the presence of pathological calcium activity preceding convulsive motor seizures and support calcium as a candidate signaling molecule in a pathway connecting seizures to subsequent cellular damage. Integrating in vivo calcium imaging with traditional assessment of seizures could potentially increase translatability of pharmacological

  1. Novel applications of motif-directed profiling to identify disease resistance genes in plants.

    PubMed

    Vossen, Jack H; Dezhsetan, Sara; Esselink, Danny; Arens, Marjon; Sanz, Maria J; Verweij, Walter; Verzaux, Estelle; van der Linden, C Gerard

    2013-10-07

    Molecular profiling of gene families is a versatile tool to study diversity between individual genomes in sexual crosses and germplasm. Nucleotide binding site (NBS) profiling, in particular, targets conserved nucleotide binding site-encoding sequences of resistance gene analogs (RGAs), and is widely used to identify molecular markers for disease resistance (R) genes. In this study, we used NBS profiling to identify genome-wide locations of RGA clusters in the genome of potato clone RH. Positions of RGAs in the potato RH and DM genomes that were generated using profiling and genome sequencing, respectively, were compared. Largely overlapping results, but also interesting discrepancies, were found. Due to the clustering of RGAs, several parts of the genome are overexposed while others remain underexposed using NBS profiling. It is shown how the profiling of other gene families, i.e. protein kinases and different protein domain-coding sequences (i.e., TIR), can be used to achieve a better marker distribution. The power of profiling techniques is further illustrated using RGA cluster-directed profiling in a population of Solanum berthaultii. Multiple different paralogous RGAs within the Rpi-ber cluster could be genetically distinguished. Finally, an adaptation of the profiling protocol was made that allowed the parallel sequencing of profiling fragments using next generation sequencing. The types of RGAs that were tagged in this next-generation profiling approach largely overlapped with classical gel-based profiling. As a potential application of next-generation profiling, we showed how the R gene family associated with late blight resistance in the SH*RH population could be identified using a bulked segregant approach. In this study, we provide a comprehensive overview of previously described and novel profiling primers and their genomic targets in potato through genetic mapping and comparative genomics. Furthermore, it is shown how genome-wide or fine mapping can be

  2. Novel applications of motif-directed profiling to identify disease resistance genes in plants

    PubMed Central

    2013-01-01

    Background Molecular profiling of gene families is a versatile tool to study diversity between individual genomes in sexual crosses and germplasm. Nucleotide binding site (NBS) profiling, in particular, targets conserved nucleotide binding site-encoding sequences of resistance gene analogs (RGAs), and is widely used to identify molecular markers for disease resistance (R) genes. Results In this study, we used NBS profiling to identify genome-wide locations of RGA clusters in the genome of potato clone RH. Positions of RGAs in the potato RH and DM genomes that were generated using profiling and genome sequencing, respectively, were compared. Largely overlapping results, but also interesting discrepancies, were found. Due to the clustering of RGAs, several parts of the genome are overexposed while others remain underexposed using NBS profiling. It is shown how the profiling of other gene families, i.e. protein kinases and different protein domain-coding sequences (i.e., TIR), can be used to achieve a better marker distribution. The power of profiling techniques is further illustrated using RGA cluster-directed profiling in a population of Solanum berthaultii. Multiple different paralogous RGAs within the Rpi-ber cluster could be genetically distinguished. Finally, an adaptation of the profiling protocol was made that allowed the parallel sequencing of profiling fragments using next generation sequencing. The types of RGAs that were tagged in this next-generation profiling approach largely overlapped with classical gel-based profiling. As a potential application of next-generation profiling, we showed how the R gene family associated with late blight resistance in the SH*RH population could be identified using a bulked segregant approach. Conclusions In this study, we provide a comprehensive overview of previously described and novel profiling primers and their genomic targets in potato through genetic mapping and comparative genomics. Furthermore, it is shown how

  3. The limits of de novo DNA motif discovery.

    PubMed

    Simcha, David; Price, Nathan D; Geman, Donald

    2012-01-01

    A major challenge in molecular biology is reverse-engineering the cis-regulatory logic that plays a major role in the control of gene expression. This program includes searching through DNA sequences to identify "motifs" that serve as the binding sites for transcription factors or, more generally, are predictive of gene expression across cellular conditions. Several approaches have been proposed for de novo motif discovery-searching sequences without prior knowledge of binding sites or nucleotide patterns. However, unbiased validation is not straightforward. We consider two approaches to unbiased validation of discovered motifs: testing the statistical significance of a motif using a DNA "background" sequence model to represent the null hypothesis and measuring performance in predicting membership in gene clusters. We demonstrate that the background models typically used are "too null," resulting in overly optimistic assessments of significance, and argue that performance in predicting TF binding or expression patterns from DNA motifs should be assessed by held-out data, as in predictive learning. Applying this criterion to common motif discovery methods resulted in universally poor performance, although there is a marked improvement when motifs are statistically significant against real background sequences. Moreover, on synthetic data where "ground truth" is known, discriminative performance of all algorithms is far below the theoretical upper bound, with pronounced "over-fitting" in training. A key conclusion from this work is that the failure of de novo discovery approaches to accurately identify motifs is basically due to statistical intractability resulting from the fixed size of co-regulated gene clusters, and thus such failures do not necessarily provide evidence that unfound motifs are not active biologically. Consequently, the use of prior knowledge to enhance motif discovery is not just advantageous but necessary. An implementation of the LR and ALR

  4. Evolution of an insect-specific GROUCHO-interaction motif in the ENGRAILED selector protein

    PubMed Central

    Hittinger, Chris Todd; Carroll, Sean B.

    2008-01-01

    Animal morphology evolves through alterations in the genetic regulatory networks that control development. Regulatory connections are commonly added, subtracted, or modified via mutations in cis-regulatory elements, but several cases are also known where transcription factors have gained or lost activity-modulating peptide motifs. In order to better assess the role of novel transcription factor peptide motifs in evolution, we searched for synapomorphic motifs in the homeotic selectors of Drosophila melanogaster and related insects. Here, we describe an evolutionarily novel GROUCHO (GRO)-interaction motif in the ENGRAILED (EN) selector protein. This “ehIFRPF” motif is not homologous to the previously characterized “engrailed homology 1” (eh1) GRO-interaction motif of EN. This second motif is an insect-specific “WRPW”-type motif that has been maintained by purifying selection in at least the dipteran/lepidopteran lineage. We demonstrate that this motif contributes to in vivo repression of the wingless (wg) target gene and to interaction with GRO in vitro. The acquisition and conservation of this auxiliary peptide motif shows how the number and activity of short peptide motifs can evolve in transcription factors while existing regulatory functions are maintained. PMID:18803772

  5. Specific binding of the replication protein of plasmid pPS10 to direct and inverted repeats is mediated by an HTH motif.

    PubMed Central

    García de Viedma, D; Serrano-López, A; Díaz-Orejas, R

    1995-01-01

    The initiator protein of the plasmid pPS10, RepA, has a putative helix-turn-helix (HTH) motif at its C-terminal end. RepA dimers bind to an inverted repeat at the repA promoter (repAP) to autoregulate RepA synthesis. [D. García de Viedma, et al. (1996) EMBO J. in press]. RepA monomers bind to four direct repeats at the origin of replication (oriV) to initiate pPS10 replication This report shows that randomly generated mutations in RepA, associated with defficiencies in autoregulation, map either at the putative HTH motif or in its vicinity. These mutant proteins do not promote pPS10 replication and are severely affected in binding to both the repAP and oriV regions in vitro. Revertants of a mutant that map in the vicinity of the HTH motif have been obtained and correspond to a second amino acid substitution far upstream of the motif. However, reversion of mutants that map in the helices of the motif occurs less frequently, at least by an order of magnitude. All these data indicate that the helices of the HTH motif play an essential role in specific RepA-DNA interactions, although additional regions also seem to be involved in DNA binding activity. Some mutations have slightly different effects in replication and autoregulation, suggesting that the role of the HTH motif in the interaction of RepA dimers or monomers with their respective DNA targets (IR or DR) is not the same. Images PMID:8559664

  6. Direct contacts between conserved motifs of different subunits provide major contribution to active site organization in human and mycobacterial dUTPases

    PubMed Central

    Takács, Enikő; Nagy, Gergely; Leveles, Ibolya; Harmat, Veronika; Lopata, Anna; Tóth, Judit; Vértessy, Beáta G.

    2010-01-01

    dUTPases are essential for genome integrity. Recent results allowed characterization of the role of conserved residues. Here we analyzed the Asp/Asn mutation within conserved Motif I of human and mycobacterial dUTPases, wherein the Asp residue was previously implicated in Mg2+-coordination. Our results on transient/steady-state kinetics, ligand-binding and a 1.80 Å-resolution structure of the mutant mycobacterial enzyme, in comparison with wild type and C-terminally truncated structures, argue that this residue has a major role in providing intra- and intersubunit contacts, but is not essential for Mg2+ accommodation. We conclude that in addition to the role of conserved motifs in substrate accommodation, direct subunit interaction between protein atoms of active site residues from different conserved motifs are crucial for enzyme function. PMID:20493855

  7. Computation-Based Discovery of Related Transcriptional Regulatory Modules and Motifs Using an Experimentally Validated Combinatorial Model

    PubMed Central

    Halfon, Marc S.; Grad, Yonatan; Church, George M.; Michelson, Alan M.

    2002-01-01

    Gene expression is regulated by transcription factors that interact with cis-regulatory elements. Predicting these elements from sequence data has proven difficult. We describe here a successful computational search for elements that direct expression in a particular temporal-spatial pattern in the Drosophila embryo, based on a single well characterized enhancer model. The fly genome was searched to identify sequence elements containing the same combination of transcription factors as those found in the model. Experimental evaluation of the search results demonstrates that our method can correctly predict regulatory elements and highlights the importance of functional testing as a means of identifying false-positive results. We also show that the search results enable the identification of additional relevant sequence motifs whose functions can be empirically validated. This approach, combined with gene expression and phylogenetic sequence data, allows for genome-wide identification of related regulatory elements, an important step toward understanding the genetic regulatory networks involved in development. [Sequence data reported in this paper have been deposited in GenBank with accession nos. AF513981 (Eve MHE) and AF513982 (Hbr DME). Supplementary material is available online at http://www.genome.org. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: R. Blackman] PMID:12097338

  8. Developmental appearance of factors that bind specifically to cis-regulatory sequences of a gene expressed in the sea urchin embryo.

    PubMed

    Calzone, F J; Thézé, N; Thiebaud, P; Hill, R L; Britten, R J; Davidson, E H

    1988-09-01

    Previous gene-transfer experiments have identified a 2500-nucleotide 5' domain of the CyIIIa cytoskeletal actin gene, which contains cis-regulatory sequences that are necessary and sufficient for spatial and temporal control of CyIIIa gene expression during embryogenesis. This gene is activated in late cleavage, exclusively in aboral ectoderm cell lineages. In this study, we focus on interactions demonstrated in vitro between sequences of the regulatory domain and proteins present in crude extracts derived from sea urchin embryo nuclei and from unfertilized eggs. Quantitative gel-shift measurements are utilized to estimate minimum numbers of factor molecules per embryo at 24 hr postfertilization, when the CyIIIa gene is active, at 7 hr, when it is still silent, and in the unfertilized egg. We also estimate the binding affinity preferences (Kr) of the various factors for their respective sites, relative to their affinity for synthetic DNA competitors. At least 14 different specific interactions occur within the regulatory regions, some of which produce multiple DNA-protein complexes. Values of Kr range from approximately 2 x 10(4) to approximately 2 x 10(6) for these factors under the conditions applied. With one exception, the minimum factor prevalences that we measured in the 400-cell 24-hr embryo nuclear extracts fell within the range of 2 x 10(5) to 2 x 10(6) molecules per embryo, i.e., a few hundred to a few thousand molecules per nucleus. Three developmental patterns were observed with respect to factor prevalence: Factors reacting at one site were found in unfertilized egg cytoplasm at about the same level per egg or embryo as in 24-hr embryo nuclei; factors reacting with five other regions of the regulatory domain are not detectable in egg cytoplasm but in 7-hr mid-cleavage-stage embryo, nuclei are already at or close to their concentrations in the 24-hr embryo nuclei; and factors reacting with five additional regions are not detectable in egg cytoplasm and

  9. CSMET: Comparative Genomic Motif Detection via Multi-Resolution Phylogenetic Shadowing

    PubMed Central

    Kolar, Mladen; Xing, Eric P.

    2008-01-01

    Functional turnover of transcription factor binding sites (TFBSs), such as whole-motif loss or gain, are common events during genome evolution. Conventional probabilistic phylogenetic shadowing methods model the evolution of genomes only at nucleotide level, and lack the ability to capture the evolutionary dynamics of functional turnover of aligned sequence entities. As a result, comparative genomic search of non-conserved motifs across evolutionarily related taxa remains a difficult challenge, especially in higher eukaryotes, where the cis-regulatory regions containing motifs can be long and divergent; existing methods rely heavily on specialized pattern-driven heuristic search or sampling algorithms, which can be difficult to generalize and hard to interpret based on phylogenetic principles. We propose a new method: Conditional Shadowing via Multi-resolution Evolutionary Trees, or CSMET, which uses a context-dependent probabilistic graphical model that allows aligned sites from different taxa in a multiple alignment to be modeled by either a background or an appropriate motif phylogeny conditioning on the functional specifications of each taxon. The functional specifications themselves are the output of a phylogeny which models the evolution not of individual nucleotides, but of the overall functionality (e.g., functional retention or loss) of the aligned sequence segments over lineages. Combining this method with a hidden Markov model that autocorrelates evolutionary rates on successive sites in the genome, CSMET offers a principled way to take into consideration lineage-specific evolution of TFBSs during motif detection, and a readily computable analytical form of the posterior distribution of motifs under TFBS turnover. On both simulated and real Drosophila cis-regulatory modules, CSMET outperforms other state-of-the-art comparative genomic motif finders. PMID:18535663

  10. Subtle Changes in Motif Positioning Cause Tissue-Specific Effects on Robustness of an Enhancer's Activity

    PubMed Central

    Erceg, Jelena; Saunders, Timothy E.; Girardot, Charles; Devos, Damien P.; Hufnagel, Lars; Furlong, Eileen E. M.

    2014-01-01

    Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, we are limited in our ability to relate information on motif occupancy to function from sequence alone. Here, we engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, we were able to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood. PMID:24391522

  11. 'Traffic light rules': Chromatin states direct miRNA-mediated network motifs running by integrating epigenome and regulatome.

    PubMed

    Zhao, Hongying; Zhang, Guanxiong; Pang, Lin; Lan, Yujia; Wang, Li; Yu, Fulong; Hu, Jing; Li, Feng; Zhao, Tingting; Xiao, Yun; Li, Xia

    2016-07-01

    Epigenetic marks can cooperatively regulate chromatin accessibility and in turn facilitate or impede the binding of regulatory factors to various elements, suggesting their important roles in regulatory circuits. However, it remains elusive as to how epigenetic marks cooperate in the operations of regulatory network. Here, we systematically characterized chromatin states of 26 epigenetic marks on different elements of protein-coding genes and miRNAs. We comprehensively analyzed, by using an integrative regulatory network, how cooperation among epigenetic, transcriptional, and post-transcriptional regulations came about. We observed extensive cooperation of epigenetic marks on local functional elements and complex epigenetic patterns corresponding to different biological functions. By identifying the significantly epigenetic state-modified motifs, we found that multiple combinations of epigenetic states were associated with a specific type of motif. Interestingly, miRNA-mediated motifs were linked to stable epigenetic states of downstream targets. Changes in epigenetic states of downstream targets in miRNA-mediated motifs can buffer the effects of upstream regulator on target genes, suggesting that miRNA-mediated motifs require the cooperation of epigenetic marks. Overall, epigenetic marks are involved in the running of regulatory motifs in the way traffic lights control traffic flows and hence should be part of the architecture of complex regulatory circuits. We demonstrated a detailed analysis of the cooperation of multiple epigenetic marks and how epigenetic regulation was organized into a human regulatory network. The findings form a basis for further understanding of the complicated roles of epigenetic marks on regulatory circuits. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Genetic and biochemical analysis of cis regulatory elements within the keratinocyte enhancer region of the human papillomavirus type 31 upstream regulatory region during different stages of the viral life cycle.

    PubMed

    Sen, Ellora; Alam, Samina; Meyers, Craig

    2004-01-01

    Using linker scanning mutational analysis, we recently identified potential cis regulatory elements contained within the 5' upstream regulatory region (URR) domain and auxiliary enhancer (AE) region of the human papillomavirus type 31 (HPV31) URR involved in the regulation of E6/E7 promoter activity at different stages of the viral life cycle. For the present study, we extended the linker scanning mutational analysis to identify potential cis elements located in the keratinocyte enhancer (KE) region (nucleotides 7511 to 7762) of the HPV31 URR and to characterize cellular factors that bind to these elements under conditions representing different stages of the viral life cycle. The linker scanning mutational analysis identified viral cis elements located in the KE region that regulate transcription in the presence and absence of any viral gene products or viral DNA replication and determine the role of host tissue differentiation on viral transcriptional regulation. Using electrophoretic mobility shift assays, we illustrated defined reorganization in the composition of cellular transcription factors binding to the same cis regulatory elements at different stages of the HPV differentiation-dependent life cycle. Our studies provide an extensive map of functional elements in the KE region of the HPV31 URR, identify cis regulatory elements that exhibit significant transcription regulatory potential, and illustrate changes in specific protein-DNA interactions at different stages of the viral life cycle. The variable recruitment of transcription factors to the same cis element under different cellular conditions may represent a mechanism underlying the tight link between keratinocyte differentiation and E6/E7 expression.

  13. WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar.

    PubMed

    Wang, Guandong; Yu, Taotao; Zhang, Weixiong

    2005-07-01

    Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at http://cic.cs.wustl.edu/wordspy.

  14. WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar

    PubMed Central

    Wang, Guandong; Yu, Taotao; Zhang, Weixiong

    2005-01-01

    Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at . PMID:15980501

  15. Integration of Bioinformatics and Synthetic Promoters Leads to the Discovery of Novel Elicitor-Responsive cis-Regulatory Sequences in Arabidopsis1[C][W][OA

    PubMed Central

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J.; Hehl, Reinhard

    2012-01-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  16. Exploiting powder X-ray diffraction for direct structure determination in structural biology: the P2X4 receptor trafficking motif YEQGL.

    PubMed

    Fujii, Kotaro; Young, Mark T; Harris, Kenneth D M

    2011-06-01

    We report the crystal structure of the 5-residue peptide acetyl-YEQGL-amide, determined directly from powder X-ray diffraction data recorded on a conventional laboratory X-ray powder diffractometer. The YEQGL motif has a known biological role, as a trafficking motif in the C-terminus of mammalian P2X4 receptors. Comparison of the crystal structure of acetyl-YEQGL-amide determined here and that of a complex formed with the μ2 subunit of the clathrin adaptor protein complex AP2 reported previously, reveals differences in conformational properties, although there are nevertheless similarities concerning aspects of the hydrogen-bonding arrangement and the hydrophobic environment of the leucine sidechain. Our results demonstrate the potential for exploiting modern powder X-ray diffraction methodology to achieve complete structure determination of materials of biological interest that do not crystallize as single crystals of suitable size and quality for single-crystal X-ray diffraction.

  17. Disease-Causing 7.4 kb Cis-Regulatory Deletion Disrupting Conserved Non-Coding Sequences and Their Interaction with the FOXL2 Promotor: Implications for Mutation Screening

    PubMed Central

    Dostie, Josée; Lemire, Edmond; Bouchard, Philippe; Field, Michael; Jones, Kristie; Lorenz, Birgit; Menten, Björn; Buysse, Karen; Pattyn, Filip; Friedli, Marc; Ucla, Catherine; Rossier, Colette; Wyss, Carine; Speleman, Frank; De Paepe, Anne; Dekker, Job; Antonarakis, Stylianos E.; De Baere, Elfride

    2009-01-01

    To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. PMID:19543368

  18. The Significance of Multivalent Bonding Motifs and “Bond Order” in DNA-Directed Nanoparticle Crystallization

    SciTech Connect

    Thaner, Ryan V.; Eryazici, Ibrahim; Macfarlane, Robert J.; Brown, Keith A.; Lee, Byeongdu; Nguyen, SonBinh T.; Mirkin, Chad A.

    2016-05-18

    Multivalent oligonucleotide-based bonding elements have been synthesized and studied for the assembly and crystallization of gold nanoparticles. Through the use of organic branching points, divalent and trivalent DNA linkers were readily incorporated into the oligonucleotide shells that define DNA-nanoparticles and compared to monovalent linker systems. These multivalent bonding motifs enable the change of "bond strength" between particles and therefore modulate the effective "bond order." In addition, the improved accessibility of strands between neighboring particles, either due to multivalency or modifications to increase strand flexibility, gives rise to superlattices with less strain in the crystallites compared to traditional designs. Furthermore, the increased availability and number of binding modes also provide a new variable that allows previously unobserved crystal structures to be synthesized, as evidenced by the formation of a thorium phosphide superlattice.

  19. A novel sorting motif in the glutamate transporter excitatory amino acid transporter 3 directs its targeting in Madin-Darby canine kidney cells and hippocampal neurons.

    PubMed

    Cheng, Chialin; Glover, Greta; Banker, Gary; Amara, Susan G

    2002-12-15

    The glutamate transporter excitatory amino acid transporter 3 (EAAT3) is polarized to the apical surface in epithelial cells and localized to the dendritic compartment in hippocampal neurons, where it is clustered adjacent to postsynaptic sites. In this study, we analyzed the sequences in EAAT3 that are responsible for its polarized localization in Madin-Darby canine kidney (MDCK) cells and neurons. Confocal microscopy and cell surface biotinylation assays demonstrated that deletion of the EAAT3 C terminus or replacement of the C terminus of EAAT3 with the analogous region in EAAT1 eliminated apical localization in MDCK cells. The C terminus of EAAT3 was sufficient to redirect the basolateral-preferring EAAT1 and the nonpolarized EAAT2 to the apical surface. Using alanine substitution mutants, we identified a short peptide motif in the cytoplasmic C-terminal region of EAAT3 that directs its apical localization in MDCK cells. Mutation of this sequence also impairs dendritic targeting of EAAT3 in hippocampal neurons but does not interfere with the clustering of EAAT3 on dendritic spines and filopodia. These data provide the first evidence that an identical cytoplasmic motif can direct apical targeting in epithelia and somatodendritic targeting in neurons. Moreover, our results demonstrate that the two fundamental features of the localization of EAAT3 in neurons, its restriction to the somatodendritic domain and its clustering near postsynaptic sites, are mediated by distinct molecular mechanisms.

  20. Unique ζ-chain motifs mediate a direct TCR-actin linkage critical for immunological synapse formation and T-cell activation.

    PubMed

    Klieger, Yair; Almogi-Hazan, Osnat; Ish-Shalom, Eliran; Pato, Aviad; Pauker, Maor H; Barda-Saad, Mira; Wang, Lynn; Baniyash, Michal

    2014-01-01

    TCR-mediated activation induces receptor microclusters that evolve to a defined immune synapse (IS). Many studies showed that actin polymerization and remodeling, which create a scaffold critical to IS formation and stabilization, are TCR mediated. However, the mechanisms controlling simultaneous TCR and actin dynamic rearrangement in the IS are yet not fully understood. Herein, we identify two novel TCR ζ-chain motifs, mediating the TCR's direct interaction with actin and inducing actin bundling. While T cells expressing the ζ-chain mutated in these motifs lack cytoskeleton (actin) associated (cska)-TCRs, they express normal levels of non-cska and surface TCRs as cells expressing wild-type ζ-chain. However, such mutant cells are unable to display activation-dependent TCR clustering, IS formation, expression of CD25/CD69 activation markers, or produce/secrete cytokine, effects also seen in the corresponding APCs. We are the first to show a direct TCR-actin linkage, providing the missing gap linking between TCR-mediated Ag recognition, specific cytoskeleton orientation toward the T-cell-APC interacting pole and long-lived IS maintenance.

  1. Carbonyl-carbonyl interactions and amide π-stacking as the directing motifs of the supramolecular assembly of ethyl N-(2-acetylphenyl)oxalamate in a synperiplanar conformation.

    PubMed

    Cabrera-Pérez, Laura C; García-Báez, Efrén V; Franco-Hernández, Marina O; Martínez-Martínez, Francisco J; Padilla-Martínez, Itzia I

    2015-05-01

    The title compound, C12H13NO4, is one of the few examples that exhibits a syn conformation between the amide and ester carbonyl groups of the oxalyl group. This conformation allows the engagement of the amide H atom in an intramolecular three-centred hydrogen-bonding S(6)S(5) motif. The compound is self-assembled by C=O...C=O and amide-π interactions into stacked columns along the b-axis direction. The concurrence of both interactions seems to be responsible for stabilizing the observed syn conformation between the carbonyl groups. The second dimension, along the a-axis direction, is developed by soft C-H...O hydrogen bonding. Density functional theory (DFT) calculations at the B3LYP/6-31G(d,p) level of theory were performed to support the experimental findings.

  2. A systematic approach to identify functional motifs within vertebrate developmental enhancers

    PubMed Central

    Li, Qiang; Ritter, Deborah; Yang, Nan; Dong, Zhiqiang; Li, Hao; Chuang, Jeffrey H.; Guo, Su

    2012-01-01

    Uncovering the cis-regulatory logic of developmental enhancers is critical to understanding the role of non-coding DNA in development. However, it is cumbersome to identify functional motifs within enhancers, and thus few vertebrate enhancers have their core functional motifs revealed. Here we report a combined experimental and computational approach for discovering regulatory motifs in developmental enhancers. Making use of the zebrafish gene expression database, we computationally identified conserved non-coding elements (CNEs) likely to have a desired tissue-specificity based on the expression of nearby genes. Through a high throughput and robust enhancer assay, we tested the activity of ~100 such CNEs and efficiently uncovered developmental enhancers with desired spatial and temporal expression patterns in the zebrafish brain. Application of de novo motif prediction algorithms on a group of forebrain enhancers identified five top-ranked motifs, all of which were experimentally validated as critical for forebrain enhancer activity. These results demonstrate a systematic approach to discover important regulatory motifs in vertebrate developmental enhancers. Moreover, this dataset provides a useful resource for further dissection of vertebrate brain development and function. PMID:19850031

  3. Control of Recombination Directionality by the Listeria Phage A118 Protein Gp44 and the Coiled-Coil Motif of Its Serine Integrase.

    PubMed

    Mandali, Sridhar; Gupta, Kushol; Dawson, Anthony R; Van Duyne, Gregory D; Johnson, Reid C

    2017-06-01

    The serine integrase of phage A118 catalyzes integrative recombination between attP on the phage and a specific attB locus on the chromosome of Listeria monocytogenes, but it is unable to promote excisive recombination between the hybrid attL and attR sites found on the integrated prophage without assistance by a recombination directionality factor (RDF). We have identified and characterized the phage-encoded RDF Gp44, which activates the A118 integrase for excision and inhibits integration. Gp44 binds to the C-terminal DNA binding domain of integrase, and we have localized the primary binding site to be within the mobile coiled-coil (CC) motif but distinct from the distal tip of the CC that is required for recombination. This interaction is sufficient to inhibit integration, but a second interaction involving the N-terminal end of Gp44 is also required to activate excision. We provide evidence that these two contacts modulate the trajectory of the CC motifs as they extend out from the integrase core in a manner dependent upon the identities of the four att sites. Our results support a model whereby Gp44 shapes the Int-bound complexes to control which att sites can synapse and recombine.IMPORTANCE Serine integrases mediate directional recombination between bacteriophage and bacterial chromosomes. These highly regulated site-specific recombination reactions are integral to the life cycle of temperate phage and, in the case of Listeria monocytogenes lysogenized by A118 family phage, are an essential virulence determinant. Serine integrases are also utilized as tools for genetic engineering and synthetic biology because of their exquisite unidirectional control of the DNA exchange reaction. Here, we identify and characterize the recombination directionality factor (RDF) that activates excision and inhibits integration reactions by the phage A118 integrase. We provide evidence that the A118 RDF binds to and modulates the trajectory of the long coiled-coil motif that

  4. Arabidopsis Flower and Embryo Developmental Genes are Repressed in Seedlings by Different Combinations of Polycomb Group Proteins in Association with Distinct Sets of Cis-regulatory Elements

    PubMed Central

    Liu, Jian; Zhang, Lei; He, Chongsheng; Shen, Wen-Hui; Jin, Hong; Xu, Lin; Zhang, Yijing

    2016-01-01

    Polycomb repressive complexes (PRCs) play crucial roles in transcriptional repression and developmental regulation in both plants and animals. In plants, depletion of different members of PRCs causes both overlapping and unique phenotypic defects. However, the underlying molecular mechanism determining the target specificity and functional diversity is not sufficiently characterized. Here, we quantitatively compared changes of tri-methylation at H3K27 in Arabidopsis mutants deprived of various key PRC components. We show that CURLY LEAF (CLF), a major catalytic subunit of PRC2, coordinates with different members of PRC1 in suppression of distinct plant developmental programs. We found that expression of flower development genes is repressed in seedlings preferentially via non-redundant role of CLF, which specifically associated with LIKE HETEROCHROMATIN PROTEIN1 (LHP1). In contrast, expression of embryo development genes is repressed by PRC1-catalytic core subunits AtBMI1 and AtRING1 in common with PRC2-catalytic enzymes CLF or SWINGER (SWN). This context-dependent role of CLF corresponds well with the change in H3K27me3 profiles, and is remarkably associated with differential co-occupancy of binding motifs of transcription factors (TFs), including MADS box and ABA-related factors. We propose that different combinations of PRC members distinctively regulate different developmental programs, and their target specificity is modulated by specific TFs. PMID:26760036

  5. Arabidopsis Flower and Embryo Developmental Genes are Repressed in Seedlings by Different Combinations of Polycomb Group Proteins in Association with Distinct Sets of Cis-regulatory Elements.

    PubMed

    Wang, Hua; Liu, Chunmei; Cheng, Jingfei; Liu, Jian; Zhang, Lei; He, Chongsheng; Shen, Wen-Hui; Jin, Hong; Xu, Lin; Zhang, Yijing

    2016-01-01

    Polycomb repressive complexes (PRCs) play crucial roles in transcriptional repression and developmental regulation in both plants and animals. In plants, depletion of different members of PRCs causes both overlapping and unique phenotypic defects. However, the underlying molecular mechanism determining the target specificity and functional diversity is not sufficiently characterized. Here, we quantitatively compared changes of tri-methylation at H3K27 in Arabidopsis mutants deprived of various key PRC components. We show that CURLY LEAF (CLF), a major catalytic subunit of PRC2, coordinates with different members of PRC1 in suppression of distinct plant developmental programs. We found that expression of flower development genes is repressed in seedlings preferentially via non-redundant role of CLF, which specifically associated with LIKE HETEROCHROMATIN PROTEIN1 (LHP1). In contrast, expression of embryo development genes is repressed by PRC1-catalytic core subunits AtBMI1 and AtRING1 in common with PRC2-catalytic enzymes CLF or SWINGER (SWN). This context-dependent role of CLF corresponds well with the change in H3K27me3 profiles, and is remarkably associated with differential co-occupancy of binding motifs of transcription factors (TFs), including MADS box and ABA-related factors. We propose that different combinations of PRC members distinctively regulate different developmental programs, and their target specificity is modulated by specific TFs.

  6. QGRS-H Predictor: a web server for predicting homologous quadruplex forming G-rich sequence motifs in nucleotide sequences

    PubMed Central

    Menendez, Camille; Frees, Scott; Bagga, Paramjeet S.

    2012-01-01

    Naturally occurring G-quadruplex structural motifs, formed by guanine-rich nucleic acids, have been reported in telomeric, promoter and transcribed regions of mammalian genomes. G-quadruplex structures have received significant attention because of growing evidence for their role in important biological processes, human disease and as therapeutic targets. Lately, there has been much interest in the potential roles of RNA G-quadruplexes as cis-regulatory elements of post-transcriptional gene expression. Large-scale computational genomics studies on G-quadruplexes have difficulty validating their predictions without laborious testing in ‘wet’ labs. We have developed a bioinformatics tool, QGRS-H Predictor that can map and analyze conserved putative Quadruplex forming 'G'-Rich Sequences (QGRS) in mRNAs, ncRNAs and other nucleotide sequences, e.g. promoter, telomeric and gene flanking regions. Identifying conserved regulatory motifs helps validate computations and enhances accuracy of predictions. The QGRS-H Predictor is particularly useful for mapping homologous G-quadruplex forming sequences as cis-regulatory elements in the context of 5′- and 3′-untranslated regions, and CDS sections of aligned mRNA sequences. QGRS-H Predictor features highly interactive graphic representation of the data. It is a unique and user-friendly application that provides many options for defining and studying G-quadruplexes. The QGRS-H Predictor can be freely accessed at: http://quadruplex.ramapo.edu/qgrs/app/start. PMID:22576365

  7. [Personal motif in art].

    PubMed

    Gerevich, József

    2015-01-01

    One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy.

  8. Redundant ERF-VII Transcription Factors Bind to an Evolutionarily Conserved cis-Motif to Regulate Hypoxia-Responsive Gene Expression in Arabidopsis

    PubMed Central

    Gasch, Philipp; Fundinger, Moritz; Müller, Jana T.; Lee, Travis; Mustroph, Angelika

    2016-01-01

    The response of Arabidopsis thaliana to low-oxygen stress (hypoxia), such as during shoot submergence or root waterlogging, includes increasing the levels of ∼50 hypoxia-responsive gene transcripts, many of which encode enzymes associated with anaerobic metabolism. Upregulation of over half of these mRNAs involves stabilization of five group VII ethylene response factor (ERF-VII) transcription factors, which are routinely degraded via the N-end rule pathway of proteolysis in an oxygen- and nitric oxide-dependent manner. Despite their importance, neither the quantitative contribution of individual ERF-VIIs nor the cis-regulatory elements they govern are well understood. Here, using single- and double-null mutants, the constitutively synthesized ERF-VIIs RELATED TO APETALA2.2 (RAP2.2) and RAP2.12 are shown to act redundantly as principle activators of hypoxia-responsive genes; constitutively expressed RAP2.3 contributes to this redundancy, whereas the hypoxia-induced HYPOXIA RESPONSIVE ERF1 (HRE1) and HRE2 play minor roles. An evolutionarily conserved 12-bp cis-regulatory motif that binds to and is sufficient for activation by RAP2.2 and RAP2.12 is identified through a comparative phylogenetic motif search, promoter dissection, yeast one-hybrid assays, and chromatin immunopurification. This motif, designated the hypoxia-responsive promoter element, is enriched in promoters of hypoxia-responsive genes in multiple species. PMID:26668304

  9. Nucleosomes, Linker DNA, and Linker Histone form a Unique Structural Motif that Directs the Higher-Order Folding and Compaction of Chromatin

    NASA Astrophysics Data System (ADS)

    Bednar, Jan; Horowitz, Rachel A.; Grigoryev, Sergei A.; Carruthers, Lenny M.; Hansen, Jeffrey C.; Koster, Abraham J.; Woodcock, Christopher L.

    1998-11-01

    The compaction level of arrays of nucleosomes may be understood in terms of the balance between the self-repulsion of DNA (principally linker DNA) and countering factors including the ionic strength and composition of the medium, the highly basic N termini of the core histones, and linker histones. However, the structural principles that come into play during the transition from a loose chain of nucleosomes to a compact 30-nm chromatin fiber have been difficult to establish, and the arrangement of nucleosomes and linker DNA in condensed chromatin fibers has never been fully resolved. Based on images of the solution conformation of native chromatin and fully defined chromatin arrays obtained by electron cryomicroscopy, we report a linker histone-dependent architectural motif beyond the level of the nucleosome core particle that takes the form of a stem-like organization of the entering and exiting linker DNA segments. DNA completes ≈ 1.7 turns on the histone octamer in the presence and absence of linker histone. When linker histone is present, the two linker DNA segments become juxtaposed ≈ 8 nm from the nucleosome center and remain apposed for 3-5 nm before diverging. We propose that this stem motif directs the arrangement of nucleosomes and linker DNA within the chromatin fiber, establishing a unique three-dimensional zigzag folding pattern that is conserved during compaction. Such an arrangement with peripherally arranged nucleosomes and internal linker DNA segments is fully consistent with observations in intact nuclei and also allows dramatic changes in compaction level to occur without a concomitant change in topology.

  10. Parametric bootstrapping for biological sequence motifs.

    PubMed

    O'Neill, Patrick K; Erill, Ivan

    2016-10-06

    between biological motifs and their null distributions. In particular, we observe that biological sequence motifs show an unusual distribution of IGC, presumably due to biochemical constraints on the mechanisms of direct read-out.

  11. An Intronic cis-Regulatory Element Is Crucial for the Alpha Tubulin Pl-Tuba1a Gene Activation in the Ciliary Band and Animal Pole Neurogenic Domains during Sea Urchin Development

    PubMed Central

    Cuttitta, Angela; Gianguzza, Fabrizio; Ragusa, Maria Antonietta

    2017-01-01

    In sea urchin development, structures derived from neurogenic territory control the swimming and feeding responses of the pluteus as well as the process of metamorphosis. We have previously isolated an alpha tubulin family member of Paracentrotus lividus (Pl-Tuba1a, formerly known as Pl-Talpha2) that is specifically expressed in the ciliary band and animal pole neurogenic domains of the sea urchin embryo. In order to identify cis-regulatory elements controlling its spatio-temporal expression, we conducted gene transfer experiments, transgene deletions and site specific mutagenesis. Thus, a genomic region of about 2.6 Kb of Pl-Tuba1a, containing four Interspecifically Conserved Regions (ICRs), was identified as responsible for proper gene expression. An enhancer role was ascribed to ICR1 and ICR2, while ICR3 exerted a pivotal role in basal expression, restricting Tuba1a expression to the proper territories of the embryo. Additionally, the mutation of the forkhead box consensus sequence binding site in ICR3 prevented Pl-Tuba1a expression. PMID:28141828

  12. Incorporating Motif Analysis into Gene Co-expression Networks Reveals Novel Modular Expression Pattern and New Signaling Pathways

    PubMed Central

    Ma, Shisong; Shah, Smit; Bohnert, Hans J.; Snyder, Michael; Dinesh-Kumar, Savithramma P.

    2013-01-01

    Understanding of gene regulatory networks requires discovery of expression modules within gene co-expression networks and identification of promoter motifs and corresponding transcription factors that regulate their expression. A commonly used method for this purpose is a top-down approach based on clustering the network into a range of densely connected segments, treating these segments as expression modules, and extracting promoter motifs from these modules. Here, we describe a novel bottom-up approach to identify gene expression modules driven by known cis-regulatory motifs in the gene promoters. For a specific motif, genes in the co-expression network are ranked according to their probability of belonging to an expression module regulated by that motif. The ranking is conducted via motif enrichment or motif position bias analysis. Our results indicate that motif position bias analysis is an effective tool for genome-wide motif analysis. Sub-networks containing the top ranked genes are extracted and analyzed for inherent gene expression modules. This approach identified novel expression modules for the G-box, W-box, site II, and MYB motifs from an Arabidopsis thaliana gene co-expression network based on the graphical Gaussian model. The novel expression modules include those involved in house-keeping functions, primary and secondary metabolism, and abiotic and biotic stress responses. In addition to confirmation of previously described modules, we identified modules that include new signaling pathways. To associate transcription factors that regulate genes in these co-expression modules, we developed a novel reporter system. Using this approach, we evaluated MYB transcription factor-promoter interactions within MYB motif modules. PMID:24098147

  13. Drosophila melanogaster Hox Transcription Factors Access the RNA Polymerase II Machinery through Direct Homeodomain Binding to a Conserved Motif of Mediator Subunit Med19

    PubMed Central

    Boube, Muriel; Hudry, Bruno; Immarigeon, Clément; Carrier, Yannick; Bernat-Fabre, Sandra; Merabet, Samir; Graba, Yacine; Bourbon, Henri-Marc; Cribbs, David L.

    2014-01-01

    Hox genes in species across the metazoa encode transcription factors (TFs) containing highly-conserved homeodomains that bind target DNA sequences to regulate batteries of developmental target genes. DNA-bound Hox proteins, together with other TF partners, induce an appropriate transcriptional response by RNA Polymerase II (PolII) and its associated general transcription factors. How the evolutionarily conserved Hox TFs interface with this general machinery to generate finely regulated transcriptional responses remains obscure. One major component of the PolII machinery, the Mediator (MED) transcription complex, is composed of roughly 30 protein subunits organized in modules that bridge the PolII enzyme to DNA-bound TFs. Here, we investigate the physical and functional interplay between Drosophila melanogaster Hox developmental TFs and MED complex proteins. We find that the Med19 subunit directly binds Hox homeodomains, in vitro and in vivo. Loss-of-function Med19 mutations act as dose-sensitive genetic modifiers that synergistically modulate Hox-directed developmental outcomes. Using clonal analysis, we identify a role for Med19 in Hox-dependent target gene activation. We identify a conserved, animal-specific motif that is required for Med19 homeodomain binding, and for activation of a specific Ultrabithorax target. These results provide the first direct molecular link between Hox homeodomain proteins and the general PolII machinery. They support a role for Med19 as a PolII holoenzyme-embedded “co-factor” that acts together with Hox proteins through their homeodomains in regulated developmental transcription. PMID:24786462

  14. Analyses of fugu hoxa2 genes provide evidence for subfunctionalization of neural crest cell and rhombomere cis-regulatory modules during vertebrate evolution.

    PubMed

    McEllin, Jennifer A; Alexander, Tara B; Tümpel, Stefan; Wiedemann, Leanne M; Krumlauf, Robb

    2016-01-15

    Hoxa2 gene is a primary player in regulation of craniofacial programs of head development in vertebrates. Here we investigate the evolution of a Hoxa2 neural crest enhancer identified originally in mouse by comparing and contrasting the fugu hoxa2a and hoxa2b genes with their orthologous teleost and mammalian sequences. Using sequence analyses in combination with transgenic regulatory assays in zebrafish and mouse embryos we demonstrate subfunctionalization of regulatory activity for expression in hindbrain segments and neural crest cells between these two fugu co-orthologs. hoxa2a regulatory sequences have retained the ability to mediate expression in neural crest cells while those of hoxa2b include cis-elements that direct expression in rhombomeres. Functional dissection of the neural crest regulatory potential of the fugu hoxa2a and hoxa2b genes identify the previously unknown cis-element NC5, which is implicated in generating the differential activity of the enhancers from these genes. The NC5 region plays a similar role in the ability of this enhancer to mediate reporter expression in mice, suggesting it is a conserved component involved in control of neural crest expression of Hoxa2 in vertebrate craniofacial development.

  15. Tissue- and stage-specific Wnt target gene expression is controlled subsequent to β-catenin recruitment to cis-regulatory modules

    PubMed Central

    Nakamura, Yukio; de Paiva Alves, Eduardo; Veenstra, Gert Jan C.; Hoppler, Stefan

    2016-01-01

    Key signalling pathways, such as canonical Wnt/β-catenin signalling, operate repeatedly to regulate tissue- and stage-specific transcriptional responses during development. Although recruitment of nuclear β-catenin to target genomic loci serves as the hallmark of canonical Wnt signalling, mechanisms controlling stage- or tissue-specific transcriptional responses remain elusive. Here, a direct comparison of genome-wide occupancy of β-catenin with a stage-matched Wnt-regulated transcriptome reveals that only a subset of β-catenin-bound genomic loci are transcriptionally regulated by Wnt signalling. We demonstrate that Wnt signalling regulates β-catenin binding to Wnt target genes not only when they are transcriptionally regulated, but also in contexts in which their transcription remains unaffected. The transcriptional response to Wnt signalling depends on additional mechanisms, such as BMP or FGF signalling for the particular genes we investigated, which do not influence β-catenin recruitment. Our findings suggest a more general paradigm for Wnt-regulated transcriptional mechanisms, which is relevant for tissue-specific functions of Wnt/β-catenin signalling in embryonic development but also for stem cell-mediated homeostasis and cancer. Chromatin association of β-catenin, even to functional Wnt-response elements, can no longer be considered a proxy for identifying transcriptionally Wnt-regulated genes. Context-dependent mechanisms are crucial for transcriptional activation of Wnt/β-catenin target genes subsequent to β-catenin recruitment. Our conclusions therefore also imply that Wnt-regulated β-catenin binding in one context can mark Wnt-regulated transcriptional target genes for different contexts. PMID:27068107

  16. Characterization of the human lipoprotein lipase (LPL) promoter: Evidence of two cis-regulatory regions, LP-[alpha] and LP-[beta] of importance for the differentation-linked induction of the LPL gene during adipogenesis

    SciTech Connect

    Enerbaeck, S.; Ohlsson, B.G.; Samuelsson, L.; Bjursell, G. )

    1992-10-01

    When preadipocytes differentiate into adipocytes, several differentiation-linked genes are activated. Lipo-protein lipase (LPL) is one of the first genes induced during this process. To investigate early events in adipocyte development, we have focused on the transcriptional activation of the LPL gene. For this purpose, we have cloned and fused different parts of intragenic and flanking sequences with a chloramphenicol acetyltransferase reporter gene. Transient transfection experiments and DNase I hypersensitivity assays indicate that several positive as well as negative elements contribute to transcriptional regulation of the LPL gene. When reporter gene constructs were stably introduced into preadipocytes, we were able to monitor and compare the activation patterns of different promoter deletion mutants at selected time points representing the process of adipocyte development. We could delimit two cis-regulatory elements important for gradual activation of the LPL gene during adipocyte development in vitro. These elements, LP-[alpha] (-702 to -666) and LP-[beta] (-468 to -430), contain a striking similarity to a consensus sequence known to bind the transcription factors HNF-3 and fork head. Results of gel mobility shift assays and DNase I and exonuclease III in vitro protection assays indicate that factors with DNA-binding properties similar to those of the HNF-3/fork head family of transcription factors are present in adipocytes and interact with LP-[alpha] and LP-[beta]. We also demonstrate that LP-[alpha] and LP-[beta] were both capable of conferring a differentiation-linked expression pattern to a heterolog promoter, thus mimicking the expression of the endogenous LPL gene during adipocyte differentiation. These findings indicate that interactions with LP-[alpha] and LP-[beta] could be a part of a differentiation switch governing induction of the LPL gene during adipocyte differentiation. 48 refs., 11 figs.

  17. Characterization of the human lipoprotein lipase (LPL) promoter: evidence of two cis-regulatory regions, LP-alpha and LP-beta, of importance for the differentiation-linked induction of the LPL gene during adipogenesis.

    PubMed Central

    Enerbäck, S; Ohlsson, B G; Samuelsson, L; Bjursell, G

    1992-01-01

    When preadipocytes differentiate into adipocytes, several differentiation-linked genes are activated. Lipoprotein lipase (LPL) is one of the first genes induced during this process. To investigate early events in adipocyte development, we have focused on the transcriptional activation of the LPL gene. For this purpose, we have cloned and fused different parts of intragenic and flanking sequences with a chloramphenicol acetyltransferase reporter gene. Transient transfection experiments and DNase I hypersensitivity assays indicate that several positive as well as negative elements contribute to transcriptional regulation of the LPL gene. When reporter gene constructs were stably introduced into preadipocytes, we were able to monitor and compare the activation patterns of different promoter deletion mutants at selected time points representing the process of adipocyte development. We could delimit two cis-regulatory elements important for gradual activation of the LPL gene during adipocyte development in vitro. These elements, LP-alpha (-702 to -666) and LP-beta (-468 to -430), contain a striking similarity to a consensus sequence known to bind the transcription factors HNF-3 and fork head. Results of gel mobility shift assays and DNase I and exonuclease III in vitro protection assays indicate that factors with DNA-binding properties similar to those of the HNF-3/fork head family of transcription factors are present in adipocytes and interact with LP-alpha and LP-beta. We also demonstrate that LP-alpha and LP-beta were both capable of conferring a differentiation-linked expression pattern to a heterolog promoter, thus mimicking the expression of the endogenous LPL gene during adipocyte differentiation. These findings indicate that interactions with LP-alpha and LP-beta could be a part of a differentiation switch governing induction of the LPL gene during adipocyte differentiation. Images PMID:1406652

  18. Gene expression profiling of cultured human NF1 heterozygous (NF1+/-) melanocytes reveals downregulation of a transcriptional cis-regulatory network mediating activation of the melanocyte-specific dopachrome tautomerase (DCT) gene.

    PubMed

    Boucneau, Joachim; De Schepper, Sofie; Vuylsteke, Marnik; Van Hummelen, Paul; Naeyaert, Jean-Marie; Lambert, Jo

    2005-08-01

    One of the major primary features of the neurocutaneous genetic disorder Neurofibromatosis type 1 are the hyperpigmentary café-au-lait macules where disregulation of melanocyte biology is supposed to play a key etiopathogenic role. To gain better insight into the possible role of the tumor suppressor gene NF1, a transcriptomic microarray analysis was performed on human NF1 heterozygous (NF1+/-) melanocytes of a Neurofibromatosis type 1 patient and NF1 wild type (NF1+/+) melanocytes of a healthy control patient, both cultured from normally pigmented skin and hyperpigmented lesional café-au-lait skin. From the magnitude of gene effects, we found that gene expression was affected most strongly by genotype and less so by lesional type. A total of 137 genes had a significant twofold or more up- (72) or downregulated (65) expression in NF1+/- melanocytes compared with NF1+/+ melanocytes. Melanocytes cultured from hyperpigmented café-au-lait skin showed 37 upregulated genes whereas only 14 were downregulated compared with normal skin melanocytes. In addition, significant genotype xlesional type interactions were observed for 465 genes. Differentially expressed genes were mainly involved in regulating cell proliferation and cell adhesion. A high number of transcription factor genes, among which a specific subset important in melanocyte lineage development, were downregulated in the cis-regulatory network governing the activation of the melanocyte-specific dopachrome tautomerase (DCT) gene. Although the results presented have been obtained with a restricted number of patients (one NF1 patient and one control) and using cDNA microarrays that may limit their interpretation, the data nevertheless addresses for the first time the effect of a heterozygous NF1 gene on the expression of the human melanocyte transcriptome and has generated several interesting candidate genes helpful in elucidating the etiopathology of café-au-lait macules in NF1 patients.

  19. Genome-wide upstream motif analysis of Cryptosporidium parvum genes clustered by expression profile

    PubMed Central

    2013-01-01

    Background There are very few molecular genetic tools available to study the apicomplexan parasite Cryptosporidium parvum. The organism is not amenable to continuous in vitro cultivation or transfection, and purification of intracellular developmental stages in sufficient numbers for most downstream molecular applications is difficult and expensive since animal hosts are required. As such, very little is known about gene regulation in C. parvum. Results We have clustered whole-genome gene expression profiles generated from a previous study of seven post-infection time points of 3,281 genes to identify genes that show similar expression patterns throughout the first 72 hours of in vitro epithelial cell culture. We used the algorithms MEME, AlignACE and FIRE to identify conserved, overrepresented DNA motifs in the upstream promoter region of genes with similar expression profiles. The most overrepresented motifs were E2F (5′-TGGCGCCA-3′); G-box (5′-G.GGGG-3′); a well-documented ApiAP2 binding motif (5′-TGCAT-3′), and an unknown motif (5′-[A/C] AACTA-3′). We generated a recombinant C. parvum DNA-binding protein domain from a putative ApiAP2 transcription factor [CryptoDB: cgd8_810] and determined its binding specificity using protein-binding microarrays. We demonstrate that cgd8_810 can putatively bind the overrepresented G-box motif, implicating this ApiAP2 in the regulation of many gene clusters. Conclusion Several DNA motifs were identified in the upstream sequences of gene clusters that might serve as potential cis-regulatory elements. These motifs, in concert with protein DNA binding site data, establish for the first time the beginnings of a global C. parvum gene regulatory map that will contribute to our understanding of the development of this zoonotic parasite. PMID:23895416

  20. Genome-wide upstream motif analysis of Cryptosporidium parvum genes clustered by expression profile.

    PubMed

    Oberstaller, Jenna; Joseph, Sandeep J; Kissinger, Jessica C

    2013-07-29

    There are very few molecular genetic tools available to study the apicomplexan parasite Cryptosporidium parvum. The organism is not amenable to continuous in vitro cultivation or transfection, and purification of intracellular developmental stages in sufficient numbers for most downstream molecular applications is difficult and expensive since animal hosts are required. As such, very little is known about gene regulation in C. parvum. We have clustered whole-genome gene expression profiles generated from a previous study of seven post-infection time points of 3,281 genes to identify genes that show similar expression patterns throughout the first 72 hours of in vitro epithelial cell culture. We used the algorithms MEME, AlignACE and FIRE to identify conserved, overrepresented DNA motifs in the upstream promoter region of genes with similar expression profiles. The most overrepresented motifs were E2F (5'-TGGCGCCA-3'); G-box (5'-G.GGGG-3'); a well-documented ApiAP2 binding motif (5'-TGCAT-3'), and an unknown motif (5'-[A/C] AACTA-3'). We generated a recombinant C. parvum DNA-binding protein domain from a putative ApiAP2 transcription factor [CryptoDB: cgd8_810] and determined its binding specificity using protein-binding microarrays. We demonstrate that cgd8_810 can putatively bind the overrepresented G-box motif, implicating this ApiAP2 in the regulation of many gene clusters. Several DNA motifs were identified in the upstream sequences of gene clusters that might serve as potential cis-regulatory elements. These motifs, in concert with protein DNA binding site data, establish for the first time the beginnings of a global C. parvum gene regulatory map that will contribute to our understanding of the development of this zoonotic parasite.

  1. Redox active motifs in selenoproteins.

    PubMed

    Li, Fei; Lutz, Patricia B; Pepelyayeva, Yuliya; Arnér, Elias S J; Bayse, Craig A; Rozovsky, Sharon

    2014-05-13

    Selenoproteins use the rare amino acid selenocysteine (Sec) to act as the first line of defense against oxidants, which are linked to aging, cancer, and neurodegenerative diseases. Many selenoproteins are oxidoreductases in which the reactive Sec is connected to a neighboring Cys and able to form a ring. These Sec-containing redox motifs govern much of the reactivity of selenoproteins. To study their fundamental properties, we have used (77)Se NMR spectroscopy in concert with theoretical calculations to determine the conformational preferences and mobility of representative motifs. This use of (77)Se as a probe enables the direct recording of the properties of Sec as its environment is systematically changed. We find that all motifs have several ring conformations in their oxidized state. These ring structures are most likely stabilized by weak, nonbonding interactions between the selenium and the amide carbon. To examine how the presence of selenium and ring geometric strain governs the motifs' reactivity, we measured the redox potentials of Sec-containing motifs and their corresponding Cys-only variants. The comparisons reveal that for C-terminal motifs the redox potentials increased between 20-25 mV when the selenenylsulfide bond was changed to a disulfide bond. Changes of similar magnitude arose when we varied ring size or the motifs' flanking residues. This suggests that the presence of Sec is not tied to unusually low redox potentials. The unique roles of selenoproteins in human health and their chemical reactivities may therefore not necessarily be explained by lower redox potentials, as has often been claimed.

  2. G4 motifs affect origin positioning and efficiency in two vertebrate replicators

    PubMed Central

    Valton, Anne-Laure; Hassan-Zadeh, Vahideh; Lema, Ingrid; Boggetto, Nicole; Alberti, Patrizia; Saintomé, Carole; Riou, Jean-François; Prioleau, Marie-Noëlle

    2014-01-01

    DNA replication ensures the accurate duplication of the genome at each cell cycle. It begins at specific sites called replication origins. Genome-wide studies in vertebrates have recently identified a consensus G-rich motif potentially able to form G-quadruplexes (G4) in most replication origins. However, there is no experimental evidence to demonstrate that G4 are actually required for replication initiation. We show here, with two model origins, that G4 motifs are required for replication initiation. Two G4 motifs cooperate in one of our model origins. The other contains only one critical G4, and its orientation determines the precise position of the replication start site. Point mutations affecting the stability of this G4 in vitro also impair origin function. Finally, this G4 is not sufficient for origin activity and must cooperate with a 200-bp cis-regulatory element. In conclusion, our study strongly supports the predicted essential role of G4 in replication initiation. PMID:24521668

  3. Gene regulation during late embryogenesis: the RY motif of maturation-specific gene promoters is a direct target of the FUS3 gene product.

    PubMed

    Reidt, W; Wohlfarth, T; Ellerström, M; Czihal, A; Tewes, A; Ezcurra, I; Rask, L; Bäumlein, H

    2000-03-01

    The Arabidopsis mutants fus3 and abi3 show pleiotropic effects during embryogenesis including reduced levels of transcripts encoding embryo-specific seed proteins. To investigate the interaction between the B3-domain-containing transcription factors FUS3 and ABI3 with the RY cis-motif, conserved in many seed-specific promoters, a promoter analysis as well as band-shift experiments were performed. The analysis of promoter mutants revealed the structural requirements for the function of the RY cis-element. It is shown that both the nucleotide sequence and the alternation of purin and pyrimidin nucleotides (RY character) are essential for the activity of the motif. Further, it was shown that FUS3 and ABI3 can act independently of each other in controlling promoter activity and that the RY cis-motif is a target for both transcription factors. For FUS3, which is so far the smallest known member of the B3-domain family, a physical interaction with the RY motif was established. The functional and biochemical data demonstrate that the regulators FUS3 and ABI3 are essential components of a regulatory network acting in concert through the RY-promoter element to control gene expression during late embryogenesis and seed development.

  4. PhyloGibbs-MP: module prediction and discriminative motif-finding by Gibbs sampling.

    PubMed

    Siddharthan, Rahul

    2008-08-29

    PhyloGibbs, our recent Gibbs-sampling motif-finder, takes phylogeny into account in detecting binding sites for transcription factors in DNA and assigns posterior probabilities to its predictions obtained by sampling the entire configuration space. Here, in an extension called PhyloGibbs-MP, we widen the scope of the program, addressing two major problems in computational regulatory genomics. First, PhyloGibbs-MP can localise predictions to small, undetermined regions of a large input sequence, thus effectively predicting cis-regulatory modules (CRMs) ab initio while simultaneously predicting binding sites in those modules-tasks that are usually done by two separate programs. PhyloGibbs-MP's performance at such ab initio CRM prediction is comparable with or superior to dedicated module-prediction software that use prior knowledge of previously characterised transcription factors. Second, PhyloGibbs-MP can predict motifs that differentiate between two (or more) different groups of regulatory regions, that is, motifs that occur preferentially in one group over the others. While other "discriminative motif-finders" have been published in the literature, PhyloGibbs-MP's implementation has some unique features and flexibility. Benchmarks on synthetic and actual genomic data show that this algorithm is successful at enhancing predictions of differentiating sites and suppressing predictions of common sites and compares with or outperforms other discriminative motif-finders on actual genomic data. Additional enhancements include significant performance and speed improvements, the ability to use "informative priors" on known transcription factors, and the ability to output annotations in a format that can be visualised with the Generic Genome Browser. In stand-alone motif-finding, PhyloGibbs-MP remains competitive, outperforming PhyloGibbs-1.0 and other programs on benchmark data.

  5. FastMotif: spectral sequence motif discovery.

    PubMed

    Colombo, Nicoló; Vlassis, Nikos

    2015-08-15

    Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, most of the existing motif finding algorithms are computationally demanding, and they may not be able to support the increasingly large datasets produced by modern high-throughput sequencing technologies. We present FastMotif, a new motif discovery algorithm that is built on a recent machine learning technique referred to as Method of Moments. Based on spectral decompositions, our method is robust to model misspecifications and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. On HT-Selex data, FastMotif extracts motif profiles that match those computed by various state-of-the-art algorithms, but one order of magnitude faster. We provide a theoretical and numerical analysis of the algorithm's robustness and discuss its sensitivity with respect to the free parameters. The Matlab code of FastMotif is available from http://lcsb-portal.uni.lu/bioinformatics. vlassis@adobe.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. DISCOVER: a feature-based discriminative method for motif search in complex genomes

    PubMed Central

    Fu, Wenjie; Ray, Pradipta; Xing, Eric P.

    2009-01-01

    Motivation: Identifying transcription factor binding sites (TFBSs) encoding complex regulatory signals in metazoan genomes remains a challenging problem in computational genomics. Due to degeneracy of nucleotide content among binding site instances or motifs, and intricate ‘grammatical organization’ of motifs within cis-regulatory modules (CRMs), extant pattern matching-based in silico motif search methods often suffer from impractically high false positive rates, especially in the context of analyzing large genomic datasets, and noisy position weight matrices which characterize binding sites. Here, we try to address this problem by using a framework to maximally utilize the information content of the genomic DNA in the region of query, taking cues from values of various biologically meaningful genetic and epigenetic factors in the query region such as clade-specific evolutionary parameters, presence/absence of nearby coding regions, etc. We present a new method for TFBS prediction in metazoan genomes that utilizes both the CRM architecture of sequences and a variety of features of individual motifs. Our proposed approach is based on a discriminative probabilistic model known as conditional random fields that explicitly optimizes the predictive probability of motif presence in large sequences, based on the joint effect of all such features. Results: This model overcomes weaknesses in earlier methods based on less effective statistical formalisms that are sensitive to spurious signals in the data. We evaluate our method on both simulated CRMs and real Drosophila sequences in comparison with a wide spectrum of existing models, and outperform the state of the art by 22% in F1 score. Availability and Implementation: The code is publicly available at http://www.sailing.cs.cmu.edu/discover.html. Contact: epxing@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19478006

  7. [Psychopathological study of lie motif in schizophrenia].

    PubMed

    Otsuka, Koichiro; Kato, Satoshi

    2006-01-01

    The theme of a statement is called "lie motif" by the authors when schizophrenic patients say "I have lied to anybody". We tried to analyse of the psychopathological characteristics and anthropological meanings of the lie motifs in schizophrenia, which has not been thematically examined until now, based on 4 cases, and contrasting with the lie motif (Lügenmotiv) in depression taken up by A. Kraus (1989). We classified the lie motifs in schizophrenia into the following two types: a) the past directive lie motif: the patients speak about their real lie regarding it as a 'petty fault' in their distant past with self-guilty feeling, b) the present directive lie motif: the patients say repeatedly 'I have lied' (about their present speech and behavior), retreating from their previous commitments. The observed false confessions of innocent fault by the patients seem to belong to the present directed lie motif. In comparison with the lie motif in depression, it is characteristic for the lie motif in schizophrenia that the patients feel themselves to already have been caught out by others before they confess the lie. The lie motif in schizophrenia seems to come into being through the attribution process of taking the others' blame on ones' own shoulders, which has been pointed out to be common in the guilt experience in schizophrenia. The others' blame on this occasion is due to "the others' gaze" in the experience of the initial self-centralization (i.e. non delusional self-referential experience) in the early stage of schizophrenia (S. Kato 1999). The others' gaze is supposed to bring about the feeling of amorphous self-revelation which could also be regarded as the guilt feeling without content, to the patients. When the guilt feeling is bound with a past concrete fault, the patients tell the past directive lie motif. On the other hand, when the patients cannot find a past fixed content, and feel their present actions as uncertain and experience them as lies, the

  8. Temporal motifs in time-dependent networks

    NASA Astrophysics Data System (ADS)

    Kovanen, Lauri; Karsai, Márton; Kaski, Kimmo; Kertész, János; Saramäki, Jari

    2011-11-01

    Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological-temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network.

  9. Large Putative PEST-like Sequence Motif at the Carboxyl Tail of Human Calcium Receptor Directs Lysosomal Degradation and Regulates Cell Surface Receptor Level*

    PubMed Central

    Zhuang, Xiaolei; Northup, John K.; Ray, Kausik

    2012-01-01

    A deletion between amino acid residues Ser895 and Val1075 in the carboxyl terminus of the human calcium receptor (hCaR), which causes autosomal dominant hypocalcemia, showed enhanced signaling activity and increased cell surface expression in HEK293 cells (Lienhardt, A., Garabédian, M. G., Bai, M., Sinding, C., Zhang, Z., Lagarde, J. P., Boulesteix, J., Rigaud, M., Brown, E. M., and Kottler, M. L. (2000) J. Clin. Endocrinol. Metab. 85, 1695–1702). To identify the underlying mechanism(s) for these increases, we investigated the effects of carboxyl tail truncation and deletion in hCaR mutants using a combination of biochemical and cell imaging approaches to define motifs that participate in regulating cell surface numbers of this G protein-coupled receptor. Our data indicate a rapid constitutive receptor internalization of the cell surface hCaR, accumulating in early (Rab7 positive) and late endosomal (LAMP1 positive) sorting compartments, before targeting to lysosomes for degradation. Recycling of hCaR back to the cell surface was also evident. Truncation and deletion mapping defined a 51-amino acid sequence between residues 920 and 970 that is required for targeting to lysosomes and degradation but not for internalization or recycling of the receptor. No singular sequence motif was identified, instead the required sequence elements seem to distribute throughout this entire interval. This interval includes a high proportion of acidic and hydroxylated amino acid residues, suggesting a similarity to PEST-like degradation motif (PESTfind score of +10) and several glutamine repeats. The results define a novel large PEST-like sequence that participates in the sorting of internalized hCaR routed to the lysosomal/degradation pathway that regulates cell surface receptor numbers. PMID:22158862

  10. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs

    PubMed Central

    Gromadzka, Agnieszka M.; Steckelberg, Anna-Lena; Singh, Kusum K.; Hofmann, Kay; Gehring, Niels H.

    2016-01-01

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs. PMID:26773052

  11. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs.

    PubMed

    Gromadzka, Agnieszka M; Steckelberg, Anna-Lena; Singh, Kusum K; Hofmann, Kay; Gehring, Niels H

    2016-03-18

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs.

  12. Protospacer recognition motifs

    PubMed Central

    Shah, Shiraz A.; Erdmann, Susanne; Mojica, Francisco J.M.; Garrett, Roger A.

    2013-01-01

    Protospacer adjacent motifs (PAMs) were originally characterized for CRISPR-Cas systems that were classified on the basis of their CRISPR repeat sequences. A few short 2–5 bp sequences were identified adjacent to one end of the protospacers. Experimental and bioinformatical results linked the motif to the excision of protospacers and their insertion into CRISPR loci. Subsequently, evidence accumulated from different virus- and plasmid-targeting assays, suggesting that these motifs were also recognized during DNA interference, at least for the recently classified type I and type II CRISPR-based systems. The two processes, spacer acquisition and protospacer interference, employ different molecular mechanisms, and there is increasing evidence to suggest that the sequence motifs that are recognized, while overlapping, are unlikely to be identical. In this article, we consider the properties of PAM sequences and summarize the evidence for their dual functional roles. It is proposed to use the terms protospacer associated motif (PAM) for the conserved DNA sequence and to employ spacer acqusition motif (SAM) and target interference motif (TIM), respectively, for acquisition and interference recognition sites. PMID:23403393

  13. Transcription factors that directly regulate the expression of CSLA9 encoding mannan synthase in Arabidopsis thaliana.

    PubMed

    Kim, Won-Chan; Reca, Ida-Barbara; Kim, Yongsig; Park, Sunchung; Thomashow, Michael F; Keegstra, Kenneth; Han, Kyung-Hwan

    2014-03-01

    Mannans are hemicellulosic polysaccharides that have a structural role and serve as storage reserves during plant growth and development. Previous studies led to the conclusion that mannan synthase enzymes in several plant species are encoded by members of the cellulose synthase-like A (CSLA) gene family. Arabidopsis has nine members of the CSLA gene family. Earlier work has shown that CSLA9 is responsible for the majority of glucomannan synthesis in both primary and secondary cell walls of Arabidopsis inflorescence stems. Little is known about how expression of the CLSA9 gene is regulated. Sequence analysis of the CSLA9 promoter region revealed the presence of multiple copies of a cis-regulatory motif (M46RE) recognized by transcription factor MYB46, leading to the hypothesis that MYB46 (At5g12870) is a direct regulator of the mannan synthase CLSA9. We obtained several lines of experimental evidence in support of this hypothesis. First, the expression of CSLA9 was substantially upregulated by MYB46 overexpression. Second, electrophoretic mobility shift assay (EMSA) was used to demonstrate the direct binding of MYB46 to the promoter of CSLA9 in vitro. This interaction was further confirmed in vivo by a chromatin immunoprecipitation assay. Finally, over-expression of MYB46 resulted in a significant increase in mannan content. Considering the multifaceted nature of MYB46-mediated transcriptional regulation of secondary wall biosynthesis, we reasoned that additional transcription factors are involved in the CSLA9 regulation. This hypothesis was tested by carrying out yeast-one hybrid screening, which identified ANAC041 and bZIP1 as direct regulators of CSLA9. Transcriptional activation assays and EMSA were used to confirm the yeast-one hybrid results. Taken together, we report that transcription factors ANAC041, bZIP1 and MYB46 directly regulate the expression of CSLA9.

  14. Stochastic motif extraction using hidden Markov model

    SciTech Connect

    Fujiwara, Yukiko; Asogawa, Minoru; Konagaya, Akihiko

    1994-12-31

    In this paper, we study the application of an HMM (hidden Markov model) to the problem of representing protein sequences by a stochastic motif. A stochastic protein motif represents the small segments of protein sequences that have a certain function or structure. The stochastic motif, represented by an HMM, has conditional probabilities to deal with the stochastic nature of the motif. This HMM directive reflects the characteristics of the motif, such as a protein periodical structure or grouping. In order to obtain the optimal HMM, we developed the {open_quotes}iterative duplication method{close_quotes} for HMM topology learning. It starts from a small fully-connected network and iterates the network generation and parameter optimization until it achieves sufficient discrimination accuracy. Using this method, we obtained an HMM for a leucine zipper motif. Compared to the accuracy of a symbolic pattern representation with accuracy of 14.8 percent, an HMM achieved 79.3 percent in prediction. Additionally, the method can obtain an HMM for various types of zinc finger motifs, and it might separate the mixed data. We demonstrated that this approach is applicable to the validation of the protein databases; a constructed HMM b as indicated that one protein sequence annotated as {open_quotes}lencine-zipper like sequence{close_quotes} in the database is quite different from other leucine-zipper sequences in terms of likelihood, and we found this discrimination is plausible.

  15. Prenatal Exposure of Mice to Diethylstilbestrol Disrupts T-Cell Differentiation by Regulating Fas/Fas Ligand Expression through Estrogen Receptor Element and Nuclear Factor-κB Motifs

    PubMed Central

    Singh, Narendra P.; Singh, Udai P.; Nagarkatti, Prakash S.

    2012-01-01

    Prenatal exposure to diethylstilbestrol (DES) is known to cause altered immune functions and increased susceptibility to autoimmune disease in humans. In the current study, we investigated the effect of prenatal exposure to DES on thymocyte differentiation involving apoptotic pathways. Prenatal DES exposure caused thymic atrophy, apoptosis, and up-regulation of Fas and Fas ligand (FasL) expression in thymocytes. To examine the mechanism underlying DES-mediated regulation of Fas and FasL, we performed luciferase assays using T cells transfected with luciferase reporter constructs containing full-length Fas or FasL promoters. There was significant luciferase induction in the presence of Fas or FasL promoters after DES exposure. Further analysis demonstrated the presence of several cis-regulatory motifs on both Fas and FasL promoters. When DES-induced transcription factors were analyzed, estrogen receptor element (ERE), nuclear factor κB (NF-κB), nuclear factor of activated T cells (NF-AT), and activator protein-1 motifs on the Fas promoter, as well as ERE, NF-κB, and NF-AT motifs on the FasL promoter, showed binding affinity with the transcription factors. Electrophoretic mobility-shift assays were performed to verify the binding affinity of cis-regulatory motifs of Fas or FasL promoters with transcription factors. There was shift in mobility of probes (ERE or NF-κB2) of both Fas and FasL in the presence of nuclear proteins from DES-treated cells, and the shift was specific to DES because these probes failed to shift their mobility in the presence of nuclear proteins from vehicle-treated cells. Together, the current study demonstrates that prenatal exposure to DES triggers significant alterations in apoptotic molecules expressed on thymocytes, which may affect T-cell differentiation and cause long-term effects on the immune functions. PMID:22888145

  16. Motif enrichment tool.

    PubMed

    Blatti, Charles; Sinha, Saurabh

    2014-07-01

    The Motif Enrichment Tool (MET) provides an online interface that enables users to find major transcriptional regulators of their gene sets of interest. MET searches the appropriate regulatory region around each gene and identifies which transcription factor DNA-binding specificities (motifs) are statistically overrepresented. Motif enrichment analysis is currently available for many metazoan species including human, mouse, fruit fly, planaria and flowering plants. MET also leverages high-throughput experimental data such as ChIP-seq and DNase-seq from ENCODE and ModENCODE to identify the regulatory targets of a transcription factor with greater precision. The results from MET are produced in real time and are linked to a genome browser for easy follow-up analysis. Use of the web tool is free and open to all, and there is no login requirement. ADDRESS: http://veda.cs.uiuc.edu/MET/.

  17. Cross-disciplinary detection and analysis of network motifs.

    PubMed

    Tran, Ngoc Tam L; DeLuccia, Luke; McDonald, Aidan F; Huang, Chun-Hsi

    2015-01-01

    The detection of network motifs has recently become an important part of network analysis across all disciplines. In this work, we detected and analyzed network motifs from undirected and directed networks of several different disciplines, including biological network, social network, ecological network, as well as other networks such as airlines, power grid, and co-purchase of political books networks. Our analysis revealed that undirected networks are similar at the basic three and four nodes, while the analysis of directed networks revealed the distinction between networks of different disciplines. The study showed that larger motifs contained the three-node motif as a subgraph. Topological analysis revealed that similar networks have similar small motifs, but as the motif size increases, differences arise. Pearson correlation coefficient showed strong positive relationship between some undirected networks but inverse relationship between some directed networks. The study suggests that the three-node motif is a building block of larger motifs. It also suggests that undirected networks share similar low-level structures. Moreover, similar networks share similar small motifs, but larger motifs define the unique structure of individuals. Pearson correlation coefficient suggests that protein structure networks, dolphin social network, and co-authorships in network science belong to a superfamily. In addition, yeast protein-protein interaction network, primary school contact network, Zachary's karate club network, and co-purchase of political books network can be classified into a superfamily.

  18. No tradeoff between versatility and robustness in gene circuit motifs

    NASA Astrophysics Data System (ADS)

    Payne, Joshua L.

    2016-05-01

    Circuit motifs are small directed subgraphs that appear in real-world networks significantly more often than in randomized networks. In the Boolean model of gene circuits, most motifs are realized by multiple circuit genotypes. Each of a motif's constituent circuit genotypes may have one or more functions, which are embodied in the expression patterns the circuit forms in response to specific initial conditions. Recent enumeration of a space of nearly 17 million three-gene circuit genotypes revealed that all circuit motifs have more than one function, with the number of functions per motif ranging from 12 to nearly 30,000. This indicates that some motifs are more functionally versatile than others. However, the individual circuit genotypes that constitute each motif are less robust to mutation if they have many functions, hinting that functionally versatile motifs may be less robust to mutation than motifs with few functions. Here, I explore the relationship between versatility and robustness in circuit motifs, demonstrating that functionally versatile motifs are robust to mutation despite the inherent tradeoff between versatility and robustness at the level of an individual circuit genotype.

  19. A G-Box-Like Motif Is Necessary for Transcriptional Regulation by Circadian Pseudo-Response Regulators in Arabidopsis.

    PubMed

    Liu, Tiffany L; Newton, Linsey; Liu, Ming-Jung; Shiu, Shin-Han; Farré, Eva M

    2016-01-01

    PSEUDO-RESPONSE REGULATORs (PRRs) play overlapping and distinct roles in maintaining circadian rhythms and regulating diverse biological processes, including the photoperiodic control of flowering, growth, and abiotic stress responses. PRRs act as transcriptional repressors and associate with chromatin via their conserved C-terminal CCT (CONSTANS, CONSTANS-like, and TIMING OF CAB EXPRESSION 1 [TOC1/PRR1]) domains by a still-poorly understood mechanism. Here, we identified genome-wide targets of PRR9 using chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) and compared them with PRR7, PRR5, and TOC1/PRR1 ChIP-seq data. We found that PRR binding sites are located within genomic regions of low nucleosome occupancy and high DNase I hypersensitivity. Moreover, conserved noncoding regions among Brassicaceae species are enriched around PRR binding sites, indicating that PRRs associate with functionally relevant cis-regulatory regions. The PRRs shared a significant number of binding regions, and our results indicate that they coordinately restrict the expression of target genes to around dawn. A G-box-like motif was overrepresented at PRR binding regions, and we showed that this motif is necessary for mediating transcriptional regulation of CIRCADIAN CLOCK ASSOCIATED 1 and PRR9 by the PRRs. Our results further our understanding of how PRRs target specific promoters and provide an extensive resource for studying circadian regulatory networks in plants. © 2016 American Society of Plant Biologists. All Rights Reserved.

  20. Motifs from the deep

    PubMed Central

    Hwang, Tony W; Codrea, Vlad; Ellington, Andrew D

    2009-01-01

    Because of the increasing recognition of the importance of non-coding RNAs in gene regulation, there is considerable interest in identifying RNA motifs in genomic data. In a recent report in BMC Genomics, Breaker and colleagues describe a new algorithm for identifying functional noncoding RNAs in metagenomic sequences of marine organisms, a strategy that may be particularly effective for discovering new and unique riboswitches. PMID:19735583

  1. Structural Motif-Based Homology Modeling of CYP27A1 and Site-Directed Mutational Analyses Affecting Vitamin D Hydroxylation

    PubMed Central

    Prosser, David E.; Guo, YuDing; Jia, Zongchao; Jones, Glenville

    2006-01-01

    Human CYP27A1 is a mitochondrial cytochrome P450, which is principally found in the liver and plays important roles in the biological activation of vitamin D3 and in the biosynthesis of bile acids. We have applied a systematic analysis of hydrogen bonding patterns in 11 prokaryotic and mammalian CYP crystal structures to construct a homology-based model of CYP27A1. Docking of vitamin D3 structures into the active site of this model identified potential substrate contact residues in the F-helix, the β-3 sheet, and the β-5 sheet. Site-directed mutagenesis and expression in COS-1 cells confirmed that these positions affect enzymatic activity, in some cases shifting metabolism of 1α-hydroxyvitamin D3 to favor 25- or 27-hydroxylation. The results suggest that conserved hydrophobic residues in the β-5 hairpin help define the shape of the substrate binding cavity and that this structure interacts with Phe-248 in the F-helix. Mutations directed toward the β-3a strand suggested a possible heme-binding interaction centered on Asn-403 and a structural role for substrate contact residues Thr-402 and Ser-404. PMID:16500955

  2. Novel blocking human IgG directed against the pentapeptide repeat motifs of Neisseria meningitidis Lip/H.8 and Laz lipoproteins.

    PubMed

    Ray, Tathagat Dutta; Lewis, Lisa A; Gulati, Sunita; Rice, Peter A; Ram, Sanjay

    2011-04-15

    Ab-initiated, complement-dependent killing contributes to host defenses against invasive meningococcal disease. Sera from nonimmunized individuals vary widely in their bactericidal activity against group B meningococci. We show that IgG isolated from select individuals can block killing of group B meningococci by human sera that are otherwise bactericidal. This IgG also reduced the bactericidal efficacy of Abs directed against the group B meningococcal protein vaccine candidates factor H-binding protein currently undergoing clinical trials and Neisserial surface protein A. Immunoblots revealed that the blocking IgG was directed against a meningococcal Ag called H.8. Killing of meningococci in reactions containing bactericidal mAbs and human blocking Abs was restored when binding of blocking Ab to meningococci was inhibited using either synthetic peptides corresponding to H.8 or a nonblocking mAb against H.8. Furthermore, genetic deletion of H.8 from target organisms abrogated blocking. The Fc region of the blocking IgG was required for blocking because F(ab')(2) fragments were ineffective. Blocking required IgG glycosylation because deglycosylation with peptide:N-glycanase eliminated blocking. C4b deposition mediated by an anti-factor H-binding protein mAb was reduced by intact blocking IgG, but not by peptide:N-glycanase-treated blocking IgG, suggesting that blocking resulted from inhibition of classical pathway of complement. In conclusion, we have identified H.8 as a meningococcal target for novel blocking Abs in human serum. Such blocking Abs may reduce the efficacy of select antigroup B meningococcal protein vaccines. We also propose that outer membrane vesicle-containing meningococcal vaccines may be more efficacious if purged of subversive immunogens such as H.8.

  3. Motif-based embedding for graph clustering

    NASA Astrophysics Data System (ADS)

    Lim, Sungsu; Lee, Jae-Gil

    2016-12-01

    Community detection in complex networks is a fundamental problem that has been extensively studied owing to its wide range of applications. However, because community detection methods typically rely on the relations between vertices in networks, they may fail to discover higher-order graph substructures, called the network motifs. In this paper, we propose a novel embedding method for graph clustering that considers higher-order relationships involving multiple vertices. We show that our embedding method, which we call motif-based embedding, is more effective in detecting communities than existing graph embedding methods, spectral embedding and force-directed embedding, both theoretically and experimentally.

  4. Network motifs modulate druggability of cellular targets

    PubMed Central

    Wu, Fan; Ma, Cong; Tan, Cheemeng

    2016-01-01

    Druggability refers to the capacity of a cellular target to be modulated by a small-molecule drug. To date, druggability is mainly studied by focusing on direct binding interactions between a drug and its target. However, druggability is impacted by cellular networks connected to a drug target. Here, we use computational approaches to reveal basic principles of network motifs that modulate druggability. Through quantitative analysis, we find that inhibiting self-positive feedback loop is a more robust and effective treatment strategy than inhibiting other regulations, and adding direct regulations to a drug-target generally reduces its druggability. The findings are explained through analytical solution of the motifs. Furthermore, we find that a consensus topology of highly druggable motifs consists of a negative feedback loop without any positive feedback loops, and consensus motifs with low druggability have multiple positive direct regulations and positive feedback loops. Based on the discovered principles, we predict potential genetic targets in Escherichia coli that have either high or low druggability based on their network context. Our work establishes the foundation toward identifying and predicting druggable targets based on their network topology. PMID:27824147

  5. Dynamic motifs of strategies in prisoner's dilemma games

    NASA Astrophysics Data System (ADS)

    Kim, Young Jin; Roh, Myungkyoon; Jeong, Seon-Young; Son, Seung-Woo

    2014-12-01

    We investigate the win-lose relations between strategies of iterated prisoner's dilemma games by using a directed network concept to display the replicator dynamics results. In the giant strongly-connected component of the win/lose network, we find win-lose circulations similar to rock-paper-scissors and analyze the fixed point and its stability. Applying the network motif concept, we introduce dynamic motifs, which describe the population dynamics relations among the three strategies. Through exact enumeration, we find 22 dynamic motifs and display their phase portraits. Visualization using directed networks and motif analysis is a useful method to make complex dynamic behavior simple in order to understand it more intuitively. Dynamic motifs can be building blocks for dynamic behavior among strategies when they are applied to other types of games.

  6. The transcriptional complex between the BCL2 i-motif and hnRNP LL is a molecular switch for control of gene expression that can be modulated by small molecules.

    PubMed

    Kang, Hyun-Jin; Kendrick, Samantha; Hecht, Sidney M; Hurley, Laurence H

    2014-03-19

    In a companion paper (DOI: 10.021/ja410934b) we demonstrate that the C-rich strand of the cis-regulatory element in the BCL2 promoter element is highly dynamic in nature and can form either an i-motif or a flexible hairpin. Under physiological conditions these two secondary DNA structures are found in an equilibrium mixture, which can be shifted by the addition of small molecules that trap out either the i-motif (IMC-48) or the flexible hairpin (IMC-76). In cellular experiments we demonstrate that the addition of these molecules has opposite effects on BCL2 gene expression and furthermore that these effects are antagonistic. In this contribution we have identified a transcriptional factor that recognizes and binds to the BCL2 i-motif to activate transcription. The molecular basis for the recognition of the i-motif by hnRNP LL is determined, and we demonstrate that the protein unfolds the i-motif structure to form a stable single-stranded complex. In subsequent experiments we show that IMC-48 and IMC-76 have opposite, antagonistic effects on the formation of the hnRNP LL-i-motif complex as well as on the transcription factor occupancy at the BCL2 promoter. For the first time we propose that the i-motif acts as a molecular switch that controls gene expression and that small molecules that target the dynamic equilibrium of the i-motif and the flexible hairpin can differentially modulate gene expression.

  7. Identification of conserved splicing motifs in mutually exclusive exons of 15 insect species.

    PubMed

    Buendia, Patricia; Tyree, John; Loredo, Robert; Hsu, Shu-Ning

    2012-04-12

    During alternative splicing, the inclusion of an exon in the final mRNA molecule is determined by nuclear proteins that bind cis-regulatory sequences in a target pre-mRNA molecule. A recent study suggested that the regulatory codes of individual RNA-binding proteins may be nearly immutable between very diverse species such as mammals and insects. The model system Drosophila melanogaster therefore presents an excellent opportunity for the study of alternative splicing due to the availability of quality EST annotations in FlyBase. In this paper, we describe an in silico analysis pipeline to extract putative exonic splicing regulatory sequences from a multiple alignment of 15 species of insects. Our method, ESTs-to-ESRs (E2E), uses graph analysis of EST splicing graphs to identify mutually exclusive (ME) exons and combines phylogenetic measures, a sliding window approach along the multiple alignment and the Welch's t statistic to extract conserved ESR motifs. The most frequent 100% conserved word of length 5 bp in different insect exons was "ATGGA". We identified 799 statistically significant "spike" hexamers, 218 motifs with either a left or right FDR corrected spike magnitude p-value < 0.05 and 83 with both left and right uncorrected p < 0.01. 11 genes were identified with highly significant motifs in one ME exon but not in the other, suggesting regulation of ME exon splicing through these highly conserved hexamers. The majority of these genes have been shown to have regulated spatiotemporal expression. 10 elements were found to match three mammalian splicing regulator databases. A putative ESR motif, GATGCAG, was identified in the ME-13b but not in the ME-13a of Drosophila N-Cadherin, a gene that has been shown to have a distinct spatiotemporal expression pattern of spliced isoforms in a recent study. Analysis of phylogenetic relationships and variability of sequence conservation as implemented in the E2E spikes method may lead to improved identification of ESRs

  8. Exclusion of RNA strands from a purine motif triple helix.

    PubMed Central

    Semerad, C L; Maher, L J

    1994-01-01

    Research concerning oligonucleotide-directed triple helix formation has mainly focused on the binding of DNA oligonucleotides to duplex DNA. The participation of RNA strands in triple helices is also of interest. For the pyrimidine motif (pyrimidine.purine.pyrimidine triplets), systematic substitution of RNA for DNA in one, two, or all three triplex strands has previously been reported. For the purine motif (purine.purine.pyrimidine triplets), studies have shown only that RNA cannot bind to duplex DNA. To extend this result, we created a DNA triple helix in the purine motif and systematically replaced one, two, or all three strands with RNA. In dramatic contrast to the general accommodation of RNA strands in the pyrimidine triple helix motif, a stable triplex forms in the purine motif only when all three of the substituent strands are DNA. The lack of triplex formation among any of the other seven possible strand combinations involving RNA suggests that: (i) duplex structures containing RNA cannot be targeted by DNA oligonucleotides in the purine motif; (ii) RNA strands cannot be employed to recognize duplex DNA in the purine motif; and (iii) RNA tertiary structures are likely to contain only isolated base triplets in the purine motif. Images PMID:7529405

  9. Conserved stem-loop structures in the HIV-1 RNA region containing the A3 3' splice site and its cis-regulatory element: possible involvement in RNA splicing.

    PubMed

    Jacquenet, S; Ropers, D; Bilodeau, P S; Damier, L; Mougin, A; Stoltzfus, C M; Branlant, C

    2001-01-15

    The HIV-1 transcript is alternatively spliced to over 30 different mRNAs. Whether RNA secondary structure can influence HIV-1 RNA alternative splicing has not previously been examined. Here we have determined the secondary structure of the HIV-1/BRU RNA segment, containing the alternative A3, A4a, A4b, A4c and A5 3' splice sites. Site A3, required for tat mRNA production, is contained in the terminal loop of a stem-loop structure (SLS2), which is highly conserved in HIV-1 and related SIVcpz strains. The exon splicing silencer (ESS2) acting on site A3 is located in a long irregular stem-loop structure (SLS3). Two SLS3 domains were protected by nuclear components under splicing condition assays. One contains the A4c branch points and a putative SR protein binding site. The other one is adjacent to ESS2. Unexpectedly, only the 3' A residue of ESS2 was protected. The suboptimal A3 polypyrimidine tract (PPT) is base paired. Using site-directed mutagenesis and transfection of a mini-HIV-1 cDNA into HeLa cells, we found that, in a wild-type PPT context, a mutation of the A3 downstream sequence that reinforced SLS2 stability decreased site A3 utilization. This was not the case with an optimized PPT. Hence, sequence and secondary structure of the PPT may cooperate in limiting site A3 utilization.

  10. A G-Box-Like Motif Is Necessary for Transcriptional Regulation by Circadian Pseudo-Response Regulators in Arabidopsis1[OPEN

    PubMed Central

    Newton, Linsey; Liu, Ming-Jung

    2016-01-01

    PSEUDO-RESPONSE REGULATORs (PRRs) play overlapping and distinct roles in maintaining circadian rhythms and regulating diverse biological processes, including the photoperiodic control of flowering, growth, and abiotic stress responses. PRRs act as transcriptional repressors and associate with chromatin via their conserved C-terminal CCT (CONSTANS, CONSTANS-like, and TIMING OF CAB EXPRESSION 1 [TOC1/PRR1]) domains by a still-poorly understood mechanism. Here, we identified genome-wide targets of PRR9 using chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) and compared them with PRR7, PRR5, and TOC1/PRR1 ChIP-seq data. We found that PRR binding sites are located within genomic regions of low nucleosome occupancy and high DNase I hypersensitivity. Moreover, conserved noncoding regions among Brassicaceae species are enriched around PRR binding sites, indicating that PRRs associate with functionally relevant cis-regulatory regions. The PRRs shared a significant number of binding regions, and our results indicate that they coordinately restrict the expression of target genes to around dawn. A G-box-like motif was overrepresented at PRR binding regions, and we showed that this motif is necessary for mediating transcriptional regulation of CIRCADIAN CLOCK ASSOCIATED 1 and PRR9 by the PRRs. Our results further our understanding of how PRRs target specific promoters and provide an extensive resource for studying circadian regulatory networks in plants. PMID:26586835

  11. Cis-regulatory control of corticospinal system development and evolution

    PubMed Central

    Shim, Sungbo; Kwan, Kenneth Y.; Li, Mingfeng; Lefebvre, Veronique; Šestan, Nenad

    2012-01-01

    Summary The co-emergence of a six-layered cerebral neocortex and its corticospinal output system is one of the evolutionary hallmarks of mammals. However, the genetic programs that underlie their development and evolution remain poorly understood. Here we identify a conserved non-exonic element (E4) that acts as a cortex-specific enhancer for the nearby Fezf2, which is required for the specification of corticospinal neuron identity and connectivity. We find that SOX4 and SOX11 functionally compete with the repressor SOX5 in the trans-activation of E4. Cortex-specific double deletion of Sox4 and Sox11 leads to the loss of Fezf2 expression and failed specification of corticospinal neurons and, independent of Fezf2, a reeler-like inversion of layers. We show evidence supporting the emergence of functional SOX binding sites in E4 during tetrapod evolution and their subsequent stabilization in mammals and possibly amniotes. These findings reveal that SOX transcription factors converge onto a cis-acting element of Fezf2 and form critical components of a regulatory network controlling the identity and connectivity of corticospinal neurons. PMID:22678282

  12. Quantitative imaging of cis-regulatory reporters in living embryos

    PubMed Central

    Dmochowski, Ivan J.; Dmochowski, Jane E.; Oliveri, Paola; Davidson, Eric H.; Fraser, Scott E.

    2002-01-01

    A confocal laser scanning microscopy method has been developed for the quantitation of green fluorescent protein (GFP) as a reporter of gene activity in living three-dimensional structures such as sea urchin and starfish embryos. This method is between 2 and 50 times more accurate than conventional confocal microscopy procedures depending on the localization of GFP within an embryo. By using coinjected Texas red dextran as an internal fluorescent standard, the observed GFP intensity is corrected for variations in laser excitation and fluorescence collection efficiency. To relate the recorded image intensity to the number of GFP molecules, the embryos were lysed gently, and a fluorometric analysis of their contents was performed. Confocal laser scanning microscopy data collection from a single sea urchin blastula required less than 2 min, thereby allowing gene expression in dozens of embryos to be monitored in parallel with high spatial and temporal resolution. PMID:12237411

  13. Cis-regulatory RNA elements that regulate specialized ribosome activity

    PubMed Central

    Xue, Shifeng; Barna, Maria

    2015-01-01

    Recent evidence has shown that the ribosome itself can play a highly regulatory role in the specialized translation of specific subpools of mRNAs, in particular at the level of ribosomal proteins (RP). However, the mechanism(s) by which this selection takes place has remained poorly understood. In our recent study, we discovered a combination of unique RNA elements in the 5′UTRs of mRNAs that allows for such control by the ribosome. These mRNAs contain a Translation Inhibitory Element (TIE) that inhibits general cap-dependent translation, and an Internal Ribosome Entry Site (IRES) that relies on a specific RP for activation. The unique combination of an inhibitor of general translation and an activator of specialized translation is key to ribosome-mediated control of gene expression. Here we discuss how these RNA regulatory elements provide a new level of control to protein expression and their implications for gene expression, organismal development and evolution. PMID:26327194

  14. Triadic motifs in the dependence networks of virtual societies.

    PubMed

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-06-10

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.

  15. Triadic motifs in the dependence networks of virtual societies

    PubMed Central

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-01-01

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755

  16. Motif Yggdrasil: sampling sequence motifs from a tree mixture model.

    PubMed

    Andersson, Samuel A; Lagergren, Jens

    2007-06-01

    In phylogenetic foot-printing, putative regulatory elements are found in upstream regions of orthologous genes by searching for common motifs. Motifs in different upstream sequences are subject to mutations along the edges of the corresponding phylogenetic tree, consequently taking advantage of the tree in the motif search is an appealing idea. We describe the Motif Yggdrasil sampler; the first Gibbs sampler based on a general tree that uses unaligned sequences. Previous tree-based Gibbs samplers have assumed a star-shaped tree or partially aligned upstream regions. We give a probabilistic model (MY model) describing upstream sequences with regulatory elements and build a Gibbs sampler with respect to this model. The model allows toggling, i.e., the restriction of a position to a subset of nucleotides, but does not require aligned sequences nor edge lengths, which may be difficult to come by. We apply the collapsing technique to eliminate the need to sample nuisance parameters, and give a derivation of the predictive update formula. We show that the MY model improves the modeling of difficult motif instances and that the use of the tree achieves a substantial increase in nucleotide level correlation coefficient both for synthetic data and 37 bacterial lexA genes. We investigate the sensitivity to errors in the tree and show that using random trees MY sampler still has a performance similar to the original version.

  17. MEME-ChIP: motif analysis of large DNA datasets.

    PubMed

    Machanick, Philip; Bailey, Timothy L

    2011-06-15

    Advances in high-throughput sequencing have resulted in rapid growth in large, high-quality datasets including those arising from transcription factor (TF) ChIP-seq experiments. While there are many existing tools for discovering TF binding site motifs in such datasets, most web-based tools cannot directly process such large datasets. The MEME-ChIP web service is designed to analyze ChIP-seq 'peak regions'--short genomic regions surrounding declared ChIP-seq 'peaks'. Given a set of genomic regions, it performs (i) ab initio motif discovery, (ii) motif enrichment analysis, (iii) motif visualization, (iv) binding affinity analysis and (v) motif identification. It runs two complementary motif discovery algorithms on the input data--MEME and DREME--and uses the motifs they discover in subsequent visualization, binding affinity and identification steps. MEME-ChIP also performs motif enrichment analysis using the AME algorithm, which can detect very low levels of enrichment of binding sites for TFs with known DNA-binding motifs. Importantly, unlike with the MEME web service, there is no restriction on the size or number of uploaded sequences, allowing very large ChIP-seq datasets to be analyzed. The analyses performed by MEME-ChIP provide the user with a varied view of the binding and regulatory activity of the ChIP-ed TF, as well as the possible involvement of other DNA-binding TFs. MEME-ChIP is available as part of the MEME Suite at http://meme.nbcr.net.

  18. MEME-ChIP: motif analysis of large DNA datasets

    PubMed Central

    Machanick, Philip; Bailey, Timothy L.

    2011-01-01

    Motivation: Advances in high-throughput sequencing have resulted in rapid growth in large, high-quality datasets including those arising from transcription factor (TF) ChIP-seq experiments. While there are many existing tools for discovering TF binding site motifs in such datasets, most web-based tools cannot directly process such large datasets. Results: The MEME-ChIP web service is designed to analyze ChIP-seq ‘peak regions’—short genomic regions surrounding declared ChIP-seq ‘peaks’. Given a set of genomic regions, it performs (i) ab initio motif discovery, (ii) motif enrichment analysis, (iii) motif visualization, (iv) binding affinity analysis and (v) motif identification. It runs two complementary motif discovery algorithms on the input data—MEME and DREME—and uses the motifs they discover in subsequent visualization, binding affinity and identification steps. MEME-ChIP also performs motif enrichment analysis using the AME algorithm, which can detect very low levels of enrichment of binding sites for TFs with known DNA-binding motifs. Importantly, unlike with the MEME web service, there is no restriction on the size or number of uploaded sequences, allowing very large ChIP-seq datasets to be analyzed. The analyses performed by MEME-ChIP provide the user with a varied view of the binding and regulatory activity of the ChIP-ed TF, as well as the possible involvement of other DNA-binding TFs. Availability: MEME-ChIP is available as part of the MEME Suite at http://meme.nbcr.net. Contact: t.bailey@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21486936

  19. [Prediction of Promoter Motifs in Virophages].

    PubMed

    Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie

    2015-07-01

    Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses.

  20. Knowledge discovery of multilevel protein motifs

    SciTech Connect

    Conklin, D.; Glasgow, J.; Fortier, S.

    1994-12-31

    A new category of protein motif is introduced. This type of motif captures, in addition to global structure, the nested structure of its component parts. A dataset of four proteins is represented using this scheme. A structured machine discovery procedure is used to discover recurrent amino acid motifs and this knowledge is utilized for the expression of subsequent protein motif discoveries. Examples of discovered multilevel motifs are presented.

  1. Unravelling daily human mobility motifs

    PubMed Central

    Schneider, Christian M.; Belik, Vitaly; Couronné, Thomas; Smoreda, Zbigniew; González, Marta C.

    2013-01-01

    Human mobility is differentiated by time scales. While the mechanism for long time scales has been studied, the underlying mechanism on the daily scale is still unrevealed. Here, we uncover the mechanism responsible for the daily mobility patterns by analysing the temporal and spatial trajectories of thousands of persons as individual networks. Using the concept of motifs from network theory, we find only 17 unique networks are present in daily mobility and they follow simple rules. These networks, called here motifs, are sufficient to capture up to 90 per cent of the population in surveys and mobile phone datasets for different countries. Each individual exhibits a characteristic motif, which seems to be stable over several months. Consequently, daily human mobility can be reproduced by an analytically tractable framework for Markov chains by modelling periods of high-frequency trips followed by periods of lower activity as the key ingredient. PMID:23658117

  2. Sequential visibility-graph motifs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-04-01

    Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.

  3. The distribution of RNA motifs in natural sequences.

    PubMed

    Bourdeau, V; Ferbeyre, G; Pageau, M; Paquin, B; Cedergren, R

    1999-11-15

    Functional analysis of genome sequences has largely ignored RNA genes and their structures. We introduce here the notion of 'ribonomics' to describe the search for the distribution of and eventually the determination of the physiological roles of these RNA structures found in the sequence databases. The utility of this approach is illustrated here by the identification in the GenBank database of RNA motifs having known binding or chemical activity. The frequency of these motifs indicates that most have originated from evolutionary drift and are selectively neutral. On the other hand, their distribution among species and their location within genes suggest that the destiny of these motifs may be more elaborate. For example, the hammerhead motif has a skewed organismal presence, is phylogenetically stable and recent work on a schistosome version confirms its in vivo biological activity. The under-representation of the valine-binding motif and the Rev-binding element in GenBank hints at a detrimental effect on cell growth or viability. Data on the presence and the location of these motifs may provide critical guidance in the design of experiments directed towards the understanding and the manipulation of RNA complexes and activities in vivo.

  4. Neural Circuits: Male Mating Motifs.

    PubMed

    Benton, Richard

    2015-09-02

    Characterizing microcircuit motifs in intact nervous systems is essential to relate neural computations to behavior. In this issue of Neuron, Clowney et al. (2015) identify recurring, parallel feedforward excitatory and inhibitory pathways in male Drosophila's courtship circuitry, which might explain decisive mate choice.

  5. Combinatorial Information Theoretical Measurement of the Semantic Significance of Semantic Graph Motifs

    SciTech Connect

    Joslyn, Cliff A.; al-Saffar, Sinan; Haglin, David J.; Holder, Larry

    2011-06-14

    Given an arbitrary semantic graph data set, perhaps one lacking in explicit ontological information, we wish to first identify its significant semantic structures, and then measure the extent of their significance. Casting a semantic graph dataset as an edge-labeled, directed graph, this task can be built on the ability to mine frequent {\\em labeled} subgraphs in edge-labeled, directed graphs. We begin by considering the fundamentals of the enumerative combinatorics of subgraph motif structures in edge-labeled directed graphs. We identify its frequent labeled, directed subgraph motif patterns, and measure the significance of the resulting motifs by the information gain relative to the expected value of the motif based on the empirical frequency distribution of the link types which compose them, assuming indpendence. We illustrate the method on a small test graph, and discuss results obtained for small linear motifs (link type bigrams and trigrams) in a larger graph structure.

  6. Observability of Neuronal Network Motifs

    PubMed Central

    Whalen, Andrew J.; Brennan, Sean N.; Sauer, Timothy D.; Schiff, Steven J.

    2014-01-01

    We quantify observability in small (3 node) neuronal networks as a function of 1) the connection topology and symmetry, 2) the measured nodes, and 3) the nodal dynamics (linear and nonlinear). We find that typical observability metrics for 3 neuron motifs range over several orders of magnitude, depending upon topology, and for motifs containing symmetry the network observability decreases when observing from particularly confounded nodes. Nonlinearities in the nodal equations generally decrease the average network observability and full network information becomes available only in limited regions of the system phase space. Our findings demonstrate that such networks are partially observable, and suggest their potential efficacy in reconstructing network dynamics from limited measurement data. How well such strategies can be used to reconstruct and control network dynamics in experimental settings is a subject for future experimental work. PMID:25909092

  7. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  8. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  9. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, P.; Ciszak, E.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  10. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, P.; Ciszak, E.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  11. A comprehensive analysis of the La-motif protein superfamily

    PubMed Central

    Bousquet-Antonelli, Cécile; Deragon, Jean-Marc

    2009-01-01

    The extremely well-conserved La motif (LAM), in synergy with the immediately following RNA recognition motif (RRM), allows direct binding of the (genuine) La autoantigen to RNA polymerase III primary transcripts. This motif is not only found on La homologs, but also on La-related proteins (LARPs) of unrelated function. LARPs are widely found amongst eukaryotes and, although poorly characterized, appear to be RNA-binding proteins fulfilling crucial cellular functions. We searched the fully sequenced genomes of 83 eukaryotic species scattered along the tree of life for the presence of LAM-containing proteins. We observed that these proteins are absent from archaea and present in all eukaryotes (except protists from the Plasmodium genus), strongly suggesting that the LAM is an ancestral motif that emerged early after the archaea-eukarya radiation. A complete evolutionary and structural analysis of these proteins resulted in their classification into five families: the genuine La homologs and four LARP families. Unexpectedly, in each family a conserved domain representing either a classical RRM or an RRM-like motif immediately follows the LAM of most proteins. An evolutionary analysis of the LAM-RRM/RRM-L regions shows that these motifs co-evolved and should be used as a single entity to define the functional region of interaction of LARPs with their substrates. We also found two extremely well conserved motifs, named LSA and DM15, shared by LARP6 and LARP1 family members, respectively. We suggest that members of the same family are functional homologs and/or share a common molecular mode of action on different RNA baits. PMID:19299548

  12. A comprehensive analysis of the La-motif protein superfamily.

    PubMed

    Bousquet-Antonelli, Cécile; Deragon, Jean-Marc

    2009-05-01

    The extremely well-conserved La motif (LAM), in synergy with the immediately following RNA recognition motif (RRM), allows direct binding of the (genuine) La autoantigen to RNA polymerase III primary transcripts. This motif is not only found on La homologs, but also on La-related proteins (LARPs) of unrelated function. LARPs are widely found amongst eukaryotes and, although poorly characterized, appear to be RNA-binding proteins fulfilling crucial cellular functions. We searched the fully sequenced genomes of 83 eukaryotic species scattered along the tree of life for the presence of LAM-containing proteins. We observed that these proteins are absent from archaea and present in all eukaryotes (except protists from the Plasmodium genus), strongly suggesting that the LAM is an ancestral motif that emerged early after the archaea-eukarya radiation. A complete evolutionary and structural analysis of these proteins resulted in their classification into five families: the genuine La homologs and four LARP families. Unexpectedly, in each family a conserved domain representing either a classical RRM or an RRM-like motif immediately follows the LAM of most proteins. An evolutionary analysis of the LAM-RRM/RRM-L regions shows that these motifs co-evolved and should be used as a single entity to define the functional region of interaction of LARPs with their substrates. We also found two extremely well conserved motifs, named LSA and DM15, shared by LARP6 and LARP1 family members, respectively. We suggest that members of the same family are functional homologs and/or share a common molecular mode of action on different RNA baits.

  13. NetMODE: Network Motif Detection without Nauty

    PubMed Central

    Wang, Haidong; Deng, Hualiang; Liu, Xiaoguang; Wang, Gang

    2012-01-01

    A motif in a network is a connected graph that occurs significantly more frequently as an induced subgraph than would be expected in a similar randomized network. By virtue of being atypical, it is thought that motifs might play a more important role than arbitrary subgraphs. Recently, a flurry of advances in the study of network motifs has created demand for faster computational means for identifying motifs in increasingly larger networks. Motif detection is typically performed by enumerating subgraphs in an input network and in an ensemble of comparison networks; this poses a significant computational problem. Classifying the subgraphs encountered, for instance, is typically performed using a graph canonical labeling package, such as Nauty, and will typically be called billions of times. In this article, we describe an implementation of a network motif detection package, which we call NetMODE. NetMODE can only perform motif detection for -node subgraphs when , but does so without the use of Nauty. To avoid using Nauty, NetMODE has an initial pretreatment phase, where -node graph data is stored in memory (). For we take a novel approach, which relates to the Reconstruction Conjecture for directed graphs. We find that NetMODE can perform up to around times faster than its predecessors when and up to around times faster when (the exact improvement varies considerably). NetMODE also (a) includes a method for generating comparison graphs uniformly at random, (b) can interface with external packages (e.g. R), and (c) can utilize multi-core architectures. NetMODE is available from netmode.sf.net. PMID:23272055

  14. Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs.

    PubMed

    Zheng, Yiyu; Li, Xiaoman; Hu, Haiyan

    2015-01-01

    Comprehensive motif discovery under experimental conditions is critical for the global understanding of gene regulation. To generate a nearly complete list of human DNA motifs under given conditions, we employed a novel approach to de novo discover significant co-occurring DNA motifs in 349 human DNase I hypersensitive site datasets. We predicted 845 to 1325 motifs in each dataset, for a total of 2684 non-redundant motifs. These 2684 motifs contained 54.02 to 75.95% of the known motifs in seven large collections including TRANSFAC. In each dataset, we also discovered 43 663 to 2 013 288 motif modules, groups of motifs with their binding sites co-occurring in a significant number of short DNA regions. Compared with known interacting transcription factors in eight resources, the predicted motif modules on average included 84.23% of known interacting motifs. We further showed new features of the predicted motifs, such as motifs enriched in proximal regions rarely overlapped with motifs enriched in distal regions, motifs enriched in 5' distal regions were often enriched in 3' distal regions, etc. Finally, we observed that the 2684 predicted motifs classified the cell or tissue types of the datasets with an accuracy of 81.29%. The resources generated in this study are available at http://server.cs.ucf.edu/predrem/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder

    PubMed Central

    Sharov, Alexei A.; Ko, Minoru S.H.

    2009-01-01

    We present CisFinder software, which generates a comprehensive list of motifs enriched in a set of DNA sequences and describes them with position frequency matrices (PFMs). A new algorithm was designed to estimate PFMs directly from counts of n-mer words with and without gaps; then PFMs are extended over gaps and flanking regions and clustered to generate non-redundant sets of motifs. The algorithm successfully identified binding motifs for 12 transcription factors (TFs) in embryonic stem cells based on published chromatin immunoprecipitation sequencing data. Furthermore, CisFinder successfully identified alternative binding motifs of TFs (e.g. POU5F1, ESRRB, and CTCF) and motifs for known and unknown co-factors of genes associated with the pluripotent state of ES cells. CisFinder also showed robust performance in the identification of motifs that were only slightly enriched in a set of DNA sequences. PMID:19740934

  16. The biological function of some human transcription factor binding motifs varies with position relative to the transcription start site.

    PubMed

    Tharakaraman, Kannan; Bodenreider, Olivier; Landsman, David; Spouge, John L; Mariño-Ramírez, Leonardo

    2008-05-01

    A number of previous studies have predicted transcription factor binding sites (TFBSs) by exploiting the position of genomic landmarks like the transcriptional start site (TSS). The studies' methods are generally too computationally intensive for genome-scale investigation, so the full potential of 'positional regulomics' to discover TFBSs and determine their function remains unknown. Because databases often annotate the genomic landmarks in DNA sequences, the methodical exploitation of positional regulomics has become increasingly urgent. Accordingly, we examined a set of 7914 human putative promoter regions (PPRs) with a known TSS. Our methods identified 1226 eight-letter DNA words with significant positional preferences with respect to the TSS, of which only 608 of the 1226 words matched known TFBSs. Many groups of genes whose PPRs contained a common word displayed similar expression profiles and related biological functions, however. Most interestingly, our results included 78 words, each of which clustered significantly in two or three different positions relative to the TSS. Often, the gene groups corresponding to different positional clusters of the same word corresponded to diverse functions, e.g. activation or repression in different tissues. Thus, different clusters of the same word likely reflect the phenomenon of 'positional regulation', i.e. a word's regulatory function can vary with its position relative to a genomic landmark, a conclusion inaccessible to methods based purely on sequence. Further integrative analysis of words co-occurring in PPRs also yielded 24 different groups of genes, likely identifying cis-regulatory modules de novo. Whereas comparative genomics requires precise sequence alignments, positional regulomics exploits genomic landmarks to provide a 'poor man's alignment'. By exploiting the phenomenon of positional regulation, it uses position to differentiate the biological functions of subsets of TFBSs sharing a common sequence motif.

  17. Detecting correlations among functional-sequence motifs

    NASA Astrophysics Data System (ADS)

    Pirino, Davide; Rigosa, Jacopo; Ledda, Alice; Ferretti, Luca

    2012-06-01

    Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features.

  18. Recurring sequence-structure motifs in (βα)8-barrel proteins and experimental optimization of a chimeric protein designed based on such motifs.

    PubMed

    Wang, Jichao; Zhang, Tongchuan; Liu, Ruicun; Song, Meilin; Wang, Juncheng; Hong, Jiong; Chen, Quan; Liu, Haiyan

    2017-02-01

    An interesting way of generating novel artificial proteins is to combine sequence motifs from natural proteins, mimicking the evolutionary path suggested by natural proteins comprising recurring motifs. We analyzed the βα and αβ modules of TIM barrel proteins by structure alignment-based sequence clustering. A number of preferred motifs were identified. A chimeric TIM was designed by using recurring elements as mutually compatible interfaces. The foldability of the designed TIM protein was then significantly improved by six rounds of directed evolution. The melting temperature has been improved by more than 20°C. A variety of characteristics suggested that the resulting protein is well-folded. Our analysis provided a library of peptide motifs that is potentially useful for different protein engineering studies. The protein engineering strategy of using recurring motifs as interfaces to connect partial natural proteins may be applied to other protein folds. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Protein structural motifs in prediction and design.

    PubMed

    Mackenzie, Craig O; Grigoryan, Gevorg

    2017-06-01

    The Protein Data Bank (PDB) has been an integral resource for shaping our fundamental understanding of protein structure and for the advancement of such applications as protein design and structure prediction. Over the years, information from the PDB has been used to generate models ranging from specific structural mechanisms to general statistical potentials. With accumulating structural data, it has become possible to mine for more complete and complex structural observations, deducing more accurate generalizations. Motif libraries, which capture recurring structural features along with their sequence preferences, have exposed modularity in the structural universe and found successful application in various problems of structural biology. Here we summarize recent achievements in this arena, focusing on subdomain level structural patterns and their applications to protein design and structure prediction, and suggest promising future directions as the structural database continues to grow. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. The Thiamine-Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Ciszak, Ewa; Dominiak, Paulina

    2004-01-01

    Thiamin pyrophosphate (TPP), a derivative of vitamin B1, is a cofactor for enzymes performing catalysis in pathways of energy production including the well known decarboxylation of a-keto acid dehydrogenases followed by transketolation. TPP-dependent enzymes constitute a structurally and functionally diverse group exhibiting multimeric subunit organization, multiple domains and two chemically equivalent catalytic centers. Annotation of functional TPP-dependcnt enzymes, therefore, has not been trivial due to low sequence similarity related to this complex organization. Our approach to analysis of structures of known TPP-dependent enzymes reveals for the first time features common to this group, which we have termed the TPP-motif. The TPP-motif consists of specific spatial arrangements of structural elements and their specific contacts to provide for a flip-flop, or alternate site, enzymatic mechanism of action. Analysis of structural elements entrained in the flip-flop action displayed by TPP-dependent enzymes reveals a novel definition of the common amino acid sequences. These sequences allow for annotation of TPP-dependent enzymes, thus advancing functional proteomics. Further details of three-dimensional structures of TPP-dependent enzymes will be discussed.

  1. D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

    PubMed Central

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-01-01

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D­MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co­regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos­box cis­regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D­MATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861

  2. Combinatorial motif analysis of regulatory gene expression in Mafb deficient macrophages

    PubMed Central

    2011-01-01

    Background Deficiency of the transcription factor MafB, which is normally expressed in macrophages, can underlie cellular dysfunction associated with a range of autoimmune diseases and arteriosclerosis. MafB has important roles in cell differentiation and regulation of target gene expression; however, the mechanisms of this regulation and the identities of other transcription factors with which MafB interacts remain uncertain. Bioinformatics methods provide a valuable approach for elucidating the nature of these interactions with transcriptional regulatory elements from a large number of DNA sequences. In particular, identification of patterns of co-occurrence of regulatory cis-elements (motifs) offers a robust approach. Results Here, the directional relationships among several functional motifs were evaluated using the Log-linear Graphical Model (LGM) after extraction and search for evolutionarily conserved motifs. This analysis highlighted GATA-1 motifs and 5’AT-rich half Maf recognition elements (MAREs) in promoter regions of 18 genes that were down-regulated in Mafb deficient macrophages. GATA-1 motifs and MafB motifs could regulate expression of these genes in both a negative and positive manner, respectively. The validity of this conclusion was tested with data from a luciferase assay that used a C1qa promoter construct carrying both the GATA-1 motifs and MAREs. GATA-1 was found to inhibit the activity of the C1qa promoter with the GATA-1 motifs and MafB motifs. Conclusions These observations suggest that both the GATA-1 motifs and MafB motifs are important for lineage specific expression of C1qa. In addition, these findings show that analysis of combinations of evolutionarily conserved motifs can be successfully used to identify patterns of gene regulation. PMID:22784578

  3. rMotifGen: random motif generator for DNA and protein sequences.

    PubMed

    Rouchka, Eric C; Hardin, C Timothy

    2007-08-07

    Detection of short, subtle conserved motif regions within a set of related DNA or amino acid sequences can lead to discoveries about important regulatory domains such as transcription factor and DNA binding sites as well as conserved protein domains. In order to help assess motif detection algorithms on motifs with varying properties and levels of conservation, we have developed a computational tool, rMotifGen, with the sole purpose of generating a number of random DNA or protein sequences containing short sequence motifs. Each motif consensus can be user-defined, randomly generated, or created from a position-specific scoring matrix (PSSM). Insertions and mutations within these motifs are created according to user-defined parameters and substitution matrices. The resulting sequences can be helpful in mutational simulations and in testing the limits of motif detection algorithms. Two implementations of rMotifGen have been created, one providing a graphical user interface (GUI) for random motif construction, and the other serving as a command line interface. The second implementation has the added advantages of platform independence and being able to be called in a batch mode. rMotifGen was used to construct sample sets of sequences containing DNA motifs and amino acid motifs that were then tested against the Gibbs sampler and MEME packages. rMotifGen provides an efficient and convenient method for creating random DNA or amino acid sequences with a variable number of motifs, where the instance of each motif can be incorporated using a position-specific scoring matrix (PSSM) or by creating an instance mutated from its corresponding consensus using an evolutionary model based on substitution matrices. rMotifGen is freely available at: http://bioinformatics.louisville.edu/brg/rMotifGen/.

  4. Discovering novel sequence motifs with MEME.

    PubMed

    Bailey, Timothy L

    2002-11-01

    This unit illustrates how to use MEME to discover motifs in a group of related nucleotide or peptide sequences. A MEME motif is a sequence pattern that occurs repeatedly in one or more sequences in the input group. MEME can be used to discover novel patterns because it bases its discoveries only on the input sequences, not on any prior knowledge (such as databases of known motifs). The input to MEME is a set of unaligned sequences of the same type (peptide or nucleotide). For each motif it discovers, MEME reports the occurrences (sites), consensus sequence, and the level of conservation (information content) at each position in the pattern. MEME also produces block diagrams showing where all of the discovered motifs occur in the training set sequences. MEME's hypertext (HTML) output also contains buttons that allow for the convenient use of the motifs in other searches.

  5. rMotifGen: random motif generator for DNA and protein sequences

    PubMed Central

    Rouchka, Eric C; Hardin, C Timothy

    2007-01-01

    Background Detection of short, subtle conserved motif regions within a set of related DNA or amino acid sequences can lead to discoveries about important regulatory domains such as transcription factor and DNA binding sites as well as conserved protein domains. In order to help assess motif detection algorithms on motifs with varying properties and levels of conservation, we have developed a computational tool, rMotifGen, with the sole purpose of generating a number of random DNA or protein sequences containing short sequence motifs. Each motif consensus can be user-defined, randomly generated, or created from a position-specific scoring matrix (PSSM). Insertions and mutations within these motifs are created according to user-defined parameters and substitution matrices. The resulting sequences can be helpful in mutational simulations and in testing the limits of motif detection algorithms. Results Two implementations of rMotifGen have been created, one providing a graphical user interface (GUI) for random motif construction, and the other serving as a command line interface. The second implementation has the added advantages of platform independence and being able to be called in a batch mode. rMotifGen was used to construct sample sets of sequences containing DNA motifs and amino acid motifs that were then tested against the Gibbs sampler and MEME packages. Conclusion rMotifGen provides an efficient and convenient method for creating random DNA or amino acid sequences with a variable number of motifs, where the instance of each motif can be incorporated using a position-specific scoring matrix (PSSM) or by creating an instance mutated from its corresponding consensus using an evolutionary model based on substitution matrices. rMotifGen is freely available at: . PMID:17683637

  6. BayesMotif: de novo protein sorting motif discovery from impure datasets

    PubMed Central

    2010-01-01

    Background Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. Methods We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Results Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. Conclusion We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which

  7. MSDmotif: exploring protein sites and motifs

    PubMed Central

    Golovin, Adel; Henrick, Kim

    2008-01-01

    Background Protein structures have conserved features – motifs, which have a sufficient influence on the protein function. These motifs can be found in sequence as well as in 3D space. Understanding of these fragments is essential for 3D structure prediction, modelling and drug-design. The Protein Data Bank (PDB) is the source of this information however present search tools have limited 3D options to integrate protein sequence with its 3D structure. Results We describe here a web application for querying the PDB for ligands, binding sites, small 3D structural and sequence motifs and the underlying database. Novel algorithms for chemical fragments, 3D motifs, ϕ/ψ sequences, super-secondary structure motifs and for small 3D structural motif associations searches are incorporated. The interface provides functionality for visualization, search criteria creation, sequence and 3D multiple alignment options. MSDmotif is an integrated system where a results page is also a search form. A set of motif statistics is available for analysis. This set includes molecule and motif binding statistics, distribution of motif sequences, occurrence of an amino-acid within a motif, correlation of amino-acids side-chain charges within a motif and Ramachandran plots for each residue. The binding statistics are presented in association with properties that include a ligand fragment library. Access is also provided through the distributed Annotation System (DAS) protocol. An additional entry point facilitates XML requests with XML responses. Conclusion MSDmotif is unique by combining chemical, sequence and 3D data in a single search engine with a range of search and visualisation options. It provides multiple views of data found in the PDB archive for exploring protein structures. PMID:18637174

  8. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

    PubMed

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2014-02-20

    ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data.

  9. Active motif finder - a bio-tool based on mutational structures in DNA sequences

    PubMed Central

    Udayakumar, Mani; Shanmuga-priya, Palaniyandi; Hemavathi, Kamalakannan; Seenivasagam, Rengasamy

    2011-01-01

    Active Motif Finder (AMF) is a novel algorithmic tool, designed based on mutations in DNA sequences. Tools available at present for finding motifs are based on matching a given motif in the query sequence. AMF describes a new algorithm that identifies the occurrences of patterns which possess all kinds of mutations like insertion, deletion and mismatch. The algorithm is mainly based on the Alignment Score Matrix (ASM) computation by comparing input motif with full length sequence. Much of the effort in bioinformatics is directed to identify these motifs in the sequences of newly discovered genes. The proposed bio-tool serves as an open resource for analysis and useful for studying polymorphisms in DNA sequences. AMF can be searched via a user-friendly interface. This tool is intended to serve the scientific community working in the areas of chemical and structural biology, and is freely available to all users, at http://www.sastra.edu/scbt/amf/. PMID:23554723

  10. Motif-based construction of a functional map for mammalian olfactory receptors.

    PubMed

    Liu, Agatha H; Zhang, Xinmin; Stolovitzky, Gustavo A; Califano, Andrea; Firestein, Stuart J

    2003-05-01

    We applied an automatic and unsupervised system to a nearly complete database of mammalian odor receptor genes. The generated motifs and gene classification were subjected to extensive and systematic downstream analysis to obtain biological insights. Two major results from this analysis were: (1) a map of sequence motifs that may correlate with function and (2) the corresponding receptor classes in which members of each class are likely to share specific functions. We have discovered motifs that have been implicated in structural integrity and posttranslational modification, as well as motifs very likely to be directly involved in ligand binding. We further propose a combinatorial molecular hypothesis, based on unique combinations of the observed motifs, that provides a foundation for understanding the generation of a large number of ligand binding sites.

  11. Mining, compressing and classifying with extensible motifs

    PubMed Central

    Apostolico, Alberto; Comin, Matteo; Parida, Laxmi

    2006-01-01

    Background Motif patterns of maximal saturation emerged originally in contexts of pattern discovery in biomolecular sequences and have recently proven a valuable notion also in the design of data compression schemes. Informally, a motif is a string of intermittently solid and wild characters that recurs more or less frequently in an input sequence or family of sequences. Motif discovery techniques and tools tend to be computationally imposing, however, special classes of "rigid" motifs have been identified of which the discovery is affordable in low polynomial time. Results In the present work, "extensible" motifs are considered such that each sequence of gaps comes endowed with some elasticity, whereby the same pattern may be stretched to fit segments of the source that match all the solid characters but are otherwise of different lengths. A few applications of this notion are then described. In applications of data compression by textual substitution, extensible motifs are seen to bring savings on the size of the codebook, and hence to improve compression. In germane contexts, in which compressibility is used in its dual role as a basis for structural inference and classification, extensible motifs are seen to support unsupervised classification and phylogeny reconstruction. Conclusion Off-line compression based on extensible motifs can be used advantageously to compress and classify biological sequences. PMID:16722593

  12. Sampling Motif-Constrained Ensembles of Networks

    NASA Astrophysics Data System (ADS)

    Fischer, Rico; Leitão, Jorge C.; Peixoto, Tiago P.; Altmann, Eduardo G.

    2015-10-01

    The statistical significance of network properties is conditioned on null models which satisfy specified properties but that are otherwise random. Exponential random graph models are a principled theoretical framework to generate such constrained ensembles, but which often fail in practice, either due to model inconsistency or due to the impossibility to sample networks from them. These problems affect the important case of networks with prescribed clustering coefficient or number of small connected subgraphs (motifs). In this Letter we use the Wang-Landau method to obtain a multicanonical sampling that overcomes both these problems. We sample, in polynomial time, networks with arbitrary degree sequences from ensembles with imposed motifs counts. Applying this method to social networks, we investigate the relation between transitivity and homophily, and we quantify the correlation between different types of motifs, finding that single motifs can explain up to 60% of the variation of motif profiles.

  13. The EDLL motif: a potent plant transcriptional activation domain from AP2/ERF transcription factors.

    PubMed

    Tiwari, Shiv B; Belachew, Alemu; Ma, Siu Fong; Young, Melinda; Ade, Jules; Shen, Yu; Marion, Colleen M; Holtan, Hans E; Bailey, Adina; Stone, Jeffrey K; Edwards, Leslie; Wallace, Andreah D; Canales, Roger D; Adam, Luc; Ratcliffe, Oliver J; Repetti, Peter P

    2012-06-01

    In plants, the ERF/EREBP family of transcriptional regulators plays a key role in adaptation to various biotic and abiotic stresses. These proteins contain a conserved AP2 DNA-binding domain and several uncharacterized motifs. Here, we describe a short motif, termed 'EDLL', that is present in AtERF98/TDR1 and other clade members from the same AP2 sub-family. We show that the EDLL motif, which has a unique arrangement of acidic amino acids and hydrophobic leucines, functions as a strong activation domain. The motif is transferable to other proteins, and is active at both proximal and distal positions of target promoters. As such, the EDLL motif is able to partly overcome the repression conferred by the AtHB2 transcription factor, which contains an ERF-associated amphiphilic repression (EAR) motif. We further examined the activation potential of EDLL by analysis of the regulation of flowering time by NF-Y (nuclear factor Y) proteins. Genetic evidence indicates that NF-Y protein complexes potentiate the action of CONSTANS in regulation of flowering in Arabidopsis; we show that the transcriptional activation function of CONSTANS can be substituted by direct fusion of the EDLL activation motif to NF-YB subunits. The EDLL motif represents a potent plant activation domain that can be used as a tool to confer transcriptional activation potential to heterologous DNA-binding proteins.

  14. Proximity of Radiation Desiccation Response Motif to the core promoter is essential for basal repression as well as gamma radiation-induced gyrB gene expression in Deinococcus radiodurans.

    PubMed

    Anaganti, Narasimha; Basu, Bhakti; Mukhopadhyaya, Rita; Apte, Shree Kumar

    2017-03-02

    The radioresistant D. radiodurans regulates its DNA damage regulon (DDR) through interaction between a 17bp palindromic cis-regulatory element called the Radiation Desiccation Response Motif (RDRM), the DdrO repressor and a protease IrrE. The role of RDRM in regulation of DDR was dissected by constructing RDRM sequence-, position- or deletion-variants of Deinococcal gyrB gene (DR0906) promoter and by RDRM insertion in the non-RDRM groESL gene (DR0606) promoter, and monitoring the effect of such modifications on the basal as well as gamma radiation inducible promoter activity by quantifying fluorescence of a GFP reporter. RDRM sequence-variants revealed that the conservation of sequence at the 5th and 13th position and the ends of RDRM is essential for basal repression by interaction with DdrO. RDRM position-variants showed that the sequence acts as a negative regulatory element only when located around transcription start site (TSS) and within the span of RNA polymerase (RNAP) binding region. RDRM deletion-variants indicated that the 5' sequence of RDRM possibly possesses an enhancer-like element responsible for higher expression yields upon repressor clearance post-irradiation. The results suggest that RDRM plays both a negative as well as a positive role of in the regulation of DDR in D. radiodurans.

  15. MotifNet: a web-server for network motif analysis.

    PubMed

    Smoly, Ilan Y; Lerman, Eugene; Ziv-Ukelson, Michal; Yeger-Lotem, Esti

    2017-06-15

    Network motifs are small topological patterns that recur in a network significantly more often than expected by chance. Their identification emerged as a powerful approach for uncovering the design principles underlying complex networks. However, available tools for network motif analysis typically require download and execution of computationally intensive software on a local computer. We present MotifNet, the first open-access web-server for network motif analysis. MotifNet allows researchers to analyze integrated networks, where nodes and edges may be labeled, and to search for motifs of up to eight nodes. The output motifs are presented graphically and the user can interactively filter them by their significance, number of instances, node and edge labels, and node identities, and view their instances. MotifNet also allows the user to distinguish between motifs that are centered on specific nodes and motifs that recur in distinct parts of the network. MotifNet is freely available at http://netbio.bgu.ac.il/motifnet . The website was implemented using ReactJs and supports all major browsers. The server interface was implemented in Python with data stored on a MySQL database. estiyl@bgu.ac.il or michaluz@cs.bgu.ac.il. Supplementary data are available at Bioinformatics online.

  16. Efficient motif search in ranked lists and applications to variable gap motifs.

    PubMed

    Leibovich, Limor; Yakhini, Zohar

    2012-07-01

    Sequence elements, at all levels-DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on -measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic approaches that take as input ranked lists of sequences and return significant motifs. The efficiency of our approach, based on suffix trees, allows searches over motif spaces that are not covered by existing tools. This includes searching variable gap motifs-two half sites with a flexible length gap in between-and searching long motifs over large alphabets. We used our approach to analyze several high-throughput measurement data sets and report some validation results as well as novel suggested motifs and motif refinements. We suggest a refinement of the known estrogen receptor 1 motif in humans, where we observe gaps other than three nucleotides that also serve as significant recognition sites, as well as a variable length motif related to potential tyrosine phosphorylation.

  17. CompleteMOTIFs: DNA motif discovery platform for transcription factor binding experiments.

    PubMed

    Kuttippurathu, Lakshmi; Hsing, Michael; Liu, Yongchao; Schmidt, Bertil; Maskell, Douglas L; Lee, Kyungjoon; He, Aibin; Pu, William T; Kong, Sek Won

    2011-03-01

    CompleteMOTIFs (cMOTIFs) is an integrated web tool developed to facilitate systematic discovery of overrepresented transcription factor binding motifs from high-throughput chromatin immunoprecipitation experiments. Comprehensive annotations and Boolean logic operations on multiple peak locations enable users to focus on genomic regions of interest for de novo motif discovery using tools such as MEME, Weeder and ChIPMunk. The pipeline incorporates a scanning tool for known motifs from TRANSFAC and JASPAR databases, and performs an enrichment test using local or precalculated background models that significantly improve the motif scanning result. Furthermore, using the cMOTIFs pipeline, we demonstrated that multiple transcription factors could cooperatively bind to the upstream of important stem cell differentiation regulators. http://cmotifs.tchlab.org.

  18. VARUN: discovering extensible motifs under saturation constraints.

    PubMed

    Apostolico, Alberto; Comin, Matteo; Parida, Laxmi

    2010-01-01

    The discovery of motifs in biosequences is frequently torn between the rigidity of the model on one hand and the abundance of candidates on the other hand. In particular, motifs that include wild cards or "don't cares" escalate exponentially with their number, and this gets only worse if a don't care is allowed to stretch up to some prescribed maximum length. In this paper, a notion of extensible motif in a sequence is introduced and studied, which tightly combines the structure of the motif pattern, as described by its syntactic specification, with the statistical measure of its occurrence count. It is shown that a combination of appropriate saturation conditions and the monotonicity of probabilistic scores over regions of constant frequency afford us significant parsimony in the generation and testing of candidate overrepresented motifs. A suite of software programs called Varun is described, implementing the discovery of extensible motifs of the type considered. The merits of the method are then documented by results obtained in a variety of experiments primarily targeting protein sequence families. Of equal importance seems the fact that the sets of all surprising motifs returned in each experiment are extracted faster and come in much more manageable sizes than would be obtained in the absence of saturation constraints.

  19. A cis-regulatory module activating transcription in the suspensor contains five cis-regulatory elements

    SciTech Connect

    Henry, Kelli F.; Kawashima, Tomokazu; Goldberg, Robert B.

    2015-03-22

    Little is known about the molecular mechanisms by which the embryo proper and suspensor of plant embryos activate specific gene sets shortly after fertilization. We analyzed the upstream region of the Scarlet Runner Bean (Phaseolus coccineus) G564 gene in order to understand how genes are activated specifically in the suspensor during early embryo development. Previously, we showed that a 54-bp fragment of the G564 upstream region is sufficient for suspensor transcription and contains at least three required cis-regulatory sequences, including the 10-bp motif (5'-GAAAAGCGAA-3'), the 10 bp-like motif (5'-GAAAAACGAA-3'), and Region 2 motif (partial sequence 5'-TTGGT-3'). Here, we use site-directed mutagenesis experiments in transgenic tobacco globularstage embryos to identify two additional cis-regulatory elements within the 54-bp cis-regulatory module that are required for G564 suspensor transcription: the Fifth motif (5'-GAGTTA-3') and a third 10-bp-related sequence (5'-GAAAACCACA-3'). Further deletion of the 54-bp fragment revealed that a 47-bp fragment containing the five motifs (the 10-bp, 10-bp-like, 10-bp-related, Region 2 and Fifth motifs) is sufficient for suspensor transcription, and represents a cis-regulatory module. A consensus sequence for each type of motif was determined by comparing motif sequences shown to activate suspensor transcription. Phylogenetic analyses suggest that the regulation of G564 is evolutionarily conserved. Lastly, a homologous cis-regulatory module was found upstream of the G564 ortholog in the Common Bean (Phaseolus vulgaris), indicating that the regulation of G564 is evolutionarily conserved in closely related bean species.

  20. Efficient motif search in ranked lists and applications to variable gap motifs

    PubMed Central

    Leibovich, Limor; Yakhini, Zohar

    2012-01-01

    Sequence elements, at all levels—DNA, RNA and protein, play a central role in mediating molecular recognition and thereby molecular regulation and signaling. Studies that focus on measuring and investigating sequence-based recognition make use of statistical and computational tools, including approaches to searching sequence motifs. State-of-the-art motif searching tools are limited in their coverage and ability to address large motif spaces. We develop and present statistical and algorithmic approaches that take as input ranked lists of sequences and return significant motifs. The efficiency of our approach, based on suffix trees, allows searches over motif spaces that are not covered by existing tools. This includes searching variable gap motifs—two half sites with a flexible length gap in between—and searching long motifs over large alphabets. We used our approach to analyze several high-throughput measurement data sets and report some validation results as well as novel suggested motifs and motif refinements. We suggest a refinement of the known estrogen receptor 1 motif in humans, where we observe gaps other than three nucleotides that also serve as significant recognition sites, as well as a variable length motif related to potential tyrosine phosphorylation. PMID:22416066

  1. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets.

    PubMed

    Thomas-Chollier, Morgane; Herrmann, Carl; Defrance, Matthieu; Sand, Olivier; Thieffry, Denis; van Helden, Jacques

    2012-02-01

    ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1,28,000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks.

  2. RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets

    PubMed Central

    Thomas-Chollier, Morgane; Herrmann, Carl; Defrance, Matthieu; Sand, Olivier; Thieffry, Denis; van Helden, Jacques

    2012-01-01

    ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1 28 000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks. PMID:22156162

  3. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

    PubMed Central

    Petrov, Anton I.; Zirbel, Craig L.; Leontis, Neocles B.

    2013-01-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson–Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access. PMID:23970545

  4. Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas.

    PubMed

    Petrov, Anton I; Zirbel, Craig L; Leontis, Neocles B

    2013-10-01

    The analysis of atomic-resolution RNA three-dimensional (3D) structures reveals that many internal and hairpin loops are modular, recurrent, and structured by conserved non-Watson-Crick base pairs. Structurally similar loops define RNA 3D motifs that are conserved in homologous RNA molecules, but can also occur at nonhomologous sites in diverse RNAs, and which often vary in sequence. To further our understanding of RNA motif structure and sequence variability and to provide a useful resource for structure modeling and prediction, we present a new method for automated classification of internal and hairpin loop RNA 3D motifs and a new online database called the RNA 3D Motif Atlas. To classify the motif instances, a representative set of internal and hairpin loops is automatically extracted from a nonredundant list of RNA-containing PDB files. Their structures are compared geometrically, all-against-all, using the FR3D program suite. The loops are clustered into motif groups, taking into account geometric similarity and structural annotations and making allowance for a variable number of bulged bases. The automated procedure that we have implemented identifies all hairpin and internal loop motifs previously described in the literature. All motif instances and motif groups are assigned unique and stable identifiers and are made available in the RNA 3D Motif Atlas (http://rna.bgsu.edu/motifs), which is automatically updated every four weeks. The RNA 3D Motif Atlas provides an interactive user interface for exploring motif diversity and tools for programmatic data access.

  5. Identification of a putative nuclear export signal motif in human NANOG homeobox domain

    SciTech Connect

    Park, Sung-Won; Do, Hyun-Jin; Huh, Sun-Hyung; Sung, Boreum; Uhm, Sang-Jun; Song, Hyuk; Kim, Nam-Hyung; Kim, Jae-Hwan

    2012-05-11

    Highlights: Black-Right-Pointing-Pointer We found the putative nuclear export signal motif within human NANOG homeodomain. Black-Right-Pointing-Pointer Leucine-rich residues are important for human NANOG homeodomain nuclear export. Black-Right-Pointing-Pointer CRM1-specific inhibitor LMB blocked the potent human NANOG NES-mediated nuclear export. -- Abstract: NANOG is a homeobox-containing transcription factor that plays an important role in pluripotent stem cells and tumorigenic cells. To understand how nuclear localization of human NANOG is regulated, the NANOG sequence was examined and a leucine-rich nuclear export signal (NES) motif ({sup 125}MQELSNILNL{sup 134}) was found in the homeodomain (HD). To functionally validate the putative NES motif, deletion and site-directed mutants were fused to an EGFP expression vector and transfected into COS-7 cells, and the localization of the proteins was examined. While hNANOG HD exclusively localized to the nucleus, a mutant with both NLSs deleted and only the putative NES motif contained (hNANOG HD-{Delta}NLSs) was predominantly cytoplasmic, as observed by nucleo/cytoplasmic fractionation and Western blot analysis as well as confocal microscopy. Furthermore, site-directed mutagenesis of the putative NES motif in a partial hNANOG HD only containing either one of the two NLS motifs led to localization in the nucleus, suggesting that the NES motif may play a functional role in nuclear export. Furthermore, CRM1-specific nuclear export inhibitor LMB blocked the hNANOG potent NES-mediated export, suggesting that the leucine-rich motif may function in CRM1-mediated nuclear export of hNANOG. Collectively, a NES motif is present in the hNANOG HD and may be functionally involved in CRM1-mediated nuclear export pathway.

  6. ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data.

    PubMed

    Heller, David; Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa

    2017-08-30

    RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM's model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Network motif identification in stochastic networks

    NASA Astrophysics Data System (ADS)

    Jiang, Rui; Tu, Zhidong; Chen, Ting; Sun, Fengzhu

    2006-06-01

    Network motifs have been identified in a wide range of networks across many scientific disciplines and are suggested to be the basic building blocks of most complex networks. Nonetheless, many networks come with intrinsic and/or experimental uncertainties and should be treated as stochastic networks. The building blocks in these networks thus may also have stochastic properties. In this article, we study stochastic network motifs derived from families of mutually similar but not necessarily identical patterns of interconnections. We establish a finite mixture model for stochastic networks and develop an expectation-maximization algorithm for identifying stochastic network motifs. We apply this approach to the transcriptional regulatory networks of Escherichia coli and Saccharomyces cerevisiae, as well as the protein-protein interaction networks of seven species, and identify several stochastic network motifs that are consistent with current biological knowledge. expectation-maximization algorithm | mixture model | transcriptional regulatory network | protein-protein interaction network

  8. New type of starch-binding domain: the direct repeat motif in the C-terminal region of Bacillus sp. no. 195 alpha-amylase contributes to starch binding and raw starch degrading.

    PubMed

    Sumitani, J; Tottori, T; Kawaguchi, T; Arai, M

    2000-09-01

    The alpha-amylase from Bacillus sp. no. 195 (BAA) consists of two domains: one is the catalytic domain similar to alpha-amylases from animals and Streptomyces in the N-terminal region; the other is the functionally unknown domain composed of an approx. 90-residue direct repeat in the C-terminal region. The gene coding for BAA was expressed in Streptomyces lividans TK24. Three active forms of the gene products were found. The pH and thermal profiles of BAAs, and their catalytic activities for p-nitrophenyl maltopentaoside and soluble starch, showed almost the same behaviours. The largest, 69 kDa, form (BAA-alpha) was of the same molecular mass as that of the mature protein estimated from the nucleotide sequence, and had raw-starch-binding and -degrading abilities. The second largest, 60 kDa, form (BAA-beta), whose molecular mass was the same as that of the natural enzyme from Bacillus sp. no. 195, was generated by proteolytic processing between the two repeat sequences in the C-terminal region, and had lower activities for raw starch binding and degrading than those of BAA-alpha. The smallest, 50 kDa, form (BAA-gamma) contained only the N-terminal catalytic domain as a result of removal of the C-terminal repeat sequence, which led to loss of binding and degradation of insoluble starches. Thus the starch adsorption capacity and raw-starch-degrading activity of BAAs depends on the existence of the repeat sequence in the C-terminal region. BAA-alpha was specifically adsorbed on starch or dextran (alpha-1,4 or alpha-1,6 glucan), and specifically desorbed with maltose or beta-cyclodextrin. These observations indicated that the repeat sequence of the enzyme was functional in the starch-binding domain (SBD). We propose the designation of the homologues to the SBD of glucoamylase from Aspergillus niger as family I SBDs, the homologues to that of glucoamylase from Rhizopus oryzae as family II, and the homologues of this repeat sequence of BAA as family III.

  9. New type of starch-binding domain: the direct repeat motif in the C-terminal region of Bacillus sp. no. 195 alpha-amylase contributes to starch binding and raw starch degrading.

    PubMed Central

    Sumitani, J; Tottori, T; Kawaguchi, T; Arai, M

    2000-01-01

    The alpha-amylase from Bacillus sp. no. 195 (BAA) consists of two domains: one is the catalytic domain similar to alpha-amylases from animals and Streptomyces in the N-terminal region; the other is the functionally unknown domain composed of an approx. 90-residue direct repeat in the C-terminal region. The gene coding for BAA was expressed in Streptomyces lividans TK24. Three active forms of the gene products were found. The pH and thermal profiles of BAAs, and their catalytic activities for p-nitrophenyl maltopentaoside and soluble starch, showed almost the same behaviours. The largest, 69 kDa, form (BAA-alpha) was of the same molecular mass as that of the mature protein estimated from the nucleotide sequence, and had raw-starch-binding and -degrading abilities. The second largest, 60 kDa, form (BAA-beta), whose molecular mass was the same as that of the natural enzyme from Bacillus sp. no. 195, was generated by proteolytic processing between the two repeat sequences in the C-terminal region, and had lower activities for raw starch binding and degrading than those of BAA-alpha. The smallest, 50 kDa, form (BAA-gamma) contained only the N-terminal catalytic domain as a result of removal of the C-terminal repeat sequence, which led to loss of binding and degradation of insoluble starches. Thus the starch adsorption capacity and raw-starch-degrading activity of BAAs depends on the existence of the repeat sequence in the C-terminal region. BAA-alpha was specifically adsorbed on starch or dextran (alpha-1,4 or alpha-1,6 glucan), and specifically desorbed with maltose or beta-cyclodextrin. These observations indicated that the repeat sequence of the enzyme was functional in the starch-binding domain (SBD). We propose the designation of the homologues to the SBD of glucoamylase from Aspergillus niger as family I SBDs, the homologues to that of glucoamylase from Rhizopus oryzae as family II, and the homologues of this repeat sequence of BAA as family III. PMID:10947962

  10. DNA Motif Databases and Their Uses.

    PubMed

    Stormo, Gary D

    2015-09-03

    Transcription factors (TFs) recognize and bind to specific DNA sequences. The specificity of a TF is usually represented as a position weight matrix (PWM). Several databases of DNA motifs exist and are used in biological research to address important biological questions. This overview describes PWMs and some of the most commonly used motif databases, as well as a few of their common applications. Copyright © 2015 John Wiley & Sons, Inc.

  11. Chaotic Motifs in Gene Regulatory Networks

    PubMed Central

    Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

    2012-01-01

    Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs. PMID:22792171

  12. Chaotic motifs in gene regulatory networks.

    PubMed

    Zhang, Zhaoyang; Ye, Weiming; Qian, Yu; Zheng, Zhigang; Huang, Xuhui; Hu, Gang

    2012-01-01

    Chaos should occur often in gene regulatory networks (GRNs) which have been widely described by nonlinear coupled ordinary differential equations, if their dimensions are no less than 3. It is therefore puzzling that chaos has never been reported in GRNs in nature and is also extremely rare in models of GRNs. On the other hand, the topic of motifs has attracted great attention in studying biological networks, and network motifs are suggested to be elementary building blocks that carry out some key functions in the network. In this paper, chaotic motifs (subnetworks with chaos) in GRNs are systematically investigated. The conclusion is that: (i) chaos can only appear through competitions between different oscillatory modes with rivaling intensities. Conditions required for chaotic GRNs are found to be very strict, which make chaotic GRNs extremely rare. (ii) Chaotic motifs are explored as the simplest few-node structures capable of producing chaos, and serve as the intrinsic source of chaos of random few-node GRNs. Several optimal motifs causing chaos with atypically high probability are figured out. (iii) Moreover, we discovered that a number of special oscillators can never produce chaos. These structures bring some advantages on rhythmic functions and may help us understand the robustness of diverse biological rhythms. (iv) The methods of dominant phase-advanced driving (DPAD) and DPAD time fraction are proposed to quantitatively identify chaotic motifs and to explain the origin of chaotic behaviors in GRNs.

  13. Helix-packing motifs in membrane proteins.

    PubMed

    Walters, R F S; DeGrado, W F

    2006-09-12

    The fold of a helical membrane protein is largely determined by interactions between membrane-imbedded helices. To elucidate recurring helix-helix interaction motifs, we dissected the crystallographic structures of membrane proteins into a library of interacting helical pairs. The pairs were clustered according to their three-dimensional similarity (rmsd motifs whose structural features can be understood in terms of simple principles of helix-helix packing. Thus, the universe of common transmembrane helix-pairing motifs is relatively simple. The largest cluster, which comprises 29% of the library members, consists of an antiparallel motif with left-handed packing angles, and it is frequently stabilized by packing of small side chains occurring every seven residues in the sequence. Right-handed parallel and antiparallel structures show a similar tendency to segregate small residues to the helix-helix interface but spaced at four-residue intervals. Position-specific sequence propensities were derived for the most populated motifs. These structural and sequential motifs should be quite useful for the design and structural prediction of membrane proteins.

  14. MotifHyades: Expectation Maximization for de novo DNA Motif Pair Discovery on Paired Sequences.

    PubMed

    Wong, Ka-Chun

    2017-06-13

    In higher eukaryotes, protein-DNA binding interactions are the central activities in gene regulation. In particular, DNA motifs such as transcription factor binding sites are the key components in gene transcription. Harnessing the recently available chromatin interaction data, computational methods are desired for identifying the coupling DNA motif pairs enriched on long-range chromatin-interacting sequence pairs (e.g. promoter-enhancer pairs) systematically. To fill the void, a novel probabilistic model (namely, MotifHyades) is proposed and developed for de novo DNA motif pair discovery on paired sequences. In particular, two expectation maximization algorithms are derived for efficient model training with linear computational complexity. Under diverse scenarios, MotifHyades is demonstrated faster and more accurate than the existing ad hoc computational pipeline. In addition, MotifHyades is applied to discover thousands of DNA motif pairs with higher gold standard motif matching ratio, higher DNase accessibility, and higher evolutionary conservation than the previous ones in the human K562 cell line. Lastly, it has been run on five other human cell lines (i.e. GM12878, HeLa-S3, HUVEC, IMR90, and NHEK), revealing another thousands of novel DNA motif pairs which are characterized across a broad spectrum of genomic features on long-range promoter-enhancer pairs. The matrix-algebra-optimized versions of MotifHyades and the discovered DNA motif pairs can be found in http://bioinfo.cs.cityu.edu.hk/MotifHyades . kc.w@cityu.edu.hk. Supplementary data are available at Bioinformatics online.

  15. iMotifs: an integrated sequence motif visualization and analysis environment

    PubMed Central

    Piipari, Matias; Down, Thomas A.; Saini, Harpreet; Enright, Anton; Hubbard, Tim J.P.

    2010-01-01

    Motivation: Short sequence motifs are an important class of models in molecular biology, used most commonly for describing transcription factor binding site specificity patterns. High-throughput methods have been recently developed for detecting regulatory factor binding sites in vivo and in vitro and consequently high-quality binding site motif data are becoming available for increasing number of organisms and regulatory factors. Development of intuitive tools for the study of sequence motifs is therefore important. iMotifs is a graphical motif analysis environment that allows visualization of annotated sequence motifs and scored motif hits in sequences. It also offers motif inference with the sensitive NestedMICA algorithm, as well as overrepresentation and pairwise motif matching capabilities. All of the analysis functionality is provided without the need to convert between file formats or learn different command line interfaces. The application includes a bundled and graphically integrated version of the NestedMICA motif inference suite that has no outside dependencies. Problems associated with local deployment of software are therefore avoided. Availability: iMotifs is licensed with the GNU Lesser General Public License v2.0 (LGPL 2.0). The software and its source is available at http://wiki.github.com/mz2/imotifs and can be run on Mac OS X Leopard (Intel/PowerPC). We also provide a cross-platform (Linux, OS X, Windows) LGPL 2.0 licensed library libxms for the Perl, Ruby, R and Objective-C programming languages for input and output of XMS formatted annotated sequence motif set files. Contact: matias.piipari@gmail.com; imotifs@googlegroups.com PMID:20106815

  16. Distal Regions of the Human IFNG Locus Direct Cell Type-Specific Expression

    PubMed Central

    Collins, Patrick L.; Chang, Shaojing; Henderson, Melodie; Soutto, Mohammed; Davis, Georgia M.; McLoed, Allyson G.; Townsend, Michael J.; Glimcher, Laurie H.; Mortlock, Douglas P.; Aune, Thomas M.

    2010-01-01

    Genes, such as IFNG, which are expressed in multiple cell lineages of the immune system, may employ a common set of regulatory elements to direct transcription in multiple cell types or individual regulatory elements to direct expression in individual cell lineages. By employing a bacterial artificial chromosome transgenic system, we demonstrate that IFNG employs unique regulatory elements to achieve lineage-specific transcriptional control. Specifically, a one 1-kb element 30 kb upstream of IFNG activates transcription in T cells and NKT cells but not in NK cells. This distal regulatory element is a Runx3 binding site in Th1 cells and is needed for RNA polymerase II recruitment to IFNG, but it is not absolutely required for histone acetylation of the IFNG locus. These results support a model whereby IFNG utilizes cis-regulatory elements with cell type-restricted function. PMID:20574006

  17. Caveats in modeling a common motif in genetic circuits

    NASA Astrophysics Data System (ADS)

    Labavić, Darka; Nagel, Hannes; Janke, Wolfhard; Meyer-Ortmanns, Hildegard

    2013-06-01

    From a coarse-grained perspective, the motif of a self-activating species, activating a second species that acts as its own repressor, is widely found in biological systems, in particular in genetic systems with inherent oscillatory behavior. Here we consider a specific realization of this motif as a genetic circuit, termed the bistable frustrated unit, in which genes are described as directly producing proteins. Upon an improved resolution in time, we focus on the effect that inherent time scales on the underlying scale can have on the bifurcation patterns on a coarser scale. Time scales are set by the binding and unbinding rates of the transcription factors to the promoter regions of the genes. Depending on the ratio of these rates to the decay times of both proteins, the appropriate averaging procedure for obtaining a coarse-grained description changes and leads to sets of deterministic equations, which considerably differ in their bifurcation structure. In particular, the desired intermediate range of regular limit cycles fades away when the binding rates of genes are not fast as compared to the decay time of the proteins. Our analysis illustrates that the common topology of the widely found motif alone does not imply universal features in the dynamics.

  18. Characteristic motifs for families of allergenic proteins

    PubMed Central

    Ivanciuc, Ovidiu; Garcia, Tzintzuni; Torres, Miguel; Schein, Catherine H.; Braun, Werner

    2008-01-01

    The identification of potential allergenic proteins is usually done by scanning a database of allergenic proteins and locating known allergens with a high sequence similarity. However, there is no universally accepted cut-off value for sequence similarity to indicate potential IgE cross-reactivity. Further, overall sequence similarity may be less important than discrete areas of similarity in proteins with homologous structure. To identify such areas, we first classified all allergens and their subdomains in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/) to their closest protein families as defined in Pfam, and identified conserved physicochemical property motifs characteristic of each group of sequences. Allergens populate only a small subset of all known Pfam families, as all allergenic proteins in SDAP could be grouped to only 130 (of 9318 total) Pfams, and 31 families contain more than four allergens. Conserved physicochemical property motifs for the aligned sequences of the most populated Pfam families were identified with the PCPMer program suite and catalogued in the webserver Motif-Mate (http://born.utmb.edu/motifmate/summary.php). We also determined specific motifs for allergenic members of a family that could distinguish them from non-allergenic ones. These allergen specific motifs should be most useful in database searches for potential allergens. We found that sequence motifs unique to the allergens in three families (seed storage proteins, Bet v 1, and tropomyosin) overlap with known IgE epitopes, thus providing evidence that our motif based approach can be used to assess the potential allergenicity of novel proteins. PMID:18951633

  19. Modeling gene regulatory network motifs using statecharts

    PubMed Central

    2012-01-01

    Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967

  20. P-value-based regulatory motif discovery using positional weight matrices.

    PubMed

    Hartmann, Holger; Guthöhrlein, Eckhart W; Siebert, Matthias; Luehr, Sebastian; Söding, Johannes

    2013-01-01

    To analyze gene regulatory networks, the sequence-dependent DNA/RNA binding affinities of proteins and noncoding RNAs are crucial. Often, these are deduced from sets of sequences enriched in factor binding sites. Two classes of computational approaches exist. The first describe binding motifs by sequence patterns and search the patterns with highest statistical significance for enrichment. The second class uses the more powerful position weight matrices (PWMs). Instead of maximizing the statistical significance of enrichment, they maximize a likelihood. Here we present XXmotif (eXhaustive evaluation of matriX motifs), the first PWM-based motif discovery method that can optimize PWMs by directly minimizing their P-values of enrichment. Optimization requires computing millions of enrichment P-values for thousands of PWMs. For a given PWM, the enrichment P-value is calculated efficiently from the match P-values of all possible motif placements in the input sequences using order statistics. The approach can naturally combine P-values for motif enrichment, conservation, and localization. On ChIP-chip/seq, miRNA knock-down, and coexpression data sets from yeast and metazoans, XXmotif outperformed state-of-the-art tools, both in numbers of correctly identified motifs and in the quality of PWMs. In segmentation modules of D. melanogaster, we detect the known key regulators and several new motifs. In human core promoters, XXmotif reports most previously described and eight novel motifs sharply peaked around the transcription start site, among them an Initiator motif similar to the fly and yeast versions. XXmotif's sensitivity, reliability, and usability will help to leverage the quickly accumulating wealth of functional genomics data.

  1. A Combinatorial Code for Splicing Silencing: UAGG and GGGG Motifs

    PubMed Central

    An, Ping; Burge, Christopher B

    2005-01-01

    Alternative pre-mRNA splicing is widely used to regulate gene expression by tuning the levels of tissue-specific mRNA isoforms. Few regulatory mechanisms are understood at the level of combinatorial control despite numerous sequences, distinct from splice sites, that have been shown to play roles in splicing enhancement or silencing. Here we use molecular approaches to identify a ternary combination of exonic UAGG and 5′-splice-site-proximal GGGG motifs that functions cooperatively to silence the brain-region-specific CI cassette exon (exon 19) of the glutamate NMDA R1 receptor (GRIN1) transcript. Disruption of three components of the motif pattern converted the CI cassette into a constitutive exon, while predominant skipping was conferred when the same components were introduced, de novo, into a heterologous constitutive exon. Predominant exon silencing was directed by the motif pattern in the presence of six competing exonic splicing enhancers, and this effect was retained after systematically repositioning the two exonic UAGGs within the CI cassette. In this system, hnRNP A1 was shown to mediate silencing while hnRNP H antagonized silencing. Genome-wide computational analysis combined with RT-PCR testing showed that a class of skipped human and mouse exons can be identified by searches that preserve the sequence and spatial configuration of the UAGG and GGGG motifs. This analysis suggests that the multi-component silencing code may play an important role in the tissue-specific regulation of the CI cassette exon, and that it may serve more generally as a molecular language to allow for intricate adjustments and the coordination of splicing patterns from different genes. PMID:15828859

  2. A combinatorial code for splicing silencing: UAGG and GGGG motifs.

    PubMed

    Han, Kyoungha; Yeo, Gene; An, Ping; Burge, Christopher B; Grabowski, Paula J

    2005-05-01

    Alternative pre-mRNA splicing is widely used to regulate gene expression by tuning the levels of tissue-specific mRNA isoforms. Few regulatory mechanisms are understood at the level of combinatorial control despite numerous sequences, distinct from splice sites, that have been shown to play roles in splicing enhancement or silencing. Here we use molecular approaches to identify a ternary combination of exonic UAGG and 5'-splice-site-proximal GGGG motifs that functions cooperatively to silence the brain-region-specific CI cassette exon (exon 19) of the glutamate NMDA R1 receptor (GRIN1) transcript. Disruption of three components of the motif pattern converted the CI cassette into a constitutive exon, while predominant skipping was conferred when the same components were introduced, de novo, into a heterologous constitutive exon. Predominant exon silencing was directed by the motif pattern in the presence of six competing exonic splicing enhancers, and this effect was retained after systematically repositioning the two exonic UAGGs within the CI cassette. In this system, hnRNP A1 was shown to mediate silencing while hnRNP H antagonized silencing. Genome-wide computational analysis combined with RT-PCR testing showed that a class of skipped human and mouse exons can be identified by searches that preserve the sequence and spatial configuration of the UAGG and GGGG motifs. This analysis suggests that the multi-component silencing code may play an important role in the tissue-specific regulation of the CI cassette exon, and that it may serve more generally as a molecular language to allow for intricate adjustments and the coordination of splicing patterns from different genes.

  3. A motif rich in charged residues determines product specificity in isomaltulose synthase.

    PubMed

    Zhang, Daohai; Li, Nan; Swaminathan, Kunchithapadam; Zhang, Lian Hui

    2003-01-16

    Isomaltulose synthase (PalI) catalyzes hydrolysis of sucrose and formation of alpha-1,6 and alpha-1,1 bonds to produce isomaltulose (alpha-D-glucosylpyranosyl-1,6-D-fructofranose) and small amount of trehalulose (alpha-D-glucosylpyranosyl-1,1-D-fructofranose). A potential isomaltulose synthase-specific motif ((325)RLDRD(329)), that contains a 'DxD' motif conserved in many glycosyltransferases, was identified based on sequence comparison with reference to the secondary structural features of PalI and homologs. Site-directed mutagenesis analysis of the motif showed that the four charged amino acid residues (Arg(325), Arg(328), Asp(327) and Asp(329)) influence the enzyme kinetics and determine the product specificity. Mutation of these four residues increased trehalulose formation by 17-61% and decreased isomaltulose by 26-67%. We conclude that the 'RLDRD' motif controls the product specificity of PalI.

  4. Unsupervised statistical discovery of spaced motifs in prokaryotic genomes.

    PubMed

    Tong, Hao; Schliekelman, Paul; Mrázek, Jan

    2017-01-05

    DNA sequences contain repetitive motifs which have various functions in the physiology of the organism. A number of methods have been developed for discovery of such sequence motifs with a primary focus on detection of regulatory motifs and particularly transcription factor binding sites. Most motif-finding methods apply probabilistic models to detect motifs characterized by unusually high number of copies of the motif in the analyzed sequences. We present a novel method for detection of pairs of motifs separated by spacers of variable nucleotide sequence but conserved length. Unlike existing methods for motif discovery, the motifs themselves are not required to occur at unusually high frequency but only to exhibit a significant preference to occur at a specific distance from each other. In the present implementation of the method, motifs are represented by pentamers and all pairs of pentamers are evaluated for statistically significant preference for a specific distance. An important step of the algorithm eliminates motif pairs where the spacers separating the two motifs exhibit a high degree of sequence similarity; such motif pairs likely arise from duplications of the whole segment including the motifs and the spacer rather than due to selective constraints indicative of a functional importance of the motif pair. The method was used to scan 569 complete prokaryotic genomes for novel sequence motifs. Some motifs detected were previously known but other motifs found in the search appear to be novel. Selected motif pairs were subjected to further investigation and in some cases their possible biological functions were proposed. We present a new motif-finding technique that is applicable to scanning complete genomes for sequence motifs. The results from analysis of 569 genomes suggest that the method detects previously known motifs that are expected to be found as well as new motifs that are unlikely to be discovered by traditional motif-finding methods. We conclude

  5. WildSpan: mining structured motifs from protein sequences

    PubMed Central

    2011-01-01

    Background Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards) are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions) that incorporates several pruning strategies to largely reduce the mining cost. Results WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode of WildSpan is developed for

  6. WildSpan: mining structured motifs from protein sequences.

    PubMed

    Hsu, Chen-Ming; Chen, Chien-Yu; Liu, Baw-Jhiune

    2011-03-31

    Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards) are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions) that incorporates several pruning strategies to largely reduce the mining cost. WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode of WildSpan is developed for discovering

  7. Sequential motif profile of natural visibility graphs.

    PubMed

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-11-01

    The concept of sequential visibility graph motifs-subgraphs appearing with characteristic frequencies in the visibility graphs associated to time series-has been advanced recently along with a theoretical framework to compute analytically the motif profiles associated to horizontal visibility graphs (HVGs). Here we develop a theory to compute the profile of sequential visibility graph motifs in the context of natural visibility graphs (VGs). This theory gives exact results for deterministic aperiodic processes with a smooth invariant density or stochastic processes that fulfill the Markov property and have a continuous marginal distribution. The framework also allows for a linear time numerical estimation in the case of empirical time series. A comparison between the HVG and the VG case (including evaluation of their robustness for short series polluted with measurement noise) is also presented.

  8. The telomere repeat motif of basal Metazoa.

    PubMed

    Traut, Walther; Szczepanowski, Monika; Vítková, Magda; Opitz, Christian; Marec, Frantisek; Zrzavý, Jan

    2007-01-01

    In most eukaryotes the telomeres consist of short DNA tandem repeats and associated proteins. Telomeric repeats are added to the chromosome ends by telomerase, a specialized reverse transcriptase. We examined telomerase activity and telomere repeat sequences in representatives of basal metazoan groups. Our results show that the 'vertebrate' telomere motif (TTAGGG)( n ) is present in all basal metazoan groups, i.e. sponges, Cnidaria, Ctenophora, and Placozoa, and also in the unicellular metazoan sister group, the Choanozoa. Thus it can be considered the ancestral telomere repeat motif of Metazoa. It has been conserved from the metazoan radiation in most animal phylogenetic lineages, and replaced by other motifs-according to our present knowledge-only in two major lineages, Arthropoda and Nematoda.

  9. A dinucleotide motif in oligonucleotides shows potent immunomodulatory activity and overrides species-specific recognition observed with CpG motif.

    PubMed

    Kandimalla, Ekambar R; Bhagat, Lakshmi; Zhu, Fu-Gang; Yu, Dong; Cong, Yan-Ping; Wang, Daqing; Tang, Jimmy X; Tang, Jin-Yan; Knetter, Cathrine F; Lien, Egil; Agrawal, Sudhir

    2003-11-25

    Bacterial and synthetic DNAs containing CpG dinucleotides in specific sequence contexts activate the vertebrate immune system through Toll-like receptor 9 (TLR9). In the present study, we used a synthetic nucleoside with a bicyclic heterobase [1-(2'-deoxy-beta-d-ribofuranosyl)-2-oxo-7-deaza-8-methyl-purine; R] to replace the C in CpG, resulting in an RpG dinucleotide. The RpG dinucleotide was incorporated in mouse- and human-specific motifs in oligodeoxynucleotides (oligos) and 3'-3-linked oligos, referred to as immunomers. Oligos containing the RpG motif induced cytokine secretion in mouse spleen-cell cultures. Immunomers containing RpG dinucleotides showed activity in transfected-HEK293 cells stably expressing mouse TLR9, suggesting direct involvement of TLR9 in the recognition of RpG motif. In J774 macrophages, RpG motifs activated NF-kappa B and mitogen-activated protein kinase pathways. Immunomers containing the RpG dinucleotide induced high levels of IL-12 and IFN-gamma, but lower IL-6 in time- and concentration-dependent fashion in mouse spleen-cell cultures costimulated with IL-2. Importantly, immunomers containing GTRGTT and GARGTT motifs were recognized to a similar extent by both mouse and human immune systems. Additionally, both mouse- and human-specific RpG immunomers potently stimulated proliferation of peripheral blood mononuclear cells obtained from diverse vertebrate species, including monkey, pig, horse, sheep, goat, rat, and chicken. An immunomer containing GTRGTT motif prevented conalbumin-induced and ragweed allergen-induced allergic inflammation in mice. We show that a synthetic bicyclic nucleotide is recognized in the C position of a CpG dinucleotide by immune cells from diverse vertebrate species without bias for flanking sequences, suggesting a divergent nucleotide motif recognition pattern of TLR9.

  10. A dinucleotide motif in oligonucleotides shows potent immunomodulatory activity and overrides species-specific recognition observed with CpG motif

    PubMed Central

    Kandimalla, Ekambar R.; Bhagat, Lakshmi; Zhu, Fu-Gang; Yu, Dong; Cong, Yan-Ping; Wang, Daqing; Tang, Jimmy X.; Tang, Jin-Yan; Knetter, Cathrine F.; Lien, Egil; Agrawal, Sudhir

    2003-01-01

    Bacterial and synthetic DNAs containing CpG dinucleotides in specific sequence contexts activate the vertebrate immune system through Toll-like receptor 9 (TLR9). In the present study, we used a synthetic nucleoside with a bicyclic heterobase [1-(2′-deoxy-β-d-ribofuranosyl)-2-oxo-7-deaza-8-methyl-purine; R] to replace the C in CpG, resulting in an RpG dinucleotide. The RpG dinucleotide was incorporated in mouse- and human-specific motifs in oligodeoxynucleotides (oligos) and 3′-3-linked oligos, referred to as immunomers. Oligos containing the RpG motif induced cytokine secretion in mouse spleen-cell cultures. Immunomers containing RpG dinucleotides showed activity in transfected-HEK293 cells stably expressing mouse TLR9, suggesting direct involvement of TLR9 in the recognition of RpG motif. In J774 macrophages, RpG motifs activated NF-κB and mitogen-activated protein kinase pathways. Immunomers containing the RpG dinucleotide induced high levels of IL-12 and IFN-γ, but lower IL-6 in time- and concentration-dependent fashion in mouse spleen-cell cultures costimulated with IL-2. Importantly, immunomers containing GTRGTT and GARGTT motifs were recognized to a similar extent by both mouse and human immune systems. Additionally, both mouse- and human-specific RpG immunomers potently stimulated proliferation of peripheral blood mononuclear cells obtained from diverse vertebrate species, including monkey, pig, horse, sheep, goat, rat, and chicken. An immunomer containing GTRGTT motif prevented conalbumin-induced and ragweed allergen-induced allergic inflammation in mice. We show that a synthetic bicyclic nucleotide is recognized in the C position of a CpG dinucleotide by immune cells from diverse vertebrate species without bias for flanking sequences, suggesting a divergent nucleotide motif recognition pattern of TLR9. PMID:14610275

  11. Calendar motifs on Getashen hydria

    NASA Astrophysics Data System (ADS)

    Vrtanesyan, Garegin

    2015-07-01

    Getashen hydria was found in the tombs of the middle bronze age (the first third of the second Millennium B.C.) in Armenia (Lake Sevan). It shows a scene consisting of three friezes. On the lower frieze depicts six zoomorphic figures, on an average six frieze waterfowl, and on top, is the graphic signs. Calendar motives of this composition have a numeric expression, six zoomorphic figures on the lower and middle friezes. Division of the annual cycle into two parts is known in the calendars of the ancient Indo-Iranian ("great summer" and "the great winter"). Animals on the lower frieze of the second mark, "winter" road of the Sun, because in this period are the most important events, ensuring the reproduction of the economy of the society. This rut ungulates - wild (deer) and domestic (goats). Moreover, the gon goats end in December, almost coinciding with the onset of the winter solstice. A couple of dogs on the lower frieze marks the version of the myth, imprisoned in the rock hero - the Sun (Mihr - Artavazd), to which his dogs have to chew the chains, anticipating his exit at the winter solstice. This is indicated by the direction of their movement, the Sun moves from left to right for an observer, only when located on the South side of the sky (i.e., beginning with the autumnal equinox). The most important event of the period of "summer road" of the Sun is the vernal equinox, which coincide with the arrival of waterfowl (ducks, geese). Their direction on the second frieze (left to right) corresponds to the position of the observer, facing North.

  12. Motif formation and industry specific topologies in the Japanese business firm network

    NASA Astrophysics Data System (ADS)

    Maluck, Julian; Donner, Reik V.; Takayasu, Hideki; Takayasu, Misako

    2017-05-01

    Motifs and roles are basic quantities for the characterization of interactions among 3-node subsets in complex networks. In this work, we investigate how the distribution of 3-node motifs can be influenced by modifying the rules of an evolving network model while keeping the statistics of simpler network characteristics, such as the link density and the degree distribution, invariant. We exemplify this problem for the special case of the Japanese Business Firm Network, where a well-studied and relatively simple yet realistic evolving network model is available, and compare the resulting motif distribution in the real-world and simulated networks. To better approximate the motif distribution of the real-world network in the model, we introduce both subgraph dependent and global additional rules. We find that a specific rule that allows only for the merging process between nodes with similar link directionality patterns reduces the observed excess of densely connected motifs with bidirectional links. Our study improves the mechanistic understanding of motif formation in evolving network models to better describe the characteristic features of real-world networks with a scale-free topology.

  13. MEME SUITE: tools for motif discovery and searching.

    PubMed

    Bailey, Timothy L; Boden, Mikael; Buske, Fabian A; Frith, Martin; Grant, Charles E; Clementi, Luca; Ren, Jingyuan; Li, Wilfred W; Noble, William S

    2009-07-01

    The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms--MAST, FIMO and GLAM2SCAN--allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm TOMTOM. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and TOMTOM), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net.

  14. MEME Suite: tools for motif discovery and searching

    PubMed Central

    Bailey, Timothy L.; Boden, Mikael; Buske, Fabian A.; Frith, Martin; Grant, Charles E.; Clementi, Luca; Ren, Jingyuan; Li, Wilfred W.; Noble, William S.

    2009-01-01

    The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms—MAST, FIMO and GLAM2SCAN—allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm Tomtom. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and Tomtom), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net. PMID:19458158

  15. The Verrucomicrobia LexA-Binding Motif: Insights into the Evolutionary Dynamics of the SOS Response

    PubMed Central

    Erill, Ivan; Campoy, Susana; Kılıç, Sefa; Barbé, Jordi

    2016-01-01

    The SOS response is the primary bacterial mechanism to address DNA damage, coordinating multiple cellular processes that include DNA repair, cell division, and translesion synthesis. In contrast to other regulatory systems, the composition of the SOS genetic network and the binding motif of its transcriptional repressor, LexA, have been shown to vary greatly across bacterial clades, making it an ideal system to study the co-evolution of transcription factors and their regulons. Leveraging comparative genomics approaches and prior knowledge on the core SOS regulon, here we define the binding motif of the Verrucomicrobia, a recently described phylum of emerging interest due to its association with eukaryotic hosts. Site directed mutagenesis of the Verrucomicrobium spinosum recA promoter confirms that LexA binds a 14 bp palindromic motif with consensus sequence TGTTC-N4-GAACA. Computational analyses suggest that recognition of this novel motif is determined primarily by changes in base-contacting residues of the third alpha helix of the LexA helix-turn-helix DNA binding motif. In conjunction with comparative genomics analysis of the LexA regulon in the Verrucomicrobia phylum, electrophoretic shift assays reveal that LexA binds to operators in the promoter region of DNA repair genes and a mutagenesis cassette in this organism, and identify previously unreported components of the SOS response. The identification of tandem LexA-binding sites generating instances of other LexA-binding motifs in the lexA gene promoter of Verrucomicrobia species leads us to postulate a novel mechanism for LexA-binding motif evolution. This model, based on gene duplication, successfully addresses outstanding questions in the intricate co-evolution of the LexA protein, its binding motif and the regulatory network it controls. PMID:27489856

  16. CytoKavosh: A Cytoscape Plug-In for Finding Network Motifs in Large Biological Networks

    PubMed Central

    Razaghi Moghadam Kashani, Zahra; Salehzadeh-Yazdi, Ali; Khakabimamaghani, Sahand

    2012-01-01

    Network motifs are small connected sub-graphs that have recently gathered much attention to discover structural behaviors of large and complex networks. Finding motifs with any size is one of the most important problems in complex and large networks. It needs fast and reliable algorithms and tools for achieving this purpose. CytoKavosh is one of the best choices for finding motifs with any given size in any complex network. It relies on a fast algorithm, Kavosh, which makes it faster than other existing tools. Kavosh algorithm applies some well known algorithmic features and includes tricky aspects, which make it an efficient algorithm in this field. CytoKavosh is a Cytoscape plug-in which supports us in finding motifs of given size in a network that is formerly loaded into the Cytoscape work-space (directed or undirected). High performance of CytoKavosh is achieved by dynamically linking highly optimized functions of Kavosh's C++ to the Cytoscape Java program, which makes this plug-in suitable for analyzing large biological networks. Some significant attributes of CytoKavosh is efficiency in time usage and memory and having no limitation related to the implementation in motif size. CytoKavosh is implemented in a visual environment Cytoscape that is convenient for the users to interact and create visual options to analyze the structural behavior of a network. This plug-in can work on any given network and is very simple to use and generates graphical results of discovered motifs with any required details. There is no specific Cytoscape plug-in, specific for finding the network motifs, based on original concept. So, we have introduced for the first time, CytoKavosh as the first plug-in, and we hope that this plug-in can be improved to cover other options to make it the best motif-analyzing tool. PMID:22952659

  17. The Motif of Meeting in Digital Education

    ERIC Educational Resources Information Center

    Sheail, Philippa

    2015-01-01

    This article draws on theoretical work which considers the composition of meetings, in order to think about the form of the meeting in digital environments for higher education. To explore the motif of meeting, I undertake a "compositional interpretation" (Rose, 2012) of the default interface offered by "Collaborate", an…

  18. The Motif of Meeting in Digital Education

    ERIC Educational Resources Information Center

    Sheail, Philippa

    2015-01-01

    This article draws on theoretical work which considers the composition of meetings, in order to think about the form of the meeting in digital environments for higher education. To explore the motif of meeting, I undertake a "compositional interpretation" (Rose, 2012) of the default interface offered by "Collaborate", an…

  19. Motifs and structural blocks retrieval by GHT

    NASA Astrophysics Data System (ADS)

    Cantoni, Virginio; Ferone, Alessio; Petrosino, Alfredo; Polat, Ozlem

    2014-06-01

    The structure of a protein gives more insight on the protein function than its amino acid sequence. Protein structure analysis and comparison are important for understanding the evolutionary relationships among proteins, predicting protein functions, and predicting protein folding. Proteins are formed by two basic regular 3D structural patterns, called Secondary Structures (SSs): helices and sheets. A structural motif is a compact 3D protein block referring to a small specific combination of secondary structural elements, which appears in a variety of molecules. In this paper we compare a few approaches for motif retrieval based on the Generalized Hough Transform (GHT). A primary technique is to adopt the single SS as structural primitives; alternatives are to adopt a SSs pair as primitive structural element, or a SSs triplet, and so on up-to an entire motif. The richer the primitive, the higher the time for pre-analysis and search, and the simpler the inspection process on the parameter space for analyzing the peaks. Performance comparisons, in terms of precision and computation time, are here presented considering the retrieval of motifs composed by three to five SSs for more than 15 million searches. The approach can be easily applied to the retrieval of greater blocks, up to protein domains, or even entire proteins.

  20. DNA motif elucidation using belief propagation.

    PubMed

    Wong, Ka-Chun; Chan, Tak-Ming; Peng, Chengbin; Li, Yue; Zhang, Zhaolei

    2013-09-01

    Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k=8∼10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors' websites: e.g. http://www.cs.toronto.edu/∼wkc/kmerHMM.

  1. Unitary circular code motifs in genomes of eukaryotes.

    PubMed

    El Soufi, Karim; Michel, Christian J

    A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D(+) motifs), repeated trinucleotides (T(+) motifs) and repeated tetranucleotides (T(+) motifs). Thus, the D(+), T(+) and T(+) motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C(2), C(3) and C(4) properties, respectively) in the DNA sequences. The statistical distribution of the D(+), T(+) and T(+) motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T(+) motifs. The longest D(+), T(+) and T(+) motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T(+) motifs) in the large eukaryotic genomes is observed compared to the D(+) and T(+) motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T(+) motifs

  2. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data

    PubMed Central

    2014-01-01

    Abstract ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. Reviewers This article was reviewed by Prof. Sandor Pongor, Dr. Yuriy Gusev, and Dr. Shyam Prabhakar (nominated by Prof. Limsoon Wong). PMID:24555784

  3. Discovering interacting domains and motifs in protein-protein interactions.

    PubMed

    Hugo, Willy; Sung, Wing-Kin; Ng, See-Kiong

    2013-01-01

    Many important biological processes, such as the signaling pathways, require protein-protein interactions (PPIs) that are designed for fast response to stimuli. These interactions are usually transient, easily formed, and disrupted, yet specific. Many of these transient interactions involve the binding of a protein domain to a short stretch (3-10) of amino acid residues, which can be characterized by a sequence pattern, i.e., a short linear motif (SLiM). We call these interacting domains and motifs domain-SLiM interactions. Existing methods have focused on discovering SLiMs in the interacting proteins' sequence data. With the recent increase in protein structures, we have a new opportunity to detect SLiMs directly from the proteins' 3D structures instead of their linear sequences. In this chapter, we describe a computational method called SLiMDIet to directly detect SLiMs on domain interfaces extracted from 3D structures of PPIs. SLiMDIet comprises two steps: (1) interaction interfaces belonging to the same domain are extracted and grouped together using structural clustering and (2) the extracted interaction interfaces in each cluster are structurally aligned to extract the corresponding SLiM. Using SLiMDIet, de novo SLiMs interacting with protein domains can be computationally detected from structurally clustered domain-SLiM interactions for PFAM domains which have available 3D structures in the PDB database.

  4. CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

    NASA Astrophysics Data System (ADS)

    Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

    2014-12-01

    Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.

  5. A Discriminative Approach for Unsupervised Clustering of DNA Sequence Motifs

    PubMed Central

    Stegmaier, Philip; Kel, Alexander; Wingender, Edgar; Borlak, Jürgen

    2013-01-01

    Algorithmic comparison of DNA sequence motifs is a problem in bioinformatics that has received increased attention during the last years. Its main applications concern characterization of potentially novel motifs and clustering of a motif collection in order to remove redundancy. Despite growing interest in motif clustering, the question which motif clusters to aim at has so far not been systematically addressed. Here we analyzed motif similarities in a comprehensive set of vertebrate transcription factor classes. For this we developed enhanced similarity scores by inclusion of the information coverage (IC) criterion, which evaluates the fraction of information an alignment covers in aligned motifs. A network-based method enabled us to identify motif clusters with high correspondence to DNA-binding domain phylogenies and prior experimental findings. Based on this analysis we derived a set of motif families representing distinct binding specificities. These motif families were used to train a classifier which was further integrated into a novel algorithm for unsupervised motif clustering. Application of the new algorithm demonstrated its superiority to previously published methods and its ability to reproduce entrained motif families. As a result, our work proposes a probabilistic approach to decide whether two motifs represent common or distinct binding specificities. PMID:23555204

  6. Regulatory motif finding by logic regression.

    PubMed

    Keles, Sündüz; van der Laan, Mark J; Vulpe, Chris

    2004-11-01

    Multiple transcription factors coordinately control transcriptional regulation of genes in eukaryotes. Although many computational methods consider the identification of individual transcription factor binding sites (TFBSs), very few focus on the interactions between these sites. We consider finding TFBSs and their context specific interactions using microarray gene expression data. We devise a hybrid approach called LogicMotif composed of a TFBS identification method combined with the new regression methodology logic regression. LogicMotif has two steps: First, potential binding sites are identified from transcription control regions of genes of interest. Various available methods can be used in this step when the genes of interest can be divided into groups such as up-and downregulated. For this step, we also develop a simple univariate regression and extension method MFURE to extract candidate TFBSs from a large number of genes in the availability of microarray gene expression data. MFURE provides an alternative method for this step when partitioning of the genes into disjoint groups is not preferred. This first step aims to identify individual sites within gene groups of interest or sites that are correlated with the gene expression outcome. In the second step, logic regression is used to build a predictive model of outcome of interest (either gene expression or up- and down-regulation) using these potential sites. This 2-fold approach creates a rich diverse set of potential binding sites in the first step and builds regression or classification models in the second step using logic regression that is particularly good at identifying complex interactions. LogicMotif is applied to two publicly available datasets. A genome-wide gene expression data set of Saccharomyces cerevisiae is used for validation. The regression models obtained are interpretable and the biological implications are in agreement with the known resuts. This analysis suggests that LogicMotif

  7. A Review of Functional Motifs Utilized by Viruses

    PubMed Central

    Sobhy, Haitham

    2016-01-01

    Short linear motifs (SLiM) are short peptides that facilitate protein function and protein-protein interactions. Viruses utilize these motifs to enter into the host, interact with cellular proteins, or egress from host cells. Studying functional motifs may help to predict protein characteristics, interactions, or the putative cellular role of a protein. In virology, it may reveal aspects of the virus tropism and help find antiviral therapeutics. This review highlights the recent understanding of functional motifs utilized by viruses. Special attention was paid to the function of proteins harboring these motifs, and viruses encoding these proteins. The review highlights motifs involved in (i) immune response and post-translational modifications (e.g., ubiquitylation, SUMOylation or ISGylation); (ii) virus-host cell interactions, including virus attachment, entry, fusion, egress and nuclear trafficking; (iii) virulence and antiviral activities; (iv) virion structure; and (v) low-complexity regions (LCRs) or motifs enriched with residues (Xaa-rich motifs). PMID:28248213

  8. The Assembly Motif of a Bacterial Small Multidrug Resistance Protein*

    PubMed Central

    Poulsen, Bradley E.; Rath, Arianna; Deber, Charles M.

    2009-01-01

    Multidrug transporters such as the small multidrug resistance (SMR) family of bacterial integral membrane proteins are capable of conferring clinically significant resistance to a variety of common therapeutics. As antiporter proteins of ∼100 amino acids, SMRs must self-assemble into homo-oligomeric structures for efflux of drug molecules. Oligomerization centered at transmembrane helix four (TM4) has been implicated in SMR assembly, but the full complement of residues required to mediate its self-interaction remains to be characterized. Here, we use Hsmr, the 110-residue SMR family member of the archaebacterium Halobacterium salinarum, to determine the TM4 residue motif required to mediate drug resistance and SMR self-association. Twelve single point mutants that scan the central portion of the TM4 helix (residues 85–104) were constructed and were tested for their ability to confer resistance to the cytotoxic compound ethidium bromide. Six residues were found to be individually essential for drug resistance activity (Gly90, Leu91, Leu93, Ile94, Gly97, and Val98), defining a minimum activity motif of 90GLXLIXXGV98 within TM4. When the propensity of these mutants to dimerize on SDS-PAGE was examined, replacements of all but Ile resulted in ∼2-fold reduction of dimerization versus the wild-type antiporter. Our work defines a minimum activity motif of 90GLXLIXXGV98 within TM4 and suggests that this sequence mediates TM4-based SMR dimerization along a single helix surface, stabilized by a small residue heptad repeat sequence. These TM4-TM4 interactions likely constitute the highest affinity locus for disruption of SMR function by directly targeting its self-assembly mechanism. PMID:19224913

  9. Sequential motif profile of natural visibility graphs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-11-01

    The concept of sequential visibility graph motifs—subgraphs appearing with characteristic frequencies in the visibility graphs associated to time series—has been advanced recently along with a theoretical framework to compute analytically the motif profiles associated to horizontal visibility graphs (HVGs). Here we develop a theory to compute the profile of sequential visibility graph motifs in the context of natural visibility graphs (VGs). This theory gives exact results for deterministic aperiodic processes with a smooth invariant density or stochastic processes that fulfill the Markov property and have a continuous marginal distribution. The framework also allows for a linear time numerical estimation in the case of empirical time series. A comparison between the HVG and the VG case (including evaluation of their robustness for short series polluted with measurement noise) is also presented.

  10. Chiral Alkyl Halides: Underexplored Motifs in Medicine

    PubMed Central

    Gál, Bálint; Bucher, Cyril; Burns, Noah Z.

    2016-01-01

    While alkyl halides are valuable intermediates in synthetic organic chemistry, their use as bioactive motifs in drug discovery and medicinal chemistry is rare in comparison. This is likely attributable to the common misconception that these compounds are merely non-specific alkylators in biological systems. A number of chlorinated compounds in the pharmaceutical and food industries, as well as a growing number of halogenated marine natural products showing unique bioactivity, illustrate the role that chiral alkyl halides can play in drug discovery. Through a series of case studies, we demonstrate in this review that these motifs can indeed be stable under physiological conditions, and that halogenation can enhance bioactivity through both steric and electronic effects. Our hope is that, by placing such compounds in the minds of the chemical community, they may gain more traction in drug discovery and inspire more synthetic chemists to develop methods for selective halogenation. PMID:27827902

  11. On the Kernelization Complexity of Colorful Motifs

    NASA Astrophysics Data System (ADS)

    Ambalath, Abhimanyu M.; Balasundaram, Radheshyam; Rao H., Chintan; Koppula, Venkata; Misra, Neeldhara; Philip, Geevarghese; Ramanujan, M. S.

    The Colorful Motif problem asks if, given a vertex-colored graph G, there exists a subset S of vertices of G such that the graph induced by G on S is connected and contains every color in the graph exactly once. The problem is motivated by applications in computational biology and is also well-studied from the theoretical point of view. In particular, it is known to be NP-complete even on trees of maximum degree three [Fellows et al, ICALP 2007]. In their pioneering paper that introduced the color-coding technique, Alon et al. [STOC 1995] show, inter alia, that the problem is FPT on general graphs. More recently, Cygan et al. [WG 2010] showed that Colorful Motif is NP-complete on comb graphs, a special subclass of the set of trees of maximum degree three. They also showed that the problem is not likely to admit polynomial kernels on forests.

  12. Anticipated synchronization in neuronal network motifs

    NASA Astrophysics Data System (ADS)

    Matias, F. S.; Gollo, L. L.; Carelli, P. V.; Copelli, M.; Mirasso, C. R.

    2013-01-01

    Two identical dynamical systems coupled unidirectionally (in a so called master-slave configuration) exhibit anticipated synchronization (AS) if the one which receives the coupling (the slave) also receives a negative delayed self-feedback. In oscillatory neuronal systems AS is characterized by a phase-locking with negative time delay τ between the spikes of the master and of the slave (slave fires before the master), while in the usual delayed synchronization (DS) regime τ is positive (slave fires after the master). A 3-neuron motif in which the slave self-feedback is replaced by a feedback loop mediated by an interneuron can exhibits both AS and DS regimes. Here we show that AS is robust in the presence of noise in a 3 Hodgkin-Huxley type neuronal motif. We also show that AS is stable for large values of τ in a chain of connected slaves-interneurons.

  13. Functional Motifs in Biochemical Reaction Networks

    PubMed Central

    Tyson, John J.; Novák, Béla

    2013-01-01

    The signal-response characteristics of a living cell are determined by complex networks of interacting genes, proteins, and metabolites. Understanding how cells respond to specific challenges, how these responses are contravened in diseased cells, and how to intervene pharmacologically in the decision-making processes of cells requires an accurate theory of the information-processing capabilities of macromolecular regulatory networks. Adopting an engineer’s approach to control systems, we ask whether realistic cellular control networks can be decomposed into simple regulatory motifs that carry out specific functions in a cell. We show that such functional motifs exist and review the experimental evidence that they control cellular responses as expected. PMID:20055671

  14. A Basic Set of Homeostatic Controller Motifs

    PubMed Central

    Drengstig, T.; Jolma, I.W.; Ni, X.Y.; Thorsen, K.; Xu, X.M.; Ruoff, P.

    2012-01-01

    Adaptation and homeostasis are essential properties of all living systems. However, our knowledge about the reaction kinetic mechanisms leading to robust homeostatic behavior in the presence of environmental perturbations is still poor. Here, we describe, and provide physiological examples of, a set of two-component controller motifs that show robust homeostasis. This basic set of controller motifs, which can be considered as complete, divides into two operational work modes, termed as inflow and outflow control. We show how controller combinations within a cell can integrate uptake and metabolization of a homeostatic controlled species and how pathways can be activated and lead to the formation of alternative products, as observed, for example, in the change of fermentation products by microorganisms when the supply of the carbon source is altered. The antagonistic character of hormonal control systems can be understood by a combination of inflow and outflow controllers. PMID:23199928

  15. The TCT motif, a key component of an RNA polymerase II transcription system for the translational machinery

    PubMed Central

    Parry, Trevor J.; Theisen, Joshua W.M.; Hsu, Jer-Yuan; Wang, Yuan-Liang; Corcoran, David L.; Eustice, Moriah; Ohler, Uwe; Kadonaga, James T.

    2010-01-01

    The TCT motif (polypyrimidine initiator) encompasses the transcription start site of nearly all ribosomal protein genes in Drosophila and mammals. The TCT motif is required for transcription of ribosomal protein gene promoters. The TCT element resembles the Inr (initiator), but is not recognized by TFIID and cannot function in lieu of an Inr. However, a single T-to-A substitution converts the TCT element into a functionally active Inr. Thus, the TCT motif is a novel transcriptional element that is distinct from the Inr. These findings reveal a specialized TCT-based transcription system that is directed toward the synthesis of ribosomal proteins. PMID:20801935

  16. Analyzing network reliability using structural motifs.

    PubMed

    Khorramzadeh, Yasamin; Youssef, Mina; Eubank, Stephen; Mowlaei, Shahir

    2015-04-01

    This paper uses the reliability polynomial, introduced by Moore and Shannon in 1956, to analyze the effect of network structure on diffusive dynamics such as the spread of infectious disease. We exhibit a representation for the reliability polynomial in terms of what we call structural motifs that is well suited for reasoning about the effect of a network's structural properties on diffusion across the network. We illustrate by deriving several general results relating graph structure to dynamical phenomena.

  17. Motif mining based on network space compression.

    PubMed

    Zhang, Qiang; Xu, Yuan

    2015-01-01

    A network motif is a recurring subnetwork within a network, and it takes on certain functions in practical biological macromolecule applications. Previous algorithms have focused on the computational efficiency of network motif detection, but some problems in storage space and searching time manifested during earlier studies. The considerable computational and spacial complexity also presents a significant challenge. In this paper, we provide a new approach for motif mining based on compressing the searching space. According to the characteristic of the parity nodes, we cut down the searching space and storage space in real graphs and random graphs, thereby reducing the computational cost of verifying the isomorphism of sub-graphs. We obtain a new network with smaller size after removing parity nodes and the "repeated edges" connected with the parity nodes. Random graph structure and sub-graph searching are based on the Back Tracking Method; all sub-graphs can be searched for by adding edges progressively. Experimental results show that this algorithm has higher speed and better stability than its alternatives.

  18. Dynamic motifs in socio-economic networks

    NASA Astrophysics Data System (ADS)

    Zhang, Xin; Shao, Shuai; Stanley, H. Eugene; Havlin, Shlomo

    2014-12-01

    Socio-economic networks are of central importance in economic life. We develop a method of identifying and studying motifs in socio-economic networks by focusing on “dynamic motifs,” i.e., evolutionary connection patterns that, because of “node acquaintances” in the network, occur much more frequently than random patterns. We examine two evolving bi-partite networks: i) the world-wide commercial ship chartering market and ii) the ship build-to-order market. We find similar dynamic motifs in both bipartite networks, even though they describe different economic activities. We also find that “influence” and “persistence” are strong factors in the interaction behavior of organizations. When two companies are doing business with the same customer, it is highly probable that another customer who currently only has business relationship with one of these two companies, will become customer of the second in the future. This is the effect of influence. Persistence means that companies with close business ties to customers tend to maintain their relationships over a long period of time.

  19. Absolute Phosphorylation Stoichiometry Analysis by Motif-Targeting Quantitative Mass Spectrometry.

    PubMed

    Tsai, Chia-Feng; Ku, Wei-Chi; Chen, Yu-Ju; Ishihama, Yasushi

    2017-01-01

    Direct measurement of site-specific phosphorylation stoichiometry can unambiguously distinguish whether the degree of phosphorylation is regulated by upstream kinase/phosphatase activity or by transcriptional regulation to alter protein expression level. Here, we describe a motif-targeting quantitative proteomic approach that integrates dephosphorylation, isotope tag labeling, and enzymatic kinase reaction for large-scale phosphorylation stoichiometry measurement of the human proteome.

  20. Identifying promoter features of co-regulated genes with similar network motifs.

    PubMed

    Harari, Oscar; del Val, Coral; Romero-Zaliz, Rocío; Shin, Dongwoo; Huang, Henry; Groisman, Eduardo A; Zwir, Igor

    2009-04-29

    A large amount of computational and experimental work has been devoted to uncovering network motifs in gene regulatory networks. The leading hypothesis is that evolutionary processes independently selected recurrent architectural relationships among regulators and target genes (motifs) to produce characteristic expression patterns of its members. However, even with the same architecture, the genes may still be differentially expressed. Therefore, to define fully the expression of a group of genes, the strength of the connections in a network motif must be specified, and the cis-promoter features that participate in the regulation must be determined. We have developed a model-based approach to analyze proteobacterial genomes for promoter features that is specifically designed to account for the variability in sequence, location and topology intrinsic to differential gene expression. We provide methods for annotating regulatory regions by detecting their subjacent cis-features. This includes identifying binding sites for a transcriptional regulator, distinguishing between activation and repression sites, direct and reverse orientation, and among sequences that weakly reflect a particular pattern; binding sites for the RNA polymerase, characterizing different classes, and locations relative to the transcription factor binding sites; the presence of riboswitches in the 5'UTR, and for other transcription factors. We applied our approach to characterize network motifs controlled by the PhoP/PhoQ regulatory system of Escherichia coli and Salmonella enterica serovar Typhimurium. We identified key features that enable the PhoP protein to control its target genes, and distinct features may produce different expression patterns even within the same network motif. Global transcriptional regulators control multiple promoters by a variety of network motifs. This is clearly the case for the regulatory protein PhoP. In this work, we studied this regulatory protein and demonstrated

  1. Quantitative Proteomics Targeting Classes of Motif-containing Peptides Using Immunoaffinity-based Mass Spectrometry*

    PubMed Central

    Olsson, Niclas; James, Peter; Borrebaeck, Carl A. K.; Wingren, Christer

    2012-01-01

    The development of high-performance technology platforms for generating detailed protein expression profiles, or protein atlases, is essential. Recently, we presented a novel platform that we termed global proteome survey, where we combined the best features of affinity proteomics and mass spectrometry, to probe any proteome in a species independent manner while still using a limited set of antibodies. We used so called context-independent-motif-specific antibodies, directed against short amino acid motifs. This enabled enrichment of motif-containing peptides from a digested proteome, which then were detected and identified by mass spectrometry. In this study, we have demonstrated the quantitative capability, reproducibility, sensitivity, and coverage of the global proteome survey technology by targeting stable isotope labeling with amino acids in cell culture-labeled yeast cultures cultivated in glucose or ethanol. The data showed that a wide range of motif-containing peptides (proteins) could be detected, identified, and quantified in a highly reproducible manner. On average, each of six different motif-specific antibodies was found to target about 75 different motif-containing proteins. Furthermore, peptides originating from proteins spanning in abundance from over a million down to less than 50 copies per cell, could be targeted. It is worth noting that a significant set of peptides previously not reported in the PeptideAtlas database was among the profiled targets. The quantitative data corroborated well with the corresponding data generated after conventional strong cation exchange fractionation of the same samples. Finally, several differentially expressed proteins, with both known and unknown functions, many relevant for the central carbon metabolism, could be detected in the glucose- versus ethanol-cultivated yeast. Taken together, the study demonstrated the potential of our immunoaffinity-based mass spectrometry platform for reproducible quantitative

  2. Synchronization patterns: from network motifs to hierarchical networks

    NASA Astrophysics Data System (ADS)

    Krishnagopal, Sanjukta; Lehnert, Judith; Poel, Winnie; Zakharova, Anna; Schöll, Eckehard

    2017-03-01

    We investigate complex synchronization patterns such as cluster synchronization and partial amplitude death in networks of coupled Stuart-Landau oscillators with fractal connectivities. The study of fractal or self-similar topology is motivated by the network of neurons in the brain. This fractal property is well represented in hierarchical networks, for which we present three different models. In addition, we introduce an analytical eigensolution method and provide a comprehensive picture of the interplay of network topology and the corresponding network dynamics, thus allowing us to predict the dynamics of arbitrarily large hierarchical networks simply by analysing small network motifs. We also show that oscillation death can be induced in these networks, even if the coupling is symmetric, contrary to previous understanding of oscillation death. Our results show that there is a direct correlation between topology and dynamics: hierarchical networks exhibit the corresponding hierarchical dynamics. This helps bridge the gap between mesoscale motifs and macroscopic networks. This article is part of the themed issue 'Horizons of cybernetical physics'.

  3. Synchronization patterns: from network motifs to hierarchical networks.

    PubMed

    Krishnagopal, Sanjukta; Lehnert, Judith; Poel, Winnie; Zakharova, Anna; Schöll, Eckehard

    2017-03-06

    We investigate complex synchronization patterns such as cluster synchronization and partial amplitude death in networks of coupled Stuart-Landau oscillators with fractal connectivities. The study of fractal or self-similar topology is motivated by the network of neurons in the brain. This fractal property is well represented in hierarchical networks, for which we present three different models. In addition, we introduce an analytical eigensolution method and provide a comprehensive picture of the interplay of network topology and the corresponding network dynamics, thus allowing us to predict the dynamics of arbitrarily large hierarchical networks simply by analysing small network motifs. We also show that oscillation death can be induced in these networks, even if the coupling is symmetric, contrary to previous understanding of oscillation death. Our results show that there is a direct correlation between topology and dynamics: hierarchical networks exhibit the corresponding hierarchical dynamics. This helps bridge the gap between mesoscale motifs and macroscopic networks.This article is part of the themed issue 'Horizons of cybernetical physics'.

  4. An E-box/M-CAT hybrid motif and cognate binding protein(s) regulate the basal muscle-specific and cAMP-inducible expression of the rat cardiac alpha-myosin heavy chain gene.

    PubMed

    Gupta, M P; Gupta, M; Zak, R

    1994-11-25

    Expression of the cardiac myosin heavy chain (MHC) genes is regulated developmentally and by numerous epigenetic factors. Here we report the identification of a cis-regulatory element and cognate nuclear binding protein(s) responsible for cAMP-induced expression of the rat cardiac alpha-MHC gene. By Northern blot analysis, we found that, in primary cultures of fetal rat heart myocytes, the elevation of intracellular levels of cAMP results in up-regulation of alpha-MHC and down-regulation of beta-MHC mRNA expression. This effect of cAMP was dependent upon the basal level of expression of both MHC transcripts and was sensitive to cycloheximide. In transient expression analysis employing a series of alpha-MHC/CAT constructs, we identified a 31-base pair fragment located in the immediate upstream region (-71 to -40), which confers both muscle-specific and cAMP-inducible expression of the gene. Within this 31-base pair fragment there are two regions, an AT-rich portion and a hybrid motif which contains overlapping sequences of E-box and M-CAT binding sites (GGCACGTGGAATG). By substitution mutation analysis, both elements were found important for the basal muscle-specific expression; however, the cAMP-inducible expression of the gene is conferred only by the E-box/M-CAT hybrid motif (EM element). Using mobility gel shift competition assay, immunoblotting, and UV-cross-linking analyses, we found that a protein binding to the EM element is indistinguishable from the transcription enhancer factor-1 (TEF-1) in terms of sequence recognition, molecular mass, and immunoreactivity. Methylation interference and point mutation analyses indicate that, besides M-CAT sequences, center CG dinucleotides of the E-box motif CACGTG are essential for protein binding to the EM element and for its functional activity. Furthermore, our data also show that, in addition to TEF-1, another HF-1a-related factor may be recognized by the alpha-MHC gene EM element. These results are first to

  5. Occurrence probability of structured motifs in random sequences.

    PubMed

    Robin, S; Daudin, J-J; Richard, H; Sagot, M-F; Schbath, S

    2002-01-01

    The problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrences of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.

  6. Discovery of a Regulatory Motif for Human Satellite DNA Transcription in Response to BATF2 Overexpression.

    PubMed

    Bai, Xuejia; Huang, Wenqiu; Zhang, Chenguang; Niu, Jing; Ding, Wei

    2016-03-01

    One of the basic leucine zipper transcription factors, BATF2, has been found to suppress cancer growth and migration. However, little is known about the genes downstream of BATF2. HeLa cells were stably transfected with BATF2, then chromatin immunoprecipitation-sequencing was employed to identify the DNA motifs responsive to BATF2. Comprehensive bioinformatics analyses indicated that the most significant motif discovered as TTCCATT[CT]GATTCCATTC[AG]AT was primarily distributed among the chromosome centromere regions and mostly within human type II satellite DNA. Such motifs were able to prime the transcription of type II satellite DNA in a directional and asymmetrical manner. Consistently, satellite II transcription was up-regulated in BATF2-overexpressing cells. The present study provides insight into understanding the role of BATF2 in tumours and the importance of satellite DNA in the maintenance of genomic stability. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.

  7. DNA motifs determining the efficiency of adaptation into the Escherichia coli CRISPR array.

    PubMed

    Yosef, Ido; Shitrit, Dror; Goren, Moran G; Burstein, David; Pupko, Tal; Qimron, Udi

    2013-08-27

    Clustered regularly interspaced short palindromic repeats (CRISPR) and their associated proteins constitute a recently identified prokaryotic defense system against invading nucleic acids. DNA segments, termed protospacers, are integrated into the CRISPR array in a process called adaptation. Here, we establish a PCR-based assay that enables evaluating the adaptation efficiency of specific spacers into the type I-E Escherichia coli CRISPR array. Using this assay, we provide direct evidence that the protospacer adjacent motif along with the first base of the protospacer (5'-AAG) partially affect the efficiency of spacer acquisition. Remarkably, we identified a unique dinucleotide, 5'-AA, positioned at the 3' end of the spacer, that enhances efficiency of the spacer's acquisition. Insertion of this dinucleotide increased acquisition efficiency of two different spacers. DNA sequencing of newly adapted CRISPR arrays revealed that the position of the newly identified motif with respect to the 5'-AAG is important for affecting acquisition efficiency. Analysis of approximately 1 million spacers showed that this motif is overrepresented in frequently acquired spacers compared with those acquired rarely. Our results represent an example of a short nonprotospacer adjacent motif sequence that affects acquisition efficiency and suggest that other as yet unknown motifs affect acquisition efficiency in other CRISPR systems as well.

  8. Alanine substitutions of noncysteine residues in the cysteine-stabilized αβ motif

    PubMed Central

    Yang, Ying-Fang; Cheng, Kuo-Chang; Tsai, Ping-Hsing; Liu, Chung-Cheng; Lee, Tian-Ren; Ping-Chiang Lyu

    2009-01-01

    The protein scaffold is a peptide framework with a high tolerance of residue modifications. The cysteine-stabilized αβ motif (CSαβ) consists of an α-helix and an antiparallel triple-stranded β-sheet connected by two disulfide bridges. Proteins containing this motif share low sequence identity but high structural similarity and has been suggested as a good scaffold for protein engineering. The Vigna radiate defensin 1 (VrD1), a plant defensin, serves here as a model protein to probe the amino acid tolerance of CSαβ motif. A systematic alanine substitution is performed on the VrD1. The key residues governing the inhibitory function and structure stability are monitored. Thirty-two of 46 residue positions of VrD1 are altered by site-directed mutagenesis techniques. The circular dichroism spectrum, intrinsic fluorescence spectrum, and chemical denaturation are used to analyze the conformation and structural stability of proteins. The secondary structures were highly tolerant to the amino acid substitutions; however, the protein stabilities were varied for each mutant. Many mutants, although they maintained their conformations, altered their inhibitory function significantly. In this study, we reported the first alanine scan on the plant defensin containing the CSαβ motif. The information is valuable to the scaffold with the CSαβ motif and protein engineering. PMID:19533758

  9. Mechano-chemical selections of two competitive unfolding pathways of a single DNA i-motif

    NASA Astrophysics Data System (ADS)

    Xu, Yue; Chen, Hu; Qu, Yu-Jie; Artem, K. Efremov; Li, Ming; Ouyang, Zhong-Can; Liu, Dong-Sheng; Yan, Jie

    2014-06-01

    The DNA i-motif is a quadruplex structure formed in tandem cytosine-rich sequences in slightly acidic conditions. Besides being considered as a building block of DNA nano-devices, it may also play potential roles in regulating chromosome stability and gene transcriptions. The stability of i-motif is crucial for these functions. In this work, we investigated the mechanical stability of a single i-motif formed in the human telomeric sequence 5'-(CCCTAA)3CCC, which revealed a novel pH and loading rate-dependent bimodal unfolding force distribution. Although the cause of the bimodal unfolding force species is not clear, we proposed a phenomenological model involving a direct unfolding favored at lower loading rate or higher pH value, which is subject to competition with another unfolding pathway through a mechanically stable intermediate state whose nature is yet to be determined. Overall, the unique mechano—chemical responses of i-motif-provide a new perspective to its stability, which may be useful to guide designing new i-motif-based DNA mechanical nano-devices.

  10. A novel RNA motif mediates the strict nuclear localization of a long noncoding RNA.

    PubMed

    Zhang, Bing; Gunawardane, Lalith; Niazi, Farshad; Jahanbani, Fereshteh; Chen, Xin; Valadkhan, Saba

    2014-06-01

    The ubiquitous presence of long noncoding RNAs (lncRNAs) in eukaryotes points to the importance of understanding how their sequences impact function. As many lncRNAs regulate nuclear events and thus must localize to nuclei, we analyzed the sequence requirements for nuclear localization in an intergenic lncRNA named BORG (BMP2-OP1-responsive gene), which is both spliced and polyadenylated but is strictly localized in nuclei. Subcellular localization of BORG was not dependent on the context or level of its expression or decay but rather depended on the sequence of the mature, spliced transcript. Mutational analyses indicated that nuclear localization of BORG was mediated through a novel RNA motif consisting of the pentamer sequence AGCCC with sequence restrictions at positions -8 (T or A) and -3 (G or C) relative to the first nucleotide of the pentamer. Mutation of the motif to a scrambled sequence resulted in complete loss of nuclear localization, while addition of even a single copy of the motif to a cytoplasmically localized RNA was sufficient to impart nuclear localization. Further, the presence of this motif in other cellular RNAs showed a direct correlation with nuclear localization, suggesting that the motif may act as a general nuclear localization signal for cellular RNAs. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  11. A Novel RNA Motif Mediates the Strict Nuclear Localization of a Long Noncoding RNA

    PubMed Central

    Zhang, Bing; Gunawardane, Lalith; Niazi, Farshad; Jahanbani, Fereshteh; Chen, Xin

    2014-01-01

    The ubiquitous presence of long noncoding RNAs (lncRNAs) in eukaryotes points to the importance of understanding how their sequences impact function. As many lncRNAs regulate nuclear events and thus must localize to nuclei, we analyzed the sequence requirements for nuclear localization in an intergenic lncRNA named BORG (BMP2-OP1-responsive gene), which is both spliced and polyadenylated but is strictly localized in nuclei. Subcellular localization of BORG was not dependent on the context or level of its expression or decay but rather depended on the sequence of the mature, spliced transcript. Mutational analyses indicated that nuclear localization of BORG was mediated through a novel RNA motif consisting of the pentamer sequence AGCCC with sequence restrictions at positions −8 (T or A) and −3 (G or C) relative to the first nucleotide of the pentamer. Mutation of the motif to a scrambled sequence resulted in complete loss of nuclear localization, while addition of even a single copy of the motif to a cytoplasmically localized RNA was sufficient to impart nuclear localization. Further, the presence of this motif in other cellular RNAs showed a direct correlation with nuclear localization, suggesting that the motif may act as a general nuclear localization signal for cellular RNAs. PMID:24732794

  12. CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design

    PubMed Central

    Chen, Yong

    2016-01-01

    A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html. PMID:27487245

  13. RNA structural motif recognition based on least-squares distance.

    PubMed

    Shen, Ying; Wong, Hau-San; Zhang, Shaohong; Zhang, Lin

    2013-09-01

    RNA structural motifs are recurrent structural elements occurring in RNA molecules. RNA structural motif recognition aims to find RNA substructures that are similar to a query motif, and it is important for RNA structure analysis and RNA function prediction. In view of this, we propose a new method known as RNA Structural Motif Recognition based on Least-Squares distance (LS-RSMR) to effectively recognize RNA structural motifs. A test set consisting of five types of RNA structural motifs occurring in Escherichia coli ribosomal RNA is compiled by us. Experiments are conducted for recognizing these five types of motifs. The experimental results fully reveal the superiority of the proposed LS-RSMR compared with four other state-of-the-art methods.

  14. MProfiler: A Profile-Based Method for DNA Motif Discovery

    NASA Astrophysics Data System (ADS)

    Altarawy, Doaa; Ismail, Mohamed A.; Ghanem, Sahar M.

    Motif Finding is one of the most important tasks in gene regulation which is essential in understanding biological cell functions. Based on recent studies, the performance of current motif finders is not satisfactory. A number of ensemble methods have been proposed to enhance the accuracy of the results. Existing ensemble methods overall performance is better than stand-alone motif finders. A recent ensemble method, MotifVoter, significantly outperforms all existing stand-alone and ensemble methods. In this paper, we propose a method, MProfiler, to increase the accuracy of MotifVoter without increasing the run time by introducing an idea called center profiling. Our experiments show improvement in the quality of generated clusters over MotifVoter in both accuracy and cluster compactness. Using 56 datasets, the accuracy of the final results using our method achieves 80% improvement in correlation coefficient nCC, and 93% improvement in performance coefficient nPC over MotifVoter.

  15. Chaotic motif sampler: detecting motifs from biological sequences by using chaotic neurodynamics

    NASA Astrophysics Data System (ADS)

    Matsuura, Takafumi; Ikeguchi, Tohru

    Identification of a region in biological sequences, motif extraction problem (MEP) is solved in bioinformatics. However, the MEP is an NP-hard problem. Therefore, it is almost impossible to obtain an optimal solution within a reasonable time frame. To find near optimal solutions for NP-hard combinatorial optimization problems such as traveling salesman problems, quadratic assignment problems, and vehicle routing problems, chaotic search, which is one of the deterministic approaches, has been proposed and exhibits better performance than stochastic approaches. In this paper, we propose a new alignment method that employs chaotic dynamics to solve the MEPs. It is called the Chaotic Motif Sampler. We show that the performance of the Chaotic Motif Sampler is considerably better than that of the conventional methods such as the Gibbs Site Sampler and the Neighborhood Optimization for Multiple Alignment Discovery.

  16. EAR motif-mediated transcriptional repression in plants: an underlying mechanism for epigenetic regulation of gene expression.

    PubMed

    Kagale, Sateesh; Rozwadowski, Kevin

    2011-02-01

    Ethylene-responsive element binding factor-associated Amphiphilic Repression (EAR) motif-mediated transcriptional repression is emerging as one of the principal mechanisms of plant gene regulation. The EAR motif, defined by the consensus sequence patterns of either LxLxL or DLNxxP, is the most predominant form of transcriptional repression motif so far identified in plants. Additionally, this active repression motif is highly conserved in transcriptional regulators known to function as negative regulators in a broad range of developmental and physiological processes across evolutionarily diverse plant species. Recent discoveries of co-repressors interacting with EAR motifs, such as TOPLESS (TPL) and AtSAP18, have begun to unravel the mechanisms of EAR motif-mediated repression. The demonstration of genetic interaction between mutants of TPL and AtHDA19, co-complex formation between TPL-related 1 (TPR1) and AtHDA19, as well as direct physical interaction between AtSAP18 and AtHDA19 support a model where EAR repressors, via recruitment of chromatin remodeling factors, facilitate epigenetic regulation of gene expression. Here, we discuss the biological significance of EAR-mediated gene regulation in the broader context of plant biology and present literature evidence in support of a model for EAR motif-mediated repression via the recruitment and action of chromatin modifiers. Additionally, we discuss the possible influences of phosphorylation and ubiquitination on the function and turnover of EAR repressors.

  17. The RNA 3D Motif Atlas: Computational methods for extraction, organization and evaluation of RNA motifs.

    PubMed

    Parlea, Lorena G; Sweeney, Blake A; Hosseini-Asanjan, Maryam; Zirbel, Craig L; Leontis, Neocles B

    2016-07-01

    RNA 3D motifs occupy places in structured RNA molecules that correspond to the hairpin, internal and multi-helix junction "loops" of their secondary structure representations. As many as 40% of the nucleotides of an RNA molecule can belong to these structural elements, which are distinct from the regular double helical regions formed by contiguous AU, GC, and GU Watson-Crick basepairs. With the large number of atomic- or near atomic-resolution 3D structures appearing in a steady stream in the PDB/NDB structure databases, the automated identification, extraction, comparison, clustering and visualization of these structural elements presents an opportunity to enhance RNA science. Three broad applications are: (1) identification of modular, autonomous structural units for RNA nanotechnology, nanobiology and synthetic biology applications; (2) bioinformatic analysis to improve RNA 3D structure prediction from sequence; and (3) creation of searchable databases for exploring the binding specificities, structural flexibility, and dynamics of these RNA elements. In this contribution, we review methods developed for computational extraction of hairpin and internal loop motifs from a non-redundant set of high-quality RNA 3D structures. We provide a statistical summary of the extracted hairpin and internal loop motifs in the most recent version of the RNA 3D Motif Atlas. We also explore the reliability and accuracy of the extraction process by examining its performance in clustering recurrent motifs from homologous ribosomal RNA (rRNA) structures. We conclude with a summary of remaining challenges, especially with regard to extraction of multi-helix junction motifs. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Bases of motifs for generating repeated patterns with wild cards.

    PubMed

    Pisanti, Nadia; Crochemore, Maxime; Grossi, Roberto; Sagot, Marie-France

    2005-01-01

    Motif inference represents one of the most important areas of research in computational biology, and one of its oldest ones. Despite this, the problem remains very much open in the sense that no existing definition is fully satisfying, either in formal terms, or in relation to the biological questions that involve finding such motifs. Two main types of motifs have been considered in the literature: matrices (of letter frequency per position in the motif) and patterns. There is no conclusive evidence in favor of either, and recent work has attempted to integrate the two types into a single model. In this paper, we address the formal issue in relation to motifs as patterns. This is essential to get at a better understanding of motifs in general. In particular, we consider a promising idea that was recently proposed, which attempted to avoid the combinatorial explosion in the number of motifs by means of a generator set for the motifs. Instead of exhibiting a complete list of motifs satisfying some input constraints, what is produced is a basis of such motifs from which all the other ones can be generated. We study the computational cost of determining such a basis of repeated motifs with wild cards in a sequence. We give new upper and lower bounds on such a cost, introducing a notion of basis that is provably contained in (and, thus, smaller) than previously defined ones. Our basis can be computed in less time and space, and is still able to generate the same set of motifs. We also prove that the number of motifs in all bases defined so far grows exponentially with the quorum, that is, with the minimal number of times a motif must appear in a sequence, something unnoticed in previous work. We show that there is no hope to efficiently compute such bases unless the quorum is fixed.

  19. Potential Direct Regulators of the Drosophila yellow Gene Identified by Yeast One-Hybrid and RNAi Screens

    PubMed Central

    Kalay, Gizem; Lusk, Richard; Dome, Mackenzie; Hens, Korneel; Deplancke, Bart; Wittkopp, Patricia J.

    2016-01-01

    The regulation of gene expression controls development, and changes in this regulation often contribute to phenotypic evolution. Drosophila pigmentation is a model system for studying evolutionary changes in gene regulation, with differences in expression of pigmentation genes such as yellow that correlate with divergent pigment patterns among species shown to be caused by changes in cis- and trans-regulation. Currently, much more is known about the cis-regulatory component of divergent yellow expression than the trans-regulatory component, in part because very few trans-acting regulators of yellow expression have been identified. This study aims to improve our understanding of the trans-acting control of yellow expression by combining yeast-one-hybrid and RNAi screens for transcription factors binding to yellow cis-regulatory sequences and affecting abdominal pigmentation in adults, respectively. Of the 670 transcription factors included in the yeast-one-hybrid screen, 45 showed evidence of binding to one or more sequence fragments tested from the 5′ intergenic and intronic yellow sequences from D. melanogaster, D. pseudoobscura, and D. willistoni, suggesting that they might be direct regulators of yellow expression. Of the 670 transcription factors included in the yeast-one-hybrid screen, plus another TF previously shown to be genetically upstream of yellow, 125 were also tested using RNAi, and 32 showed altered abdominal pigmentation. Nine transcription factors were identified in both screens, including four nuclear receptors related to ecdysone signaling (Hr78, Hr38, Hr46, and Eip78C). This finding suggests that yellow expression might be directly controlled by nuclear receptors influenced by ecdysone during early pupal development when adult pigmentation is forming. PMID:27527791

  20. Potential Direct Regulators of the Drosophila yellow Gene Identified by Yeast One-Hybrid and RNAi Screens.

    PubMed

    Kalay, Gizem; Lusk, Richard; Dome, Mackenzie; Hens, Korneel; Deplancke, Bart; Wittkopp, Patricia J

    2016-10-13

    The regulation of gene expression controls development, and changes in this regulation often contribute to phenotypic evolution. Drosophila pigmentation is a model system for studying evolutionary changes in gene regulation, with differences in expression of pigmentation genes such as yellow that correlate with divergent pigment patterns among species shown to be caused by changes in cis- and trans-regulation. Currently, much more is known about the cis-regulatory component of divergent yellow expression than the trans-regulatory component, in part because very few trans-acting regulators of yellow expression have been identified. This study aims to improve our understanding of the trans-acting control of yellow expression by combining yeast-one-hybrid and RNAi screens for transcription factors binding to yellow cis-regulatory sequences and affecting abdominal pigmentation in adults, respectively. Of the 670 transcription factors included in the yeast-one-hybrid screen, 45 showed evidence of binding to one or more sequence fragments tested from the 5' intergenic and intronic yellow sequences from D. melanogaster, D. pseudoobscura, and D. willistoni, suggesting that they might be direct regulators of yellow expression. Of the 670 transcription factors included in the yeast-one-hybrid screen, plus another TF previously shown to be genetically upstream of yellow, 125 were also tested using RNAi, and 32 showed altered abdominal pigmentation. Nine transcription factors were identified in both screens, including four nuclear receptors related to ecdysone signaling (Hr78, Hr38, Hr46, and Eip78C). This finding suggests that yellow expression might be directly controlled by nuclear receptors influenced by ecdysone during early pupal development when adult pigmentation is forming. Copyright © 2016 Kalay et al.