Science.gov

Sample records for cis-regulatory motif directs

  1. Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs

    PubMed Central

    Ivan, Andra; Halfon, Marc S; Sinha, Saurabh

    2008-01-01

    We consider the problem of predicting cis-regulatory modules without knowledge of motifs. We formulate this problem in a pragmatic setting, and create over 30 new data sets, using Drosophila modules, to use as a 'benchmark'. We propose two new methods for the problem, and evaluate these, as well as two existing methods, on our benchmark. We find that the challenge of predicting cis-regulatory modules ab initio, without any input of relevant motifs, is a realizable goal. PMID:18226245

  2. Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria

    PubMed Central

    2013-01-01

    Background In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels. An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. Results A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. Conclusions The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and

  3. Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation

    PubMed Central

    Rouault, Hervé; Santolini, Marc; Schweisguth, François; Hakim, Vincent

    2014-01-01

    Cis-regulatory modules (CRMs) and motifs play a central role in tissue and condition-specific gene expression. Here we present Imogene, an ensemble of statistical tools that we have developed to facilitate their identification and implemented in a publicly available software. Starting from a small training set of mammalian or fly CRMs that drive similar gene expression profiles, Imogene determines de novo cis-regulatory motifs that underlie this co-expression. It can then predict on a genome-wide scale other CRMs with a regulatory potential similar to the training set. Imogene bypasses the need of large datasets for statistical analyses by making central use of the information provided by the sequenced genomes of multiple species, based on the developed statistical tools and explicit models for transcription factor binding site evolution. We test Imogene on characterized tissue-specific mouse developmental CRMs. Its ability to identify CRMs with the same specificity based on its de novo created motifs is comparable to that of previously evaluated ‘motif-blind’ methods. We further show, both in flies and in mammals, that Imogene de novo generated motifs are sufficient to discriminate CRMs related to different developmental programs. Notably, purely relying on sequence data, Imogene performs as well in this discrimination task as a previously reported learning algorithm based on Chromatin Immunoprecipitation (ChIP) data for multiple transcription factors at multiple developmental stages. PMID:24682824

  4. On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

    NASA Astrophysics Data System (ADS)

    Tarpine, Ryan; Istrail, Sorin

    The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.

  5. Bacterial regulon modeling and prediction based on systematic cis regulatory motif analyses.

    PubMed

    Liu, Bingqiang; Zhou, Chuan; Li, Guojun; Zhang, Hanyuan; Zeng, Erliang; Liu, Qi; Ma, Qin

    2016-01-01

    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria. PMID:26975728

  6. Bacterial regulon modeling and prediction based on systematic cis regulatory motif analyses

    NASA Astrophysics Data System (ADS)

    Liu, Bingqiang; Zhou, Chuan; Li, Guojun; Zhang, Hanyuan; Zeng, Erliang; Liu, Qi; Ma, Qin

    2016-03-01

    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria.

  7. Bacterial regulon modeling and prediction based on systematic cis regulatory motif analyses

    PubMed Central

    Liu, Bingqiang; Zhou, Chuan; Li, Guojun; Zhang, Hanyuan; Zeng, Erliang; Liu, Qi; Ma, Qin

    2016-01-01

    Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria. PMID:26975728

  8. Mutagenesis of GATA motifs controlling the endoderm regulator elt-2 reveals distinct dominant and secondary cis-regulatory elements.

    PubMed

    Du, Lawrence; Tracy, Sharon; Rifkin, Scott A

    2016-04-01

    Cis-regulatory elements (CREs) are crucial links in developmental gene regulatory networks, but in many cases, it can be difficult to discern whether similar CREs are functionally equivalent. We found that despite similar conservation and binding capability to upstream activators, different GATA cis-regulatory motifs within the promoter of the C. elegans endoderm regulator elt-2 play distinctive roles in activating and modulating gene expression throughout development. We fused wild-type and mutant versions of the elt-2 promoter to a gfp reporter and inserted these constructs as single copies into the C. elegans genome. We then counted early embryonic gfp transcripts using single-molecule RNA FISH (smFISH) and quantified gut GFP fluorescence. We determined that a single primary dominant GATA motif located 527bp upstream of the elt-2 start codon was necessary for both embryonic activation and later maintenance of transcription, while nearby secondary GATA motifs played largely subtle roles in modulating postembryonic levels of elt-2. Mutation of the primary activating site increased low-level spatiotemporally ectopic stochastic transcription, indicating that this site acts repressively in non-endoderm cells. Our results reveal that CREs with similar GATA factor binding affinities in close proximity can play very divergent context-dependent roles in regulating the expression of a developmentally critical gene in vivo. PMID:26896592

  9. Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs

    PubMed Central

    2012-01-01

    Background Researchers seeking to unlock the genetic basis of human physiology and diseases have been studying gene transcription regulation. The temporal and spatial patterns of gene expression are controlled by mainly non-coding elements known as cis-regulatory modules (CRMs) and epigenetic factors. CRMs modulating related genes share the regulatory signature which consists of transcription factor (TF) binding sites (TFBSs). Identifying such CRMs is a challenging problem due to the prohibitive number of sequence sets that need to be analyzed. Results We formulated the challenge as a supervised classification problem even though experimentally validated CRMs were not required. Our efforts resulted in a software system named CrmMiner. The system mines for CRMs in the vicinity of related genes. CrmMiner requires two sets of sequences: a mixed set and a control set. Sequences in the vicinity of the related genes comprise the mixed set, whereas the control set includes random genomic sequences. CrmMiner assumes that a large percentage of the mixed set is made of background sequences that do not include CRMs. The system identifies pairs of closely located motifs representing vertebrate TFBSs that are enriched in the training mixed set consisting of 50% of the gene loci. In addition, CrmMiner selects a group of the enriched pairs to represent the tissue-specific regulatory signature. The mixed and the control sets are searched for candidate sequences that include any of the selected pairs. Next, an optimal Bayesian classifier is used to distinguish candidates found in the mixed set from their control counterparts. Our study proposes 62 tissue-specific regulatory signatures and putative CRMs for different human tissues and cell types. These signatures consist of assortments of ubiquitously expressed TFs and tissue-specific TFs. Under controlled settings, CrmMiner identified known CRMs in noisy sets up to 1:25 signal-to-noise ratio. CrmMiner was 21-75% more precise than a

  10. Systems analysis of cis-regulatory motifs in C4 photosynthesis genes using maize and rice leaf transcriptomic data during a process of de-etiolation

    PubMed Central

    Xu, Jiajia; Bräutigam, Andrea; Weber, Andreas P. M.; Zhu, Xin-Guang

    2016-01-01

    Identification of potential cis-regulatory motifs controlling the development of C4 photosynthesis is a major focus of current research. In this study, we used time-series RNA-seq data collected from etiolated maize and rice leaf tissues sampled during a de-etiolation process to systematically characterize the expression patterns of C4-related genes and to further identify potential cis elements in five different genomic regions (i.e. promoter, 5′UTR, 3′UTR, intron, and coding sequence) of C4 orthologous genes. The results demonstrate that although most of the C4 genes show similar expression patterns, a number of them, including chloroplast dicarboxylate transporter 1, aspartate aminotransferase, and triose phosphate transporter, show shifted expression patterns compared with their C3 counterparts. A number of conserved short DNA motifs between maize C4 genes and their rice orthologous genes were identified not only in the promoter, 5′UTR, 3′UTR, and coding sequences, but also in the introns of core C4 genes. We also identified cis-regulatory motifs that exist in maize C4 genes and also in genes showing similar expression patterns as maize C4 genes but that do not exist in rice C3 orthologs, suggesting a possible recruitment of pre-existing cis-elements from genes unrelated to C4 photosynthesis into C4 photosynthesis genes during C4 evolution. PMID:27436282

  11. Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules

    PubMed Central

    Boeva, Valentina; Clément, Julien; Régnier, Mireille; Roytberg, Mikhail A; Makeev, Vsevolod J

    2007-01-01

    Background cis-Regulatory modules (CRMs) of eukaryotic genes often contain multiple binding sites for transcription factors. The phenomenon that binding sites form clusters in CRMs is exploited in many algorithms to locate CRMs in a genome. This gives rise to the problem of calculating the statistical significance of the event that multiple sites, recognized by different factors, would be found simultaneously in a text of a fixed length. The main difficulty comes from overlapping occurrences of motifs. So far, no tools have been developed allowing the computation of p-values for simultaneous occurrences of different motifs which can overlap. Results We developed and implemented an algorithm computing the p-value that s different motifs occur respectively k1, ..., ks or more times, possibly overlapping, in a random text. Motifs can be represented with a majority of popular motif models, but in all cases, without indels. Zero or first order Markov chains can be adopted as a model for the random text. The computational tool was tested on the set of cis-regulatory modules involved in D. melanogaster early development, for which there exists an annotation of binding sites for transcription factors. Our test allowed us to correctly identify transcription factors cooperatively/competitively binding to DNA. Method The algorithm that precisely computes the probability of simultaneous motif occurrences is inspired by the Aho-Corasick automaton and employs a prefix tree together with a transition function. The algorithm runs with the O(n|Σ|(m|ℋ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFlecsaaa@3762@| + K|σ|K) ∏i ki) time complexity, where n is the length of the text, |Σ| is the alphabet size, m is the maximal motif length, |

  12. Promoter analysis reveals cis-regulatory motifs associated with the expression of the WRKY transcription factor CrWRKY1 in Catharanthus roseus.

    PubMed

    Yang, Zhirong; Patra, Barunava; Li, Runzhi; Pattanaik, Sitakanta; Yuan, Ling

    2013-12-01

    WRKY transcription factors (TFs) are emerging as an important group of regulators of plant secondary metabolism. However, the cis-regulatory elements associated with their regulation have not been well characterized. We have previously demonstrated that CrWRKY1, a member of subgroup III of the WRKY TF family, regulates biosynthesis of terpenoid indole alkaloids in the ornamental and medicinal plant, Catharanthus roseus. Here, we report the isolation and functional characterization of the CrWRKY1 promoter. In silico analysis of the promoter sequence reveals the presence of several potential TF binding motifs, indicating the involvement of additional TFs in the regulation of the TIA pathway. The CrWRKY1 promoter can drive the expression of a β-glucuronidase (GUS) reporter gene in native (C. roseus protoplasts and transgenic hairy roots) and heterologous (transgenic tobacco seedlings) systems. Analysis of 5'- or 3'-end deletions indicates that the sequence located between positions -140 to -93 bp and -3 to +113 bp, relative to the transcription start site, is critical for promoter activity. Mutation analysis shows that two overlapping as-1 elements and a CT-rich motif contribute significantly to promoter activity. The CrWRKY1 promoter is induced in response to methyl jasmonate (MJ) treatment and the promoter region between -230 and -93 bp contains a putative MJ-responsive element. The CrWRKY1 promoter can potentially be used as a tool to isolate novel TFs involved in the regulation of the TIA pathway. PMID:23979312

  13. In planta analysis of a cis-regulatory cytokinin response motif in Arabidopsis and identification of a novel enhancer sequence.

    PubMed

    Ramireddy, Eswarayya; Brenner, Wolfram G; Pfeifer, Andreas; Heyl, Alexander; Schmülling, Thomas

    2013-07-01

    The phytohormone cytokinin plays a key role in regulating plant growth and development, and is involved in numerous physiological responses to environmental changes. The type-B response regulators, which regulate the transcription of cytokinin response genes, are a part of the cytokinin signaling system. Arabidopsis thaliana encodes 11 type-B response regulators (type-B ARRs), and some of them were shown to bind in vitro to the core cytokinin response motif (CRM) 5'-(A/G)GAT(T/C)-3' or, in the case of ARR1, to an extended motif (ECRM), 5'-AAGAT(T/C)TT-3'. Here we obtained in planta proof for the functionality of the latter motif. Promoter deletion analysis of the primary cytokinin response gene ARR6 showed that a combination of two extended motifs within the promoter is required to mediate the full transcriptional activation by ARR1 and other type-B ARRs. CRMs were found to be over-represented in the vicinity of ECRMs in the promoters of cytokinin-regulated genes, suggesting their functional relevance. Moreover, an evolutionarily conserved 27 bp long T-rich region between -220 and -193 bp was identified and shown to be required for the full activation by type-B ARRs and the response to cytokinin. This novel enhancer is not bound by the DNA-binding domain of ARR1, indicating that additional proteins might be involved in mediating the transcriptional cytokinin response. Furthermore, genome-wide expression profiling identified genes, among them ARR16, whose induction by cytokinin depends on both ARR1 and other specific type-B ARRs. This together with the ECRM/CRM sequence clustering indicates cooperative action of different type-B ARRs for the activation of particular target genes. PMID:23620480

  14. Modeling DNA sequence-based cis-regulatory gene networks.

    PubMed

    Bolouri, Hamid; Davidson, Eric H

    2002-06-01

    Gene network analysis requires computationally based models which represent the functional architecture of regulatory interactions, and which provide directly testable predictions. The type of model that is useful is constrained by the particular features of developmentally active cis-regulatory systems. These systems function by processing diverse regulatory inputs, generating novel regulatory outputs. A computational model which explicitly accommodates this basic concept was developed earlier for the cis-regulatory system of the endo16 gene of the sea urchin. This model represents the genetically mandated logic functions that the system executes, but also shows how time-varying kinetic inputs are processed in different circumstances into particular kinetic outputs. The same basic design features can be utilized to construct models that connect the large number of cis-regulatory elements constituting developmental gene networks. The ultimate aim of the network models discussed here is to represent the regulatory relationships among the genomic control systems of the genes in the network, and to state their functional meaning. The target site sequences of the cis-regulatory elements of these genes constitute the physical basis of the network architecture. Useful models for developmental regulatory networks must represent the genetic logic by which the system operates, but must also be capable of explaining the real time dynamics of cis-regulatory response as kinetic input and output data become available. Most importantly, however, such models must display in a direct and transparent manner fundamental network design features such as intra- and intercellular feedback circuitry; the sources of parallel inputs into each cis-regulatory element; gene battery organization; and use of repressive spatial inputs in specification and boundary formation. Successful network models lead to direct tests of key architectural features by targeted cis-regulatory analysis. PMID

  15. Experimental validation of predicted mammalian erythroid cis-regulatory modules

    PubMed Central

    Wang, Hao; Zhang, Ying; Cheng, Yong; Zhou, Yuepin; King, David C.; Taylor, James; Chiaromonte, Francesca; Kasturi, Jyotsna; Petrykowska, Hanna; Gibb, Brian; Dorman, Christine; Miller, Webb; Dore, Louis C.; Welch, John; Weiss, Mitchell J.; Hardison, Ross C.

    2006-01-01

    Multiple alignments of genome sequences are helpful guides to functional analysis, but predicting cis-regulatory modules (CRMs) accurately from such alignments remains an elusive goal. We predict CRMs for mammalian genes expressed in red blood cells by combining two properties gleaned from aligned, noncoding genome sequences: a positive regulatory potential (RP) score, which detects similarity to patterns in alignments distinctive for regulatory regions, and conservation of a binding site motif for the essential erythroid transcription factor GATA-1. Within eight target loci, we tested 75 noncoding segments by reporter gene assays in transiently transfected human K562 cells and/or after site-directed integration into murine erythroleukemia cells. Segments with a high RP score and a conserved exact match to the binding site consensus are validated at a good rate (50%–100%, with rates increasing at higher RP), whereas segments with lower RP scores or nonconsensus binding motifs tend to be inactive. Active DNA segments were shown to be occupied by GATA-1 protein by chromatin immunoprecipitation, whereas sites predicted to be inactive were not occupied. We verify four previously known erythroid CRMs and identify 28 novel ones. Thus, high RP in combination with another feature of a CRM, such as a conserved transcription factor binding site, is a good predictor of functional CRMs. Genome-wide predictions based on RP and a large set of well-defined transcription factor binding sites are available through servers at http://www.bx.psu.edu/. PMID:17038566

  16. A method for using direct injection of plasmid DNA to study cis-regulatory element activity in F0 Xenopus embryos and tadpoles.

    PubMed

    Wang, Chen; Szaro, Ben G

    2015-02-01

    The ability to express exogenous reporter genes in intact, externally developing embryos, such as Xenopus, is a powerful tool for characterizing the activity of cis-regulatory gene elements during development. Although methods exist for generating transgenic Xenopus lines, more simplified methods for use with F0 animals would significantly speed the characterization of these elements. We discovered that injecting 2-cell stage embryos with a plasmid bearing a ϕC31 integrase-targeted attB element and two dual β-globin HS4 insulators flanking a reporter transgene in opposite orientations relative to each other yielded persistent expression with sufficiently high penetrance for characterizing the activity of the promoter without having to coinject integrase RNA. Expression began appropriately during development and persisted into swimming tadpole stages without perturbing the expression of the cognate endogenous gene. Coinjected plasmids having the same elements but expressing different reporter proteins were reliably coexpressed within the same cells, providing a useful control for variations in injections between animals. To overcome the high propensity of these plasmids to undergo recombination, we developed a method for generating them using conventional cloning methods and DH5α cells for propagation. We conclude that this method offers a convenient and reliable way to evaluate the activity of cis-regulatory gene elements in the intact F0 embryo. PMID:25448690

  17. The role of cis regulatory evolution in maize domestication.

    PubMed

    Lemmon, Zachary H; Bukowski, Robert; Sun, Qi; Doebley, John F

    2014-11-01

    Gene expression differences between divergent lineages caused by modification of cis regulatory elements are thought to be important in evolution. We assayed genome-wide cis and trans regulatory differences between maize and its wild progenitor, teosinte, using deep RNA sequencing in F1 hybrid and parent inbred lines for three tissue types (ear, leaf and stem). Pervasive regulatory variation was observed with approximately 70% of ∼17,000 genes showing evidence of regulatory divergence between maize and teosinte. However, many fewer genes (1,079 genes) show consistent cis differences with all sampled maize and teosinte lines. For ∼70% of these 1,079 genes, the cis differences are specific to a single tissue. The number of genes with cis regulatory differences is greatest for ear tissue, which underwent a drastic transformation in form during domestication. As expected from the domestication bottleneck, maize possesses less cis regulatory variation than teosinte with this deficit greatest for genes showing maize-teosinte cis regulatory divergence, suggesting selection on cis regulatory differences during domestication. Consistent with selection on cis regulatory elements, genes with cis effects correlated strongly with genes under positive selection during maize domestication and improvement, while genes with trans regulatory effects did not. We observed a directional bias such that genes with cis differences showed higher expression of the maize allele more often than the teosinte allele, suggesting domestication favored up-regulation of gene expression. Finally, this work documents the cis and trans regulatory changes between maize and teosinte in over 17,000 genes for three tissues. PMID:25375861

  18. Cis-regulatory mutations in human disease

    PubMed Central

    2009-01-01

    Cis-acting regulatory sequences are required for the proper temporal and spatial control of gene expression. Variation in gene expression is highly heritable and a significant determinant of human disease susceptibility. The diversity of human genetic diseases attributed, in whole or in part, to mutations in non-coding regulatory sequences is on the rise. Improvements in genome-wide methods of associating genetic variation with human disease and predicting DNA with cis-regulatory potential are two of the major reasons for these recent advances. This review will highlight select examples from the literature that have successfully integrated genetic and genomic approaches to uncover the molecular basis by which cis-regulatory mutations alter gene expression and contribute to human disease. The fine mapping of disease-causing variants has led to the discovery of novel cis-acting regulatory elements that, in some instances, are located as far away as 1.5 Mb from the target gene. In other cases, the prior knowledge of the regulatory landscape surrounding the gene of interest aided in the selection of enhancers for mutation screening. The success of these studies should provide a framework for following up on the large number of genome-wide association studies that have identified common variants in non-coding regions of the genome that associate with increased risk of human diseases including, diabetes, autism, Crohn's, colorectal cancer, and asthma, to name a few. PMID:19641089

  19. Characterization of Putative cis-Regulatory Elements in Genes Preferentially Expressed in Arabidopsis Male Meiocytes

    PubMed Central

    Li, Mingjun

    2014-01-01

    Meiosis is essential for plant reproduction because it is the process during which homologous chromosome pairing, synapsis, and meiotic recombination occur. The meiotic transcriptome is difficult to investigate because of the size of meiocytes and the confines of anther lobes. The recent development of isolation techniques has enabled the characterization of transcriptional profiles in male meiocytes of Arabidopsis. Gene expression in male meiocytes shows unique features. The direct interaction of transcription factors (TFs) with DNA regulatory sequences forms the basis for the specificity of transcriptional regulation. Here, we identified putative cis-regulatory elements (CREs) associated with male meiocyte-expressed genes using in silico tools. The upstream regions (1 kb) of the top 50 genes preferentially expressed in Arabidopsis meiocytes possessed conserved motifs. These motifs are putative binding sites of TFs, some of which share common functions, such as roles in cell division. In combination with cell-type-specific analysis, our findings could be a substantial aid for the identification and experimental verification of the protein-DNA interactions for the specific TFs that drive gene expression in meiocytes. PMID:25250331

  20. Discovering cis-regulatory RNAs in Shewanella genomes by Support Vector Machines.

    PubMed

    Xu, Xing; Ji, Yongmei; Stormo, Gary D

    2009-04-01

    An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5' untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5'-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs

  1. Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

    PubMed Central

    Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

    2011-01-01

    We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875

  2. Global identification of the genetic networks and cis-regulatory elements of the cold response in zebrafish

    PubMed Central

    Hu, Peng; Liu, Mingli; Zhang, Dong; Wang, Jinfeng; Niu, Hongbo; Liu, Yimeng; Wu, Zhichao; Han, Bingshe; Zhai, Wanying; Shen, Yu; Chen, Liangbiao

    2015-01-01

    The transcriptional programs of ectothermic teleosts are directly influenced by water temperature. However, the cis- and trans-factors governing cold responses are not well characterized. We profiled transcriptional changes in eight zebrafish tissues exposed to mildly and severely cold temperatures using RNA-Seq. A total of 1943 differentially expressed genes (DEGs) were identified, from which 34 clusters representing distinct tissue and temperature response expression patterns were derived using the k-means fuzzy clustering algorithm. The promoter regions of the clustered DEGs that demonstrated strong co-regulation were analysed for enriched cis-regulatory elements with a motif discovery program, DREME. Seventeen motifs, ten known and seven novel, were identified, which covered 23% of the DEGs. Two motifs predicted to be the binding sites for the transcription factors Bcl6 and Jun, respectively, were chosen for experimental verification, and they demonstrated the expected cold-induced and cold-repressed patterns of gene regulation. Protein interaction modeling of the network components followed by experimental validation suggested that Jun physically interacts with Bcl6 and might be a hub factor that orchestrates the cold response in zebrafish. Thus, the methodology used and the regulatory networks uncovered in this study provide a foundation for exploring the mechanisms of cold adaptation in teleosts. PMID:26227973

  3. A genome-wide cis-regulatory element discovery method based on promoter sequences and gene co-expression networks

    PubMed Central

    2013-01-01

    Background Deciphering cis-regulatory networks has become an attractive yet challenging task. This paper presents a simple method for cis-regulatory network discovery which aims to avoid some of the common problems of previous approaches. Results Using promoter sequences and gene expression profiles as input, rather than clustering the genes by the expression data, our method utilizes co-expression neighborhood information for each individual gene, thereby overcoming the disadvantages of current clustering based models which may miss specific information for individual genes. In addition, rather than using a motif database as an input, it implements a simple motif count table for each enumerated k-mer for each gene promoter sequence. Thus, it can be used for species where previous knowledge of cis-regulatory motifs is unknown and has the potential to discover new transcription factor binding sites. Applications on Saccharomyces cerevisiae and Arabidopsis have shown that our method has a good prediction accuracy and outperforms a phylogenetic footprinting approach. Furthermore, the top ranked gene-motif regulatory clusters are evidently functionally co-regulated, and the regulatory relationships between the motifs and the enriched biological functions can often be confirmed by literature. Conclusions Since this method is simple and gene-specific, it can be readily utilized for insufficiently studied species or flexibly used as an additional step or data source for previous transcription regulatory networks discovery models. PMID:23368633

  4. A Cis-Regulatory Map of the Drosophila Genome

    PubMed Central

    Nègre, Nicolas; Brown, Christopher D.; Ma, Lijia; Bristow, Christopher Aaron; Miller, Steven W.; Wagner, Ulrich; Kheradpour, Pouya; Eaton, Matthew L.; Loriaux, Paul; Sealfon, Rachel; Li, Zirong; Ishii, Haruhiko; Spokony, Rebecca F.; Chen, Jia; Hwang, Lindsay; Cheng, Chao; Auburn, Richard P.; Davis, Melissa B.; Domanus, Marc; Shah, Parantu K.; Morrison, Carolyn A.; Zieba, Jennifer; Suchy, Sarah; Senderowicz, Lionel; Victorsen, Alec; Bild, Nicholas A.; Grundstad, A. Jason; Hanley, David; MacAlpine, David M.; Mannervik, Mattias; Venken, Koen; Bellen, Hugo; White, Robert; Russell, Steven; Grossman, Robert L.; Ren, Bing; Gerstein, Mark; Posakony, James W.; Kellis, Manolis; White, Kevin P.

    2011-01-01

    Systematic annotation of gene regulatory elements is a major challenge in genome science. Direct mapping of chromatin modification marks and transcriptional factor binding sites genome-wide 1,2 has successfully identified specific subtypes of regulatory elements 3. In Drosophila several pioneering studies have provided genome-wide identification of Polycomb-Response Elements 4, chromatin states 5, transcription factor binding sites (TFBS) 6–9, PolII regulation 8, and insulator elements 10; however, comprehensive annotation of the regulatory genome remains a significant challenge. Here we describe results from the modENCODE cis-regulatory annotation project. We produced a map of the Drosophila melanogaster regulatory genome based on more than 300 chromatin immuno-precipitation (ChIP) datasets for eight chromatin features, five histone deacetylases (HDACs) and thirty-eight site-specific transcription factors (TFs) at different stages of development. Using these data we inferred more than 20,000 candidate regulatory elements and we validated a subset of predictions for promoters, enhancers, and insulators in vivo. We also identified nearly 2,000 genomic regions of dense TF binding associated with chromatin activity and accessibility. We discovered hundreds of new TF co-binding relationships and defined a TF network with over 800 potential regulatory relationships. PMID:21430782

  5. Abundant raw material for cis-regulatory evolution in humans

    NASA Technical Reports Server (NTRS)

    Rockman, Matthew V.; Wray, Gregory A.

    2002-01-01

    Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.

  6. A Computational Pipeline for High- Throughput Discovery of cis-Regulatory Noncoding RNA in Prokaryotes

    PubMed Central

    Yao, Zizhen; Barrick, Jeffrey; Weinberg, Zasha; Neph, Shane; Breaker, Ronald; Tompa, Martin; Ruzzo, Walter L

    2007-01-01

    Noncoding RNAs (ncRNAs) are important functional RNAs that do not code for proteins. We present a highly efficient computational pipeline for discovering cis-regulatory ncRNA motifs de novo. The pipeline differs from previous methods in that it is structure-oriented, does not require a multiple-sequence alignment as input, and is capable of detecting RNA motifs with low sequence conservation. We also integrate RNA motif prediction with RNA homolog search, which improves the quality of the RNA motifs significantly. Here, we report the results of applying this pipeline to Firmicute bacteria. Our top-ranking motifs include most known Firmicute elements found in the RNA family database (Rfam). Comparing our motif models with Rfam's hand-curated motif models, we achieve high accuracy in both membership prediction and base-pair–level secondary structure prediction (at least 75% average sensitivity and specificity on both tasks). Of the ncRNA candidates not in Rfam, we find compelling evidence that some of them are functional, and analyze several potential ribosomal protein leaders in depth. PMID:17616982

  7. Cis-regulatory architecture of a brain signaling center predates the origin of chordates.

    PubMed

    Yao, Yao; Minor, Paul J; Zhao, Ying-Tao; Jeong, Yongsu; Pani, Ariel M; King, Anna N; Symmons, Orsolya; Gan, Lin; Cardoso, Wellington V; Spitz, François; Lowe, Christopher J; Epstein, Douglas J

    2016-05-01

    Genomic approaches have predicted hundreds of thousands of tissue-specific cis-regulatory sequences, but the determinants critical to their function and evolutionary history are mostly unknown. Here we systematically decode a set of brain enhancers active in the zona limitans intrathalamica (zli), a signaling center essential for vertebrate forebrain development via the secreted morphogen Sonic hedgehog (Shh). We apply a de novo motif analysis tool to identify six position-independent sequence motifs together with their cognate transcription factors that are essential for zli enhancer activity and Shh expression in the mouse embryo. Using knowledge of this regulatory lexicon, we discover new Shh zli enhancers in mice and a functionally equivalent element in hemichordates, indicating an ancient origin of the Shh zli regulatory network that predates the chordate phylum. These findings support a strategy for delineating functionally conserved enhancers in the absence of overt sequence homologies and over extensive evolutionary distances. PMID:27064252

  8. Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes

    PubMed Central

    Zhang, Shaoqiang; Xu, Minli; Su, Zhengchang

    2009-01-01

    Although cis-regulatory binding sites (CRBSs) are at least as important as the coding sequences in a genome, our general understanding of them in most sequenced genomes is very limited due to the lack of efficient and accurate experimental and computational methods for their characterization, which has largely hindered our understanding of many important biological processes. In this article, we describe a novel algorithm for genome-wide de novo prediction of CRBSs with high accuracy. We designed our algorithm to circumvent three identified difficulties for CRBS prediction using comparative genomics principles based on a new method for the selection of reference genomes, a new metric for measuring the similarity of CRBSs, and a new graph clustering procedure. When operon structures are correctly predicted, our algorithm can predict 81% of known individual binding sites belonging to 94% of known cis-regulatory motifs in the Escherichia coli K12 genome, while achieving high prediction specificity. Our algorithm has also achieved similar prediction accuracy in the Bacillus subtilis genome, suggesting that it is very robust, and thus can be applied to any other sequenced prokaryotic genome. When compared with the prior state-of-the-art algorithms, our algorithm outperforms them in both prediction sensitivity and specificity. PMID:19383880

  9. CREME: Cis-Regulatory Module Explorer for the Human Genome

    SciTech Connect

    Loots, G G; Sharan, R; Ovcharenko, I; Ben-Hur, A

    2004-02-11

    The binding of transcription factors to specific regulatory sequence elements is a primary mechanism for controlling gene transcription. Eukaryotic genes are often regulated by several transcription factors, whose binding sites are tightly clustered and form cis-regulatory modules. In this paper we present a web-server, CREME, for identifying and visualizing cis-regulatory modules in the promoter regions of a given set of potentially co-regulated genes. CREME relies on a database of putative transcription factor binding sites that have been annotated across the human genome using a library of position weight matrices and evolutionary conservation with the mouse and rat genomes. A search algorithm is applied to this dataset to identify combinations of transcription factors whose binding sites tend to co-occur in close proximity in the promoter regions of the input gene set. The identified cis-regulatory modules are statistically scored and significant combinations are reported and graphically visualized. Our web-server is available at http://creme.dcode.org/.

  10. cis-Regulatory control of the initial neurogenic pattern of onecut gene expression in the sea urchin embryo.

    PubMed

    Barsi, Julius C; Davidson, Eric H

    2016-01-01

    Specification of the ciliated band (CB) of echinoid embryos executes three spatial functions essential for postgastrular organization. These are establishment of a band about 5 cells wide which delimits and bounds other embryonic territories; definition of a neurogenic domain within this band; and generation within it of arrays of ciliary cells that bear the special long cilia from which the structure derives its name. In Strongylocentrotus purpuratus the spatial coordinates of the future ciliated band are initially and exactly determined by the disposition of a ring of cells that transcriptionally activate the onecut homeodomain regulatory gene, beginning in blastula stage, long before the appearance of the CB per se. Thus the cis-regulatory apparatus that governs onecut expression in the blastula directly reveals the genomic sequence code by which these aspects of the spatial organization of the embryo are initially determined. We screened the entire onecut locus and its flanking region for transcriptionally active cis-regulatory elements, and by means of BAC recombineered deletions identified three separated and required cis-regulatory modules that execute different functions. The operating logic of the crucial spatial control module accounting for the spectacularly precise and beautiful early onecut expression domain depends on spatial repression. Previously predicted oral ectoderm and aboral ectoderm repressors were identified by cis-regulatory mutation as the products of goosecoid and irxa genes respectively, while the pan-ectodermal activator SoxB1 supplies a transcriptional driver function. PMID:26522848

  11. A primer on regression methods for decoding cis-regulatory logic

    SciTech Connect

    Das, Debopriya; Pellegrini, Matteo; Gray, Joe W.

    2009-03-03

    The rapidly emerging field of systems biology is helping us to understand the molecular determinants of phenotype on a genomic scale [1]. Cis-regulatory elements are major sequence-based determinants of biological processes in cells and tissues [2]. For instance, during transcriptional regulation, transcription factors (TFs) bind to very specific regions on the promoter DNA [2,3] and recruit the basal transcriptional machinery, which ultimately initiates mRNA transcription (Figure 1A). Learning cis-Regulatory Elements from Omics Data A vast amount of work over the past decade has shown that omics data can be used to learn cis-regulatory logic on a genome-wide scale [4-6]--in particular, by integrating sequence data with mRNA expression profiles. The most popular approach has been to identify over-represented motifs in promoters of genes that are coexpressed [4,7,8]. Though widely used, such an approach can be limiting for a variety of reasons. First, the combinatorial nature of gene regulation is difficult to explicitly model in this framework. Moreover, in many applications of this approach, expression data from multiple conditions are necessary to obtain reliable predictions. This can potentially limit the use of this method to only large data sets [9]. Although these methods can be adapted to analyze mRNA expression data from a pair of biological conditions, such comparisons are often confounded by the fact that primary and secondary response genes are clustered together--whereas only the primary response genes are expected to contain the functional motifs [10]. A set of approaches based on regression has been developed to overcome the above limitations [11-32]. These approaches have their foundations in certain biophysical aspects of gene regulation [26,33-35]. That is, the models are motivated by the expected transcriptional response of genes due to the binding of TFs to their promoters. While such methods have gathered popularity in the computational domain

  12. Epistatic Interactions in the Arabinose Cis-Regulatory Element

    PubMed Central

    Lagator, Mato; Igler, Claudia; Moreno, Anaísa B.; Guet, Călin C.; Bollback, Jonathan P.

    2016-01-01

    Changes in gene expression are an important mode of evolution; however, the proximate mechanism of these changes is poorly understood. In particular, little is known about the effects of mutations within cis binding sites for transcription factors, or the nature of epistatic interactions between these mutations. Here, we tested the effects of single and double mutants in two cis binding sites involved in the transcriptional regulation of the Escherichia coli araBAD operon, a component of arabinose metabolism, using a synthetic system. This system decouples transcriptional control from any posttranslational effects on fitness, allowing a precise estimate of the effect of single and double mutations, and hence epistasis, on gene expression. We found that epistatic interactions between mutations in the araBAD cis-regulatory element are common, and that the predominant form of epistasis is negative. The magnitude of the interactions depended on whether the mutations are located in the same or in different operator sites. Importantly, these epistatic interactions were dependent on the presence of arabinose, a native inducer of the araBAD operon in vivo, with some interactions changing in sign (e.g., from negative to positive) in its presence. This study thus reveals that mutations in even relatively simple cis-regulatory elements interact in complex ways such that selection on the level of gene expression in one environment might perturb regulation in the other environment in an unpredictable and uncorrelated manner. PMID:26589997

  13. Epistatic Interactions in the Arabinose Cis-Regulatory Element.

    PubMed

    Lagator, Mato; Igler, Claudia; Moreno, Anaísa B; Guet, Călin C; Bollback, Jonathan P

    2016-03-01

    Changes in gene expression are an important mode of evolution; however, the proximate mechanism of these changes is poorly understood. In particular, little is known about the effects of mutations within cis binding sites for transcription factors, or the nature of epistatic interactions between these mutations. Here, we tested the effects of single and double mutants in two cis binding sites involved in the transcriptional regulation of the Escherichia coli araBAD operon, a component of arabinose metabolism, using a synthetic system. This system decouples transcriptional control from any posttranslational effects on fitness, allowing a precise estimate of the effect of single and double mutations, and hence epistasis, on gene expression. We found that epistatic interactions between mutations in the araBAD cis-regulatory element are common, and that the predominant form of epistasis is negative. The magnitude of the interactions depended on whether the mutations are located in the same or in different operator sites. Importantly, these epistatic interactions were dependent on the presence of arabinose, a native inducer of the araBAD operon in vivo, with some interactions changing in sign (e.g., from negative to positive) in its presence. This study thus reveals that mutations in even relatively simple cis-regulatory elements interact in complex ways such that selection on the level of gene expression in one environment might perturb regulation in the other environment in an unpredictable and uncorrelated manner. PMID:26589997

  14. Evolution of lineage-specific functions in ancient cis-regulatory modules.

    PubMed

    Pauls, Stefan; Goode, Debbie K; Petrone, Libero; Oliveri, Paola; Elgar, Greg

    2015-11-01

    Morphological evolution is driven both by coding sequence variation and by changes in regulatory sequences. However, how cis-regulatory modules (CRMs) evolve to generate entirely novel expression domains is largely unknown. Here, we reconstruct the evolutionary history of a lens enhancer located within a CRM that not only predates the lens, a vertebrate innovation, but bilaterian animals in general. Alignments of orthologous sequences from different deuterostomes sub-divide the CRM into a deeply conserved core and a more divergent flanking region. We demonstrate that all deuterostome flanking regions, including invertebrate sequences, activate gene expression in the zebrafish lens through the same ancient cluster of activator sites. However, levels of gene expression vary between species due to the presence of repressor motifs in flanking region and core. These repressor motifs are responsible for the relatively weak enhancer activity of tetrapod flanking regions. Ray-finned fish, however, have gained two additional lineage-specific activator motifs which in combination with the ancient cluster of activators and the core constitute a potent lens enhancer. The exploitation and modification of existing regulatory potential in flanking regions but not in the highly conserved core might represent a more general model for the emergence of novel regulatory functions in complex CRMs. PMID:26538567

  15. The Hematopoietic Stem and Progenitor Cell Cistrome: GATA Factor-Dependent cis-Regulatory Mechanisms.

    PubMed

    Hewitt, K J; Johnson, K D; Gao, X; Keles, S; Bresnick, E H

    2016-01-01

    Transcriptional regulators mediate the genesis and function of the hematopoietic system by binding complex ensembles of cis-regulatory elements to establish genetic networks. While thousands to millions of any given cis-element resides in a genome, how transcriptional regulators select these sites and how site attributes dictate functional output is not well understood. An instructive system to address this problem involves the GATA family of transcription factors that control vital developmental and physiological processes and are linked to multiple human pathologies. Although GATA factors bind DNA motifs harboring the sequence GATA, only a very small subset of these abundant motifs are occupied in genomes. Mechanistic studies revealed a unique configuration of a GATA factor-regulated cis-element consisting of an E-box and a downstream GATA motif separated by a short DNA spacer. GATA-1- or GATA-2-containing multiprotein complexes at these composite elements control transcription of genes critical for hematopoietic stem cell emergence in the mammalian embryo, hematopoietic progenitor cell regulation, and erythroid cell maturation. Other constituents of the complex include the basic helix-loop-loop transcription factor Scl/TAL1, its heterodimeric partner E2A, and the Lim domain proteins LMO2 and LDB1. This chapter reviews the structure/function of E-box-GATA composite cis-elements, which collectively constitute an important sector of the hematopoietic stem and progenitor cell cistrome. PMID:27137654

  16. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  17. Identification of tissue-specific cis-regulatory modules based on interactions between transcription factors

    PubMed Central

    Yu, Xueping; Lin, Jimmy; Zack, Donald J; Qian, Jiang

    2007-01-01

    Background Evolutionary conservation has been used successfully to help identify cis-acting DNA regions that are important in regulating tissue-specific gene expression. Motivated by increasing evidence that some DNA regulatory regions are not evolutionary conserved, we have developed an approach for cis-regulatory region identification that does not rely upon evolutionary sequence conservation. Results The conservation-independent approach is based on an empirical potential energy between interacting transcription factors (TFs). In this analysis, the potential energy is defined as a function of the number of TF interactions in a genomic region and the strength of the interactions. By identifying sets of interacting TFs, the analysis locates regions enriched with the binding sites of these interacting TFs. We applied this approach to 30 human tissues and identified 6232 putative cis-regulatory modules (CRMs) regulating 2130 tissue-specific genes. Interestingly, some genes appear to be regulated by different CRMs in different tissues. Known regulatory regions are highly enriched in our predicted CRMs. In addition, DNase I hypersensitive sites, which tend to be associated with active regulatory regions, significantly overlap with the predicted CRMs, but not with more conserved regions. We also find that conserved and non-conserved CRMs regulate distinct gene groups. Conserved CRMs control more essential genes and genes involved in fundamental cellular activities such as transcription. In contrast, non-conserved CRMs, in general, regulate more non-essential genes, such as genes related to neural activity. Conclusion These results demonstrate that identifying relevant sets of binding motifs can help in the mapping of DNA regulatory regions, and suggest that non-conserved CRMs play an important role in gene regulation. PMID:17996093

  18. Computational discovery of soybean promoter cis-regulatory elements for the construction of soybean cyst nematode-inducible synthetic promoters.

    PubMed

    Liu, Wusheng; Mazarei, Mitra; Peng, Yanhui; Fethe, Michael H; Rudis, Mary R; Lin, Jingyu; Millwood, Reginald J; Arelli, Prakash R; Stewart, Charles Neal

    2014-10-01

    Computational methods offer great hope but limited accuracy in the prediction of functional cis-regulatory elements; improvements are needed to enable synthetic promoter design. We applied an ensemble strategy for de novo soybean cyst nematode (SCN)-inducible motif discovery among promoters of 18 co-expressed soybean genes that were selected from six reported microarray studies involving a compatible soybean-SCN interaction. A total of 116 overlapping motif regions (OMRs) were discovered bioinformatically that were identified by at least four out of seven bioinformatic tools. Using synthetic promoters, the inducibility of each OMR or motif itself was evaluated by co-localization of gain of function of an orange fluorescent protein reporter and the presence of SCN in transgenic soybean hairy roots. Among 16 OMRs detected from two experimentally confirmed SCN-inducible promoters, 11 OMRs (i.e. 68.75%) were experimentally confirmed to be SCN-inducible, leading to the discovery of 23 core motifs of 5- to 7-bp length, of which 14 are novel in plants. We found that a combination of the three best tools (i.e. SCOPE, W-AlignACE and Weeder) could detect all 23 core motifs. Thus, this strategy is a high-throughput approach for de novo motif discovery in soybean and offers great potential for novel motif discovery and synthetic promoter engineering for any plant and trait in crop biotechnology. PMID:24893752

  19. Deciphering cis-regulatory control in inflammatory cells.

    PubMed

    Ghisletti, Serena; Natoli, Gioacchino

    2013-01-01

    In innate immune system cells, such as macrophages and dendritic cells, deployment of inducible gene expression programmes in response to microbes and danger signals requires highly precise regulatory mechanisms. The inflammatory response has to be tailored based on both the triggering stimulus and its dose, and it has to be unfolded in a kinetically complex manner that suits the different phases of the inflammatory process. Genomic characterization of regulatory elements in this context indicated that transcriptional regulators involved in macrophage specification act as pioneer transcription factors (TFs) that generate regions of open chromatin that enable the recruitment of TFs activated in response to external inputs. Therefore, competence for responses to a specific stimulus is programmed at an early stage of differentiation by factors involved in lineage commitment and maintenance of cell identity, which are responsible for the organization of a cell-type-specific cis-regulatory repertoire. The basic functional and organizational principles that regulate inflammatory gene expression in professional cells of the innate immune system provide general paradigms on the interplay between differentiation and environmental responses. PMID:23650641

  20. Detailed map of a cis-regulatory input function

    NASA Astrophysics Data System (ADS)

    Setty, Y.; Mayo, A. E.; Surette, M. G.; Alon, U.

    2003-06-01

    Most genes are regulated by multiple transcription factors that bind specific sites in DNA regulatory regions. These cis-regulatory regions perform a computation: the rate of transcription is a function of the active concentrations of each of the input transcription factors. Here, we used accurate gene expression measurements from living cell cultures, bearing GFP reporters, to map in detail the input function of the classic lacZYA operon of Escherichia coli, as a function of about a hundred combinations of its two inducers, cAMP and isopropyl -D-thiogalactoside (IPTG). We found an unexpectedly intricate function with four plateau levels and four thresholds. This result compares well with a mathematical model of the binding of the regulatory proteins cAMP receptor protein (CRP) and LacI to the lac regulatory region. The model is also used to demonstrate that with few mutations, the same region could encode much purer AND-like or even OR-like functions. This possibility means that the wild-type region is selected to perform an elaborate computation in setting the transcription rate. The present approach can be generally used to map the input functions of other genes.

  1. CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining

    PubMed Central

    Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando

    2014-01-01

    Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by

  2. Overview Article: Identifying transcriptional cis-regulatory modules in animal genomes

    PubMed Central

    Suryamohan, Kushal; Halfon, Marc S.

    2014-01-01

    Gene expression is regulated through the activity of transcription factors and chromatin modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily-identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods has led to an explosion of both computational and empirical methods for CRM discovery in model and non-model organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against transcription factors or histone post-translational modifications, identification of nucleosome-depleted “open” chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted transcription factor binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. PMID:25704908

  3. Directed network motifs in Alzheimer's disease and mild cognitive impairment.

    PubMed

    Friedman, Eric J; Young, Karl; Tremper, Graham; Liang, Jason; Landsberg, Adam S; Schuff, Norbert

    2015-01-01

    Directed network motifs are the building blocks of complex networks, such as human brain networks, and capture deep connectivity information that is not contained in standard network measures. In this paper we present the first application of directed network motifs in vivo to human brain networks, utilizing recently developed directed progression networks which are built upon rates of cortical thickness changes between brain regions. This is in contrast to previous studies which have relied on simulations and in vitro analysis of non-human brains. We show that frequencies of specific directed network motifs can be used to distinguish between patients with Alzheimer's disease (AD) and normal control (NC) subjects. Especially interesting from a clinical standpoint, these motif frequencies can also distinguish between subjects with mild cognitive impairment who remained stable over three years (MCI) and those who converted to AD (CONV). Furthermore, we find that the entropy of the distribution of directed network motifs increased from MCI to CONV to AD, implying that the distribution of pathology is more structured in MCI but becomes less so as it progresses to CONV and further to AD. Thus, directed network motifs frequencies and distributional properties provide new insights into the progression of Alzheimer's disease as well as new imaging markers for distinguishing between normal controls, stable mild cognitive impairment, MCI converters and Alzheimer's disease. PMID:25879535

  4. Putative cis-Regulatory Elements Associated with Heat Shock Genes Activated During Excystation of Cryptosporidium parvum

    PubMed Central

    Lara, Ana M.; Serrano, Myrna; Sheth, Nihar; Buck, Gregory

    2010-01-01

    Background Cryptosporidiosis is a ubiquitous infectious disease, caused by the protozoan parasites Cryptosporidium hominis and C. parvum, leading to acute, persistent and chronic diarrhea worldwide. Although the complications of this disease can be serious, even fatal, in immunocompromised patients of any age, they have also been found to lead to long term effects, including growth inhibition and impaired cognitive development, in infected immunocompetent children. The Cryptosporidium life cycle alternates between a dormant stage, the oocyst, and a highly replicative phase that includes both asexual vegetative stages as well as sexual stages, implying fine genetic regulatory mechanisms. The parasite is extremely difficult to study because it cannot be cultured in vitro and animal models are equally challenging. The recent publication of the genome sequence of C. hominis and C. parvum has, however, significantly advanced our understanding of the biology and pathogenesis of this parasite. Methodology/Principal Findings Herein, our goal was to identify cis-regulatory elements associated with heat shock response in Cryptosporidium using a combination of in silico and real time RT-PCR strategies. Analysis with Gibbs-Sampling algorithms of upstream non-translated regions of twelve genes annotated as heat shock proteins in the Cryptosporidium genome identified a highly conserved over-represented sequence motif in eleven of them. RT-PCR analyses, described herein and also by others, show that these eleven genes bearing the putative element are induced concurrent with excystation of parasite oocysts via heat shock. Conclusions/Significance Our analyses suggest that occurrences of a motif identified in the upstream regions of the Cryptosporidium heat shock genes represent parts of the transcriptional apparatus and function as stress response elements that activate expression of these genes during excystation, and possibly at other stages in the life cycle of the parasite

  5. Direct vs 2-stage approaches to structured motif finding

    PubMed Central

    2012-01-01

    Background The notion of DNA motif is a mathematical abstraction used to model regions of the DNA (known as Transcription Factor Binding Sites, or TFBSs) that are bound by a given Transcription Factor to regulate gene expression or repression. In turn, DNA structured motifs are a mathematical counterpart that models sets of TFBSs that work in concert in the gene regulations processes of higher eukaryotic organisms. Typically, a structured motif is composed of an ordered set of isolated (or simple) motifs, separated by a variable, but somewhat constrained number of “irrelevant” base-pairs. Discovering structured motifs in a set of DNA sequences is a computationally hard problem that has been addressed by a number of authors using either a direct approach, or via the preliminary identification and successive combination of simple motifs. Results We describe a computational tool, named SISMA, for the de-novo discovery of structured motifs in a set of DNA sequences. SISMA is an exact, enumerative algorithm, meaning that it finds all the motifs conforming to the specifications. It does so in two stages: first it discovers all the possible component simple motifs, then combines them in a way that respects the given constraints. We developed SISMA mainly with the aim of understanding the potential benefits of such a 2-stage approach w.r.t. direct methods. In fact, no 2-stage software was available for the general problem of structured motif discovery, but only a few tools that solved restricted versions of the problem. We evaluated SISMA against other published tools on a comprehensive benchmark made of both synthetic and real biological datasets. In a significant number of cases, SISMA outperformed the competitors, exhibiting a good performance also in most of the cases in which it was inferior. Conclusions A reflection on the results obtained lead us to conclude that a 2-stage approach can be implemented with many advantages over direct approaches. Some of these

  6. Characterization of a putative cis-regulatory element that controls transcriptional activity of the pig uroplakin II gene promoter

    SciTech Connect

    Kwon, Deug-Nam; Park, Mi-Ryung; Park, Jong-Yi; Cho, Ssang-Goo; Park, Chankyu; Oh, Jae-Wook; Song, Hyuk; Kim, Jae-Hwan; Kim, Jin-Hoi

    2011-07-01

    Highlights: {yields} The sequences of -604 to -84 bp of the pUPII promoter contained the region of a putative negative cis-regulatory element. {yields} The core promoter was located in the 5F-1. {yields} Transcription factor HNF4 can directly bind in the pUPII core promoter region, which plays a critical role in controlling promoter activity. {yields} These features of the pUPII promoter are fundamental to development of a target-specific vector. -- Abstract: Uroplakin II (UPII) is a one of the integral membrane proteins synthesized as a major differentiation product of mammalian urothelium. UPII gene expression is bladder specific and differentiation dependent, but little is known about its transcription response elements and molecular mechanism. To identify the cis-regulatory elements in the pig UPII (pUPII) gene promoter region, we constructed pUPII 5' upstream region deletion mutants and demonstrated that each of the deletion mutants participates in controlling the expression of the pUPII gene in human bladder carcinoma RT4 cells. We also identified a new core promoter region and putative negative cis-regulatory element within a minimal promoter region. In addition, we showed that hepatocyte nuclear factor 4 (HNF4) can directly bind in the pUPII core promoter (5F-1) region, which plays a critical role in controlling promoter activity. Transient cotransfection experiments showed that HNF4 positively regulates pUPII gene promoter activity. Thus, the binding element and its binding protein, HNF4 transcription factor, may be involved in the mechanism that specifically regulates pUPII gene transcription.

  7. Characterization and identification of cis-regulatory elements in Arabidopsis based on single-nucleotide polymorphism information.

    PubMed

    Korkuc, Paula; Schippers, Jos H M; Walther, Dirk

    2014-01-01

    Identifying regulatory elements and revealing their role in gene expression regulation remains a central goal of plant genome research. We exploited the detailed genomic sequencing information of a large number of Arabidopsis (Arabidopsis thaliana) accessions to characterize known and to identify novel cis-regulatory elements in gene promoter regions of Arabidopsis by relying on conservation as the hallmark signal of functional relevance. Based on the genomic layout and the obtained density profiles of single-nucleotide polymorphisms (SNPs) in sequence regions upstream of transcription start sites, the average length of promoter regions in Arabidopsis could be established at 500 bp. Genes associated with high degrees of variability of their respective upstream regions are preferentially involved in environmental response and signaling processes, while low levels of promoter SNP density are common among housekeeping genes. Known cis-elements were found to exhibit a decreased SNP density than sequence regions not associated with known motifs. For 15 known cis-element motifs, strong positional preferences relative to the transcription start site were detected based on their promoter SNP density profiles. Five novel candidate cis-element motifs were identified as consensus motifs of 17 sequence hexamers exhibiting increased sequence conservation combined with evidence of positional preferences, annotation information, and functional relevance for inducing correlated gene expression. Our study demonstrates that the currently available resolution of SNP data offers novel ways for the identification of functional genomic elements and the characterization of gene promoter sequences. PMID:24204023

  8. Identification and Functional Characterization of Cis-Regulatory Elements Controlling Expression of the Porcine ADRB2 Gene

    PubMed Central

    Jaeger, Alexandra; Fritschka, Stephan; Ponsuksili, Siriluck; Wimmers, Klaus; Muráni, Eduard

    2015-01-01

    The beta-2 adrenergic receptor (beta-2 AR) modulates metabolic processes in skeletal muscle, liver, and adipose tissue in response to catecholamine stimulation. We showed previously that expression of the porcine beta-2 AR gene (ADRB2) is affected by cis-regulatory polymorphisms. These are most likely responsible for the association of ADRB2 with economically relevant muscle-related traits in pigs. The present study focused on characterization of promoter elements involved in basal transcriptional regulation of the porcine ADRB2 in different cell types to aid identification of its cis-regulatory polymorphisms. Based on in silico analysis, luciferase reporter gene assays and gel shift assays were performed using COS-7, HepG2, C2C12, and 3T3-L1 cells. Deletion mapping of the 5´ flanking region (-1324 to +33) of ADRB2 revealed the region between -307 and -269 to be the minimal promoter, including regulatory elements essential for the basal transcriptional activity in all four tested cell types. Directly upstream (-400 to -323) we identified an important enhancer element required for maximal promoter activity. In silico analysis and gel shift assays revealed that this GC-rich element harbors two evolutionarily conserved binding sites of Sp1, a constitutive transcriptional activator. Significant transcriptional activation of the porcine ADRB2 promoter was demonstrated by overexpression of Sp1. Our results demonstrate, for the first time, an important role of Sp1 and of the responsive enhancer element in the regulation of ADRB2 expression. Polymorphisms located in this domain of the porcine ADRB2 promoter represent candidate causal cis-regulatory variants. PMID:26221068

  9. BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements

    PubMed Central

    De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan

    2015-01-01

    Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488

  10. The search for cis-regulatory driver mutations in cancer genomes.

    PubMed

    Poulos, Rebecca C; Sloane, Mathew A; Hesson, Luke B; Wong, Jason W H

    2015-10-20

    With the advent of high-throughput and relatively inexpensive whole-genome sequencing technology, the focus of cancer research has begun to shift toward analyses of somatic mutations in non-coding cis-regulatory elements of the cancer genome. Cis-regulatory elements play an important role in gene regulation, with mutations in these elements potentially resulting in changes to the expression of linked genes. The recent discoveries of recurrent TERT promoter mutations in melanoma, and recurrent mutations that create a super-enhancer regulating TAL1 expression in T-cell acute lymphoblastic leukaemia (T-ALL), have sparked significant interest in the search for other somatic cis-regulatory mutations driving cancer development. In this review, we look more closely at the TERT promoter and TAL1 enhancer alterations and use these examples to ask whether other cis-regulatory mutations may play a role in cancer susceptibility. In doing so, we make observations from the data emerging from recent research in this field, and describe the experimental and analytical approaches which could be adopted in the hope of better uncovering the true functional significance of somatic cis-regulatory mutations in cancer. PMID:26356674

  11. Study of Cis-regulatory Elements in the Ascidian Ciona intestinalis.

    PubMed

    Irvine, Steven Q

    2013-03-01

    The ascidian (sea squirt) C. intestinalis has become an important model organism for the study of cis-regulation. This is largely due to the technology that has been developed for assessing cis-regulatory activity through the use of transient reporter transgenes introduced into fertilized eggs. This technique allows the rapid and inexpensive testing of endogenous or altered DNA for regulatory activity in vivo. This review examines evidence that C. intestinalis cis-regulatory elements are located more closely to coding regions than in other model organisms. I go on to compare the organization of cis-regulatory elements and conserved non-coding sequences in Ciona, mammals, and other deuterostomes for three representative C.intestinalis genes, Pax6, FoxAa, and the DlxA-B cluster, along with homologs in the other species. These comparisons point out some of the similarities and differences between cis-regulatory elements and their study in the various model organisms. Finally, I provide illustrations of how C. intestinalis lends itself to detailed study of the structure of cis-regulatory elements, which have led, and promise to continue to lead, to important insights into the fundamentals of transcriptional regulation. PMID:23997651

  12. Complex interactions between cis-regulatory modules in native conformation are critical for Drosophila snail expression.

    PubMed

    Dunipace, Leslie; Ozdemir, Anil; Stathopoulos, Angelike

    2011-09-01

    It has been shown in several organisms that multiple cis-regulatory modules (CRMs) of a gene locus can be active concurrently to support similar spatiotemporal expression. To understand the functional importance of such seemingly redundant CRMs, we examined two CRMs from the Drosophila snail gene locus, which are both active in the ventral region of pre-gastrulation embryos. By performing a deletion series in a ∼25 kb DNA rescue construct using BAC recombineering and site-directed transgenesis, we demonstrate that the two CRMs are not redundant. The distal CRM is absolutely required for viability, whereas the proximal CRM is required only under extreme conditions such as high temperature. Consistent with their distinct requirements, the CRMs support distinct expression patterns: the proximal CRM exhibits an expanded expression domain relative to endogenous snail, whereas the distal CRM exhibits almost complete overlap with snail except at the anterior-most pole. We further show that the distal CRM normally limits the increased expression domain of the proximal CRM and that the proximal CRM serves as a `damper' for the expression levels driven by the distal CRM. Thus, the two CRMs interact in cis in a non-additive fashion and these interactions may be important for fine-tuning the domains and levels of gene expression. PMID:21813571

  13. Analysis of opo cis-regulatory landscape uncovers Vsx2 requirement in early eye morphogenesis.

    PubMed

    Gago-Rodrigues, Ines; Fernández-Miñán, Ana; Letelier, Joaquin; Naranjo, Silvia; Tena, Juan J; Gómez-Skarmeta, José L; Martinez-Morales, Juan R

    2015-01-01

    The self-organized morphogenesis of the vertebrate optic cup entails coupling the activation of the retinal gene regulatory network to the constriction-driven infolding of the retinal epithelium. Yet the genetic mechanisms underlying this coordination remain largely unexplored. Through phylogenetic footprinting and transgenesis in zebrafish, here we examine the cis-regulatory landscape of opo, an endocytosis regulator essential for eye morphogenesis. Among the different conserved enhancers identified, we isolate a single retina-specific element (H6_10137) and show that its activity depends on binding sites for the retinal determinant Vsx2. Gain- and loss-of-function experiments and ChIP analyses reveal that Vsx2 regulates opo expression through direct binding to this retinal enhancer. Furthermore, we show that vsx2 knockdown impairs the primary optic cup folding. These data support a model by which vsx2, operating through the effector gene opo, acts as a central transcriptional node that coordinates neural retina patterning and optic cup invagination in zebrafish. PMID:25963169

  14. Distinct Functional Constraints Partition Sequence Conservation in a cis-Regulatory Element

    PubMed Central

    Ruvinsky, Ilya

    2011-01-01

    Different functional constraints contribute to different evolutionary rates across genomes. To understand why some sequences evolve faster than others in a single cis-regulatory locus, we investigated function and evolutionary dynamics of the promoter of the Caenorhabditis elegans unc-47 gene. We found that this promoter consists of two distinct domains. The proximal promoter is conserved and is largely sufficient to direct appropriate spatial expression. The distal promoter displays little if any conservation between several closely related nematodes. Despite this divergence, sequences from all species confer robustness of expression, arguing that this function does not require substantial sequence conservation. We showed that even unrelated sequences have the ability to promote robust expression. A prominent feature shared by all of these robustness-promoting sequences is an AT-enriched nucleotide composition consistent with nucleosome depletion. Because general sequence composition can be maintained despite sequence turnover, our results explain how different functional constraints can lead to vastly disparate rates of sequence divergence within a promoter. PMID:21655084

  15. Developmental cis-regulatory analysis of the cyclin D gene in the sea urchin Strongylocentrotus purpuratus

    PubMed Central

    McCarty, Christopher M.

    2013-01-01

    Cyclin D genes regulate the cell cycle, growth and differentiation in response to intercellular signaling. While the promoters of vertebrate cyclin D genes have been analyzed, the cis-regulatory sequences across an entire cyclin D locus have not. Doing so would increase understanding of how cyclin D genes respond to the regulatory states established by developmental gene regulatory networks, linking cell cycle and growth control to the ontogenetic program. Therefore, we conducted a cis-regulatory analysis on the cyclin D gene, SpcycD, of the sea urchin, Strongylocentrotus purpuratus, during embryogenesis, identifying upstream and intronic sequences, located within six defined regions bearing one or more cis-regulatory modules each. PMID:24090975

  16. cis-Regulatory Mutations Are a Genetic Cause of Human Limb Malformations

    PubMed Central

    VanderMeer, Julia E.; Ahituv, Nadav

    2011-01-01

    The underlying mutations that cause human limb malformations are often difficult to determine, particularly for limb malformations that occur as isolated traits. Evidence from a variety of studies shows that cis-regulatory mutations, specifically in enhancers, can lead to some of these isolated limb malformations. Here, we provide a review of human limb malformations that have been shown to be caused by enhancer mutations and propose that cis-regulatory mutations will continue to be identified as the cause of additional human malformations as our understanding of regulatory sequences improves. PMID:21509892

  17. Motif-Role-Fingerprints: The Building-Blocks of Motifs, Clustering-Coefficients and Transitivities in Directed Networks

    PubMed Central

    McDonnell, Mark D.; Yaveroğlu, Ömer Nebil; Schmerl, Brett A.; Iannella, Nicolangelo; Ward, Lawrence M.

    2014-01-01

    Complex networks are frequently characterized by metrics for which particular subgraphs are counted. One statistic from this category, which we refer to as motif-role fingerprints, differs from global subgraph counts in that the number of subgraphs in which each node participates is counted. As with global subgraph counts, it can be important to distinguish between motif-role fingerprints that are ‘structural’ (induced subgraphs) and ‘functional’ (partial subgraphs). Here we show mathematically that a vector of all functional motif-role fingerprints can readily be obtained from an arbitrary directed adjacency matrix, and then converted to structural motif-role fingerprints by multiplying that vector by a specific invertible conversion matrix. This result demonstrates that a unique structural motif-role fingerprint exists for any given functional motif-role fingerprint. We demonstrate a similar result for the cases of functional and structural motif-fingerprints without node roles, and global subgraph counts that form the basis of standard motif analysis. We also explicitly highlight that motif-role fingerprints are elemental to several popular metrics for quantifying the subgraph structure of directed complex networks, including motif distributions, directed clustering coefficient, and transitivity. The relationships between each of these metrics and motif-role fingerprints also suggest new subtypes of directed clustering coefficients and transitivities. Our results have potential utility in analyzing directed synaptic networks constructed from neuronal connectome data, such as in terms of centrality. Other potential applications include anomaly detection in networks, identification of similar networks and identification of similar nodes within networks. Matlab code for calculating all stated metrics following calculation of functional motif-role fingerprints is provided as S1 Matlab File. PMID:25486535

  18. Cis-regulatory mechanisms governing stem and progenitor cell transitions

    PubMed Central

    Johnson, Kirby D.; Kong, Guangyao; Gao, Xin; Chang, Yuan-I; Hewitt, Kyle J.; Sanalkumar, Rajendran; Prathibha, Rajalekshmi; Ranheim, Erik A.; Dewey, Colin N.; Zhang, Jing; Bresnick, Emery H.

    2015-01-01

    Cis-element encyclopedias provide information on phenotypic diversity and disease mechanisms. Although cis-element polymorphisms and mutations are instructive, deciphering function remains challenging. Mutation of an intronic GATA motif (+9.5) in GATA2, encoding a master regulator of hematopoiesis, underlies an immunodeficiency associated with myelodysplastic syndrome (MDS) and acute myeloid leukemia (AML). Whereas an inversion relocalizes another GATA2 cis-element (−77) to the proto-oncogene EVI1, inducing EVI1 expression and AML, whether this reflects ectopic or physiological activity is unknown. We describe a mouse strain that decouples −77 function from proto-oncogene deregulation. The −77−/− mice exhibited a novel phenotypic constellation including late embryonic lethality and anemia. The −77 established a vital sector of the myeloid progenitor transcriptome, conferring multipotentiality. Unlike the +9.5−/− embryos, hematopoietic stem cell genesis was unaffected in −77−/− embryos. These results illustrate a paradigm in which cis-elements in a locus differentially control stem and progenitor cell transitions, and therefore the individual cis-element alterations cause unique and overlapping disease phenotypes. PMID:26601269

  19. Evolution of Cis-Regulatory Elements and Regulatory Networks in Duplicated Genes of Arabidopsis1[OPEN

    PubMed Central

    Guo, Xu Qiu; Adams, Keith L.

    2015-01-01

    Plant genomes contain large numbers of duplicated genes that contribute to the evolution of new functions. Following duplication, genes can exhibit divergence in their coding sequence and their expression patterns. Changes in the cis-regulatory element landscape can result in changes in gene expression patterns. High-throughput methods developed recently can identify potential cis-regulatory elements on a genome-wide scale. Here, we use a recent comprehensive data set of DNase I sequencing-identified cis-regulatory binding sites (footprints) at single-base-pair resolution to compare binding sites and network connectivity in duplicated gene pairs in Arabidopsis (Arabidopsis thaliana). We found that duplicated gene pairs vary greatly in their cis-regulatory element architecture, resulting in changes in regulatory network connectivity. Whole-genome duplicates (WGDs) have approximately twice as many footprints in their promoters left by potential regulatory proteins than do tandem duplicates (TDs). The WGDs have a greater average number of footprint differences between paralogs than TDs. The footprints, in turn, result in more regulatory network connections between WGDs and other genes, forming denser, more complex regulatory networks than shown by TDs. When comparing regulatory connections between duplicates, WGDs had more pairs in which the two genes are either partially or fully diverged in their network connections, but fewer genes with no network connections than the TDs. There is evidence of younger TDs and WGDs having fewer unique connections compared with older duplicates. This study provides insights into cis-regulatory element evolution and network divergence in duplicated genes. PMID:26474639

  20. Functional Evolution of cis-Regulatory Modules at a Homeotic Gene in Drosophila

    PubMed Central

    Schiller, Benjamin J.; Bae, Esther; Tran, Diana A.; Shur, Andrey S.; Allen, John M.; Rau, Christoph; Bender, Welcome; Fisher, William W.; Celniker, Susan E.; Drewell, Robert A.

    2009-01-01

    It is a long-held belief in evolutionary biology that the rate of molecular evolution for a given DNA sequence is inversely related to the level of functional constraint. This belief holds true for the protein-coding homeotic (Hox) genes originally discovered in Drosophila melanogaster. Expression of the Hox genes in Drosophila embryos is essential for body patterning and is controlled by an extensive array of cis-regulatory modules (CRMs). How the regulatory modules functionally evolve in different species is not clear. A comparison of the CRMs for the Abdominal-B gene from different Drosophila species reveals relatively low levels of overall sequence conservation. However, embryonic enhancer CRMs from other Drosophila species direct transgenic reporter gene expression in the same spatial and temporal patterns during development as their D. melanogaster orthologs. Bioinformatic analysis reveals the presence of short conserved sequences within defined CRMs, representing gap and pair-rule transcription factor binding sites. One predicted binding site for the gap transcription factor KRUPPEL in the IAB5 CRM was found to be altered in Superabdominal (Sab) mutations. In Sab mutant flies, the third abdominal segment is transformed into a copy of the fifth abdominal segment. A model for KRUPPEL-mediated repression at this binding site is presented. These findings challenge our current understanding of the relationship between sequence evolution at the molecular level and functional activity of a CRM. While the overall sequence conservation at Drosophila CRMs is not distinctive from neighboring genomic regions, functionally critical transcription factor binding sites within embryonic enhancer CRMs are highly conserved. These results have implications for understanding mechanisms of gene expression during embryonic development, enhancer function, and the molecular evolution of eukaryotic regulatory modules. PMID:19893611

  1. Recurrent Modification of a Conserved Cis-Regulatory Element Underlies Fruit Fly Pigmentation Diversity

    PubMed Central

    Rogers, William A.; Salomone, Joseph R.; Tacy, David J.; Camino, Eric M.; Davis, Kristen A.; Rebeiz, Mark; Williams, Thomas M.

    2013-01-01

    The development of morphological traits occurs through the collective action of networks of genes connected at the level of gene expression. As any node in a network may be a target of evolutionary change, the recurrent targeting of the same node would indicate that the path of evolution is biased for the relevant trait and network. Although examples of parallel evolution have implicated recurrent modification of the same gene and cis-regulatory element (CRE), little is known about the mutational and molecular paths of parallel CRE evolution. In Drosophila melanogaster fruit flies, the Bric-à-brac (Bab) transcription factors control the development of a suite of sexually dimorphic traits on the posterior abdomen. Female-specific Bab expression is regulated by the dimorphic element, a CRE that possesses direct inputs from body plan (ABD-B) and sex-determination (DSX) transcription factors. Here, we find that the recurrent evolutionary modification of this CRE underlies both intraspecific and interspecific variation in female pigmentation in the melanogaster species group. By reconstructing the sequence and regulatory activity of the ancestral Drosophila melanogaster dimorphic element, we demonstrate that a handful of mutations were sufficient to create independent CRE alleles with differing activities. Moreover, intraspecific and interspecific dimorphic element evolution proceeded with little to no alterations to the known body plan and sex-determination regulatory linkages. Collectively, our findings represent an example where the paths of evolution appear biased to a specific CRE, and drastic changes in function were accompanied by deep conservation of key regulatory linkages. PMID:24009528

  2. Cis-Regulatory Changes Associated with a Recent Mating System Shift and Floral Adaptation in Capsella.

    PubMed

    Steige, Kim A; Reimegård, Johan; Koenig, Daniel; Scofield, Douglas G; Slotte, Tanja

    2015-10-01

    The selfing syndrome constitutes a suite of floral and reproductive trait changes that have evolved repeatedly across many evolutionary lineages in response to the shift to selfing. Convergent evolution of the selfing syndrome suggests that these changes are adaptive, yet our understanding of the detailed molecular genetic basis of the selfing syndrome remains limited. Here, we investigate the role of cis-regulatory changes during the recent evolution of the selfing syndrome in Capsella rubella, which split from the outcrosser Capsella grandiflora less than 200 ka. We assess allele-specific expression (ASE) in leaves and flower buds at a total of 18,452 genes in three interspecific F1 C. grandiflora x C. rubella hybrids. Using a hierarchical Bayesian approach that accounts for technical variation using genomic reads, we find evidence for extensive cis-regulatory changes. On average, 44% of the assayed genes show evidence of ASE; however, only 6% show strong allelic expression biases. Flower buds, but not leaves, show an enrichment of cis-regulatory changes in genomic regions responsible for floral and reproductive trait divergence between C. rubella and C. grandiflora. We further detected an excess of heterozygous transposable element (TE) insertions near genes with ASE, and TE insertions targeted by uniquely mapping 24-nt small RNAs were associated with reduced expression of nearby genes. Our results suggest that cis-regulatory changes have been important during the recent adaptive floral evolution in Capsella and that differences in TE dynamics between selfing and outcrossing species could be important for rapid regulatory divergence in association with mating system shifts. PMID:26318184

  3. Evolved tooth gain in sticklebacks is associated with a cis-regulatory allele of Bmp6

    PubMed Central

    Cleves, Phillip A.; Ellis, Nicholas A.; Jimenez, Monica T.; Nunez, Stephanie M.; Schluter, Dolph; Kingsley, David M.; Miller, Craig T.

    2014-01-01

    Developmental genetic studies of evolved differences in morphology have led to the hypothesis that cis-regulatory changes often underlie morphological evolution. However, because most of these studies focus on evolved loss of traits, the genetic architecture and possible association with cis-regulatory changes of gain traits are less understood. Here we show that a derived benthic freshwater stickleback population has evolved an approximate twofold gain in ventral pharyngeal tooth number compared with their ancestral marine counterparts. Comparing laboratory-reared developmental time courses of a low-toothed marine population and this high-toothed benthic population reveals that increases in tooth number and tooth plate area and decreases in tooth spacing arise at late juvenile stages. Genome-wide linkage mapping identifies largely separate sets of quantitative trait loci affecting different aspects of dental patterning. One large-effect quantitative trait locus controlling tooth number fine-maps to a genomic region containing an excellent candidate gene, Bone morphogenetic protein 6 (Bmp6). Stickleback Bmp6 is expressed in developing teeth, and no coding changes are found between the high- and low-toothed populations. However, quantitative allele-specific expression assays of Bmp6 in developing teeth in F1 hybrids show that cis-regulatory changes have elevated the relative expression level of the freshwater benthic Bmp6 allele at late, but not early, stages of stickleback development. Collectively, our data support a model where a late-acting cis-regulatory up-regulation of Bmp6 expression underlies a significant increase in tooth number in derived benthic sticklebacks. PMID:25205810

  4. The identification of cis-regulatory elements: A review from a machine learning perspective.

    PubMed

    Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W

    2015-12-01

    The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field. PMID:26499213

  5. Dynamic SPR monitoring of yeast nuclear protein binding to a cis-regulatory element

    SciTech Connect

    Mao, Grace; Brody, James P.

    2007-11-09

    Gene expression is controlled by protein complexes binding to short specific sequences of DNA, called cis-regulatory elements. Expression of most eukaryotic genes is controlled by dozens of these elements. Comprehensive identification and monitoring of these elements is a major goal of genomics. In pursuit of this goal, we are developing a surface plasmon resonance (SPR) based assay to identify and monitor cis-regulatory elements. To test whether we could reliably monitor protein binding to a regulatory element, we immobilized a 16 bp region of Saccharomyces cerevisiae chromosome 5 onto a gold surface. This 16 bp region of DNA is known to bind several proteins and thought to control expression of the gene RNR1, which varies through the cell cycle. We synchronized yeast cell cultures, and then sampled these cultures at a regular interval. These samples were processed to purify nuclear lysate, which was then exposed to the sensor. We found that nuclear protein binds this particular element of DNA at a significantly higher rate (as compared to unsynchronized cells) during G1 phase. Other time points show levels of DNA-nuclear protein binding similar to the unsynchronized control. We also measured the apparent association complex of the binding to be 0.014 s{sup -1}. We conclude that (1) SPR-based assays can monitor DNA-nuclear protein binding and that (2) for this particular cis-regulatory element, maximum DNA-nuclear protein binding occurs during G1 phase.

  6. Predominant contribution of cis-regulatory divergence in the evolution of mouse alternative splicing

    PubMed Central

    Gao, Qingsong; Sun, Wei; Ballegeer, Marlies; Libert, Claude; Chen, Wei

    2015-01-01

    Divergence of alternative splicing represents one of the major driving forces to shape phenotypic diversity during evolution. However, the extent to which these divergences could be explained by the evolving cis-regulatory versus trans-acting factors remains unresolved. To globally investigate the relative contributions of the two factors for the first time in mammals, we measured splicing difference between C57BL/6J and SPRET/EiJ mouse strains and allele-specific splicing pattern in their F1 hybrid. Out of 11,818 alternative splicing events expressed in the cultured fibroblast cells, we identified 796 with significant difference between the parental strains. After integrating allele-specific data from F1 hybrid, we demonstrated that these events could be predominately attributed to cis-regulatory variants, including those residing at and beyond canonical splicing sites. Contrary to previous observations in Drosophila, such predominant contribution was consistently observed across different types of alternative splicing. Further analysis of liver tissues from the same mouse strains and reanalysis of published datasets on other strains showed similar trends, implying in general the predominant contribution of cis-regulatory changes in the evolution of mouse alternative splicing. PMID:26134616

  7. Role of conserved cis-regulatory elements in the post-transcriptional regulation of the human MECP2 gene involved in autism

    PubMed Central

    2013-01-01

    Background The MECP2 gene codes for methyl CpG binding protein 2 which regulates activities of other genes in the early development of the brain. Mutations in this gene have been associated with Rett syndrome, a form of autism. The purpose of this study was to investigate the role of evolutionarily conserved cis-elements in regulating the post-transcriptional expression of the MECP2 gene and to explore their possible correlations with a mutation that is known to cause mental retardation. Results A bioinformatics approach was used to map evolutionarily conserved cis-regulatory elements in the transcribed regions of the human MECP2 gene and its mammalian orthologs. Cis-regulatory motifs including G-quadruplexes, microRNA target sites, and AU-rich elements have gained significant importance because of their role in key biological processes and as therapeutic targets. We discovered in the 5′-UTR (untranslated region) of MECP2 mRNA a highly conserved G-quadruplex which overlapped a known deletion in Rett syndrome patients with decreased levels of MeCP2 protein. We believe that this 5′-UTR G-quadruplex could be involved in regulating MECP2 translation. We mapped additional evolutionarily conserved G-quadruplexes, microRNA target sites, and AU-rich elements in the key sections of both untranslated regions. Our studies suggest the regulation of translation, mRNA turnover, and development-related alternative MECP2 polyadenylation, putatively involving interactions of conserved cis-regulatory elements with their respective trans factors and complex interactions among the trans factors themselves. We discovered highly conserved G-quadruplex motifs that were more prevalent near alternative splice sites as compared to the constitutive sites of the MECP2 gene. We also identified a pair of overlapping G-quadruplexes at an alternative 5′ splice site that could potentially regulate alternative splicing in a negative as well as a positive way in the MECP2 pre

  8. Toward a Genome-Wide Reconstruction of Cis-Regulatory Networks in the Human Genome

    PubMed Central

    Cecchini, Katharine R.; Banerjee, A. Raja; Kim, Tae Hoon

    2009-01-01

    The vast amount of recent progress made on the sequence of the human genome has allowed an unprecedented examination of cis-regulatory networks. These networks consist of functional elements such as promoters, enhancers, silencers, and insulators, and their coordinated activity is responsible for regulation of gene expression. Recent studies surveyed the entire genome, identifying novel elements and evaluating functional differences in respect to development. These investigations present the first steps towards a global regulatory map for expression in the human genome. PMID:19560550

  9. Rapid evolution of cis-regulatory sequences via local point mutations

    NASA Technical Reports Server (NTRS)

    Stone, J. R.; Wray, G. A.

    2001-01-01

    Although the evolution of protein-coding sequences within genomes is well understood, the same cannot be said of the cis-regulatory regions that control transcription. Yet, changes in gene expression are likely to constitute an important component of phenotypic evolution. We simulated the evolution of new transcription factor binding sites via local point mutations. The results indicate that new binding sites appear and become fixed within populations on microevolutionary timescales under an assumption of neutral evolution. Even combinations of two new binding sites evolve very quickly. We predict that local point mutations continually generate considerable genetic variation that is capable of altering gene expression.

  10. Variation in vertebrate cis-regulatory elements in evolution and disease.

    PubMed

    Douglas, Adam Thomas; Hill, Robert D

    2014-01-01

    Much of the genetic information that drives animal diversity lies within the vast non-coding regions of the genome. Multi-species sequence conservation in non-coding regions of the genome flags important regulatory elements and more recently, techniques that look for functional signatures predicted for regulatory sequences have added to the identification of thousands more. For some time, biologists have argued that changes in cis-regulatory sequences creates the basic genetic framework for evolutionary change. Recent advances support this notion and show that there is extensive genomic variability in non-coding regulatory elements associated with trait variation, speciation and disease. PMID:25764334

  11. Variation in Vertebrate Cis-Regulatory Elements in Evolution and Disease.

    PubMed

    Douglas, Adam T; Hill, Robert E

    2014-05-01

    Much of the genetic information that drives animal diversity lies within the vast non-coding regions of the genome. Multi-species sequence conservation in non-coding regions of the genome flags important regulatory elements and more recently, techniques that look for functional signatures predicted for regulatory sequences have added to the identification of thousands more. For some time, biologists have argued that changes in cis-regulatory sequences creates the basic genetic framework for evolutionary change. Recent advances support this notion and show that there is extensive genomic variability in non-coding regulatory elements associated with trait variation, speciation and disease. PMID:24802895

  12. Variation in Vertebrate Cis-Regulatory Elements in Evolution and Disease

    PubMed Central

    Douglas, Adam Thomas; Hill, Robert E

    2014-01-01

    Much of the genetic information that drives animal diversity lies within the vast non-coding regions of the genome. Multi-species sequence conservation in non-coding regions of the genome flags important regulatory elements and more recently, techniques that look for functional signatures predicted for regulatory sequences have added to the identification of thousands more. For some time, biologists have argued that changes in cis-regulatory sequences creates the basic genetic framework for evolutionary change. Recent advances support this notion and show that there is extensive genomic variability in non-coding regulatory elements associated with trait variation, speciation and disease. PMID:25764334

  13. Multiple Dileucine-like Motifs Direct VGLUT1 Trafficking

    PubMed Central

    Foss, Sarah M.; Li, Haiyan; Santos, Magda S.; Edwards, Robert H.

    2013-01-01

    The vesicular glutamate transporters (VGLUTs) package glutamate into synaptic vesicles, and the two principal isoforms VGLUT1 and VGLUT2 have been suggested to influence the properties of release. To understand how a VGLUT isoform might influence transmitter release, we have studied their trafficking and previously identified a dileucine-like endocytic motif in the C terminus of VGLUT1. Disruption of this motif impairs the activity-dependent recycling of VGLUT1, but does not eliminate its endocytosis. We now report the identification of two additional dileucine-like motifs in the N terminus of VGLUT1 that are not well conserved in the other isoforms. In the absence of all three motifs, rat VGLUT1 shows limited accumulation at synaptic sites and no longer responds to stimulation. In addition, shRNA-mediated knockdown of clathrin adaptor proteins AP-1 and AP-2 shows that the C-terminal motif acts largely via AP-2, whereas the N-terminal motifs use AP-1. Without the C-terminal motif, knockdown of AP-1 reduces the proportion of VGLUT1 that responds to stimulation. VGLUT1 thus contains multiple sorting signals that engage distinct trafficking mechanisms. In contrast to VGLUT1, the trafficking of VGLUT2 depends almost entirely on the conserved C-terminal dileucine-like motif: without this motif, a substantial fraction of VGLUT2 redistributes to the plasma membrane and the transporter's synaptic localization is disrupted. Consistent with these differences in trafficking signals, wild-type VGLUT1 and VGLUT2 differ in their response to stimulation. PMID:23804088

  14. Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

    PubMed

    Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

    2010-01-15

    Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short. PMID:19895802

  15. Distance and Helical Phase Dependence of Synergistic Transcription Activation in cis-Regulatory Module

    PubMed Central

    Huang, Qilai; Gong, Chenguang; Li, Jiahuang; Zhuo, Zhu; Chen, Yuan; Wang, Jin; Hua, Zi-Chun

    2012-01-01

    Deciphering of the spatial and stereospecific constraints on synergistic transcription activation mediated between activators bound to cis-regulatory elements is important for understanding gene regulation and remains largely unknown. It has been commonly believed that two activators will activate transcription most effectively when they are bound on the same face of DNA double helix and within a boundary distance from the transcription initiation complex attached to the TATA box. In this work, we studied the spatial and stereospecific constraints on activation by multiple copies of bound model activators using a series of engineered relative distances and stereospecific orientations. We observed that multiple copies of the activators GAL4-VP16 and ZEBRA bound to engineered promoters activated transcription more effectively when bound on opposite faces of the DNA double helix. This phenomenon was not affected by the spatial relationship between the proximal activator and initiation complex. To explain these results, we proposed the novel concentration field model, which posits the effective concentration of bound activators, and therefore the transcription activation potential, is affected by their stereospecific positioning. These results could be used to understand synergistic transcription activation anew and to aid the development of predictive models for the identification of cis-regulatory elements. PMID:22299056

  16. The structure and evolution of cis-regulatory regions: the shavenbaby story

    PubMed Central

    Stern, David L.; Frankel, Nicolás

    2013-01-01

    In this paper, we provide a historical account of the contribution of a single line of research to our current understanding of the structure of cis-regulatory regions and the genetic basis for morphological evolution. We revisit the experiments that shed light on the evolution of larval cuticular patterns within the genus Drosophila and the evolution and structure of the shavenbaby gene. We describe the experiments that led to the discovery that multiple genetic changes in the cis-regulatory region of shavenbaby caused the loss of dorsal cuticular hairs (quaternary trichomes) in first instar larvae of Drosophila sechellia. We also discuss the experiments that showed that the convergent loss of quaternary trichomes in D. sechellia and Drosophila ezoana was generated by parallel genetic changes in orthologous enhancers of shavenbaby. We discuss the observation that multiple shavenbaby enhancers drive overlapping patterns of expression in the embryo and that these apparently redundant enhancers ensure robust shavenbaby expression and trichome morphogenesis under stressful conditions. All together, these data, collected over 13 years, provide a fundamental case study in the fields of gene regulation and morphological evolution, and highlight the importance of prolonged, detailed studies of single genes. PMID:24218640

  17. Exonic remnants of whole-genome duplication reveal cis-regulatory function of coding exons

    PubMed Central

    Dong, Xianjun; Navratilova, Pavla; Fredman, David; Drivenes, Øyvind; Becker, Thomas S.; Lenhard, Boris

    2010-01-01

    Using a comparative genomics approach to reconstruct the fate of genomic regulatory blocks (GRBs) and identify exonic remnants that have survived the disappearance of their host genes after whole-genome duplication (WGD) in teleosts, we discover a set of 38 candidate cis-regulatory coding exons (RCEs) with predicted target genes. These elements demonstrate evolutionary separation of overlapping protein-coding and regulatory information after WGD in teleosts. We present evidence that the corresponding mammalian exons are still under both coding and non-coding selection pressure, are more conserved than other protein coding exons in the host gene and several control sets, and share key characteristics with highly conserved non-coding elements in the same regions. Their dual function is corroborated by existing experimental data. Additionally, we show examples of human exon remnants stemming from the vertebrate 2R WGD. Our findings suggest that long-range cis-regulatory inputs for developmental genes are not limited to non-coding regions, but can also overlap the coding sequence of unrelated genes. Thus, exonic regulatory elements in GRBs might be functionally equivalent to those in non-coding regions, calling for a re-evaluation of the sequence space in which to look for long-range regulatory elements and experimentally test their activity. PMID:19969543

  18. Engineering Synthetic cis-Regulatory Elements for Simultaneous Recognition of Three Transcriptional Factors in Bacteria.

    PubMed

    Amores, Gerardo Ruiz; Guazzaroni, María-Eugenia; Silva-Rocha, Rafael

    2015-12-18

    Recognition of cis-regulatory elements by transcription factors (TF) at target promoters is crucial to gene regulation in bacteria. In this process, binding of TFs to their cognate sequences depends on a set of physical interactions between these proteins and specific nucleotides in the operator region. Previously, we showed that in silico optimization algorithms are able to generate short sequences that are recognized by two different TFs of Escherichia coli, namely, CRP and IHF, thus generating an AND logic gate. Here, we expanded this approach in order to engineer DNA sequences that can be simultaneously recognized by three unrelated TFs (CRP, IHF, and Fis). Using in silico optimization and experimental validation strategies, we were able to obtain a candidate promoter (Plac-CFI1) regulated by only two TFs with an AND logic, thus demonstrating a limitation in the design. Subsequently, we modified the algorithm to allow the optimization of extended sequences, and were able to design two synthetic promoters (PCFI20-1 and PCFI22-5) that were functional in vivo. Expression assays in E. coli mutant strains for each TF revealed that while CRP positively regulates the promoter activities, IHF and Fis are strong repressors of both the promoter variants. Taken together, our results demonstrate the potential of in silico strategies in bacterial synthetic promoter engineering. Furthermore, the study also shows how small modifications in cis-regulatory elements can drastically affect the final logic of the resulting promoter. PMID:26305598

  19. Conservation and Evolution of Cis-Regulatory Systems in Ascomycete Fungi

    PubMed Central

    2004-01-01

    Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups from S. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa. PMID:15534694

  20. Conservation and evolution of cis-regulatory systems in ascomycete fungi

    SciTech Connect

    Gasch, Audrey P.; Moses, Alan M.; Chiang, Derek Y.; Fraser, Hunter B.; Berardini, Mark; Eisen, Michael B.

    2004-03-15

    Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups from S. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa.

  1. Transcription of Mammalian cis-Regulatory Elements Is Restrained by Actively Enforced Early Termination.

    PubMed

    Austenaa, Liv M I; Barozzi, Iros; Simonatto, Marta; Masella, Silvia; Della Chiara, Giulia; Ghisletti, Serena; Curina, Alessia; de Wit, Elzo; Bouwman, Britta A M; de Pretis, Stefano; Piccolo, Viviana; Termanini, Alberto; Prosperini, Elena; Pelizzola, Mattia; de Laat, Wouter; Natoli, Gioacchino

    2015-11-01

    Upon recruitment to active enhancers and promoters, RNA polymerase II (Pol II) generates short non-coding transcripts of unclear function. The mechanisms that control the length and the amount of ncRNAs generated by cis-regulatory elements are largely unknown. Here, we show that the adaptor protein WDR82 and its associated complexes actively limit such non-coding transcription. WDR82 targets the SET1 H3K4 methyltransferases and the nuclear protein phosphatase 1 (PP1) complexes to the initiating Pol II. WDR82 and PP1 also interact with components of the transcriptional termination and RNA processing machineries. Depletion of WDR82, SET1, or the PP1 subunit required for its nuclear import caused distinct but overlapping transcription termination defects at highly expressed genes and active enhancers and promoters, thus enabling the increased synthesis of unusually long ncRNAs. These data indicate that transcription initiated from cis-regulatory elements is tightly coordinated with termination mechanisms that impose the synthesis of short RNAs. PMID:26593720

  2. The evolution of cichlid fish egg-spots is linked with a cis-regulatory change

    PubMed Central

    Santos, M. Emília; Braasch, Ingo; Boileau, Nicolas; Meyer, Britta S.; Sauteur, Loïc; Böhne, Astrid; Belting, Heinz-Georg; Affolter, Markus; Salzburger, Walter

    2014-01-01

    The origin of novel phenotypic characters is a key component in organismal diversification; yet, the mechanisms underlying the emergence of such evolutionary novelties are largely unknown. Here we examine the origin of egg-spots, an evolutionary innovation of the most species-rich group of cichlids, the haplochromines, where these conspicuous male fin colour markings are involved in mating. Applying a combination of RNAseq, comparative genomics and functional experiments, we identify two novel pigmentation genes, fhl2a and fhl2b, and show that especially the more rapidly evolving b-paralog is associated with egg-spot formation. We further find that egg-spot bearing haplochromines, but not other cichlids, feature a transposable element in the cis-regulatory region of fhl2b. Using transgenic zebrafish, we finally demonstrate that this region shows specific enhancer activities in iridophores, a type of pigment cells found in egg-spots, suggesting that a cis-regulatory change is causally linked to the gain of expression in egg-spot bearing haplochromines. PMID:25296686

  3. Recent mating-system evolution in Eichhornia is accompanied by cis-regulatory divergence.

    PubMed

    Arunkumar, Ramesh; Maddison, Teresa I; Barrett, Spencer C H; Wright, Stephen I

    2016-07-01

    The evolution of predominant self-fertilization from cross-fertilization in plants is accompanied by diverse changes to morphology, ecology and genetics, some of which likely result from regulatory changes in gene expression. We examined changes in gene expression during early stages in the transition to selfing in populations of animal-pollinated Eichhornia paniculata with contrasting mating patterns. We crossed plants from outcrossing and selfing populations and tested for the presence of allele-specific expression (ASE) in floral buds and leaf tissue of F1 offspring, indicative of cis-regulatory changes. We identified 1365 genes exhibiting ASE in floral buds and leaf tissue. These genes preferentially expressed alleles from outcrossing parents. Moreover, we found evidence that genes exhibiting ASE had a greater nonsynonymous diversity compared to synonymous diversity in the selfing parents. Our results suggest that the transition from outcrossing to high rates of self-fertilization may have the potential to shape the cis-regulatory genomic landscape of angiosperm species, but that the changes in ASE may be moderate, particularly during the early stages of this transition. PMID:26990568

  4. Profiling of conserved non-coding elements upstream of SHOX and functional characterisation of the SHOX cis-regulatory landscape

    PubMed Central

    Verdin, Hannah; Fernández-Miñán, Ana; Benito-Sanz, Sara; Janssens, Sandra; Callewaert, Bert; Waele, Kathleen De; Schepper, Jean De; François, Inge; Menten, Björn; Heath, Karen E.; Gómez-Skarmeta, José Luis; Baere, Elfride De

    2015-01-01

    Genetic defects such as copy number variations (CNVs) in non-coding regions containing conserved non-coding elements (CNEs) outside the transcription unit of their target gene, can underlie genetic disease. An example of this is the short stature homeobox (SHOX) gene, regulated by seven CNEs located downstream and upstream of SHOX, with proven enhancer capacity in chicken limbs. CNVs of the downstream CNEs have been reported in many idiopathic short stature (ISS) cases, however, only recently have a few CNVs of the upstream enhancers been identified. Here, we set out to provide insight into: (i) the cis-regulatory role of these upstream CNEs in human cells, (ii) the prevalence of upstream CNVs in ISS, and (iii) the chromatin architecture of the SHOX cis-regulatory landscape in chicken and human cells. Firstly, luciferase assays in human U2OS cells, and 4C-seq both in chicken limb buds and human U2OS cells, demonstrated cis-regulatory enhancer capacities of the upstream CNEs. Secondly, CNVs of these upstream CNEs were found in three of 501 ISS patients. Finally, our 4C-seq interaction map of the SHOX region reveals a cis-regulatory domain spanning more than 1 Mb and harbouring putative new cis-regulatory elements. PMID:26631348

  5. The cis-regulatory system of the tbrain gene: alternative use of multiple modules to promote skeletogenic expression in the sea urchin embryo

    PubMed Central

    Wahl, Mary E.; Hahn, Julie; Gora, Kasia; Davidson, Eric H.; Oliveri, Paola

    2009-01-01

    The genomic cis-regulatory systems controlling regulatory gene expression usually include multiple modules. The regulatory output of such systems at any given time depends on which module is directing the function of the basal transcription apparatus, and ultimately on the transcription factor inputs into that module. Here we examine regulation of the S. purpuratus tbrain gene, a required activator of the skeletogenic specification state in the lineage descendant from the embryo micromeres. Alternate cis-regulatory modules were found to convey skeletogenic expression in reporter constructs. To determine their relative developmental functions in context, we made use of recombineered BAC constructs containing a GFP reporter, and of derivatives from which specific modules had been deleted. The outputs of the various constructs were observed spatially by GFP fluorescence and quantitatively over time by QPCR. In the context of the complete genomic locus, early skeletogenic expression is controlled by an intron enhancer plus a proximal region containing a HesC site as predicted from network analysis. From ingression onward, however, a dedicated distal module utilizing positive Ets1/2 inputs contributes to definitive expression in the skeletogenic mesenchyme. This module also mediates a newly-discovered negative Erg input which excludes non-skeletogenic mesodermal expression. PMID:19679118

  6. Evolutionarily Assembled cis-Regulatory Module at a Human Ciliopathy Locus

    PubMed Central

    Lee, Jeong Ho; Silhavy, Jennifer L.; Lee, Ji Eun; Al-Gazali, Lihadh; Thomas, Sophie; Davis, Erica E.; Bielas, Stephanie L.; Hill, Kiley J.; Iannicelli, Miriam; Brancati, Francesco; Gabriel, Stacey B.; Russ, Carsten; Logan, Clare V.; Sharif, Saghira Malik; Bennett, Christopher P.; Abe, Masumi; Hildebrandt, Friedhelm; Diplas, Bill H.; Attié-Bitach, Tania; Katsanis, Nicholas; Rajab, Anna; Koul, Roshan; Sztriha, Laszlo; Waters, Elizabeth R.; Ferro-Novick, Susan; Woods, C. Geoffrey; Johnson, Colin A.; Valente, Enza Maria; Zaki, Maha S.; Gleeson, Joseph G.

    2013-01-01

    Neighboring genes are often coordinately expressed within cis-regulatory modules, but evidence that nonparalogous genes share functions in mammals is lacking. Here, we report that mutation of either TMEM138 or TMEM216 causes a phenotypically indistinguishable human ciliopathy, Joubert syndrome. Despite a lack of sequence homology, the genes are aligned in a head-to-tail configuration and joined by chromosomal rearrangement at the amphibian-to-reptile evolutionary transition. Expression of the two genes is mediated by a conserved regulatory element in the noncoding intergenic region. Coordinated expression is important for their interdependent cellular role in vesicular transport to primary cilia. Hence, during vertebrate evolution of genes involved in ciliogenesis, nonparalogous genes were arranged to a functional gene cluster with shared regulatory elements. PMID:22282472

  7. Quantitative Analysis of Cis-Regulatory Element Activity Using Synthetic Promoters in Transgenic Plants.

    PubMed

    Benn, Geoffrey; Dehesh, Katayoon

    2016-01-01

    Synthetic promoters, introduced stably or transiently into plants, are an invaluable tool for the identification of functional regulatory elements and the corresponding transcription factor(s) that regulate the amplitude, spatial distribution, and temporal patterns of gene expression. Here, we present a protocol describing the steps required to identify and characterize putative cis-regulatory elements. These steps include application of computational tools to identify putative elements, construction of a synthetic promoter upstream of luciferase, identification of transcription factors that regulate the element, testing the functionality of the element introduced transiently and/or stably into the species of interest followed by high-throughput luciferase screening assays, and subsequent data processing and statistical analysis. PMID:27557758

  8. BET bromodomain inhibition releases the Mediator complex from select cis-regulatory elements

    PubMed Central

    Bhagwat, Anand S.; Roe, Jae-Seok; Mok, Beverly A.; Hohmann, Anja F.; Shi, Junwei; Vakoc, Christopher R.

    2016-01-01

    The bromodomain and extraterminal (BET) protein BRD4 can physically interact with the Mediator complex, but the relevance of this association to the therapeutic effects of BET inhibitors in cancer is unclear. Here, we show that BET inhibition causes a rapid release of Mediator from a subset of cis-regulatory elements in the genome of acute myeloid leukemia (AML) cells. These sites of Mediator eviction were highly correlated with transcriptional suppression of neighboring genes, which are enriched for targets of the transcription factor MYB and for functions related to leukemogenesis. An shRNA screen of Mediator in AML cells identified the MED12, MED13, MED23, and MED24 subunits as performing a similar regulatory function to BRD4 in this context, including a shared role in sustaining a block in myeloid maturation. These findings suggest that the interaction between BRD4 and Mediator has functional importance for gene-specific transcriptional activation and for AML maintenance. PMID:27068464

  9. Establishment of a Developmental Compartment Requires Interactions between Three Synergistic Cis-regulatory Modules

    PubMed Central

    Bieli, Dimitri; Kanca, Oguz; Requena, David; Hamaratoglu, Fisun; Gohl, Daryl; Schedl, Paul; Affolter, Markus; Slattery, Matthew; Müller, Martin; Estella, Carlos

    2015-01-01

    The subdivision of cell populations in compartments is a key event during animal development. In Drosophila, the gene apterous (ap) divides the wing imaginal disc in dorsal vs ventral cell lineages and is required for wing formation. ap function as a dorsal selector gene has been extensively studied. However, the regulation of its expression during wing development is poorly understood. In this study, we analyzed ap transcriptional regulation at the endogenous locus and identified three cis-regulatory modules (CRMs) essential for wing development. Only when the three CRMs are combined, robust ap expression is obtained. In addition, we genetically and molecularly analyzed the trans-factors that regulate these CRMs. Our results propose a three-step mechanism for the cell lineage compartment expression of ap that includes initial activation, positive autoregulation and Trithorax-mediated maintenance through separable CRMs. PMID:26468882

  10. [Identification and mapping of cis-regulatory elements within long genomic sequences].

    PubMed

    Akopov, S B; Chernov, I P; Vetchinova, A S; Bulanenkova, S S; Nikolaev, L G

    2007-01-01

    The publication of the human and other metazoan genome sequences opened up the possibility for mapping and analysis of genomic regulatory elements. Unfortunately, experimental data on genomic positions of such sequences as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. As most genomic regulatory elements (e.g., enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements in silico is often ambiguous. Therefore, the development of high-throughput experimental approaches for identification and mapping of genomic functional elements is highly desirable. In this review we discuss novel approaches to high-throughput experimental identification of mammalian genomes cis-regulatory elements which is a necessary step toward the complete genome annotation. PMID:18240562

  11. Lessons from Domestication: Targeting Cis-Regulatory Elements for Crop Improvement.

    PubMed

    Swinnen, Gwen; Goossens, Alain; Pauwels, Laurens

    2016-06-01

    Domestication of wild plant species has provided us with crops that serve our human nutritional needs. Advanced DNA sequencing has propelled the unveiling of underlying genetic changes associated with domestication. Interestingly, many changes reside in cis-regulatory elements (CREs) that control the expression of an unmodified coding sequence. Sequence variation in CREs can impact gene expression levels, but also developmental timing and tissue specificity of expression. When genes are involved in multiple pathways or active in several organs and developmental stages CRE modifications are favored in contrast to mutations in coding regions, due to the lack of detrimental pleiotropic effects. Therefore, learning from domestication, we propose that CREs are interesting targets for genome editing to create new alleles for plant breeding. PMID:26876195

  12. Dissecting the Genetic Basis of a Complex cis-Regulatory Adaptation

    PubMed Central

    Artieri, Carlo G.; Zhang, Mian; Zhou, Yiqi; Palmer, Michael E.; Fraser, Hunter B.

    2015-01-01

    Although single genes underlying several evolutionary adaptations have been identified, the genetic basis of complex, polygenic adaptations has been far more challenging to pinpoint. Here we report that the budding yeast Saccharomyces paradoxus has recently evolved resistance to citrinin, a naturally occurring mycotoxin. Applying a genome-wide test for selection on cis-regulation, we identified five genes involved in the citrinin response that are constitutively up-regulated in S. paradoxus. Four of these genes are necessary for resistance, and are also sufficient to increase the resistance of a sensitive strain when over-expressed. Moreover, cis-regulatory divergence in the promoters of these genes contributes to resistance, while exacting a cost in the absence of citrinin. Our results demonstrate how the subtle effects of individual regulatory elements can be combined, via natural selection, into a complex adaptation. Our approach can be applied to dissect the genetic basis of polygenic adaptations in a wide range of species. PMID:26713447

  13. Massively parallel cis-regulatory analysis in the mammalian central nervous system

    PubMed Central

    Shen, Susan Q.; Myers, Connie A.; Hughes, Andrew E.O.; Byrne, Leah C.; Flannery, John G.; Corbo, Joseph C.

    2016-01-01

    Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell–derived organoids. PMID:26576614

  14. Identification of three new cis-regulatory IRF5 polymorphisms: in vitro studies

    PubMed Central

    2013-01-01

    Background Polymorphisms in the interferon regulatory factor 5 (IRF5) gene are associated with susceptibility to systemic lupus erythematosus, rheumatoid arthritis and other diseases through independent risk and protective haplotypes. Several functional polymorphisms are already known, but they do not account for the protective haplotypes that are tagged by the minor allele of rs729302. Methods Polymorphisms in linkage disequilibrium (LD) with rs729302 or particularly associated with IRF5 expression were selected for functional screening, which involved electrophoretic mobility shift assays (EMSAs) and reporter gene assays. Results A total of 54 single-nucleotide polymorphisms in the 5' region of IRF5 were genotyped. Twenty-four of them were selected for functional screening because of their high LD with rs729302 or protective haplotypes. In addition, two polymorphisms were selected for their prominent association with IRF5 expression. Seven of these twenty-six polymorphisms showed reproducible allele differences in EMSA. The seven were subsequently analyzed in gene reporter assays, and three of them showed significant differences between their two alleles: rs729302, rs13245639 and rs11269962. Haplotypes including the cis-regulatory polymorphisms correlated very well with IRF5 mRNA expression in an analysis based on previous data. Conclusion We have found that three polymorphisms in LD with the protective haplotypes of IRF5 have differential allele effects in EMSA and in reporter gene assays. Identification of these cis-regulatory polymorphisms will allow more accurate analysis of transcriptional regulation of IRF5 expression, more powerful genetic association studies and deeper insight into the role of IRF5 in disease susceptibility. PMID:23941291

  15. Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.

    PubMed

    José-Edwards, Diana S; Oda-Ishii, Izumi; Kugler, Jamie E; Passamaneck, Yale J; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna

    2015-12-01

    A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs. PMID:26684323

  16. Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord

    PubMed Central

    José-Edwards, Diana S.; Oda-Ishii, Izumi; Kugler, Jamie E.; Passamaneck, Yale J.; Katikala, Lavanya; Nibu, Yutaka; Di Gregorio, Anna

    2015-01-01

    A main challenge of modern biology is to understand how specific constellations of genes are activated to differentiate cells and give rise to distinct tissues. This study focuses on elucidating how gene expression is initiated in the notochord, an axial structure that provides support and patterning signals to embryos of humans and all other chordates. Although numerous notochord genes have been identified, the regulatory DNAs that orchestrate development and propel evolution of this structure by eliciting notochord gene expression remain mostly uncharted, and the information on their configuration and recurrence is still quite fragmentary. Here we used the simple chordate Ciona for a systematic analysis of notochord cis-regulatory modules (CRMs), and investigated their composition, architectural constraints, predictive ability and evolutionary conservation. We found that most Ciona notochord CRMs relied upon variable combinations of binding sites for the transcription factors Brachyury and/or Foxa2, which can act either synergistically or independently from one another. Notably, one of these CRMs contains a Brachyury binding site juxtaposed to an (AC) microsatellite, an unusual arrangement also found in Brachyury-bound regulatory regions in mouse. In contrast, different subsets of CRMs relied upon binding sites for transcription factors of widely diverse families. Surprisingly, we found that neither intra-genomic nor interspecific conservation of binding sites were reliably predictive hallmarks of notochord CRMs. We propose that rather than obeying a rigid sequence-based cis-regulatory code, most notochord CRMs are rather unique. Yet, this study uncovered essential elements recurrently used by divergent chordates as basic building blocks for notochord CRMs. PMID:26684323

  17. Quantitative comparison of cis-regulatory element (CRE) activities in transgenic Drosophila melanogaster.

    PubMed

    Rogers, William A; Williams, Thomas M

    2011-01-01

    Gene expression patterns are specified by cis-regulatory element (CRE) sequences, which are also called enhancers or cis-regulatory modules. A typical CRE possesses an arrangement of binding sites for several transcription factor proteins that confer a regulatory logic specifying when, where, and at what level the regulated gene(s) is expressed. The full set of CREs within an animal genome encodes the organism's program for development, and empirical as well as theoretical studies indicate that mutations in CREs played a prominent role in morphological evolution. Moreover, human genome wide association studies indicate that genetic variation in CREs contribute substantially to phenotypic variation. Thus, understanding regulatory logic and how mutations affect such logic is a central goal of genetics. Reporter transgenes provide a powerful method to study the in vivo function of CREs. Here a known or suspected CRE sequence is coupled to heterologous promoter and coding sequences for a reporter gene encoding an easily observable protein product. When a reporter transgene is inserted into a host organism, the CRE's activity becomes visible in the form of the encoded reporter protein. P-element mediated transgenesis in the fruit fly species Drosophila (D.) melanogaster has been used for decades to introduce reporter transgenes into this model organism, though the genomic placement of transgenes is random. Hence, reporter gene activity is strongly influenced by the local chromatin and gene environment, limiting CRE comparisons to being qualitative. In recent years, the phiC31 based integration system was adapted for use in D. melanogaster to insert transgenes into specific genome landing sites. This capability has made the quantitative measurement of gene and, relevant here, CRE activity feasible. The production of transgenic fruit flies can be outsourced, including phiC31-based integration, eliminating the need to purchase expensive equipment and/or have proficiency at

  18. Close sequence comparisons are sufficient to identify human cis-regulatory elements.

    PubMed

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M; Couronne, Olivier; Pennacchio, Len A

    2006-07-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points. PMID:16769978

  19. Subfunctionalization of Duplicated Zebrafish pax6 Genes by cis-Regulatory Divergence

    PubMed Central

    Gautier, Philippe; Dahm, Ralf; Schonthaler, Helia B; Damante, Giuseppe; Seawright, Anne; Hever, Ann M; Yeyati, Patricia L; van Heyningen, Veronica; Coutinho, Pedro

    2008-01-01

    Gene duplication is a major driver of evolutionary divergence. In most vertebrates a single PAX6 gene encodes a transcription factor required for eye, brain, olfactory system, and pancreas development. In zebrafish, following a postulated whole-genome duplication event in an ancestral teleost, duplicates pax6a and pax6b jointly fulfill these roles. Mapping of the homozygously viable eye mutant sunrise identified a homeodomain missense change in pax6b, leading to loss of target binding. The mild phenotype emphasizes role-sharing between the co-orthologues. Meticulous mapping of isolated BACs identified perturbed synteny relationships around the duplicates. This highlights the functional conservation of pax6 downstream (3′) control sequences, which in most vertebrates reside within the introns of a ubiquitously expressed neighbour gene, ELP4, whose pax6a-linked exons have been lost in zebrafish. Reporter transgenic studies in both mouse and zebrafish, combined with analysis of vertebrate sequence conservation, reveal loss and retention of specific cis-regulatory elements, correlating strongly with the diverged expression of co-orthologues, and providing clear evidence for evolution by subfunctionalization. PMID:18282108

  20. Identification of a novel cis-regulatory element essential for immune tolerance.

    PubMed

    LaFlam, Taylor N; Seumois, Grégory; Miller, Corey N; Lwin, Wint; Fasano, Kayla J; Waterfield, Michael; Proekt, Irina; Vijayanand, Pandurangan; Anderson, Mark S

    2015-11-16

    Thymic central tolerance is essential to preventing autoimmunity. In medullary thymic epithelial cells (mTECs), the Autoimmune regulator (Aire) gene plays an essential role in this process by driving the expression of a diverse set of tissue-specific antigens (TSAs), which are presented and help tolerize self-reactive thymocytes. Interestingly, Aire has a highly tissue-restricted pattern of expression, with only mTECs and peripheral extrathymic Aire-expressing cells (eTACs) known to express detectable levels in adults. Despite this high level of tissue specificity, the cis-regulatory elements that control Aire expression have remained obscure. Here, we identify a highly conserved noncoding DNA element that is essential for Aire expression. This element shows enrichment of enhancer-associated histone marks in mTECs and also has characteristics of being an NF-κB-responsive element. Finally, we find that this element is essential for Aire expression in vivo and necessary to prevent spontaneous autoimmunity, reflecting the importance of this regulatory DNA element in promoting immune tolerance. PMID:26527800

  1. Distal cis-regulatory elements are required for tissue-specific expression of enamelin (Enam)

    PubMed Central

    Hu, Yuanyuan; Papagerakis, Petros; Ye, Ling; Feng, Jerry Q.; Simmer, James P.; Hu, Jan C-C.

    2009-01-01

    Enamel formation is orchestrated by the sequential expression of genes encoding enamel matrix proteins; however, the mechanisms sustaining the spatio–temporal order of gene transcription during amelogenesis are poorly understood. The aim of this study was to characterize the cis-regulatory sequences necessary for normal expression of enamelin (Enam). Several enamelin transcription regulatory regions, showing high sequence homology among species, were identified. DNA constructs containing 5.2 or 3.9 kb regions upstream of the enamelin translation initiation site were linked to a LacZ reporter and used to generate transgenic mice. Only the 5.2-Enam–LacZ construct was sufficient to recapitulate the endogenous pattern of enamelin tooth-specific expression. The 3.9-Enam–LacZ transgenic lines showed no expression in dental cells, but ectopic β-galactosidase activity was detected in osteoblasts. Potential transcription factor-binding sites were identified that may be important in controlling enamelin basal promoter activity and in conferring enamelin tissue-specific expression. Our study provides new insights into regulatory mechanisms governing enamelin expression. PMID:18353004

  2. Subfunctionalization of duplicated zebrafish pax6 genes by cis-regulatory divergence.

    PubMed

    Kleinjan, Dirk A; Bancewicz, Ruth M; Gautier, Philippe; Dahm, Ralf; Schonthaler, Helia B; Damante, Giuseppe; Seawright, Anne; Hever, Ann M; Yeyati, Patricia L; van Heyningen, Veronica; Coutinho, Pedro

    2008-02-01

    Gene duplication is a major driver of evolutionary divergence. In most vertebrates a single PAX6 gene encodes a transcription factor required for eye, brain, olfactory system, and pancreas development. In zebrafish, following a postulated whole-genome duplication event in an ancestral teleost, duplicates pax6a and pax6b jointly fulfill these roles. Mapping of the homozygously viable eye mutant sunrise identified a homeodomain missense change in pax6b, leading to loss of target binding. The mild phenotype emphasizes role-sharing between the co-orthologues. Meticulous mapping of isolated BACs identified perturbed synteny relationships around the duplicates. This highlights the functional conservation of pax6 downstream (3') control sequences, which in most vertebrates reside within the introns of a ubiquitously expressed neighbour gene, ELP4, whose pax6a-linked exons have been lost in zebrafish. Reporter transgenic studies in both mouse and zebrafish, combined with analysis of vertebrate sequence conservation, reveal loss and retention of specific cis-regulatory elements, correlating strongly with the diverged expression of co-orthologues, and providing clear evidence for evolution by subfunctionalization. PMID:18282108

  3. Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints

    PubMed Central

    Irimia, Manuel; Tena, Juan J.; Alexis, Maria S.; Fernandez-Miñan, Ana; Maeso, Ignacio; Bogdanović, Ozren; de la Calle-Mustienes, Elisa; Roy, Scott W.; Gómez-Skarmeta, José L.; Fraser, Hunter B.

    2012-01-01

    The order of genes in eukaryotic genomes has generally been assumed to be neutral, since gene order is largely scrambled over evolutionary time. Only a handful of exceptional examples are known, typically involving deeply conserved clusters of tandemly duplicated genes (e.g., Hox genes and histones). Here we report the first systematic survey of microsynteny conservation across metazoans, utilizing 17 genome sequences. We identified nearly 600 pairs of unrelated genes that have remained tightly physically linked in diverse lineages across over 600 million years of evolution. Integrating sequence conservation, gene expression data, gene function, epigenetic marks, and other genomic features, we provide extensive evidence that many conserved ancient linkages involve (1) the coordinated transcription of neighboring genes, or (2) genomic regulatory blocks (GRBs) in which transcriptional enhancers controlling developmental genes are contained within nearby bystander genes. In addition, we generated ChIP-seq data for key histone modifications in zebrafish embryos, which provided further evidence of putative GRBs in embryonic development. Finally, using chromosome conformation capture (3C) assays and stable transgenic experiments, we demonstrate that enhancers within bystander genes drive the expression of genes such as Otx and Islet, critical regulators of central nervous system development across bilaterians. These results suggest that ancient genomic functional associations are far more common than previously thought—involving ∼12% of the ancestral bilaterian genome—and that cis-regulatory constraints are crucial in determining metazoan genome architecture. PMID:22722344

  4. Genetic Analysis of Transvection Effects Involving Cis-Regulatory Elements of the Drosophila Ultrabithorax Gene

    PubMed Central

    Micol, J. L.; Castelli-Gair, J. E.; Garcia-Bellido, A.

    1990-01-01

    The Ultrabithorax (Ubx) gene of Drosophila melanogaster contains two functionally distinguishable regions: the protein-coding Ubx transcription unit and, upstream of it, the transcribed but non-protein-coding bxd region. Numerous recessive, partial loss-of-function mutations which appear to be regulatory mutations map within the bxd region and within the introns of the Ubx transcription unit. In addition, mutations within the Ubx unit exons are known and most of these behave as null alleles. Ubx(1) is one such allele. We have confirmed that, although the Ubx(1) allele does not produce detectable Ubx proteins (UBX), it does retain other genetic functions detectable by their effects on the expression of a paired, homologous Ubx allele, i.e., by transvection. We have extended previous analyses made by E. B. Lewis by mapping the critical elements of the Ubx gene which participate in transvection effects. Our results show that the Ubx(1) allele retains wild-type functions whose effectiveness can be reduced (1) by additional cis mutations in the bxd region or in introns of the Ubx transcription unit, as well as (2) by rearrangements disturbing pairing between homologous Ubx genes. Our results suggest that those remnant functions in Ubx(1) are able to modulate the activity of the allele located in the homologous chromosome. We discuss the normal cis regulatory role of these functions involved in trans interactions between homologous Ubx genes, as well as the implications of our results for the current models on transvection. PMID:2123161

  5. Motif-directed flexible backbone design of functional interactions

    PubMed Central

    Havranek, James J; Baker, David

    2009-01-01

    Computational protein design relies on a number of approximations to efficiently search the huge sequence space available to proteins. The fixed backbone and rotamer approximations in particular are important for formulating protein design as a discrete combinatorial optimization problem. However, the resulting coarse-grained sampling of possible side-chain terminal positions is problematic for the design of protein function, which depends on precise positioning of side-chain atoms. Although backbone flexibility can greatly increase the conformation freedom of side-chain functional groups, it is not obvious which backbone movements will generate the critical constellation of atoms responsible for protein function. Here, we report an automated method for identifying protein backbone movements that can give rise to any specified set of desired side-chain atomic placements and interactions, using protein–DNA interfaces as a model system. We use a library of previously observed protein–DNA interactions (motifs) and a rotamer-based description of side-chain conformation freedom to identify placements for the protein backbone that can give rise to a favorable side-chain interaction with DNA. We describe a tree-search algorithm for identifying those combinations of interactions from the library that can be realized with minimal perturbation of the protein backbone. We compare the efficiency of this method with the alternative approach of building and screening alternate backbone conformations. PMID:19472357

  6. Single nucleotide polymorphisms with cis-regulatory effects on long non-coding transcripts in human primary monocytes.

    PubMed

    Almlöf, Jonas Carlsson; Lundmark, Per; Lundmark, Anders; Ge, Bing; Pastinen, Tomi; Goodall, Alison H; Cambien, François; Deloukas, Panos; Ouwehand, Willem H; Syvänen, Ann-Christine

    2014-01-01

    We applied genome-wide allele-specific expression analysis of monocytes from 188 samples. Monocytes were purified from white blood cells of healthy blood donors to detect cis-acting genetic variation that regulates the expression of long non-coding RNAs. We analysed 8929 regions harboring genes for potential long non-coding RNA that were retrieved from data from the ENCODE project. Of these regions, 60% were annotated as intergenic, which implies that they do not overlap with protein-coding genes. Focusing on the intergenic regions, and using stringent analysis of the allele-specific expression data, we detected robust cis-regulatory SNPs in 258 out of 489 informative intergenic regions included in the analysis. The cis-regulatory SNPs that were significantly associated with allele-specific expression of long non-coding RNAs were enriched to enhancer regions marked for active or bivalent, poised chromatin by histone modifications. Out of the lncRNA regions regulated by cis-acting regulatory SNPs, 20% (n = 52) were co-regulated with the closest protein coding gene. We compared the identified cis-regulatory SNPs with those in the catalog of SNPs identified by genome-wide association studies of human diseases and traits. This comparison identified 32 SNPs in loci from genome-wide association studies that displayed a strong association signal with allele-specific expression of non-coding RNAs in monocytes, with p-values ranging from 6.7×10(-7) to 9.5×10(-89). The identified cis-regulatory SNPs are associated with diseases of the immune system, like multiple sclerosis and rheumatoid arthritis. PMID:25025429

  7. Mapping Association between Long-Range Cis-Regulatory Regions and Their Target Genes Using Comparative Genomics

    NASA Astrophysics Data System (ADS)

    Mongin, Emmanuel; Dewar, Ken; Blanchette, Mathieu

    In chordates, long-range cis-regulatory regions are involved in the control of transcription initiation (either as repressors or enhancers). They can be located as far as 1 Mb from the transcription start site of the target gene and can regulate more than one gene. Therefore, proper characterization of functional interactions between long-range cis-regulatory regions and their target genes remains problematic. We present a novel method to predict such interactions based on the analysis of rearrangements between the human and 16 other vertebrate genomes. Our method is based on the assumption that genome rearrangements that would disrupt the functional interaction between a cis-regulatory region and its target gene are likely to be deleterious. Therefore, conservation of synteny through evolution would be an indication of a functional interaction. We use our algorithm to classify a set of 1,406,084 putative associations from the human genome. This genome-wide map of interactions has many potential applications, including the selection of candidate regions prior to in vivo experimental characterization, a better characterization of regulatory regions involved in position effect diseases, and an improved understanding of the mechanisms and importance of long-range regulation.

  8. Balanced polymorphism in bottlenecked populations: the case of the CCR5 5' cis-regulatory region in Amazonian Amerindians.

    PubMed

    Ramalho, Rodrigo F; Santos, Eduardo J M; Guerreiro, João F; Meyer, Diogo

    2010-09-01

    The 5' cis-regulatory region of the CCR5 gene exhibits a strong signature of balancing selection in several human populations. Here we analyze the polymorphism of this region in Amerindians from Amazonia, who have a complex demographic history, including recent bottlenecks that are known to reduce genetic variability. Amerindians show high nucleotide diversity (pi = 0.27%) and significantly positive Tajima's D, and carry haplotypes associated with weak and strong gene expression. To evaluate whether these signatures of balancing selection could be explained by demography, we perform neutrality tests based on empiric and simulated data. The observed Tajima's D was higher than that of other world populations; higher than that found for 18 noncoding regions of South Amerindians, and higher than 99.6% of simulated genealogies, which assume nonequilibrium conditions. Moreover, comparing Amerindians and Asians, the Fst for CCR5 cis-regulatory region was unusually low, in relation to neutral markers. These findings indicate that, despite their complex demographic history, South Amerindians carry a detectable signature of selection on the CCR5 cis-regulatory region. PMID:20538030

  9. Changes in cis-regulatory elements of a key floral regulator are associated with divergence of inflorescence architectures.

    PubMed

    Kusters, Elske; Della Pina, Serena; Castel, Rob; Souer, Erik; Koes, Ronald

    2015-08-15

    Higher plant species diverged extensively with regard to the moment (flowering time) and position (inflorescence architecture) at which flowers are formed. This seems largely caused by variation in the expression patterns of conserved genes that specify floral meristem identity (FMI), rather than changes in the encoded proteins. Here, we report a functional comparison of the promoters of homologous FMI genes from Arabidopsis, petunia, tomato and Antirrhinum. Analysis of promoter-reporter constructs in petunia and Arabidopsis, as well as complementation experiments, showed that the divergent expression of leafy (LFY) and the petunia homolog aberrant leaf and flower (ALF) results from alterations in the upstream regulatory network rather than cis-regulatory changes. The divergent expression of unusual floral organs (UFO) from Arabidopsis, and the petunia homolog double top (DOT), however, is caused by the loss or gain of cis-regulatory promoter elements, which respond to trans-acting factors that are expressed in similar patterns in both species. Introduction of pUFO:UFO causes no obvious defects in Arabidopsis, but in petunia it causes the precocious and ectopic formation of flowers. This provides an example of how a change in a cis-regulatory region can account for a change in the plant body plan. PMID:26220938

  10. Shuffling of cis-regulatory elements is a pervasive feature of the vertebrate lineage

    PubMed Central

    Sanges, Remo; Kalmar, Eva; Claudiani, Pamela; D'Amato, Maria; Muller, Ferenc; Stupka, Elia

    2006-01-01

    Background All vertebrates share a remarkable degree of similarity in their development as well as in the basic functions of their cells. Despite this, attempts at unearthing genome-wide regulatory elements conserved throughout the vertebrate lineage using BLAST-like approaches have thus far detected noncoding conservation in only a few hundred genes, mostly associated with regulation of transcription and development. Results We used a unique combination of tools to obtain regional global-local alignments of orthologous loci. This approach takes into account shuffling of regulatory regions that are likely to occur over evolutionary distances greater than those separating mammalian genomes. This approach revealed one order of magnitude more vertebrate conserved elements than was previously reported in over 2,000 genes, including a high number of genes found in the membrane and extracellular regions. Our analysis revealed that 72% of the elements identified have undergone shuffling. We tested the ability of the elements identified to enhance transcription in zebrafish embryos and compared their activity with a set of control fragments. We found that more than 80% of the elements tested were able to enhance transcription significantly, prevalently in a tissue-restricted manner corresponding to the expression domain of the neighboring gene. Conclusion Our work elucidates the importance of shuffling in the detection of cis-regulatory elements. It also elucidates how similarities across the vertebrate lineage, which go well beyond development, can be explained not only within the realm of coding genes but also in that of the sequences that ultimately govern their expression. PMID:16859531

  11. Deciphering Cis-Regulatory Element Mediated Combinatorial Regulation in Rice under Blast Infected Condition.

    PubMed

    Deb, Arindam; Kundu, Sudip

    2015-01-01

    Combinations of cis-regulatory elements (CREs) present at the promoters facilitate the binding of several transcription factors (TFs), thereby altering the consequent gene expressions. Due to the eminent complexity of the regulatory mechanism, the combinatorics of CRE-mediated transcriptional regulation has been elusive. In this work, we have developed a new methodology that quantifies the co-occurrence tendencies of CREs present in a set of promoter sequences; these co-occurrence scores are filtered in three consecutive steps to test their statistical significance; and the significantly co-occurring CRE pairs are presented as networks. These networks of co-occurring CREs are further transformed to derive higher order of regulatory combinatorics. We have further applied this methodology on the differentially up-regulated gene-sets of rice tissues under fungal (Magnaporthe) infected conditions to demonstrate how it helps to understand the CRE-mediated combinatorial gene regulation. Our analysis includes a wide spectrum of biologically important results. The CRE pairs having a strong tendency to co-occur often exhibit very similar joint distribution patterns at the promoters of rice. We couple the network approach with experimental results of plant gene regulation and defense mechanisms and find evidences of auto and cross regulation among TF families, cross-talk among multiple hormone signaling pathways, similarities and dissimilarities in regulatory combinatorics between different tissues, etc. Our analyses have pointed a highly distributed nature of the combinatorial gene regulation facilitating an efficient alteration in response to fungal attack. All together, our proposed methodology could be an important approach in understanding the combinatorial gene regulation. It can be further applied to unravel the tissue and/or condition specific combinatorial gene regulation in other eukaryotic systems with the availability of annotated genomic sequences and suitable

  12. Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models

    PubMed Central

    Svetlichnyy, Dmitry; Imrichova, Hana; Fiers, Mark; Kalender Atak, Zeynep; Aerts, Stein

    2015-01-01

    Cancer genomes contain vast amounts of somatic mutations, many of which are passenger mutations not involved in oncogenesis. Whereas driver mutations in protein-coding genes can be distinguished from passenger mutations based on their recurrence, non-coding mutations are usually not recurrent at the same position. Therefore, it is still unclear how to identify cis-regulatory driver mutations, particularly when chromatin data from the same patient is not available, thus relying only on sequence and expression information. Here we use machine-learning methods to predict functional regulatory regions using sequence information alone, and compare the predicted activity of the mutated region with the reference sequence. This way we define the Predicted Regulatory Impact of a Mutation in an Enhancer (PRIME). We find that the recently identified driver mutation in the TAL1 enhancer has a high PRIME score, representing a “gain-of-target” for MYB, whereas the highly recurrent TERT promoter mutation has a surprisingly low PRIME score. We trained Random Forest models for 45 cancer-related transcription factors, and used these to score variations in the HeLa genome and somatic mutations across more than five hundred cancer genomes. Each model predicts only a small fraction of non-coding mutations with a potential impact on the function of the encompassing regulatory region. Nevertheless, as these few candidate driver mutations are often linked to gains in chromatin activity and gene expression, they may contribute to the oncogenic program by altering the expression levels of specific oncogenes and tumor suppressor genes. PMID:26562774

  13. Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models.

    PubMed

    Svetlichnyy, Dmitry; Imrichova, Hana; Fiers, Mark; Kalender Atak, Zeynep; Aerts, Stein

    2015-11-01

    Cancer genomes contain vast amounts of somatic mutations, many of which are passenger mutations not involved in oncogenesis. Whereas driver mutations in protein-coding genes can be distinguished from passenger mutations based on their recurrence, non-coding mutations are usually not recurrent at the same position. Therefore, it is still unclear how to identify cis-regulatory driver mutations, particularly when chromatin data from the same patient is not available, thus relying only on sequence and expression information. Here we use machine-learning methods to predict functional regulatory regions using sequence information alone, and compare the predicted activity of the mutated region with the reference sequence. This way we define the Predicted Regulatory Impact of a Mutation in an Enhancer (PRIME). We find that the recently identified driver mutation in the TAL1 enhancer has a high PRIME score, representing a "gain-of-target" for MYB, whereas the highly recurrent TERT promoter mutation has a surprisingly low PRIME score. We trained Random Forest models for 45 cancer-related transcription factors, and used these to score variations in the HeLa genome and somatic mutations across more than five hundred cancer genomes. Each model predicts only a small fraction of non-coding mutations with a potential impact on the function of the encompassing regulatory region. Nevertheless, as these few candidate driver mutations are often linked to gains in chromatin activity and gene expression, they may contribute to the oncogenic program by altering the expression levels of specific oncogenes and tumor suppressor genes. PMID:26562774

  14. Conserved Cis-Regulatory Modules Control Robustness in Msx1 Expression at Single-Cell Resolution

    PubMed Central

    Vance, Keith W.; Woodcock, Dan J.; Reid, John E.; Bretschneider, Till; Ott, Sascha; Koentges, Georgy

    2015-01-01

    The process of transcription is highly stochastic leading to cell-to-cell variations and noise in gene expression levels. However, key essential genes have to be precisely expressed at the correct amount and time to ensure proper cellular development and function. Studies in yeast and bacterial systems have shown that gene expression noise decreases as mean expression levels increase, a relationship that is controlled by promoter DNA sequence. However, the function of distal cis-regulatory modules (CRMs), an evolutionary novelty of metazoans, in controlling transcriptional robustness and variability is poorly understood. In this study, we used live cell imaging of transfected reporters combined with a mathematical modelling and statistical inference scheme to quantify the function of conserved Msx1 CRMs and promoters in modulating single-cell real-time transcription rates in C2C12 mouse myoblasts. The results show that the mean expression–noise relationship is solely promoter controlled for this key pluripotency regulator. In addition, we demonstrate that CRMs modulate single-cell basal promoter rate distributions in a graded manner across a population of cells. This extends the rheostatic model of CRM action to provide a more detailed understanding of CRM function at single-cell resolution. We also identify a novel CRM transcriptional filter function that acts to reduce intracellular variability in transcription rates and show that this can be phylogenetically separable from rate modulating CRM activities. These results are important for understanding how the expression of key vertebrate developmental transcription factors is precisely controlled both within and between individual cells. PMID:26342140

  15. Conserved Cis-Regulatory Modules Control Robustness in Msx1 Expression at Single-Cell Resolution.

    PubMed

    Vance, Keith W; Woodcock, Dan J; Reid, John E; Bretschneider, Till; Ott, Sascha; Koentges, Georgy

    2015-09-01

    The process of transcription is highly stochastic leading to cell-to-cell variations and noise in gene expression levels. However, key essential genes have to be precisely expressed at the correct amount and time to ensure proper cellular development and function. Studies in yeast and bacterial systems have shown that gene expression noise decreases as mean expression levels increase, a relationship that is controlled by promoter DNA sequence. However, the function of distal cis-regulatory modules (CRMs), an evolutionary novelty of metazoans, in controlling transcriptional robustness and variability is poorly understood. In this study, we used live cell imaging of transfected reporters combined with a mathematical modelling and statistical inference scheme to quantify the function of conserved Msx1 CRMs and promoters in modulating single-cell real-time transcription rates in C2C12 mouse myoblasts. The results show that the mean expression-noise relationship is solely promoter controlled for this key pluripotency regulator. In addition, we demonstrate that CRMs modulate single-cell basal promoter rate distributions in a graded manner across a population of cells. This extends the rheostatic model of CRM action to provide a more detailed understanding of CRM function at single-cell resolution. We also identify a novel CRM transcriptional filter function that acts to reduce intracellular variability in transcription rates and show that this can be phylogenetically separable from rate modulating CRM activities. These results are important for understanding how the expression of key vertebrate developmental transcription factors is precisely controlled both within and between individual cells. PMID:26342140

  16. Regulation of human PTCH1b expression by different 5' untranslated region cis-regulatory elements

    PubMed Central

    Ozretić, Petar; Bisio, Alessandra; Musani, Vesna; Trnski, Diana; Sabol, Maja; Levanat, Sonja; Inga, Alberto

    2015-01-01

    PTCH1 gene codes for a 12-pass transmembrane receptor with a negative regulatory role in the Hedgehog-Gli signaling pathway. PTCH1 germline mutations cause Gorlin syndrome, a disorder characterized by developmental abnormalities and tumor susceptibility. The autosomal dominant inheritance, and the evidence for PTCH1 haploinsufficiency, suggests that fine-tuning systems of protein patched homolog 1 (PTC1) levels exist to properly regulate the pathway. Given the role of 5' untranslated region (5'UTR) in protein expression, our aim was to thoroughly explore cis-regulatory elements in the 5'UTR of PTCH1 transcript 1b. The (CGG)n polymorphism was the main potential regulatory element studied so far but with inconsistent results and no clear association between repeat number and disease risk. Using luciferase reporter constructs in human cell lines here we show that the number of CGG repeats has no strong impact on gene expression, both at mRNA and protein levels. We observed variability in the length of 5'UTR and changes in abundance of the associated transcripts after pathway activation. We show that upstream AUG codons (uAUGs) present only in longer 5'UTRs could negatively regulate the amount of PTC1 isoform L (PTC1-L). The existence of an internal ribosome entry site (IRES) observed using different approaches and mapped in the region comprising the CGG repeats, would counteract the effect of the uAUGs and enable synthesis of PTC1-L under stressful conditions, such as during hypoxia. Higher relative translation efficiency of PTCH1b mRNA in HEK 293T cultured hypoxia was observed by polysomal profiling and Western blot analyses. All our results point to an exceptionally complex and so far unexplored role of 5'UTR PTCH1b cis-element features in the regulation of the Hedgehog-Gli signaling pathway. PMID:25826662

  17. Deciphering Cis-Regulatory Element Mediated Combinatorial Regulation in Rice under Blast Infected Condition

    PubMed Central

    Deb, Arindam; Kundu, Sudip

    2015-01-01

    Combinations of cis-regulatory elements (CREs) present at the promoters facilitate the binding of several transcription factors (TFs), thereby altering the consequent gene expressions. Due to the eminent complexity of the regulatory mechanism, the combinatorics of CRE-mediated transcriptional regulation has been elusive. In this work, we have developed a new methodology that quantifies the co-occurrence tendencies of CREs present in a set of promoter sequences; these co-occurrence scores are filtered in three consecutive steps to test their statistical significance; and the significantly co-occurring CRE pairs are presented as networks. These networks of co-occurring CREs are further transformed to derive higher order of regulatory combinatorics. We have further applied this methodology on the differentially up-regulated gene-sets of rice tissues under fungal (Magnaporthe) infected conditions to demonstrate how it helps to understand the CRE-mediated combinatorial gene regulation. Our analysis includes a wide spectrum of biologically important results. The CRE pairs having a strong tendency to co-occur often exhibit very similar joint distribution patterns at the promoters of rice. We couple the network approach with experimental results of plant gene regulation and defense mechanisms and find evidences of auto and cross regulation among TF families, cross-talk among multiple hormone signaling pathways, similarities and dissimilarities in regulatory combinatorics between different tissues, etc. Our analyses have pointed a highly distributed nature of the combinatorial gene regulation facilitating an efficient alteration in response to fungal attack. All together, our proposed methodology could be an important approach in understanding the combinatorial gene regulation. It can be further applied to unravel the tissue and/or condition specific combinatorial gene regulation in other eukaryotic systems with the availability of annotated genomic sequences and suitable

  18. Identification and Characterization of a cis-Regulatory Element for Zygotic Gene Expression in Chlamydomonas reinhardtii

    PubMed Central

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; Umen, James

    2016-01-01

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient to confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. We predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes. PMID:27172209

  19. Identification and characterization of a cis-regulatory element for zygotic gene expression in Chlamydomonas reinhardtii

    DOE PAGESBeta

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; Umen, James

    2016-03-26

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C. reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient tomore » confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. Furthermore, we predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes.« less

  20. Identification and Characterization of a cis-Regulatory Element for Zygotic Gene Expression in Chlamydomonas reinhardtii.

    PubMed

    Hamaji, Takashi; Lopez, David; Pellegrini, Matteo; Umen, James

    2016-01-01

    Upon fertilization Chlamydomonas reinhardtii zygotes undergo a program of differentiation into a diploid zygospore that is accompanied by transcription of hundreds of zygote-specific genes. We identified a distinct sequence motif we term a zygotic response element (ZYRE) that is highly enriched in promoter regions of C reinhardtii early zygotic genes. A luciferase reporter assay was used to show that native ZYRE motifs within the promoter of zygotic gene ZYS3 or intron of zygotic gene DMT4 are necessary for zygotic induction. A synthetic luciferase reporter with a minimal promoter was used to show that ZYRE motifs introduced upstream are sufficient to confer zygotic upregulation, and that ZYRE-controlled zygotic transcription is dependent on the homeodomain transcription factor GSP1. We predict that ZYRE motifs will correspond to binding sites for the homeodomain proteins GSP1-GSM1 that heterodimerize and activate zygotic gene expression in early zygotes. PMID:27172209

  1. Numb directs the subcellular localization of EAAT3 through binding the YxNxxF motif.

    PubMed

    Su, Jin-Feng; Wei, Jian; Li, Pei-Shan; Miao, Hong-Hua; Ma, Yong-Chao; Qu, Yu-Xiu; Xu, Jie; Qin, Jie; Li, Bo-Liang; Song, Bao-Liang; Xu, Zheng-Ping; Luo, Jie

    2016-08-15

    Excitatory amino acid transporter type 3 (EAAT3, also known as SLC1A1) is a high-affinity, Na(+)-dependent glutamate carrier that localizes primarily within the cell and at the apical plasma membrane. Although previous studies have reported proteins and sequence regions involved in EAAT3 trafficking, the detailed molecular mechanism by which EAAT3 is distributed to the correct location still remains elusive. Here, we identify that the YVNGGF sequence in the C-terminus of EAAT3 is responsible for its intracellular localization and apical sorting in rat hepatoma cells CRL1601 and Madin-Darby canine kidney (MDCK) cells, respectively. We further demonstrate that Numb, a clathrin adaptor protein, directly binds the YVNGGF motif and regulates the localization of EAAT3. Mutation of Y503, N505 and F508 within the YVNGGF motif to alanine residues or silencing Numb by use of small interfering RNA (siRNA) results in the aberrant localization of EAAT3. Moreover, both Numb and the YVNGGF motif mediate EAAT3 endocytosis in CRL1601 cells. In summary, our study suggests that Numb is a pivotal adaptor protein that mediates the subcellular localization of EAAT3 through binding the YxNxxF (where x stands for any amino acid) motif. PMID:27358480

  2. cis regulatory requirements for hypodermal cell-specific expression of the Caenorhabditis elegans cuticle collagen gene dpy-7.

    PubMed Central

    Gilleard, J S; Barry, J D; Johnstone, I L

    1997-01-01

    The Caenorhabditis elegans cuticle collagens are encoded by a multigene family of between 50 and 100 members and are the major component of the nematode cuticular exoskeleton. They are synthesized in the hypodermis prior to secretion and incorporation into the cuticle and exhibit complex patterns of spatial and temporal expression. We have investigated the cis regulatory requirements for tissue- and stage-specific expression of the cuticle collagen gene dpy-7 and have identified a compact regulatory element which is sufficient to specify hypodermal cell reporter gene expression. This element appears to be a true tissue-specific promoter element, since it encompasses the dpy-7 transcription initiation sites and functions in an orientation-dependent manner. We have also shown, by interspecies transformation experiments, that the dpy-7 cis regulatory elements are functionally conserved between C. elegans and C. briggsae, and comparative sequence analysis supports the importance of the regulatory sequence that we have identified by reporter gene analysis. All of our data suggest that the spatial expression of the dpy-7 cuticle collagen gene is established essentially by a small tissue-specific promoter element and does not require upstream activator or repressor elements. In addition, we have found the DPY-7 polypeptide is very highly conserved between the two species and that the C. briggsae polypeptide can function appropriately within the C. elegans cuticle. This finding suggests a remarkably high level of conservation of individual cuticle components, and their interactions, between these two nematode species. PMID:9121480

  3. ChIP-Seq-Annotated Heliconius erato Genome Highlights Patterns of cis-Regulatory Evolution in Lepidoptera.

    PubMed

    Lewis, James J; van der Burg, Karin R L; Mazo-Vargas, Anyi; Reed, Robert D

    2016-09-13

    Uncovering phylogenetic patterns of cis-regulatory evolution remains a fundamental goal for evolutionary and developmental biology. Here, we characterize the evolution of regulatory loci in butterflies and moths using chromatin immunoprecipitation sequencing (ChIP-seq) annotation of regulatory elements across three stages of head development. In the process we provide a high-quality, functionally annotated genome assembly for the butterfly, Heliconius erato. Comparing cis-regulatory element conservation across six lepidopteran genomes, we find that regulatory sequences evolve at a pace similar to that of protein-coding regions. We also observe that elements active at multiple developmental stages are markedly more conserved than elements with stage-specific activity. Surprisingly, we also find that stage-specific proximal and distal regulatory elements evolve at nearly identical rates. Our study provides a benchmark for genome-wide patterns of regulatory element evolution in insects, and it shows that developmental timing of activity strongly predicts patterns of regulatory sequence evolution. PMID:27626657

  4. An ancient yet flexible cis-regulatory architecture allows localized Hedgehog tuning by patched/Ptch1

    PubMed Central

    Lorberbaum, David S; Ramos, Andrea I; Peterson, Kevin A; Carpenter, Brandon S; Parker, David S; De, Sandip; Hillers, Lauren E; Blake, Victoria M; Nishi, Yuichi; McFarlane, Matthew R; Chiang, Ason CY; Kassis, Judith A; Allen, Benjamin L; McMahon, Andrew P; Barolo, Scott

    2016-01-01

    The Hedgehog signaling pathway is part of the ancient developmental-evolutionary animal toolkit. Frequently co-opted to pattern new structures, the pathway is conserved among eumetazoans yet flexible and pleiotropic in its effects. The Hedgehog receptor, Patched, is transcriptionally activated by Hedgehog, providing essential negative feedback in all tissues. Our locus-wide dissections of the cis-regulatory landscapes of fly patched and mouse Ptch1 reveal abundant, diverse enhancers with stage- and tissue-specific expression patterns. The seemingly simple, constitutive Hedgehog response of patched/Ptch1 is driven by a complex regulatory architecture, with batteries of context-specific enhancers engaged in promoter-specific interactions to tune signaling individually in each tissue, without disturbing patterning elsewhere. This structure—one of the oldest cis-regulatory features discovered in animal genomes—explains how patched/Ptch1 can drive dramatic adaptations in animal morphology while maintaining its essential core function. It may also suggest a general model for the evolutionary flexibility of conserved regulators and pathways. DOI: http://dx.doi.org/10.7554/eLife.13550.001 PMID:27146892

  5. Differential contribution of cis-regulatory elements to higher order chromatin structure and expression of the CFTR locus.

    PubMed

    Yang, Rui; Kerschner, Jenny L; Gosalia, Nehal; Neems, Daniel; Gorsic, Lidija K; Safi, Alexias; Crawford, Gregory E; Kosak, Steven T; Leir, Shih-Hsing; Harris, Ann

    2016-04-20

    Higher order chromatin structure establishes domains that organize the genome and coordinate gene expression. However, the molecular mechanisms controlling transcription of individual loci within a topological domain (TAD) are not fully understood. The cystic fibrosis transmembrane conductance regulator (CFTR) gene provides a paradigm for investigating these mechanisms.CFTR occupies a TAD bordered by CTCF/cohesin binding sites within which are cell-type-selective cis-regulatory elements for the locus. We showed previously that intronic and extragenic enhancers, when occupied by specific transcription factors, are recruited to the CFTR promoter by a looping mechanism to drive gene expression. Here we use a combination of CRISPR/Cas9 editing of cis-regulatory elements and siRNA-mediated depletion of architectural proteins to determine the relative contribution of structural elements and enhancers to the higher order structure and expression of the CFTR locus. We found the boundaries of the CFTRTAD are conserved among diverse cell types and are dependent on CTCF and cohesin complex. Removal of an upstream CTCF-binding insulator alters the interaction profile, but has little effect on CFTR expression. Within the TAD, intronic enhancers recruit cell-type selective transcription factors and deletion of a pivotal enhancer element dramatically decreases CFTR expression, but has minor effect on its 3D structure. PMID:26673704

  6. FootprintDB: Analysis of Plant Cis-Regulatory Elements, Transcription Factors, and Binding Interfaces.

    PubMed

    Contreras-Moreira, Bruno; Sebastian, Alvaro

    2016-01-01

    FootprintDB is a database and search engine that compiles regulatory sequences from open access libraries of curated DNA cis-elements and motifs, and their associated transcription factors (TFs). It systematically annotates the binding interfaces of the TFs by exploiting protein-DNA complexes deposited in the Protein Data Bank. Each entry in footprintDB is thus a DNA motif linked to the protein sequence of the TF(s) known to recognize it, and in most cases, the set of predicted interface residues involved in specific recognition. This chapter explains step-by-step how to search for DNA motifs and protein sequences in footprintDB and how to focus the search to a particular organism. Two real-world examples are shown where this software was used to analyze transcriptional regulation in plants. Results are described with the aim of guiding users on their interpretation, and special attention is given to the choices users might face when performing similar analyses. PMID:27557773

  7. Combinatorial regulation modules on GmSBP2 promoter: a distal cis-regulatory domain confines the SBP2 promoter activity to the vascular tissue in vegetative organs.

    PubMed

    Waclawovsky, Alessandro J; Freitas, Rejane L; Rocha, Carolina S; Contim, Luis Antônio S; Fontes, Elizabeth P B

    2006-01-01

    The Glycine max sucrose binding protein (GmSBP2) promoter directs phloem-specific expression of reporter genes in transgenic tobacco. Here, we identified cis-regulatory domains (CRD) that contribute with positive and negative regulation for the tissue-specific pattern of the GmSPB2 promoter. Negative regulatory elements in the distal CRD-A (-2000 to -700) sequences suppressed expression from the GmSBP2 promoter in tissues other than seed tissues and vascular tissues of vegetative organs. Deletion of this region relieved repression resulting in a constitutive promoter highly active in all tissues analyzed. Further deletions from the strong constitutive -700GmSBP2 promoter delimited several intercalating enhancer-like and repressing domains that function in a context-dependent manner. Histochemical examination revealed that the CRD-C (-445 to -367) harbors both negative and positive elements. This region abolished promoter expression in roots and in all tissues of stems except for the inner phloem. In contrast, it restores root meristem expression when fused to the -132pSBP2-GUS construct, which contains root meristem expression-repressing determinants mapped to the 44-bp CRD-G (-136 to -92). Thus, the GmSBP2 promoter is functionally organized into a proximal region with the combinatorial modular configuration of plant promoters and a distal domain, which restricts gene expression to the vascular tissues in vegetative organs. PMID:16574256

  8. A phylogenetic Gibbs sampler that yields centroid solutions of cis-regulatory sites

    SciTech Connect

    Newberg, Lee A.; Thompson, William A.; Conlan, Sean; Smith, Thomas M.; McCue, Lee Ann; Lawrence, Charles E.

    2007-07-15

    Identification of functionally conserved regulatory elements in sequence data from closely related organisms is becoming feasible, due to the rapid growth of public sequence databases. Closely related organisms are most likely to have common regulatory motifs, however the recent speciation of such organisms results in the high degree of correlation in their genome sequences, confounding the detection of functional elements. Additionally, alignment algorithms that use optimization techniques are limited to the detection of a single alignment that may not be representative. Comparative-genomics studies must be able to address the phylogenetic correlation in the data and efficiently explore the alignment space, in order to make specific and biologically relevant predictions. Results: We describe here a Gibbs sampler that employs a full phylogenetic model and reports an ensemble centroid solution. We describe regulatory motif detection using both simulated and real data, and demonstrate that this approach achieves improved specificity, sensitivity, and positive predictive value over non-phylogenetic algorithms, and over phylogenetic algorithms that report a maximum likelihood solution.

  9. Correlating Gene Expression Variation with cis-Regulatory Polymorphism in Saccharomyces cerevisiae

    PubMed Central

    Chen, Kevin; van Nimwegen, Erik; Rajewsky, Nikolaus; Siegal, Mark L.

    2010-01-01

    Identifying the nucleotides that cause gene expression variation is a critical step in dissecting the genetic basis of complex traits. Here, we focus on polymorphisms that are predicted to alter transcription factor binding sites (TFBSs) in the yeast, Saccharomyces cerevisiae. We assembled a confident set of transcription factor motifs using recent protein binding microarray and ChIP-chip data and used our collection of motifs to predict a comprehensive set of TFBSs across the S. cerevisiae genome. We used a population genomics analysis to show that our predictions are accurate and significantly improve on our previous annotation. Although predicting gene expression from sequence is thought to be difficult in general, we identified a subset of genes for which changes in predicted TFBSs correlate well with expression divergence between yeast strains. Our analysis thus demonstrates both the accuracy of our new TFBS predictions and the feasibility of using simple models of gene regulation to causally link differences in gene expression to variation at individual nucleotides. PMID:20829281

  10. Two RNA-binding motifs in eIF3 direct HCV IRES-dependent translation

    PubMed Central

    Sun, Chaomin; Querol-Audí, Jordi; Mortimer, Stefanie A.; Arias-Palomo, Ernesto; Doudna, Jennifer A.; Nogales, Eva; Cate, Jamie H. D.

    2013-01-01

    The initiation of protein synthesis plays an essential regulatory role in human biology. At the center of the initiation pathway, the 13-subunit eukaryotic translation initiation factor 3 (eIF3) controls access of other initiation factors and mRNA to the ribosome by unknown mechanisms. Using electron microscopy (EM), bioinformatics and biochemical experiments, we identify two highly conserved RNA-binding motifs in eIF3 that direct translation initiation from the hepatitis C virus internal ribosome entry site (HCV IRES) RNA. Mutations in the RNA-binding motif of subunit eIF3a weaken eIF3 binding to the HCV IRES and the 40S ribosomal subunit, thereby suppressing eIF2-dependent recognition of the start codon. Mutations in the eIF3c RNA-binding motif also reduce 40S ribosomal subunit binding to eIF3, and inhibit eIF5B-dependent steps downstream of start codon recognition. These results provide the first connection between the structure of the central translation initiation factor eIF3 and recognition of the HCV genomic RNA start codon, molecular interactions that likely extend to the human transcriptome. PMID:23766293

  11. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  12. Multiple cis Regulatory Elements Control RANTES Promoter Activity in Alveolar Epithelial Cells Infected with Respiratory Syncytial Virus

    PubMed Central

    Casola, Antonella; Garofalo, Roberto P.; Haeberle, Helene; Elliott, Todd F.; Lin, Rongtuan; Jamaluddin, Mohammad; Brasier, Allan R.

    2001-01-01

    Respiratory syncytial virus (RSV) produces intense pulmonary inflammation, in part through its ability to induce chemokine synthesis in infected airway epithelial cells. RANTES (regulated upon activation, normally T-cell expressed and presumably secreted) is a CC chemokine which recruits and activates monocytes, lymphocytes, and eosinophils, all cell types present in the lung inflammatory infiltrate induced by RSV infection. In this study, we analyzed the mechanism of RSV-induced RANTES promoter activation in human type II alveolar epithelial cells (A549 cells). Promoter deletion and mutagenesis experiments indicate that RSV requires the presence of five different cis regulatory elements, located in the promoter fragment spanning from −220 to +55 nucleotides, corresponding to NF-κB, C/EBP, Jun/CREB/ATF, and interferon regulatory factor (IRF) binding sites. Although site mutations of the NF-κB, C/EBP, and CREB/AP-1 like sites reduce RSV-induced RANTES gene transcription to 50% or less, only mutations affecting IRF binding completely abolish RANTES inducibility. Supershift and microaffinity isolation assays were used to identify the different transcription factor family members whose DNA binding activity was RSV inducible. Expression of dominant negative mutants of these transcription factors further established their central role in virus-induced RANTES promoter activation. Our finding that the presence of multiple cis regulatory elements is required for full activation of the RANTES promoter in RSV-infected alveolar epithelial cells supports the enhanceosome model for RANTES gene transcription, which is absolutely dependent on binding of IRF transcription factors. The identification of regulatory mechanisms of RANTES gene expression is fundamental for rational design of inhibitors of RSV-induced lung inflammation. PMID:11413310

  13. Using machine learning to predict gene expression and discover sequence motifs

    NASA Astrophysics Data System (ADS)

    Li, Xuejing

    Recently, large amounts of experimental data for complex biological systems have become available. We use tools and algorithms from machine learning to build data-driven predictive models. We first present a novel algorithm to discover gene sequence motifs associated with temporal expression patterns of genes. Our algorithm, which is based on partial least squares (PLS) regression, is able to directly model the flow of information, from gene sequence to gene expression, to learn cis regulatory motifs and characterize associated gene expression patterns. Our algorithm outperforms traditional computational methods e.g. clustering in motif discovery. We then present a study of extending a machine learning model for transcriptional regulation predictive of genetic regulatory response to Caenorhabditis elegans. We show meaningful results both in terms of prediction accuracy on the test experiments and biological information extracted from the regulatory program. The model discovers DNA binding sites ab initio. We also present a case study where we detect a signal of lineage-specific regulation. Finally we present a comparative study on learning predictive models for motif discovery, based on different boosting algorithms: Adaptive Boosting (AdaBoost), Linear Programming Boosting (LPBoost) and Totally Corrective Boosting (TotalBoost). We evaluate and compare the performance of the three boosting algorithms via both statistical and biological validation, for hypoxia response in Saccharomyces cerevisiae.

  14. Modular cis-regulatory organization of developmentally expressed genes: two genes transcribed territorially in the sea urchin embryo, and additional examples.

    PubMed Central

    Kirchhamer, C V; Yuh, C H; Davidson, E H

    1996-01-01

    The cis-regulatory systems that control developmental expression of two sea urchin genes have been subjected to detailed functional analysis. Both systems are modular in organization: specific, separable fragments of the cis-regulatory DNA each containing multiple transcription factor target sites execute particular regulatory subfunctions when associated with reporter genes and introduced into the embryo. The studies summarized here were carried out on the CyIIIa gene, expressed in the embryonic aboral ectoderm and on the Endo16 gene, expressed in the embryonic vegetal plate, archenteron, and then midgut. The regulatory systems of both genes include modules that control particular aspects of temporal and spatial expression, and in both the territorial boundaries of expression depend on a combination of negative and positive functions. In both genes different regulatory modules control early and late embryonic expression. Modular cis-regulatory organization is widespread in developmentally regulated genes, and we present a tabular summary that includes many examples from mouse and Drosophila. We regard cis-regulatory modules as units of developmental transcription control, and also of evolution, in the assembly of transcription control systems. Images Fig. 2 PMID:8790328

  15. Separate elements of the TERMINAL FLOWER 1 cis-regulatory region integrate pathways to control flowering time and shoot meristem identity.

    PubMed

    Serrano-Mislata, Antonio; Fernández-Nohales, Pedro; Doménech, María J; Hanzawa, Yoshie; Bradley, Desmond; Madueño, Francisco

    2016-09-15

    TERMINAL FLOWER 1 (TFL1) is a key regulator of Arabidopsis plant architecture that responds to developmental and environmental signals to control flowering time and the fate of shoot meristems. TFL1 expression is dynamic, being found in all shoot meristems, but not in floral meristems, with the level and distribution changing throughout development. Using a variety of experimental approaches we have analysed the TFL1 promoter to elucidate its functional structure. TFL1 expression is based on distinct cis-regulatory regions, the most important being located 3' of the coding sequence. Our results indicate that TFL1 expression in the shoot apical versus lateral inflorescence meristems is controlled through distinct cis-regulatory elements, suggesting that different signals control expression in these meristem types. Moreover, we identified a cis-regulatory region necessary for TFL1 expression in the vegetative shoot and required for a wild-type flowering time, supporting that TFL1 expression in the vegetative meristem controls flowering time. Our study provides a model for the functional organisation of TFL1 cis-regulatory regions, contributing to our understanding of how developmental pathways are integrated at the genomic level of a key regulator to control plant architecture. PMID:27385013

  16. RAR/RXR binding dynamics distinguish pluripotency from differentiation associated cis-regulatory elements

    PubMed Central

    Chatagnon, Amandine; Veber, Philippe; Morin, Valérie; Bedo, Justin; Triqueneaux, Gérard; Sémon, Marie; Laudet, Vincent; d'Alché-Buc, Florence; Benoit, Gérard

    2015-01-01

    In mouse embryonic cells, ligand-activated retinoic acid receptors (RARs) play a key role in inhibiting pluripotency-maintaining genes and activating some major actors of cell differentiation. To investigate the mechanism underlying this dual regulation, we performed joint RAR/RXR ChIP-seq and mRNA-seq time series during the first 48 h of the RA-induced Primitive Endoderm (PrE) differentiation process in F9 embryonal carcinoma (EC) cells. We show here that this dual regulation is associated with RAR/RXR genomic redistribution during the differentiation process. In-depth analysis of RAR/RXR binding sites occupancy dynamics and composition show that in undifferentiated cells, RAR/RXR interact with genomic regions characterized by binding of pluripotency-associated factors and high prevalence of the non-canonical DR0-containing RA response element. By contrast, in differentiated cells, RAR/RXR bound regions are enriched in functional Sox17 binding sites and are characterized with a higher frequency of the canonical DR5 motif. Our data offer an unprecedentedly detailed view on the action of RA in triggering pluripotent cell differentiation and demonstrate that RAR/RXR action is mediated via two different sets of regulatory regions tightly associated with cell differentiation status. PMID:25897113

  17. RAR/RXR binding dynamics distinguish pluripotency from differentiation associated cis-regulatory elements.

    PubMed

    Chatagnon, Amandine; Veber, Philippe; Morin, Valérie; Bedo, Justin; Triqueneaux, Gérard; Sémon, Marie; Laudet, Vincent; d'Alché-Buc, Florence; Benoit, Gérard

    2015-05-26

    In mouse embryonic cells, ligand-activated retinoic acid receptors (RARs) play a key role in inhibiting pluripotency-maintaining genes and activating some major actors of cell differentiation. To investigate the mechanism underlying this dual regulation, we performed joint RAR/RXR ChIP-seq and mRNA-seq time series during the first 48 h of the RA-induced Primitive Endoderm (PrE) differentiation process in F9 embryonal carcinoma (EC) cells. We show here that this dual regulation is associated with RAR/RXR genomic redistribution during the differentiation process. In-depth analysis of RAR/RXR binding sites occupancy dynamics and composition show that in undifferentiated cells, RAR/RXR interact with genomic regions characterized by binding of pluripotency-associated factors and high prevalence of the non-canonical DR0-containing RA response element. By contrast, in differentiated cells, RAR/RXR bound regions are enriched in functional Sox17 binding sites and are characterized with a higher frequency of the canonical DR5 motif. Our data offer an unprecedentedly detailed view on the action of RA in triggering pluripotent cell differentiation and demonstrate that RAR/RXR action is mediated via two different sets of regulatory regions tightly associated with cell differentiation status. PMID:25897113

  18. Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes1

    PubMed Central

    Rombauts, Stephane; Florquin, Kobe; Lescot, Magali; Marchal, Kathleen; Rouzé, Pierre; Van de Peer, Yves

    2003-01-01

    The identification of promoters and their regulatory elements is one of the major challenges in bioinformatics and integrates comparative, structural, and functional genomics. Many different approaches have been developed to detect conserved motifs in a set of genes that are either coregulated or orthologous. However, although recent approaches seem promising, in general, unambiguous identification of regulatory elements is not straightforward. The delineation of promoters is even harder, due to its complex nature, and in silico promoter prediction is still in its infancy. Here, we review the different approaches that have been developed for identifying promoters and their regulatory elements. We discuss the detection of cis-acting regulatory elements using word-counting or probabilistic methods (so-called “search by signal” methods) and the delineation of promoters by considering both sequence content and structural features (“search by content” methods). As an example of search by content, we explored in greater detail the association of promoters with CpG islands. However, due to differences in sequence content, the parameters used to detect CpG islands in humans and other vertebrates cannot be used for plants. Therefore, a preliminary attempt was made to define parameters that could possibly define CpG and CpNpG islands in Arabidopsis, by exploring the compositional landscape around the transcriptional start site. To this end, a data set of more than 5,000 gene sequences was built, including the promoter region, the 5′-untranslated region, and the first introns and coding exons. Preliminary analysis shows that promoter location based on the detection of potential CpG/CpNpG islands in the Arabidopsis genome is not straightforward. Nevertheless, because the landscape of CpG/CpNpG islands differs considerably between promoters and introns on the one side and exons (whether coding or not) on the other, more sophisticated approaches can probably be

  19. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

    PubMed

    Karnik, Rahul; Beer, Michael A

    2015-01-01

    The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884

  20. Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space

    PubMed Central

    Karnik, Rahul; Beer, Michael A.

    2015-01-01

    The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs. PMID:26465884

  1. Cis-regulatory Changes at FLOWERING LOCUS T Mediate Natural Variation in Flowering Responses of Arabidopsis thaliana

    PubMed Central

    Schwartz, Christopher; Balasubramanian, Sureshkumar; Warthmann, Norman; Michael, Todd P.; Lempe, Janne; Sureshkumar, Sridevi; Kobayashi, Yasushi; Maloof, Julin N.; Borevitz, Justin O.; Chory, Joanne; Weigel, Detlef

    2009-01-01

    Flowering time, a critical adaptive trait, is modulated by several environmental cues. These external signals converge on a small set of genes that in turn mediate the flowering response. Mutant analysis and subsequent molecular studies have revealed that one of these integrator genes, FLOWERING LOCUS T (FT), responds to photoperiod and temperature cues, two environmental parameters that greatly influence flowering time. As the central player in the transition to flowering, the protein coding sequence of FT and its function are highly conserved across species. Using QTL mapping with a new advanced intercross-recombinant inbred line (AI-RIL) population, we show that a QTL tightly linked to FT contributes to natural variation in the flowering response to the combined effects of photoperiod and ambient temperature. Using heterogeneous inbred families (HIF) and introgression lines, we fine map the QTL to a 6.7 kb fragment in the FT promoter. We confirm by quantitative complementation that FT has differential activity in the two parental strains. Further support for FT underlying the QTL comes from a new approach, quantitative knockdown with artificial microRNAs (amiRNAs). Consistent with the causal sequence polymorphism being in the promoter, we find that the QTL affects FT expression. Taken together, these results indicate that allelic variation at pathway integrator genes such as FT can underlie phenotypic variability and that this may be achieved through cis-regulatory changes. PMID:19652183

  2. Comparative epigenomics in distantly related teleost species identifies conserved cis-regulatory nodes active during the vertebrate phylotypic period

    PubMed Central

    Tena, Juan J.; González-Aguilera, Cristina; Fernández-Miñán, Ana; Vázquez-Marín, Javier; Parra-Acero, Helena; Cross, Joe W.; Rigby, Peter W.J.; Carvajal, Jaime J.; Wittbrodt, Joachim; Gómez-Skarmeta, José L.; Martínez-Morales, Juan R.

    2014-01-01

    The complex relationship between ontogeny and phylogeny has been the subject of attention and controversy since von Baer’s formulations in the 19th century. The classic concept that embryogenesis progresses from clade general features to species-specific characters has often been revisited. It has become accepted that embryos from a clade show maximum morphological similarity at the so-called phylotypic period (i.e., during mid-embryogenesis). According to the hourglass model, body plan conservation would depend on constrained molecular mechanisms operating at this period. More recently, comparative transcriptomic analyses have provided conclusive evidence that such molecular constraints exist. Examining cis-regulatory architecture during the phylotypic period is essential to understand the evolutionary source of body plan stability. Here we compare transcriptomes and key epigenetic marks (H3K4me3 and H3K27ac) from medaka (Oryzias latipes) and zebrafish (Danio rerio), two distantly related teleosts separated by an evolutionary distance of 115–200 Myr. We show that comparison of transcriptome profiles correlates with anatomical similarities and heterochronies observed at the phylotypic stage. Through comparative epigenomics, we uncover a pool of conserved regulatory regions (≈700), which are active during the vertebrate phylotypic period in both species. Moreover, we show that their neighboring genes encode mainly transcription factors with fundamental roles in tissue specification. We postulate that these regulatory regions, active in both teleost genomes, represent key constrained nodes of the gene networks that sustain the vertebrate body plan. PMID:24709821

  3. Novel green tissue-specific synthetic promoters and cis-regulatory elements in rice

    PubMed Central

    Wang, Rui; Zhu, Menglin; Ye, Rongjian; Liu, Zuoxiong; Zhou, Fei; Chen, Hao; Lin, Yongjun

    2015-01-01

    As an important part of synthetic biology, synthetic promoter has gradually become a hotspot in current biology. The purposes of the present study were to synthesize green tissue-specific promoters and to discover green tissue-specific cis-elements. We first assembled several regulatory sequences related to tissue-specific expression in different combinations, aiming to obtain novel green tissue-specific synthetic promoters. GUS assays of the transgenic plants indicated 5 synthetic promoters showed green tissue-specific expression patterns and different expression efficiencies in various tissues. Subsequently, we scanned and counted the cis-elements in different tissue-specific promoters based on the plant cis-elements database PLACE and the rice cDNA microarray database CREP for green tissue-specific cis-element discovery, resulting in 10 potential cis-elements. The flanking sequence of one potential core element (GEAT) was predicted by bioinformatics. Then, the combination of GEAT and its flanking sequence was functionally identified with synthetic promoter. GUS assays of the transgenic plants proved its green tissue-specificity. Furthermore, the function of GEAT flanking sequence was analyzed in detail with site-directed mutagenesis. Our study provides an example for the synthesis of rice tissue-specific promoters and develops a feasible method for screening and functional identification of tissue-specific cis-elements with their flanking sequences at the genome-wide level in rice. PMID:26655679

  4. Extensive cis-Regulatory Variation Robust to Environmental Perturbation in Arabidopsis[W

    PubMed Central

    Cubillos, Francisco A.; Stegle, Oliver; Grondin, Cécile; Canut, Matthieu; Tisné, Sébastien; Gy, Isabelle

    2014-01-01

    cis- and trans-acting factors affect gene expression and responses to environmental conditions. However, for most plant systems, we lack a comprehensive map of these factors and their interaction with environmental variation. Here, we examined allele-specific expression (ASE) in an F1 hybrid to study how alleles from two Arabidopsis thaliana accessions affect gene expression. To investigate the effect of the environment, we used drought stress and developed a variance component model to estimate the combined genetic contributions of cis- and trans-regulatory polymorphisms, environmental factors, and their interactions. We quantified ASE for 11,003 genes, identifying 3318 genes with consistent ASE in control and stress conditions, demonstrating that cis-acting genetic effects are essentially robust to changes in the environment. Moreover, we found 1618 genes with genotype x environment (GxE) interactions, mostly cis x E interactions with magnitude changes in ASE. We found fewer trans x E interactions, but these effects were relatively less robust across conditions, showing more changes in the direction of the effect between environments; this confirms that trans-regulation plays an important role in the response to environmental conditions. Our data provide a detailed map of cis- and trans-regulation and GxE interactions in A. thaliana, laying the ground for mechanistic investigations and studies in other plants and environments. PMID:25428981

  5. i-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly.

    PubMed

    Imrichová, Hana; Hulselmans, Gert; Atak, Zeynep Kalender; Potier, Delphine; Aerts, Stein

    2015-07-01

    i-cisTarget is a web tool to predict regulators of a set of genomic regions, such as ChIP-seq peaks or co-regulated/similar enhancers. i-cisTarget can also be used to identify upstream regulators and their target enhancers starting from a set of co-expressed genes. Whereas the original version of i-cisTarget was focused on Drosophila data, the 2015 update also provides support for human and mouse data. i-cisTarget detects transcription factor motifs (position weight matrices) and experimental data tracks (e.g. from ENCODE, Roadmap Epigenomics) that are enriched in the input set of regions. As experimental data tracks we include transcription factor ChIP-seq data, histone modification ChIP-seq data and open chromatin data. The underlying processing method is based on a ranking-and-recovery procedure, allowing accurate determination of enrichment across heterogeneous datasets, while also discriminating direct from indirect target regions through a 'leading edge' analysis. We illustrate i-cisTarget on various Ewing sarcoma datasets to identify EWS-FLI1 targets starting from ChIP-seq, differential ATAC-seq, differential H3K27ac and differential gene expression data. Use of i-cisTarget is free and open to all, and there is no login requirement. Address: http://gbiomed.kuleuven.be/apps/lcb/i-cisTarget. PMID:25925574

  6. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus.

    PubMed

    Damle, Sagar; Davidson, Eric H

    2011-09-15

    Deployment of the gene-regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an auto-repressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring. PMID:21723273

  7. Precise cis-regulatory control of spatial and temporal expression of the alx-1 gene in the skeletogenic lineage of s. purpuratus

    PubMed Central

    Damle, Sagar; Davidson, Eric H.

    2011-01-01

    Deployment of the gene regulatory network (GRN) responsible for skeletogenesis in the embryo of the sea urchin Strongylocentrotus purpuratus is restricted to the large micromere lineage by a double negative regulatory gate. The gate consists of a GRN subcircuit composed of the pmar1 and hesC genes, which encode repressors and are wired in tandem, plus a set of target regulatory genes under hesC control. The skeletogenic cell state is specified initially by micromere-specific expression of these regulatory genes, viz. alx1, ets1, tbrain and tel, plus the gene encoding the Notch ligand Delta. Here we use a recently developed high throughput methodology for experimental cis-regulatory analysis to elucidate the genomic regulatory system controlling alx1 expression in time and embryonic space. The results entirely confirm the double negative gate control system at the cis-regulatory level, including definition of the functional HesC target sites, and add the crucial new information that the drivers of alx1 expression are initially Ets1, and then Alx1 itself plus Ets1. Cis-regulatory analysis demonstrates that these inputs quantitatively account for the magnitude of alx1 expression. Furthermore, the Alx1 gene product not only performs an auto-regulatory role, promoting a fast rise in alx1 expression, but also, when at high levels, it behaves as an autorepressor. A synthetic experiment indicates that this behavior is probably due to dimerization. In summary, the results we report provide the sequence level basis for control of alx1 spatial expression by the double negative gate GRN architecture, and explain the rising, then falling temporal expression profile of the alx1 gene in terms of its auto-regulatory genetic wiring. PMID:21723273

  8. iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections

    PubMed Central

    Imrichová, Hana; Van de Sande, Bram; Standaert, Laura; Christiaens, Valerie; Hulselmans, Gert; Herten, Koen; Naval Sanchez, Marina; Potier, Delphine; Svetlichnyy, Dmitry; Kalender Atak, Zeynep; Fiers, Mark; Marine, Jean-Christophe; Aerts, Stein

    2014-01-01

    Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org. PMID:25058159

  9. Cis-Regulatory Elements Determine Germline Specificity and Expression Level of an Isopentenyltransferase Gene in Sperm Cells of Arabidopsis.

    PubMed

    Zhang, Jinghua; Yuan, Tong; Duan, Xiaomeng; Wei, Xiaoping; Shi, Tao; Li, Jia; Russell, Scott D; Gou, Xiaoping

    2016-03-01

    Flowering plant sperm cells transcribe a divergent and complex complement of genes. To examine promoter function, we chose an isopentenyltransferase gene known as PzIPT1. This gene is highly selectively transcribed in one sperm cell morphotype of Plumbago zeylanica, which preferentially fuses with the central cell during fertilization and is thus a founding cell of the primary endosperm. In transgenic Arabidopsis (Arabidopsis thaliana), PzIPT1 promoter displays activity in both sperm cells and upon progressive promoter truncation from the 5'-end results in a progressive decrease in reporter production, consistent with occurrence of multiple enhancer sites. Cytokinin-dependent protein binding motifs are identified in the promoter sequence, which respond with stimulation by cytokinin. Expression of PzIPT1 promoter in sperm cells confers specificity independently of previously reported Germline Restrictive Silencer Factor binding sequence. Instead, a cis-acting regulatory region consisting of two duplicated 6-bp Male Gamete Selective Activation (MGSA) motifs occurs near the site of transcription initiation. Disruption of this sequence-specific site inactivates expression of a GFP reporter gene in sperm cells. Multiple copies of the MGSA motif fused with the minimal CaMV35S promoter elements confer reporter gene expression in sperm cells. Similar duplicated MGSA motifs are also identified from promoter sequences of sperm cell-expressed genes in Arabidopsis, suggesting selective activation is possibly a common mechanism for regulation of gene expression in sperm cells of flowering plants. PMID:26739233

  10. RNA-ID, a highly sensitive and robust method to identify cis-regulatory sequences using superfolder GFP and a fluorescence-based assay

    PubMed Central

    Dean, Kimberly M.; Grayhack, Elizabeth J.

    2012-01-01

    We have developed a robust and sensitive method, called RNA-ID, to screen for cis-regulatory sequences in RNA using fluorescence-activated cell sorting (FACS) of yeast cells bearing a reporter in which expression of both superfolder green fluorescent protein (GFP) and yeast codon-optimized mCherry red fluorescent protein (RFP) is driven by the bidirectional GAL1,10 promoter. This method recapitulates previously reported progressive inhibition of translation mediated by increasing numbers of CGA codon pairs, and restoration of expression by introduction of a tRNA with an anticodon that base pairs exactly with the CGA codon. This method also reproduces effects of paromomycin and context on stop codon read-through. Five key features of this method contribute to its effectiveness as a selection for regulatory sequences: The system exhibits greater than a 250-fold dynamic range, a quantitative and dose-dependent response to known inhibitory sequences, exquisite resolution that allows nearly complete physical separation of distinct populations, and a reproducible signal between different cells transformed with the identical reporter, all of which are coupled with simple methods involving ligation-independent cloning, to create large libraries. Moreover, we provide evidence that there are sequences within a 9-nt library that cause reduced GFP fluorescence, suggesting that there are novel cis-regulatory sequences to be found even in this short sequence space. This method is widely applicable to the study of both RNA-mediated and codon-mediated effects on expression. PMID:23097427

  11. A cis-regulatory sequence from a short intergenic region gives rise to a strong microbe-associated molecular pattern-responsive synthetic promoter.

    PubMed

    Lehmeyer, Mona; Hanko, Erik K R; Roling, Lena; Gonzalez, Lilian; Wehrs, Maren; Hehl, Reinhard

    2016-06-01

    The high gene density in Arabidopsis thaliana leaves only relatively short intergenic regions for potential cis-regulatory sequences. To learn more about the regulation of genes harbouring only very short upstream intergenic regions, this study investigates a recently identified novel microbe-associated molecular pattern (MAMP)-responsive cis-sequence located within the 101 bp long intergenic region upstream of the At1g13990 gene. It is shown that the cis-regulatory sequence is sufficient for MAMP-responsive reporter gene activity in the context of its native promoter. The 3' UTR of the upstream gene has a quantitative effect on gene expression. In context of a synthetic promoter, the cis-sequence is shown to achieve a strong increase in reporter gene activity as a monomer, dimer and tetramer. Mutation analysis of the cis-sequence determined the specific nucleotides required for gene expression activation. In transgenic A. thaliana the synthetic promoter harbouring a tetramer of the cis-sequence not only drives strong pathogen-responsive reporter gene expression but also shows a high background activity. The results of this study contribute to our understanding how genes with very short upstream intergenic regions are regulated and how these regions can serve as a source for MAMP-responsive cis-sequences for synthetic promoter design. PMID:26833485

  12. 'In silico expression analysis', a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences.

    PubMed

    Bolívar, Julio C; Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated 'in silico expression analysis' was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the 'in silico expression analysis' resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the 'in silico expression analysis' predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. DATABASE URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  13. ‘In silico expression analysis’, a novel PathoPlant web tool to identify abiotic and biotic stress conditions associated with specific cis-regulatory sequences

    PubMed Central

    Machens, Fabian; Brill, Yuri; Romanov, Artyom; Bülow, Lorenz; Hehl, Reinhard

    2014-01-01

    Using bioinformatics, putative cis-regulatory sequences can be easily identified using pattern recognition programs on promoters of specific gene sets. The abundance of predicted cis-sequences is a major challenge to associate these sequences with a possible function in gene expression regulation. To identify a possible function of the predicted cis-sequences, a novel web tool designated ‘in silico expression analysis’ was developed that correlates submitted cis-sequences with gene expression data from Arabidopsis thaliana. The web tool identifies the A. thaliana genes harbouring the sequence in a defined promoter region and compares the expression of these genes with microarray data. The result is a hierarchy of abiotic and biotic stress conditions to which these genes are most likely responsive. When testing the performance of the web tool, known cis-regulatory sequences were submitted to the ‘in silico expression analysis’ resulting in the correct identification of the associated stress conditions. When using a recently identified novel elicitor-responsive sequence, a WT-box (CGACTTTT), the ‘in silico expression analysis’ predicts that genes harbouring this sequence in their promoter are most likely Botrytis cinerea induced. Consistent with this prediction, the strongest induction of a reporter gene harbouring this sequence in the promoter is observed with B. cinerea in transgenic A. thaliana. Database URL: http://www.pathoplant.de/expression_analysis.php. PMID:24727366

  14. Mapping gene regulatory networks in Drosophila eye development by large-scale transcriptome perturbations and motif inference.

    PubMed

    Potier, Delphine; Davie, Kristofer; Hulselmans, Gert; Naval Sanchez, Marina; Haagen, Lotte; Huynh-Thu, Vân Anh; Koldere, Duygu; Celik, Arzu; Geurts, Pierre; Christiaens, Valerie; Aerts, Stein

    2014-12-24

    Genome control is operated by transcription factors (TFs) controlling their target genes by binding to promoters and enhancers. Conceptually, the interactions between TFs, their binding sites, and their functional targets are represented by gene regulatory networks (GRNs). Deciphering in vivo GRNs underlying organ development in an unbiased genome-wide setting involves identifying both functional TF-gene interactions and physical TF-DNA interactions. To reverse engineer the GRNs of eye development in Drosophila, we performed RNA-seq across 72 genetic perturbations and sorted cell types and inferred a coexpression network. Next, we derived direct TF-DNA interactions using computational motif inference, ultimately connecting 241 TFs to 5,632 direct target genes through 24,926 enhancers. Using this network, we found network motifs, cis-regulatory codes, and regulators of eye development. We validate the predicted target regions of Grainyhead by ChIP-seq and identify this factor as a general cofactor in the eye network, being bound to thousands of nucleosome-free regions. PMID:25533349

  15. Directional Phosphorylation and Nuclear Transport of the Splicing Factor SRSF1 Is Regulated by an RNA Recognition Motif.

    PubMed

    Serrano, Pedro; Aubol, Brandon E; Keshwani, Malik M; Forli, Stefano; Ma, Chen-Ting; Dutta, Samit K; Geralt, Michael; Wüthrich, Kurt; Adams, Joseph A

    2016-06-01

    Multisite phosphorylation is required for the biological function of serine-arginine (SR) proteins, a family of essential regulators of mRNA splicing. These modifications are catalyzed by serine-arginine protein kinases (SRPKs) that phosphorylate numerous serines in arginine-serine-rich (RS) domains of SR proteins using a directional, C-to-N-terminal mechanism. The present studies explore how SRPKs govern this highly biased phosphorylation reaction and investigate biological roles of the observed directional phosphorylation mechanism. Using NMR spectroscopy with two separately expressed domains of SRSF1, we showed that several residues in the RNA-binding motif 2 interact with the N-terminal region of the RS domain (RS1). These contacts provide a structural framework that balances the activities of SRPK1 and the protein phosphatase PP1, thereby regulating the phosphoryl content of the RS domain. Disruption of the implicated intramolecular RNA-binding motif 2-RS domain interaction impairs both the directional phosphorylation mechanism and the nuclear translocation of SRSF1 demonstrating that the intrinsic phosphorylation bias is obligatory for SR protein biological function. PMID:27091468

  16. A cis-regulatory site downregulates PTHLH in translocation t(8;12)(q13;p11.2) and leads to Brachydactyly Type E

    PubMed Central

    Maass, Philipp G.; Wirth, Jutta; Aydin, Atakan; Rump, Andreas; Stricker, Sigmar; Tinschert, Sigrid; Otero, Miguel; Tsuchimochi, Kaneyuki; Goldring, Mary B.; Luft, Friedrich C.; Bähring, Sylvia

    2010-01-01

    Parathyroid hormone-like hormone (PTHLH) is an important chondrogenic regulator; however, the gene has not been directly linked to human disease. We studied a family with autosomal-dominant Brachydactyly Type E (BDE) and identified a t(8;12)(q13;p11.2) translocation with breakpoints (BPs) upstream of PTHLH on chromosome 12p11.2 and a disrupted KCNB2 on 8q13. We sequenced the BPs and identified a highly conserved Activator protein 1 (AP-1) motif on 12p11.2, together with a C-ets-1 motif translocated from 8q13. AP-1 and C-ets-1 bound in vitro and in vivo at the derivative chromosome 8 breakpoint [der(8) BP], but were differently enriched between the wild-type and BP allele. We differentiated fibroblasts from BDE patients into chondrogenic cells and found that PTHLH and its targets, ADAMTS-7 and ADAMTS-12 were downregulated along with impaired chondrogenic differentiation. We next used human and murine chondrocytes and observed that the AP-1 motif stimulated, whereas der(8) BP or C-ets-1 decreased, PTHLH promoter activity. These results are the first to identify a cis-directed PTHLH downregulation as primary cause of human chondrodysplasia. PMID:20015959

  17. The Significance of Multivalent Bonding Motifs and "Bond Order" in DNA-Directed Nanoparticle Crystallization.

    PubMed

    Thaner, Ryan V; Eryazici, Ibrahim; Macfarlane, Robert J; Brown, Keith A; Lee, Byeongdu; Nguyen, SonBinh T; Mirkin, Chad A

    2016-05-18

    Multivalent oligonucleotide-based bonding elements have been synthesized and studied for the assembly and crystallization of gold nanoparticles. Through the use of organic branching points, divalent and trivalent DNA linkers were readily incorporated into the oligonucleotide shells that define DNA-nanoparticles and compared to monovalent linker systems. These multivalent bonding motifs enable the change of "bond strength" between particles and therefore modulate the effective "bond order." In addition, the improved accessibility of strands between neighboring particles, either due to multivalency or modifications to increase strand flexibility, gives rise to superlattices with less strain in the crystallites compared to traditional designs. Furthermore, the increased availability and number of binding modes also provide a new variable that allows previously unobserved crystal structures to be synthesized, as evidenced by the formation of a thorium phosphide superlattice. PMID:27148838

  18. Microevolution of cis-regulatory elements: an example from the pair-rule segmentation gene fushi tarazu in the Drosophila melanogaster subgroup.

    PubMed

    Bakkali, Mohammed

    2011-01-01

    The importance of non-coding DNAs that control transcription is ever noticeable, but the characterization and analysis of the evolution of such DNAs presents challenges not found in the analysis of coding sequences. In this study of the cis-regulatory elements of the pair rule segmentation gene fushi tarazu (ftz) I report the DNA sequences of ftz's zebra element (promoter) and a region containing the proximal enhancer from a total of 45 fly lines belonging to several populations of the species Drosophila melanogaster, D. simulans, D. sechellia, D. mauritiana, D. yakuba, D. teissieri, D. orena and D. erecta. Both elements evolve at slower rate than ftz synonymous sites, thus reflecting their functional importance. The promoter evolves more slowly than the average for ftz's coding sequence while, on average, the enhancer evolves more rapidly, suggesting more functional constraint and effective purifying selection on the former. Comparative analysis of the number and nature of base substitutions failed to detect significant evidence for positive/adaptive selection in transcription-factor-binding sites. These seem to evolve at similar rates to regions not known to bind transcription factors. Although this result reflects the evolutionary flexibility of the transcription factor binding sites, it also suggests a complex and still not completely understood nature of even the characterized cis-regulatory sequences. The latter seem to contain more functional parts than those currently identified, some of which probably transcription factor binding. This study illustrates ways in which functional assignments of sequences within cis-acting sequences can be used in the search for adaptive evolution, but also highlights difficulties in how such functional assignment and analysis can be carried out. PMID:22073317

  19. A point mutation to Galphai selectively blocks GoLoco motif binding: direct evidence for Galpha.GoLoco complexes in mitotic spindle dynamics.

    PubMed

    Willard, Francis S; Zheng, Zhen; Guo, Juan; Digby, Gregory J; Kimple, Adam J; Conley, Jason M; Johnston, Christopher A; Bosch, Dustin; Willard, Melinda D; Watts, Val J; Lambert, Nevin A; Ikeda, Stephen R; Du, Quansheng; Siderovski, David P

    2008-12-26

    Heterotrimeric G-protein Galpha subunits and GoLoco motif proteins are key members of a conserved set of regulatory proteins that influence invertebrate asymmetric cell division and vertebrate neuroepithelium and epithelial progenitor differentiation. GoLoco motif proteins bind selectively to the inhibitory subclass (Galphai) of Galpha subunits, and thus it is assumed that a Galphai.GoLoco motif protein complex plays a direct functional role in microtubule dynamics underlying spindle orientation and metaphase chromosomal segregation during cell division. To address this hypothesis directly, we rationally identified a point mutation to Galphai subunits that renders a selective loss-of-function for GoLoco motif binding, namely an asparagine-to-isoleucine substitution in the alphaD-alphaE loop of the Galpha helical domain. This GoLoco-insensitivity ("GLi") mutation prevented Galphai1 association with all human GoLoco motif proteins and abrogated interaction between the Caenorhabditis elegans Galpha subunit GOA-1 and the GPR-1 GoLoco motif. In contrast, the GLi mutation did not perturb any other biochemical or signaling properties of Galphai subunits, including nucleotide binding, intrinsic and RGS protein-accelerated GTP hydrolysis, and interactions with Gbetagamma dimers, adenylyl cyclase, and seven transmembrane-domain receptors. GoLoco insensitivity rendered Galphai subunits unable to recruit GoLoco motif proteins such as GPSM2/LGN and GPSM3 to the plasma membrane, and abrogated the exaggerated mitotic spindle rocking normally seen upon ectopic expression of wild type Galphai subunits in kidney epithelial cells. This GLi mutation should prove valuable in establishing the physiological roles of Galphai.GoLoco motif protein complexes in microtubule dynamics and spindle function during cell division as well as to delineate potential roles for GoLoco motifs in receptor-mediated signal transduction. PMID:18984596

  20. Sex Chromosome-wide Transcriptional Suppression and Compensatory Cis-Regulatory Evolution Mediate Gene Expression in the Drosophila Male Germline.

    PubMed

    Landeen, Emily L; Muirhead, Christina A; Wright, Lori; Meiklejohn, Colin D; Presgraves, Daven C

    2016-07-01

    The evolution of heteromorphic sex chromosomes has repeatedly resulted in the evolution of sex chromosome-specific forms of regulation, including sex chromosome dosage compensation in the soma and meiotic sex chromosome inactivation in the germline. In the male germline of Drosophila melanogaster, a novel but poorly understood form of sex chromosome-specific transcriptional regulation occurs that is distinct from canonical sex chromosome dosage compensation or meiotic inactivation. Previous work shows that expression of reporter genes driven by testis-specific promoters is considerably lower-approximately 3-fold or more-for transgenes inserted into X chromosome versus autosome locations. Here we characterize this transcriptional suppression of X-linked genes in the male germline and its evolutionary consequences. Using transgenes and transpositions, we show that most endogenous X-linked genes, not just testis-specific ones, are transcriptionally suppressed several-fold specifically in the Drosophila male germline. In wild-type testes, this sex chromosome-wide transcriptional suppression is generally undetectable, being effectively compensated by the gene-by-gene evolutionary recruitment of strong promoters on the X chromosome. We identify and experimentally validate a promoter element sequence motif that is enriched upstream of the transcription start sites of hundreds of testis-expressed genes; evolutionarily conserved across species; associated with strong gene expression levels in testes; and overrepresented on the X chromosome. These findings show that the expression of X-linked genes in the Drosophila testes reflects a balance between chromosome-wide epigenetic transcriptional suppression and long-term compensatory adaptation by sex-linked genes. Our results have broad implications for the evolution of gene expression in the Drosophila male germline and for genome evolution. PMID:27404402

  1. Sex Chromosome-wide Transcriptional Suppression and Compensatory Cis-Regulatory Evolution Mediate Gene Expression in the Drosophila Male Germline

    PubMed Central

    Landeen, Emily L.; Muirhead, Christina A.; Meiklejohn, Colin D.; Presgraves, Daven C.

    2016-01-01

    The evolution of heteromorphic sex chromosomes has repeatedly resulted in the evolution of sex chromosome-specific forms of regulation, including sex chromosome dosage compensation in the soma and meiotic sex chromosome inactivation in the germline. In the male germline of Drosophila melanogaster, a novel but poorly understood form of sex chromosome-specific transcriptional regulation occurs that is distinct from canonical sex chromosome dosage compensation or meiotic inactivation. Previous work shows that expression of reporter genes driven by testis-specific promoters is considerably lower—approximately 3-fold or more—for transgenes inserted into X chromosome versus autosome locations. Here we characterize this transcriptional suppression of X-linked genes in the male germline and its evolutionary consequences. Using transgenes and transpositions, we show that most endogenous X-linked genes, not just testis-specific ones, are transcriptionally suppressed several-fold specifically in the Drosophila male germline. In wild-type testes, this sex chromosome-wide transcriptional suppression is generally undetectable, being effectively compensated by the gene-by-gene evolutionary recruitment of strong promoters on the X chromosome. We identify and experimentally validate a promoter element sequence motif that is enriched upstream of the transcription start sites of hundreds of testis-expressed genes; evolutionarily conserved across species; associated with strong gene expression levels in testes; and overrepresented on the X chromosome. These findings show that the expression of X-linked genes in the Drosophila testes reflects a balance between chromosome-wide epigenetic transcriptional suppression and long-term compensatory adaptation by sex-linked genes. Our results have broad implications for the evolution of gene expression in the Drosophila male germline and for genome evolution. PMID:27404402

  2. Integrative Modeling of eQTLs and Cis-Regulatory Elements Suggests Mechanisms Underlying Cell Type Specificity of eQTLs

    PubMed Central

    Brown, Christopher D.; Mangravite, Lara M.; Engelhardt, Barbara E.

    2013-01-01

    Genetic variants in cis-regulatory elements or trans-acting regulators frequently influence the quantity and spatiotemporal distribution of gene transcription. Recent interest in expression quantitative trait locus (eQTL) mapping has paralleled the adoption of genome-wide association studies (GWAS) for the analysis of complex traits and disease in humans. Under the hypothesis that many GWAS associations tag non-coding SNPs with small effects, and that these SNPs exert phenotypic control by modifying gene expression, it has become common to interpret GWAS associations using eQTL data. To fully exploit the mechanistic interpretability of eQTL-GWAS comparisons, an improved understanding of the genetic architecture and causal mechanisms of cell type specificity of eQTLs is required. We address this need by performing an eQTL analysis in three parts: first we identified eQTLs from eleven studies on seven cell types; then we integrated eQTL data with cis-regulatory element (CRE) data from the ENCODE project; finally we built a set of classifiers to predict the cell type specificity of eQTLs. The cell type specificity of eQTLs is associated with eQTL SNP overlap with hundreds of cell type specific CRE classes, including enhancer, promoter, and repressive chromatin marks, regions of open chromatin, and many classes of DNA binding proteins. These associations provide insight into the molecular mechanisms generating the cell type specificity of eQTLs and the mode of regulation of corresponding eQTLs. Using a random forest classifier with cell specific CRE-SNP overlap as features, we demonstrate the feasibility of predicting the cell type specificity of eQTLs. We then demonstrate that CREs from a trait-associated cell type can be used to annotate GWAS associations in the absence of eQTL data for that cell type. We anticipate that such integrative, predictive modeling of cell specificity will improve our ability to understand the mechanistic basis of human complex phenotypic

  3. Autosomal recessive retinitis pigmentosa with homozygous rhodopsin mutation E150K and non-coding cis-regulatory variants in CRX-binding regions of SAMD7

    PubMed Central

    Van Schil, Kristof; Karlstetter, Marcus; Aslanidis, Alexander; Dannhausen, Katharina; Azam, Maleeha; Qamar, Raheel; Leroy, Bart P.; Depasse, Fanny; Langmann, Thomas; De Baere, Elfride

    2016-01-01

    The aim of this study was to unravel the molecular pathogenesis of an unusual retinitis pigmentosa (RP) phenotype observed in a Turkish consanguineous family. Homozygosity mapping revealed two candidate genes, SAMD7 and RHO. A homozygous RHO mutation c.448G > A, p.E150K was found in two affected siblings, while no coding SAMD7 mutations were identified. Interestingly, four non-coding homozygous variants were found in two SAMD7 genomic regions relevant for binding of the retinal transcription factor CRX (CRX-bound regions, CBRs) in these affected siblings. Three variants are located in a promoter CBR termed CBR1, while the fourth is located more downstream in CBR2. Transcriptional activity of these variants was assessed by luciferase assays and electroporation of mouse retinal explants with reporter constructs of wild-type and variant SAMD7 CBRs. The combined CBR2/CBR1 variant construct showed significantly decreased SAMD7 reporter activity compared to the wild-type sequence, suggesting a cis-regulatory effect on SAMD7 expression. As Samd7 is a recently identified Crx-regulated transcriptional repressor in retina, we hypothesize that these SAMD7 variants might contribute to the retinal phenotype observed here, characterized by unusual, recognizable pigment deposits, differing from the classic spicular intraretinal pigmentation observed in other individuals homozygous for p.E150K, and typically associated with RP in general. PMID:26887858

  4. The lncRNA Malat1 is dispensable for mouse development but its transcription plays a cis-regulatory role in the adult

    PubMed Central

    Zhang, Bin; Arun, Gayatri; Mao, Yuntao S.; Lazar, Zsolt; Hung, Gene; Bhattacharjee, Gourab; Xiao, Xiaokun; Booth, Carmen J.; Wu, Jie; Zhang, Chaolin; Spector, David L.

    2012-01-01

    SUMMARY Genome-wide studies have identified thousands of long noncoding RNAs (lncRNAs) lacking protein coding capacity. However, most lncRNAs are expressed at a very low level, and in most cases there is no genetic evidence to support their in vivo function. Malat1 (metastasis associated lung adenocarcinoma transcript 1) is among the most abundant and highly conserved lncRNAs, and it exhibits an uncommon 3′-end processing mechanism. In addition, its specific nuclear localization, developmental regulation, and dysregulation in cancer are suggestive of it having a critical biological function. We have characterized a Malat1 loss-of-function genetic model that indicates Malat1 is not essential for mouse pre- and post-natal development. Furthermore, depletion of Malat1 does not impact global gene expression, splicing factor level and phosphorylation status, or alternative pre-mRNA splicing. However, among a small number of genes that were dysregulated in adult Malat1 knockout mice, many were Malat1 neighboring genes, thus indicating a potential cis regulatory role of Malat1 gene transcription. PMID:22840402

  5. A cis-Regulatory Mutation in Troponin-I of Drosophila Reveals the Importance of Proper Stoichiometry of Structural Proteins During Muscle Assembly

    PubMed Central

    Firdaus, Hena; Mohan, Jayaram; Naz, Sarwat; Arathi, Prabhashankar; Ramesh, Saraf R.; Nongthomba, Upendra

    2015-01-01

    Rapid and high wing-beat frequencies achieved during insect flight are powered by the indirect flight muscles, the largest group of muscles present in the thorax. Any anomaly during the assembly and/or structural impairment of the indirect flight muscles gives rise to a flightless phenotype. Multiple mutagenesis screens in Drosophila melanogaster for defective flight behavior have led to the isolation and characterization of mutations that have been instrumental in the identification of many proteins and residues that are important for muscle assembly, function, and disease. In this article, we present a molecular-genetic characterization of a flightless mutation, flightless-H (fliH), originally designated as heldup-a (hdp-a). We show that fliH is a cis-regulatory mutation of the wings up A (wupA) gene, which codes for the troponin-I protein, one of the troponin complex proteins, involved in regulation of muscle contraction. The mutation leads to reduced levels of troponin-I transcript and protein. In addition to this, there is also coordinated reduction in transcript and protein levels of other structural protein isoforms that are part of the troponin complex. The altered transcript and protein stoichiometry ultimately culminates in unregulated acto-myosin interactions and a hypercontraction muscle phenotype. Our results shed new insights into the importance of maintaining the stoichiometry of structural proteins during muscle assembly for proper function with implications for the identification of mutations and disease phenotypes in other species, including humans. PMID:25747460

  6. A Hox Transcription Factor Collective Binds a Highly Conserved Distal-less cis-Regulatory Module to Generate Robust Transcriptional Outcomes

    PubMed Central

    Uhl, Juli D.; Zandvakili, Arya; Gebelein, Brian

    2016-01-01

    cis-regulatory modules (CRMs) generate precise expression patterns by integrating numerous transcription factors (TFs). Surprisingly, CRMs that control essential gene patterns can differ greatly in conservation, suggesting distinct constraints on TF binding sites. Here, we show that a highly conserved Distal-less regulatory element (DCRE) that controls gene expression in leg precursor cells recruits multiple Hox, Extradenticle (Exd) and Homothorax (Hth) complexes to mediate dual outputs: thoracic activation and abdominal repression. Using reporter assays, we found that abdominal repression is particularly robust, as neither individual binding site mutations nor a DNA binding deficient Hth protein abolished cooperative DNA binding and in vivo repression. Moreover, a re-engineered DCRE containing a distinct configuration of Hox, Exd, and Hth sites also mediated abdominal Hox repression. However, the re-engineered DCRE failed to perform additional segment-specific functions such as thoracic activation. These findings are consistent with two emerging concepts in gene regulation: First, the abdominal Hox/Exd/Hth factors utilize protein-protein and protein-DNA interactions to form repression complexes on flexible combinations of sites, consistent with the TF collective model of CRM organization. Second, the conserved DCRE mediates multiple cell-type specific outputs, consistent with recent findings that pleiotropic CRMs are associated with conserved TF binding and added evolutionary constraints. PMID:27058369

  7. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping

    PubMed Central

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-01-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  8. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping.

    PubMed

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-04-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  9. Distinct cis-Regulatory Elements from the Dlx1/Dlx2 Locus Mark Different Progenitor Cell Populations in the Ganglionic Eminences and Different Subtypes of Adult Cortical Interneurons

    PubMed Central

    Ghanem, Noël; Yu, Man; Long, Jason; Hatch, Gary; Rubenstein, John L. R.; Ekker, Marc

    2016-01-01

    Distinct subtypes of cortical GABAergic interneurons provide inhibitory signals that are indispensable for neural network function. The Dlx homeobox genes have a central role in regulating their development and function. We have characterized the activity of three cis-regulatory sequences involved in forebrain expression of vertebrate Dlx genes: upstream regulatory element 2 (URE2), I12b, and I56i. The three regulatory elements display regional and temporal differences in their activities within the lateral ganglionic eminence (LGE), medial ganglionic eminence (MGE), and caudal ganglionic eminence (CGE) and label distinct populations of tangentially migrating neurons at embryonic day 12.5 (E12.5) and E13.5. We provide evidence that the dorsomedial and ventral MGE are distinct sources of tangentially migrating neurons during midgestation. In the adult cortex, URE2 and I12b/I56i are differentially expressed in parvalbumin-, calretinin-, neuropeptide Y-, and neuronal nitric oxide synthase-positive interneurons; I12b and I56i were specifically active in somatostatin-, vasoactive intestinal peptide-, and calbindin-positive interneurons. These data suggest that interneuron subtypes use distinct combinations of Dlx1/Dlx2 enhancers from the time they are specified through adulthood. PMID:17494687

  10. A Hox Transcription Factor Collective Binds a Highly Conserved Distal-less cis-Regulatory Module to Generate Robust Transcriptional Outcomes.

    PubMed

    Uhl, Juli D; Zandvakili, Arya; Gebelein, Brian

    2016-04-01

    cis-regulatory modules (CRMs) generate precise expression patterns by integrating numerous transcription factors (TFs). Surprisingly, CRMs that control essential gene patterns can differ greatly in conservation, suggesting distinct constraints on TF binding sites. Here, we show that a highly conserved Distal-less regulatory element (DCRE) that controls gene expression in leg precursor cells recruits multiple Hox, Extradenticle (Exd) and Homothorax (Hth) complexes to mediate dual outputs: thoracic activation and abdominal repression. Using reporter assays, we found that abdominal repression is particularly robust, as neither individual binding site mutations nor a DNA binding deficient Hth protein abolished cooperative DNA binding and in vivo repression. Moreover, a re-engineered DCRE containing a distinct configuration of Hox, Exd, and Hth sites also mediated abdominal Hox repression. However, the re-engineered DCRE failed to perform additional segment-specific functions such as thoracic activation. These findings are consistent with two emerging concepts in gene regulation: First, the abdominal Hox/Exd/Hth factors utilize protein-protein and protein-DNA interactions to form repression complexes on flexible combinations of sites, consistent with the TF collective model of CRM organization. Second, the conserved DCRE mediates multiple cell-type specific outputs, consistent with recent findings that pleiotropic CRMs are associated with conserved TF binding and added evolutionary constraints. PMID:27058369

  11. Germ line and embryonic expression of Fex, a member of the Drosophila F-element retrotransposon family, is mediated by an internal cis-regulatory control region.

    PubMed Central

    Kerber, B; Fellert, S; Taubert, H; Hoch, M

    1996-01-01

    The F elements of Drosophila melanogaster belong to the superfamily of long interspersed nucleotide element retrotransposons. To date, F-element transcription has not been detected in flies. Here we describe the isolation of a member of the F-element family, termed Fex, which is transcribed in specific cells of the female and male germ lines and in various tissues during embryogenesis of D. melanogaster. Sequence analysis revealed that this element contains two complete open reading frames coding for a putative nucleic acid-binding protein and a putative reverse transcriptase. Functional analysis of the 5' region, using germ line transformation of Fex-lacZ reporter gene constructs, demonstrates that major aspects of tissue-specific Fex expression are controlled by internal cis-acting elements that lie in the putative coding region of open reading frame 1. These sequences mediate dynamic gene expression in eight expression domains during embryonic and germ line development. The capacity of the cis-regulatory region of the Fex element to mediate such complex expression patterns is unique among members of the long interspersed nucleotide element superfamily of retrotransposons and is reminiscent of regulatory regions of developmental control genes. PMID:8649411

  12. Direct Imaging of Hippocampal Epileptiform Calcium Motifs Following Kainic Acid Administration in Freely Behaving Mice

    PubMed Central

    Berdyyeva, Tamara K.; Frady, E. Paxon; Nassi, Jonathan J.; Aluisio, Leah; Cherkas, Yauheniya; Otte, Stephani; Wyatt, Ryan M.; Dugovic, Christine; Ghosh, Kunal K.; Schnitzer, Mark J.; Lovenberg, Timothy; Bonaventure, Pascal

    2016-01-01

    Prolonged exposure to abnormally high calcium concentrations is thought to be a core mechanism underlying hippocampal damage in epileptic patients; however, no prior study has characterized calcium activity during seizures in the live, intact hippocampus. We have directly investigated this possibility by combining whole-brain electroencephalographic (EEG) measurements with microendoscopic calcium imaging of pyramidal cells in the CA1 hippocampal region of freely behaving mice treated with the pro-convulsant kainic acid (KA). We observed that KA administration led to systematic patterns of epileptiform calcium activity: a series of large-scale, intensifying flashes of increased calcium fluorescence concurrent with a cluster of low-amplitude EEG waveforms. This was accompanied by a steady increase in cellular calcium levels (>5 fold increase relative to the baseline), followed by an intense spreading calcium wave characterized by a 218% increase in global mean intensity of calcium fluorescence (n = 8, range [114–349%], p < 10−4; t-test). The wave had no consistent EEG phenotype and occurred before the onset of motor convulsions. Similar changes in calcium activity were also observed in animals treated with 2 different proconvulsant agents, N-methyl-D-aspartate (NMDA) and pentylenetetrazol (PTZ), suggesting the measured changes in calcium dynamics are a signature of seizure activity rather than a KA-specific pathology. Additionally, despite reducing the behavioral severity of KA-induced seizures, the anticonvulsant drug valproate (VA, 300 mg/kg) did not modify the observed abnormalities in calcium dynamics. These results confirm the presence of pathological calcium activity preceding convulsive motor seizures and support calcium as a candidate signaling molecule in a pathway connecting seizures to subsequent cellular damage. Integrating in vivo calcium imaging with traditional assessment of seizures could potentially increase translatability of pharmacological

  13. Novel applications of motif-directed profiling to identify disease resistance genes in plants

    PubMed Central

    2013-01-01

    Background Molecular profiling of gene families is a versatile tool to study diversity between individual genomes in sexual crosses and germplasm. Nucleotide binding site (NBS) profiling, in particular, targets conserved nucleotide binding site-encoding sequences of resistance gene analogs (RGAs), and is widely used to identify molecular markers for disease resistance (R) genes. Results In this study, we used NBS profiling to identify genome-wide locations of RGA clusters in the genome of potato clone RH. Positions of RGAs in the potato RH and DM genomes that were generated using profiling and genome sequencing, respectively, were compared. Largely overlapping results, but also interesting discrepancies, were found. Due to the clustering of RGAs, several parts of the genome are overexposed while others remain underexposed using NBS profiling. It is shown how the profiling of other gene families, i.e. protein kinases and different protein domain-coding sequences (i.e., TIR), can be used to achieve a better marker distribution. The power of profiling techniques is further illustrated using RGA cluster-directed profiling in a population of Solanum berthaultii. Multiple different paralogous RGAs within the Rpi-ber cluster could be genetically distinguished. Finally, an adaptation of the profiling protocol was made that allowed the parallel sequencing of profiling fragments using next generation sequencing. The types of RGAs that were tagged in this next-generation profiling approach largely overlapped with classical gel-based profiling. As a potential application of next-generation profiling, we showed how the R gene family associated with late blight resistance in the SH*RH population could be identified using a bulked segregant approach. Conclusions In this study, we provide a comprehensive overview of previously described and novel profiling primers and their genomic targets in potato through genetic mapping and comparative genomics. Furthermore, it is shown how

  14. Epsilon glutathione transferases possess a unique class-conserved subunit interface motif that directly interacts with glutathione in the active site.

    PubMed

    Wongsantichon, Jantana; Robinson, Robert C; Ketterman, Albert J

    2015-01-01

    Epsilon class glutathione transferases (GSTs) have been shown to contribute significantly to insecticide resistance. We report a new Epsilon class protein crystal structure from Drosophila melanogaster for the glutathione transferase DmGSTE6. The structure reveals a novel Epsilon clasp motif that is conserved across hundreds of millions of years of evolution of the insect Diptera order. This histidine-serine motif lies in the subunit interface and appears to contribute to quaternary stability as well as directly connecting the two glutathiones in the active sites of this dimeric enzyme. PMID:26487708

  15. Epsilon glutathione transferases possess a unique class-conserved subunit interface motif that directly interacts with glutathione in the active site

    PubMed Central

    Wongsantichon, Jantana; Robinson, Robert C.; Ketterman, Albert J.

    2015-01-01

    Epsilon class glutathione transferases (GSTs) have been shown to contribute significantly to insecticide resistance. We report a new Epsilon class protein crystal structure from Drosophila melanogaster for the glutathione transferase DmGSTE6. The structure reveals a novel Epsilon clasp motif that is conserved across hundreds of millions of years of evolution of the insect Diptera order. This histidine-serine motif lies in the subunit interface and appears to contribute to quaternary stability as well as directly connecting the two glutathiones in the active sites of this dimeric enzyme. PMID:26487708

  16. Direct contacts between conserved motifs of different subunits provide major contribution to active site organization in human and mycobacterial dUTPases

    PubMed Central

    Takács, Enikő; Nagy, Gergely; Leveles, Ibolya; Harmat, Veronika; Lopata, Anna; Tóth, Judit; Vértessy, Beáta G.

    2010-01-01

    dUTPases are essential for genome integrity. Recent results allowed characterization of the role of conserved residues. Here we analyzed the Asp/Asn mutation within conserved Motif I of human and mycobacterial dUTPases, wherein the Asp residue was previously implicated in Mg2+-coordination. Our results on transient/steady-state kinetics, ligand-binding and a 1.80 Å-resolution structure of the mutant mycobacterial enzyme, in comparison with wild type and C-terminally truncated structures, argue that this residue has a major role in providing intra- and intersubunit contacts, but is not essential for Mg2+ accommodation. We conclude that in addition to the role of conserved motifs in substrate accommodation, direct subunit interaction between protein atoms of active site residues from different conserved motifs are crucial for enzyme function. PMID:20493855

  17. Direct contacts between conserved motifs of different subunits provide major contribution to active site organization in human and mycobacterial dUTPases.

    PubMed

    Takács, Eniko; Nagy, Gergely; Leveles, Ibolya; Harmat, Veronika; Lopata, Anna; Tóth, Judit; Vértessy, Beáta G

    2010-07-16

    dUTP pyrophosphatases (dUTPases) are essential for genome integrity. Recent results allowed characterization of the role of conserved residues. Here we analyzed the Asp/Asn mutation within conserved Motif I of human and mycobacterial dUTPases, wherein the Asp residue was previously implicated in Mg(2+)-coordination. Our results on transient/steady-state kinetics, ligand binding and a 1.80 A resolution structure of the mutant mycobacterial enzyme, in comparison with wild type and C-terminally truncated structures, argue that this residue has a major role in providing intra- and intersubunit contacts, but is not essential for Mg(2+) accommodation. We conclude that in addition to the role of conserved motifs in substrate accommodation, direct subunit interaction between protein atoms of active site residues from different conserved motifs are crucial for enzyme function. PMID:20493855

  18. Identification of cis regulatory features in the embryonic zebrafish genome through large-scale profiling of H3K4me1 and H3K4me3 binding sites

    PubMed Central

    Aday, Aaron W.; Zhu, Lihua Julie; Lakshmanan, Abirami; Wang, Jie; Lawson, Nathan D.

    2011-01-01

    An organism’s genome sequence serves as a blueprint for the proteins and regulatory RNAs essential for cellular function. The genome also harbors cis-acting non-coding sequences that control gene expression and are essential to coordinate regulatory programs during embryonic development. However, the genome sequence is largely identical between cell types within a multi-cellular organism indicating that factors such as DNA accessibility and chromatin structure play a crucial role in governing cell-specific gene expression. Recent studies have identified particular chromatin modifications that define functionally distinct cis regulatory elements. Among these are forms of histone 3 that are mono- or tri-methylated at lysine 4 (H3K4me1 or H3K4me3, respectively), which bind preferentially to promoter and enhancer elements in the mammalian genome. In this work, we investigated whether these modified histones could similarly identify cis regulatory elements within the zebrafish genome. By applying chromatin immunoprecipitation followed by deep sequencing, we find that H3K4me1 and H3K4me3 are enriched at transcriptional start sites in the genome of the developing zebrafish embryo and that this association correlates with gene expression. We further find that these modifications associate with distal non-coding conserved elements, including known active enhancers. Finally, we demonstrate that it is possible to utilize H3K4me1 and H3K4me3 binding profiles in combination with available expression data to computationally identify relevant cis regulatory sequences flanking syn-expressed genes in the developing embryo. Taken together, our results indicate that H3K4me1 and H3K4me3 generally mark cis regulatory elements within the zebrafish genome and indicate that further characterization of the zebrafish using this approach will prove valuable in defining transcriptional networks in this model system. PMID:21435340

  19. Computation-Based Discovery of Related Transcriptional Regulatory Modules and Motifs Using an Experimentally Validated Combinatorial Model

    PubMed Central

    Halfon, Marc S.; Grad, Yonatan; Church, George M.; Michelson, Alan M.

    2002-01-01

    Gene expression is regulated by transcription factors that interact with cis-regulatory elements. Predicting these elements from sequence data has proven difficult. We describe here a successful computational search for elements that direct expression in a particular temporal-spatial pattern in the Drosophila embryo, based on a single well characterized enhancer model. The fly genome was searched to identify sequence elements containing the same combination of transcription factors as those found in the model. Experimental evaluation of the search results demonstrates that our method can correctly predict regulatory elements and highlights the importance of functional testing as a means of identifying false-positive results. We also show that the search results enable the identification of additional relevant sequence motifs whose functions can be empirically validated. This approach, combined with gene expression and phylogenetic sequence data, allows for genome-wide identification of related regulatory elements, an important step toward understanding the genetic regulatory networks involved in development. [Sequence data reported in this paper have been deposited in GenBank with accession nos. AF513981 (Eve MHE) and AF513982 (Hbr DME). Supplementary material is available online at http://www.genome.org. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: R. Blackman] PMID:12097338

  20. Subtle Changes in Motif Positioning Cause Tissue-Specific Effects on Robustness of an Enhancer's Activity

    PubMed Central

    Erceg, Jelena; Saunders, Timothy E.; Girardot, Charles; Devos, Damien P.; Hufnagel, Lars; Furlong, Eileen E. M.

    2014-01-01

    Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, we are limited in our ability to relate information on motif occupancy to function from sequence alone. Here, we engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, we were able to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood. PMID:24391522

  1. Identification of potential regulatory motifs in odorant receptor genes by analysis of promoter sequences

    PubMed Central

    Michaloski, Jussara S.; Galante, Pedro A.F.

    2006-01-01

    Mouse odorant receptors (ORs) are encoded by >1000 genes dispersed throughout the genome. Each olfactory neuron expresses one single OR gene, while the rest of the genes remain silent. The mechanisms underlying OR gene expression are poorly understood. Here, we investigated if OR genes share common cis-regulatory sequences in their promoter regions. We carried out a comprehensive analysis in which the upstream regions of a large number of OR genes were compared. First, using RLM-RACE, we generated cDNAs containing the complete 5′-untranslated regions (5′-UTRs) for a total number of 198 mouse OR genes. Then, we aligned these cDNA sequences to the mouse genome so that the 5′ structure and transcription start sites (TSSs) of the OR genes could be precisely determined. Sequences upstream of the TSSs were retrieved and browsed for common elements. We found DNA sequence motifs that are overrepresented in the promoter regions of the OR genes. Most motifs resemble O/E-like sites and are preferentially localized within 200 bp upstream of the TSSs. Finally, we show that these motifs specifically interact with proteins extracted from nuclei prepared from the olfactory epithelium, but not from brain or liver. Our results show that the OR genes share common promoter elements. The present strategy should provide information on the role played by cis-regulatory sequences in OR gene regulation. PMID:16902085

  2. WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar

    PubMed Central

    Wang, Guandong; Yu, Taotao; Zhang, Weixiong

    2005-01-01

    Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at . PMID:15980501

  3. A novel pairwise comparison method for in silico discovery of statistically significant cis-regulatory elements in eukaryotic promoter regions: application to Arabidopsis.

    PubMed

    Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Aliakbari, Massumeh; Lindlöf, Angelica; Ebrahimi, Mahdi; Ebrahimie, Esmaeil

    2015-01-01

    Cis regulatory elements (CREs), located within promoter regions, play a significant role in the blueprint for transcriptional regulation of genes. There is a growing interest to study the combinatorial nature of CREs including presence or absence of CREs, the number of occurrences of each CRE, as well as of their order and location relative to their target genes. Comparative promoter analysis has been shown to be a reliable strategy to test the significance of each component of promoter architecture. However, it remains unclear what level of difference in the number of occurrences of each CRE is of statistical significance in order to explain different expression patterns of two genes. In this study, we present a novel statistical approach for pairwise comparison of promoters of Arabidopsis genes in the context of number of occurrences of each CRE within the promoters. First, using the sample of 1000 Arabidopsis promoters, the results of the goodness of fit test and non-parametric analysis revealed that the number of occurrences of CREs in a promoter sequence is Poisson distributed. As a promoter sequence contained functional and non-functional CREs, we addressed the issue of the statistical distribution of functional CREs by analyzing the ChIP-seq datasets. The results showed that the number of occurrences of functional CREs over the genomic regions was determined as being Poisson distributed. In accordance with the obtained distribution of CREs occurrences, we suggested the Audic and Claverie (AC) test to compare two promoters based on the number of occurrences for the CREs. Superiority of the AC test over Chi-square (2×2) and Fisher's exact tests was also shown, as the AC test was able to detect a higher number of significant CREs. The two case studies on the Arabidopsis genes were performed in order to biologically verify the pairwise test for promoter comparison. Consequently, a number of CREs with significantly different occurrences was identified between

  4. Conformational Flexibility and Dynamics of the Internal Loop and Helical Regions of the Kink-Turn Motif in the Glycine Riboswitch by Site-Directed Spin-Labeling.

    PubMed

    Esquiaqui, Jackie M; Sherman, Eileen M; Ye, Jing-Dong; Fanucci, Gail E

    2016-08-01

    Site-directed spin-labeling (SDSL) electron paramagnetic resonance (EPR) spectroscopy provides a means for a solution state description of site-specific dynamics and flexibility of large RNAs, facilitating our understanding of the effects of environmental conditions such as ligands and ions on RNA structure and dynamics. Here, the utility and capability of EPR line shape analysis and distance measurements to monitor and describe site-specific changes in the conformational dynamics of internal loop nucleobases as well as helix-helix interactions of the kink-turn motif in the Vibrio cholerae (VC) glycine riboswitch that occur upon sequential K(+)-, Mg(2+)-, and glycine-induced folding were explored. Spin-labels were incorporated into the 232-nucleotide sequence via splinted ligation strategies. Thiouridine nucleobase labeling within the internal loop reveals unambiguous differential dynamics for two successive sites labeled, with varied rates of motion reflective of base flipping and base stacking. EPR-based distance measurements for nitroxide spin-labels incorporated within the RNA backbone in the helical regions of the kink-turn motif are reflective of helical formation and tertiary interaction induced by ion stabilization. In both instances, results indicate that the structural formation of the kink-turn motif in the VC glycine riboswitch can be stabilized by 100 mM K(+) where the conformational flexibility of the kink-turn motif is not further tightened by subsequent addition of divalent ions. Although glycine binding is likely to induce structural and dynamic changes in other regions, SDSL indicates no impact of glycine binding on the local dynamics or structure of the kink-turn motif as investigated here. Overall, these results demonstrate the ability of SDSL to interrogate site-specific base dynamics and packing of helices in large RNAs and demonstrate ion-induced stability of the kink-turn fold of the VC riboswitch. PMID:27427937

  5. Integration of bioinformatics and synthetic promoters leads to the discovery of novel elicitor-responsive cis-regulatory sequences in Arabidopsis.

    PubMed

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J; Hehl, Reinhard

    2012-09-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  6. Integration of Bioinformatics and Synthetic Promoters Leads to the Discovery of Novel Elicitor-Responsive cis-Regulatory Sequences in Arabidopsis1[C][W][OA

    PubMed Central

    Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J.; Hehl, Reinhard

    2012-01-01

    A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985

  7. Cis-Regulatory Elements Determine Germline Specificity and Expression Level of an Isopentenyltransferase Gene in Sperm Cells of Arabidopsis1[OPEN

    PubMed Central

    Yuan, Tong; Duan, Xiaomeng; Wei, Xiaoping; Li, Jia

    2016-01-01

    Flowering plant sperm cells transcribe a divergent and complex complement of genes. To examine promoter function, we chose an isopentenyltransferase gene known as PzIPT1. This gene is highly selectively transcribed in one sperm cell morphotype of Plumbago zeylanica, which preferentially fuses with the central cell during fertilization and is thus a founding cell of the primary endosperm. In transgenic Arabidopsis (Arabidopsis thaliana), PzIPT1 promoter displays activity in both sperm cells and upon progressive promoter truncation from the 5′-end results in a progressive decrease in reporter production, consistent with occurrence of multiple enhancer sites. Cytokinin-dependent protein binding motifs are identified in the promoter sequence, which respond with stimulation by cytokinin. Expression of PzIPT1 promoter in sperm cells confers specificity independently of previously reported Germline Restrictive Silencer Factor binding sequence. Instead, a cis-acting regulatory region consisting of two duplicated 6-bp Male Gamete Selective Activation (MGSA) motifs occurs near the site of transcription initiation. Disruption of this sequence-specific site inactivates expression of a GFP reporter gene in sperm cells. Multiple copies of the MGSA motif fused with the minimal CaMV35S promoter elements confer reporter gene expression in sperm cells. Similar duplicated MGSA motifs are also identified from promoter sequences of sperm cell-expressed genes in Arabidopsis, suggesting selective activation is possibly a common mechanism for regulation of gene expression in sperm cells of flowering plants. PMID:26739233

  8. The strain-specific cis-acting element of beet curly top geminivirus DNA replication maps to the directly repeated motif of the ori.

    PubMed

    Choi, I R; Stenger, D C

    1996-12-01

    Strains of beet curly top geminivirus (BCTV) possess distinct cis- and trans-acting replication specificity elements which are not separately interchangeable among strains. Analysis of the replication competency of chimeric BCTV genomes, in which portions of the origin of DNA replication (ori) were derived from heterologous BCTV strains, have permitted identification of an essential cis-acting element governing strain-specific replication in a subgroup II geminivirus. Our studies indicate that the cis-acting element responsible for strain-specific replication properties resides within the directly repeated motif of the BCTV ori. Transient replication assays conducted in leaf disks and complementation experiments conducted in whole plants indicated that the trans-acting replication specificity element, residing within the amino-terminal region of the C1 Rep protein, may recognize and replicate a chimeric BCTV genome containing a heterologous ori so long as all or portions of the core element of the directly repeated motif are derived from the same strain as the Rep protein. As Rep protein binding to the core element of the directly repeated motif has been demonstrated by others to be essential for replication of subgroup III geminiviruses, our results support the hypothesis that replication specificity of subgroup II viruses is governed by processes similar to that of subgroup III viruses. However, a second cis-acting element of the ori, which appears to contribute to subgroup III virus replication specificity, does not seem to be required for replication specificity among the subgroup II viruses examined. Nonetheless, a potential role for a second cis-acting element in the BCTV ori contributing to maximal replication cannot be excluded. PMID:8941329

  9. Defining a Conformational Consensus Motif in Cotransin-Sensitive Signal Sequences: A Proteomic and Site-Directed Mutagenesis Study

    PubMed Central

    Klein, Wolfgang; Westendorf, Carolin; Schmidt, Antje; Conill-Cortés, Mercè; Rutz, Claudia; Blohs, Marcus; Beyermann, Michael; Protze, Jonas; Krause, Gerd; Krause, Eberhard; Schülein, Ralf

    2015-01-01

    The cyclodepsipeptide cotransin was described to inhibit the biosynthesis of a small subset of proteins by a signal sequence-discriminatory mechanism at the Sec61 protein-conducting channel. However, it was not clear how selective cotransin is, i.e. how many proteins are sensitive. Moreover, a consensus motif in signal sequences mediating cotransin sensitivity has yet not been described. To address these questions, we performed a proteomic study using cotransin-treated human hepatocellular carcinoma cells and the stable isotope labelling by amino acids in cell culture technique in combination with quantitative mass spectrometry. We used a saturating concentration of cotransin (30 micromolar) to identify also less-sensitive proteins and to discriminate the latter from completely resistant proteins. We found that the biosynthesis of almost all secreted proteins was cotransin-sensitive under these conditions. In contrast, biosynthesis of the majority of the integral membrane proteins was cotransin-resistant. Cotransin sensitivity of signal sequences was neither related to their length nor to their hydrophobicity. Instead, in the case of signal anchor sequences, we identified for the first time a conformational consensus motif mediating cotransin sensitivity. PMID:25806945

  10. AthaMap web tools for database-assisted identification of combinatorial cis-regulatory elements and the display of highly conserved transcription factor binding sites in Arabidopsis thaliana.

    PubMed

    Steffens, Nils Ole; Galuschka, Claudia; Schindler, Martin; Bülow, Lorenz; Hehl, Reinhard

    2005-07-01

    The AthaMap database generates a map of cis-regulatory elements for the Arabidopsis thaliana genome. AthaMap contains more than 7.4 x 10(6) putative binding sites for 36 transcription factors (TFs) from 16 different TF families. A newly implemented functionality allows the display of subsets of higher conserved transcription factor binding sites (TFBSs). Furthermore, a web tool was developed that permits a user-defined search for co-localizing cis-regulatory elements. The user can specify individually the level of conservation for each TFBS and a spacer range between them. This web tool was employed for the identification of co-localizing sites of known interacting TFs and TFs containing two DNA-binding domains. More than 1.8 x 10(5) combinatorial elements were annotated in the AthaMap database. These elements can also be used to identify more complex co-localizing elements consisting of up to four TFBSs. The AthaMap database and the connected web tools are a valuable resource for the analysis and the prediction of gene expression regulation at http://www.athamap.de. PMID:15980498

  11. A systematic approach to identify functional motifs within vertebrate developmental enhancers

    PubMed Central

    Li, Qiang; Ritter, Deborah; Yang, Nan; Dong, Zhiqiang; Li, Hao; Chuang, Jeffrey H.; Guo, Su

    2012-01-01

    Uncovering the cis-regulatory logic of developmental enhancers is critical to understanding the role of non-coding DNA in development. However, it is cumbersome to identify functional motifs within enhancers, and thus few vertebrate enhancers have their core functional motifs revealed. Here we report a combined experimental and computational approach for discovering regulatory motifs in developmental enhancers. Making use of the zebrafish gene expression database, we computationally identified conserved non-coding elements (CNEs) likely to have a desired tissue-specificity based on the expression of nearby genes. Through a high throughput and robust enhancer assay, we tested the activity of ~100 such CNEs and efficiently uncovered developmental enhancers with desired spatial and temporal expression patterns in the zebrafish brain. Application of de novo motif prediction algorithms on a group of forebrain enhancers identified five top-ranked motifs, all of which were experimentally validated as critical for forebrain enhancer activity. These results demonstrate a systematic approach to discover important regulatory motifs in vertebrate developmental enhancers. Moreover, this dataset provides a useful resource for further dissection of vertebrate brain development and function. PMID:19850031

  12. Functional characterization of transcription factor motifs using cross-species comparison across large evolutionary distances.

    PubMed

    Kim, Jaebum; Cunningham, Ryan; James, Brian; Wyder, Stefan; Gibson, Joshua D; Niehuis, Oliver; Zdobnov, Evgeny M; Robertson, Hugh M; Robinson, Gene E; Werren, John H; Sinha, Saurabh

    2010-01-01

    We address the problem of finding statistically significant associations between cis-regulatory motifs and functional gene sets, in order to understand the biological roles of transcription factors. We develop a computational framework for this task, whose features include a new statistical score for motif scanning, the use of different scores for predicting targets of different motifs, and new ways to deal with redundancies among significant motif-function associations. This framework is applied to the recently sequenced genome of the jewel wasp, Nasonia vitripennis, making use of the existing knowledge of motifs and gene annotations in another insect genome, that of the fruitfly. The framework uses cross-species comparison to improve the specificity of its predictions, and does so without relying upon non-coding sequence alignment. It is therefore well suited for comparative genomics across large evolutionary divergences, where existing alignment-based methods are not applicable. We also apply the framework to find motifs associated with socially regulated gene sets in the honeybee, Apis mellifera, using comparisons with Nasonia, a solitary species, to identify honeybee-specific associations. PMID:20126523

  13. Aminopeptidase B, a glucagon-processing enzyme: site directed mutagenesis of the Zn2+-binding motif and molecular modelling

    PubMed Central

    Pham, Viet-Laï; Cadel, Marie-Sandrine; Gouzy-Darmon, Cécile; Hanquez, Chantal; Beinfeld, Margery C; Nicolas, Pierre; Etchebest, Catherine; Foulon, Thierry

    2007-01-01

    Background Aminopeptidase B (Ap-B; EC 3.4.11.6) catalyzes the cleavage of basic residues at the N-terminus of peptides and processes glucagon into miniglucagon. The enzyme exhibits, in vitro, a residual ability to hydrolyze leukotriene A4 into the pro-inflammatory lipid mediator leukotriene B4. The potential bi-functional nature of Ap-B is supported by close structural relationships with LTA4 hydrolase (LTA4H ; EC 3.3.2.6). A structure-function analysis is necessary for the detailed understanding of the enzymatic mechanisms of Ap-B and to design inhibitors, which could be used to determine the complete in vivo functions of the enzyme. Results The rat Ap-B cDNA was expressed in E. coli and the purified recombinant enzyme was characterized. 18 mutants of the H325EXXHX18E348 Zn2+-binding motif were constructed and expressed. All mutations were found to abolish the aminopeptidase activity. A multiple alignment of 500 sequences of the M1 family of aminopeptidases was performed to identify 3 sub-families of exopeptidases and to build a structural model of Ap-B using the x-ray structure of LTA4H as a template. Although the 3D structures of the two enzymes resemble each other, they differ in certain details. The role that a loop, delimiting the active center of Ap-B, plays in discriminating basic substrates, as well as the function of consensus motifs, such as RNP1 and Armadillo domain are discussed. Examination of electrostatic potentials and hydrophobic patches revealed important differences between Ap-B and LTA4H and suggests that Ap-B is involved in protein-protein interactions. Conclusion Alignment of the primary structures of the M1 family members clearly demonstrates the existence of different sub-families and highlights crucial residues in the enzymatic activity of the whole family. E. coli recombinant enzyme and Ap-B structural model constitute powerful tools for investigating the importance and possible roles of these conserved residues in Ap-B, LTA4H and M1

  14. O-xylosylation in a recombinant protein is directed at a common motif on glycine-serine linkers.

    PubMed

    Spencer, David; Novarra, Shabazz; Zhu, Liang; Mugabe, Sheila; Thisted, Thomas; Baca, Manuel; Depaz, Roberto; Barton, Christopher

    2013-11-01

    Glycine-serine (GS) linkers are commonly used in recombinant proteins to connect domains. Here, we report the posttranslational O-glycosylation of a GS linker in a novel fusion protein. The structure of the O-glycan moiety is a xylose-based core substituted with hexose and sulfated hexauronic acid residues. The total level of O-xylosylation was approximately 30% in the material expressed in HEK-293 cell lines. There was an approximate 10-fold reduction in O-xylosylation levels when the material was expressed in Chinese hamster ovary cell lines. Similar O-glycan structures have been reported for human urinary thrombomodulin and represent the initial building block for proteoglycans such as chondroitin sulfate and heparin. The sites of attachment, determined by electron transfer dissociation mass spectrometry, were localized to serine in the linker regions of the recombinant fusion protein. This attachment could be attributed, in part, to the inherent xylosyltransferase motif present in GS linkers. Elimination of the O-glycan moiety was achieved with modified linkers containing only glycine residues. The aggregation and fragmentation behavior of the GGG construct were comparable to the GSG-linked material during thermal stress. The O-xylosylation reported has implications for the manufacturing consistency of recombinant proteins containing GS linkers. PMID:24105735

  15. [Personal motif in art].

    PubMed

    Gerevich, József

    2015-01-01

    One of the basic questions of the art psychology is whether a personal motif is to be found behind works of art and if so, how openly or indirectly it appears in the work itself. Analysis of examples and documents from the fine arts and literature allow us to conclude that the personal motif that can be identified by the viewer through symbols, at times easily at others with more difficulty, gives an emotional plus to the artistic product. The personal motif may be found in traumatic experiences, in communication to the model or with other emotionally important persons (mourning, disappointment, revenge, hatred, rivalry, revolt etc.), in self-searching, or self-analysis. The emotions are expressed in artistic activity either directly or indirectly. The intention nourished by the artist's identity (Kunstwollen) may stand in the way of spontaneous self-expression, channelling it into hidden paths. Under the influence of certain circumstances, the artist may arouse in the viewer, consciously or unconsciously, an illusionary, misleading image of himself. An examination of the personal motif is one of the important research areas of art therapy. PMID:26202617

  16. A G-string positive cis-regulatory element in the LpS1 promoter binds two distinct nuclear factors distributed non-uniformly in Lytechinus pictus embryos.

    PubMed

    Xiang, M; Lu, S Y; Musso, M; Karsenty, G; Klein, W H

    1991-12-01

    The LpS1 alpha and beta genes of Lytechinus pictus are activated at the late cleavage stage of embryogenesis, with LpS1 mRNAs accumulating only in lineages contributing to aboral ectoderm. We had shown previously that 762 bp of 5' flanking DNA from the LpS1 beta gene was sufficient for proper temporal and aboral ectoderm specific expression. In the present study, we identified a strong positive cis-regulatory element at -70 bp to -75 bp in the LpS1 beta promoter with the sequence (G)6 and a similar, more distal cis-element at -721 bp to -726 bp. The proximal 'G-string' element interacted with two nuclear factors, one specific to ectoderm and one to endoderm/mesoderm nuclear extracts, whereas the distal G-string element interacted only with the ectoderm factor. The ectoderm and endoderm/mesoderm G-string factors were distinct based on their migratory behavior in electrophoretic mobility shift assays, binding site specificities, salt optima and EDTA sensitivity. The proximal G-string element shared homology with a binding site for the mammalian transcription factor IF1, a protein that binds to negative cis-regulatory elements in the mouse alpha 1(I) and alpha 2(I) collagen gene promoters. Competition experiments using wild-type and mutant oligonucleotides indicated that the ectoderm G-string factor and IF1 have similar recognition sites. Partially purified IF1 specifically bound to an oligonucleotide containing the proximal G-string of LpS1 beta. From our results, we suggest that the ectoderm G-string factor, a member of the G-rich DNA-binding protein family, activates the LpS1 gene in aboral ectoderm cells by binding to the LpS1 promoter at the proximal G-string site. PMID:1811948

  17. Fast and Efficient Cloning of Cis-Regulatory Sequences for High-Throughput Yeast One-Hybrid Analyses of Transcription Factors.

    PubMed

    Kelemen, Zsolt; Przybyla-Toscano, Jonathan; Tissot, Nicolas; Lepiniec, Loïc; Dubos, Christian

    2016-01-01

    Yeast one-hybrid (Y1H) assay has been proven to be a powerful technique to characterize in vivo the interaction between a given transcription factor (TF), or its DNA-binding domain (DBD), and target DNA sequences. Comprehensive characterization of TF/DBD and DNA interactions should allow designing synthetic promoters that would undoubtedly be valuable for biotechnological approaches. Here, we use the ligation-independent cloning system (LIC) in order to enhance the cloning efficiency of DNA motifs into the pHISi Y1H vector. LIC overcomes important limitations of traditional cloning technologies, since any DNA fragment can be cloned into LIC compatible vectors without using restriction endonucleases, ligation, or in vitro recombination. PMID:27557765

  18. Arabidopsis Flower and Embryo Developmental Genes are Repressed in Seedlings by Different Combinations of Polycomb Group Proteins in Association with Distinct Sets of Cis-regulatory Elements

    PubMed Central

    Liu, Jian; Zhang, Lei; He, Chongsheng; Shen, Wen-Hui; Jin, Hong; Xu, Lin; Zhang, Yijing

    2016-01-01

    Polycomb repressive complexes (PRCs) play crucial roles in transcriptional repression and developmental regulation in both plants and animals. In plants, depletion of different members of PRCs causes both overlapping and unique phenotypic defects. However, the underlying molecular mechanism determining the target specificity and functional diversity is not sufficiently characterized. Here, we quantitatively compared changes of tri-methylation at H3K27 in Arabidopsis mutants deprived of various key PRC components. We show that CURLY LEAF (CLF), a major catalytic subunit of PRC2, coordinates with different members of PRC1 in suppression of distinct plant developmental programs. We found that expression of flower development genes is repressed in seedlings preferentially via non-redundant role of CLF, which specifically associated with LIKE HETEROCHROMATIN PROTEIN1 (LHP1). In contrast, expression of embryo development genes is repressed by PRC1-catalytic core subunits AtBMI1 and AtRING1 in common with PRC2-catalytic enzymes CLF or SWINGER (SWN). This context-dependent role of CLF corresponds well with the change in H3K27me3 profiles, and is remarkably associated with differential co-occupancy of binding motifs of transcription factors (TFs), including MADS box and ABA-related factors. We propose that different combinations of PRC members distinctively regulate different developmental programs, and their target specificity is modulated by specific TFs. PMID:26760036

  19. Scavenger Chemokine (CXC Motif) Receptor 7 (CXCR7) Is a Direct Target Gene of HIC1 (Hypermethylated in Cancer 1)*

    PubMed Central

    Van Rechem, Capucine; Rood, Brian R.; Touka, Majid; Pinte, Sébastien; Jenal, Mathias; Guérardel, Cateline; Ramsey, Keri; Monté, Didier; Bégue, Agnès; Tschan, Mario P.; Stephan, Dietrich A.; Leprince, Dominique

    2009-01-01

    The tumor suppressor gene HIC1 (Hypermethylated in Cancer 1) that is epigenetically silenced in many human tumors and is essential for mammalian development encodes a sequence-specific transcriptional repressor. The few genes that have been reported to be directly regulated by HIC1 include ATOH1, FGFBP1, SIRT1, and E2F1. HIC1 is thus involved in the complex regulatory loops modulating p53-dependent and E2F1-dependent cell survival and stress responses. We performed genome-wide expression profiling analyses to identify new HIC1 target genes, using HIC1-deficient U2OS human osteosarcoma cells infected with adenoviruses expressing either HIC1 or GFP as a negative control. These studies identified several putative direct target genes, including CXCR7, a G-protein-coupled receptor recently identified as a scavenger receptor for the chemokine SDF-1/CXCL12. CXCR7 is highly expressed in human breast, lung, and prostate cancers. Using quantitative reverse transcription-PCR analyses, we demonstrated that CXCR7 was repressed in U2OS cells overexpressing HIC1. Inversely, inactivation of endogenous HIC1 by RNA interference in normal human WI38 fibroblasts results in up-regulation of CXCR7 and SIRT1. In silico analyses followed by deletion studies and luciferase reporter assays identified a functional and phylogenetically conserved HIC1-responsive element in the human CXCR7 promoter. Moreover, chromatin immunoprecipitation (ChIP) and ChIP upon ChIP experiments demonstrated that endogenous HIC1 proteins are bound together with the C-terminal binding protein corepressor to the CXCR7 and SIRT1 promoters in WI38 cells. Taken together, our results implicate the tumor suppressor HIC1 in the transcriptional regulation of the chemokine receptor CXCR7, a key player in the promotion of tumorigenesis in a wide variety of cell types. PMID:19525223

  20. Are mutagenic non D-loop direct repeat motifs in mitochondrial DNA under a negative selection pressure?

    PubMed Central

    Lakshmanan, Lakshmi Narayanan; Gruber, Jan; Halliwell, Barry; Gunawan, Rudiyanto

    2015-01-01

    Non D-loop direct repeats (DRs) in mitochondrial DNA (mtDNA) have been commonly implicated in the mutagenesis of mtDNA deletions associated with neuromuscular disease and ageing. Further, these DRs have been hypothesized to put a constraint on the lifespan of mammals and are under a negative selection pressure. Using a compendium of 294 mammalian mtDNA, we re-examined the relationship between species lifespan and the mutagenicity of such DRs. Contradicting the prevailing hypotheses, we found no significant evidence that long-lived mammals possess fewer mutagenic DRs than short-lived mammals. By comparing DR counts in human mtDNA with those in selectively randomized sequences, we also showed that the number of DRs in human mtDNA is primarily determined by global mtDNA properties, such as the bias in synonymous codon usage (SCU) and nucleotide composition. We found that SCU bias in mtDNA positively correlates with DR counts, where repeated usage of a subset of codons leads to more frequent DR occurrences. While bias in SCU and nucleotide composition has been attributed to nucleotide mutational bias, mammalian mtDNA still exhibit higher SCU bias and DR counts than expected from such mutational bias, suggesting a lack of negative selection against non D-loop DRs. PMID:25855815

  1. Overlapping CRE and E Box Motifs in the Enhancer Sequences of the Bovine Leukemia Virus 5′ Long Terminal Repeat Are Critical for Basal and Acetylation-Dependent Transcriptional Activity of the Viral Promoter: Implications for Viral Latency

    PubMed Central

    Calomme, Claire; Dekoninck, Ann; Nizet, Séverine; Adam, Emmanuelle; Nguyên, Thi Liên-Anh; Van Den Broeke, Anne; Willems, Luc; Kettmann, Richard; Burny, Arsène; Lint, Carine Van

    2004-01-01

    Bovine leukemia virus (BLV) infection is characterized by viral latency in a large proportion of cells containing an integrated provirus. In this study, we postulated that mechanisms directing the recruitment of deacetylases to the BLV 5′ long terminal repeat (LTR) could explain the transcriptional repression of viral expression in vivo. Accordingly, we showed that BLV promoter activity was induced by several deacetylase inhibitors (such as trichostatin A [TSA]) in the context of episomal LTR constructs and in the context of an integrated BLV provirus. Moreover, treatment of BLV-infected cells with TSA increased H4 acetylation at the viral promoter, showing a close correlation between the level of histone acetylation and transcriptional activation of the BLV LTR. Among the known cis-regulatory DNA elements located in the 5′ LTR, three E box motifs overlapping cyclic AMP responsive elements (CREs) in U3 were shown to be involved in transcriptional repression of BLV basal gene expression. Importantly, the combined mutations of these three E box motifs markedly reduced the inducibility of the BLV promoter by TSA. E boxes are susceptible to recognition by transcriptional repressors such as Max-Mad-mSin3 complexes that repress transcription by recruiting deacetylases. However, our in vitro binding studies failed to reveal the presence of Mad-Max proteins in the BLV LTR E box-specific complexes. Remarkably, TSA increased the occupancy of the CREs by CREB/ATF. Therefore, we postulated that the E box-specific complexes exerted their negative cooperative effect on BLV transcription by steric hindrance with the activators CREB/ATF and/or their transcriptional coactivators possessing acetyltransferase activities. Our results thus suggest that the overlapping CRE and E box elements in the BLV LTR were selected during evolution as a novel strategy for BLV to allow better silencing of viral transcription and to escape from the host immune response. PMID:15564493

  2. Directed Evolution Reveals the Binding Motif Preference of the LC8/DYNLL Hub Protein and Predicts Large Numbers of Novel Binders in the Human Proteome

    PubMed Central

    Rapali, Péter; Radnai, László; Süveges, Dániel; Harmat, Veronika; Tölgyesi, Ferenc; Wahlgren, Weixiao Y.; Katona, Gergely; Nyitray, László; Pál, Gábor

    2011-01-01

    LC8 dynein light chain (DYNLL) is a eukaryotic hub protein that is thought to function as a dimerization engine. Its interacting partners are involved in a wide range of cellular functions. In its dozens of hitherto identified binding partners DYNLL binds to a linear peptide segment. The known segments define a loosely characterized binding motif: [D/S]-4K-3X-2[T/V/I]-1Q0[T/V]1[D/E]2. The motifs are localized in disordered segments of the DYNLL-binding proteins and are often flanked by coiled coil or other potential dimerization domains. Based on a directed evolution approach, here we provide the first quantitative characterization of the binding preference of the DYNLL binding site. We displayed on M13 phage a naïve peptide library with seven fully randomized positions around a fixed, naturally conserved glutamine. The peptides were presented in a bivalent manner fused to a leucine zipper mimicking the natural dimer to dimer binding stoichiometry of DYNLL-partner complexes. The phage-selected consensus sequence V-5S-4R-3G-2T-1Q0T1E2 resembles the natural one, but is extended by an additional N-terminal valine, which increases the affinity of the monomeric peptide twentyfold. Leu-zipper dimerization increases the affinity into the subnanomolar range. By comparing crystal structures of an SRGTQTE-DYNLL and a dimeric VSRGTQTE-DYNLL complex we find that the affinity enhancing valine is accommodated in a binding pocket on DYNLL. Based on the in vitro evolved sequence pattern we predict a large number of novel DYNLL binding partners in the human proteome. Among these EML3, a microtubule-binding protein involved in mitosis contains an exact match of the phage-evolved consensus and binds to DYNLL with nanomolar affinity. These results significantly widen the scope of the human interactome around DYNLL and will certainly shed more light on the biological functions and organizing role of DYNLL in the human and other eukaryotic interactomes. PMID:21533121

  3. Directed evolution reveals the binding motif preference of the LC8/DYNLL hub protein and predicts large numbers of novel binders in the human proteome.

    PubMed

    Rapali, Péter; Radnai, László; Süveges, Dániel; Harmat, Veronika; Tölgyesi, Ferenc; Wahlgren, Weixiao Y; Katona, Gergely; Nyitray, László; Pál, Gábor

    2011-01-01

    LC8 dynein light chain (DYNLL) is a eukaryotic hub protein that is thought to function as a dimerization engine. Its interacting partners are involved in a wide range of cellular functions. In its dozens of hitherto identified binding partners DYNLL binds to a linear peptide segment. The known segments define a loosely characterized binding motif: [D/S](-4)K(-3)X(-2)[T/V/I](-1)Q(0)[T/V](1)[D/E](2). The motifs are localized in disordered segments of the DYNLL-binding proteins and are often flanked by coiled coil or other potential dimerization domains. Based on a directed evolution approach, here we provide the first quantitative characterization of the binding preference of the DYNLL binding site. We displayed on M13 phage a naïve peptide library with seven fully randomized positions around a fixed, naturally conserved glutamine. The peptides were presented in a bivalent manner fused to a leucine zipper mimicking the natural dimer to dimer binding stoichiometry of DYNLL-partner complexes. The phage-selected consensus sequence V(-5)S(-4)R(-3)G(-2)T(-1)Q(0)T(1)E(2) resembles the natural one, but is extended by an additional N-terminal valine, which increases the affinity of the monomeric peptide twentyfold. Leu-zipper dimerization increases the affinity into the subnanomolar range. By comparing crystal structures of an SRGTQTE-DYNLL and a dimeric VSRGTQTE-DYNLL complex we find that the affinity enhancing valine is accommodated in a binding pocket on DYNLL. Based on the in vitro evolved sequence pattern we predict a large number of novel DYNLL binding partners in the human proteome. Among these EML3, a microtubule-binding protein involved in mitosis contains an exact match of the phage-evolved consensus and binds to DYNLL with nanomolar affinity. These results significantly widen the scope of the human interactome around DYNLL and will certainly shed more light on the biological functions and organizing role of DYNLL in the human and other eukaryotic interactomes

  4. Mining Conditional Phosphorylation Motifs.

    PubMed

    Liu, Xiaoqing; Wu, Jun; Gong, Haipeng; Deng, Shengchun; He, Zengyou

    2014-01-01

    Phosphorylation motifs represent position-specific amino acid patterns around the phosphorylation sites in the set of phosphopeptides. Several algorithms have been proposed to uncover phosphorylation motifs, whereas the problem of efficiently discovering a set of significant motifs with sufficiently high coverage and non-redundancy still remains unsolved. Here we present a novel notion called conditional phosphorylation motifs. Through this new concept, the motifs whose over-expressiveness mainly benefits from its constituting parts can be filtered out effectively. To discover conditional phosphorylation motifs, we propose an algorithm called C-Motif for a non-redundant identification of significant phosphorylation motifs. C-Motif is implemented under the Apriori framework, and it tests the statistical significance together with the frequency of candidate motifs in a single stage. Experiments demonstrate that C-Motif outperforms some current algorithms such as MMFPh and Motif-All in terms of coverage and non-redundancy of the results and efficiency of the execution. The source code of C-Motif is available at: https://sourceforge. net/projects/cmotif/. PMID:26356863

  5. Analyses of fugu hoxa2 genes provide evidence for subfunctionalization of neural crest cell and rhombomere cis-regulatory modules during vertebrate evolution.

    PubMed

    McEllin, Jennifer A; Alexander, Tara B; Tümpel, Stefan; Wiedemann, Leanne M; Krumlauf, Robb

    2016-01-15

    Hoxa2 gene is a primary player in regulation of craniofacial programs of head development in vertebrates. Here we investigate the evolution of a Hoxa2 neural crest enhancer identified originally in mouse by comparing and contrasting the fugu hoxa2a and hoxa2b genes with their orthologous teleost and mammalian sequences. Using sequence analyses in combination with transgenic regulatory assays in zebrafish and mouse embryos we demonstrate subfunctionalization of regulatory activity for expression in hindbrain segments and neural crest cells between these two fugu co-orthologs. hoxa2a regulatory sequences have retained the ability to mediate expression in neural crest cells while those of hoxa2b include cis-elements that direct expression in rhombomeres. Functional dissection of the neural crest regulatory potential of the fugu hoxa2a and hoxa2b genes identify the previously unknown cis-element NC5, which is implicated in generating the differential activity of the enhancers from these genes. The NC5 region plays a similar role in the ability of this enhancer to mediate reporter expression in mice, suggesting it is a conserved component involved in control of neural crest expression of Hoxa2 in vertebrate craniofacial development. PMID:26632170

  6. Distance preferences in the arrangement of binding motifs and hierarchical levels in organization of transcription regulatory information

    PubMed Central

    Makeev, Vsevolod J.; Lifanov, Alexander P.; Nazina, Anna G.; Papatsenko, Dmitri A.

    2003-01-01

    We explored distance preferences in the arrangement of binding motifs for five transcription factors (Bicoid, Krüppel, Hunchback, Knirps and Caudal) in a large set of Drosophila cis-regulatory modules (CRMs). Analysis of non-overlapping binding motifs revealed the presence of periodic signals specific to particular combinations of binding motifs. The most striking periodic signals (10 bp for Bicoid and 11 bp for Hunchback) suggest preferential positioning of some binding site combinations on the same side of the DNA helix. We also analyzed distance preferences in arrangements of highly correlated overlapping binding motifs, such as Bicoid and Krüppel. Based on the distance analysis, we extracted preferential binding site arrangements and proposed models for potential composite elements (CEs) and antagonistic motif pairs involved in the function of developmental CRMs. Our results suggest that there are distinct hierarchical levels in the organization of transcription regulatory information. We discuss the role of the hierarchy in understanding transcriptional regulation and in detection of transcription regulatory regions in genomes. PMID:14530449

  7. The ESEV PDZ-Binding Motif of the Avian Influenza A Virus NS1 Protein Protects Infected Cells from Apoptosis by Directly Targeting Scribble▿

    PubMed Central

    Liu, Hongbing; Golebiewski, Lisa; Dow, Eugene C.; Krug, Robert M.; Javier, Ronald T.; Rice, Andrew P.

    2010-01-01

    The NS1 protein from influenza A viruses contains a four-amino-acid sequence at its carboxyl terminus that is termed the PDZ-binding motif (PBM). The NS1 PBM is predicted to bind to cellular PDZ proteins and functions as a virulence determinant in infected mice. ESEV is the consensus PBM sequence of avian influenza viruses, while RSKV is the consensus sequence of human viruses. Currently circulating highly pathogenic H5N1 influenza viruses encode an NS1 protein with the ESEV PBM. We identified cellular targets of the avian ESEV PBM and identified molecular mechanisms involved in its function. Using glutathione S-transferase (GST) pull-down assays, we found that the ESEV PBM enables NS1 to associate with the PDZ proteins Scribble, Dlg1, MAGI-1, MAGI-2, and MAGI-3. Because Scribble possesses a proapoptotic activity, we investigated the interaction between NS1 and Scribble. The association between NS1 and Scribble is direct and requires the ESEV PBM and two Scribble PDZ domains. We constructed recombinant H3N2 viruses that encode an H6N6 avian virus NS1 protein with either an ESEV or mutant ESEA PBM, allowing an analysis of the ESEV PBM in infections in mammalian cells. The ESEV PBM enhanced viral replication up to 4-fold. In infected cells, NS1 with the ESEV PBM relocalized Scribble into cytoplasmic puncta concentrated in perinuclear regions and also protected cells from apoptosis. In addition, the latter effect was eliminated by small interfering RNA (siRNA)-mediated Scribble depletion. This study shows that one function of the avian ESEV PBM is to reduce apoptosis during infection through disruption of Scribble's proapoptotic function. PMID:20702615

  8. Tissue- and stage-specific Wnt target gene expression is controlled subsequent to β-catenin recruitment to cis-regulatory modules.

    PubMed

    Nakamura, Yukio; de Paiva Alves, Eduardo; Veenstra, Gert Jan C; Hoppler, Stefan

    2016-06-01

    Key signalling pathways, such as canonical Wnt/β-catenin signalling, operate repeatedly to regulate tissue- and stage-specific transcriptional responses during development. Although recruitment of nuclear β-catenin to target genomic loci serves as the hallmark of canonical Wnt signalling, mechanisms controlling stage- or tissue-specific transcriptional responses remain elusive. Here, a direct comparison of genome-wide occupancy of β-catenin with a stage-matched Wnt-regulated transcriptome reveals that only a subset of β-catenin-bound genomic loci are transcriptionally regulated by Wnt signalling. We demonstrate that Wnt signalling regulates β-catenin binding to Wnt target genes not only when they are transcriptionally regulated, but also in contexts in which their transcription remains unaffected. The transcriptional response to Wnt signalling depends on additional mechanisms, such as BMP or FGF signalling for the particular genes we investigated, which do not influence β-catenin recruitment. Our findings suggest a more general paradigm for Wnt-regulated transcriptional mechanisms, which is relevant for tissue-specific functions of Wnt/β-catenin signalling in embryonic development but also for stem cell-mediated homeostasis and cancer. Chromatin association of β-catenin, even to functional Wnt-response elements, can no longer be considered a proxy for identifying transcriptionally Wnt-regulated genes. Context-dependent mechanisms are crucial for transcriptional activation of Wnt/β-catenin target genes subsequent to β-catenin recruitment. Our conclusions therefore also imply that Wnt-regulated β-catenin binding in one context can mark Wnt-regulated transcriptional target genes for different contexts. PMID:27068107

  9. Tissue- and stage-specific Wnt target gene expression is controlled subsequent to β-catenin recruitment to cis-regulatory modules

    PubMed Central

    Nakamura, Yukio; de Paiva Alves, Eduardo; Veenstra, Gert Jan C.; Hoppler, Stefan

    2016-01-01

    Key signalling pathways, such as canonical Wnt/β-catenin signalling, operate repeatedly to regulate tissue- and stage-specific transcriptional responses during development. Although recruitment of nuclear β-catenin to target genomic loci serves as the hallmark of canonical Wnt signalling, mechanisms controlling stage- or tissue-specific transcriptional responses remain elusive. Here, a direct comparison of genome-wide occupancy of β-catenin with a stage-matched Wnt-regulated transcriptome reveals that only a subset of β-catenin-bound genomic loci are transcriptionally regulated by Wnt signalling. We demonstrate that Wnt signalling regulates β-catenin binding to Wnt target genes not only when they are transcriptionally regulated, but also in contexts in which their transcription remains unaffected. The transcriptional response to Wnt signalling depends on additional mechanisms, such as BMP or FGF signalling for the particular genes we investigated, which do not influence β-catenin recruitment. Our findings suggest a more general paradigm for Wnt-regulated transcriptional mechanisms, which is relevant for tissue-specific functions of Wnt/β-catenin signalling in embryonic development but also for stem cell-mediated homeostasis and cancer. Chromatin association of β-catenin, even to functional Wnt-response elements, can no longer be considered a proxy for identifying transcriptionally Wnt-regulated genes. Context-dependent mechanisms are crucial for transcriptional activation of Wnt/β-catenin target genes subsequent to β-catenin recruitment. Our conclusions therefore also imply that Wnt-regulated β-catenin binding in one context can mark Wnt-regulated transcriptional target genes for different contexts. PMID:27068107

  10. Redox active motifs in selenoproteins.

    PubMed

    Li, Fei; Lutz, Patricia B; Pepelyayeva, Yuliya; Arnér, Elias S J; Bayse, Craig A; Rozovsky, Sharon

    2014-05-13

    Selenoproteins use the rare amino acid selenocysteine (Sec) to act as the first line of defense against oxidants, which are linked to aging, cancer, and neurodegenerative diseases. Many selenoproteins are oxidoreductases in which the reactive Sec is connected to a neighboring Cys and able to form a ring. These Sec-containing redox motifs govern much of the reactivity of selenoproteins. To study their fundamental properties, we have used (77)Se NMR spectroscopy in concert with theoretical calculations to determine the conformational preferences and mobility of representative motifs. This use of (77)Se as a probe enables the direct recording of the properties of Sec as its environment is systematically changed. We find that all motifs have several ring conformations in their oxidized state. These ring structures are most likely stabilized by weak, nonbonding interactions between the selenium and the amide carbon. To examine how the presence of selenium and ring geometric strain governs the motifs' reactivity, we measured the redox potentials of Sec-containing motifs and their corresponding Cys-only variants. The comparisons reveal that for C-terminal motifs the redox potentials increased between 20-25 mV when the selenenylsulfide bond was changed to a disulfide bond. Changes of similar magnitude arose when we varied ring size or the motifs' flanking residues. This suggests that the presence of Sec is not tied to unusually low redox potentials. The unique roles of selenoproteins in human health and their chemical reactivities may therefore not necessarily be explained by lower redox potentials, as has often been claimed. PMID:24769567

  11. Motif module map reveals enforcement of aging by continual NF-κB activity

    PubMed Central

    Adler, Adam S.; Sinha, Saurabh; Kawahara, Tiara L.A.; Zhang, Jennifer Y.; Segal, Eran; Chang, Howard Y.

    2007-01-01

    Aging is characterized by specific alterations in gene expression, but their underlying mechanisms and functional consequences are not well understood. Here we develop a systematic approach to identify combinatorial cis-regulatory motifs that drive age-dependent gene expression across different tissues and organisms. Integrated analysis of 365 microarrays spanning nine tissue types predicted fourteen motifs as major regulators of age-dependent gene expression in human and mouse. The motif most strongly associated with aging was that of the transcription factor NF-κB. Inducible genetic blockade of NF-κB for 2 wk in the epidermis of chronologically aged mice reverted the tissue characteristics and global gene expression programs to those of young mice. Age-specific NF-κB blockade and orthogonal cell cycle interventions revealed that NF-κB controls cell cycle exit and gene expression signature of aging in parallel but not sequential pathways. These results identify a conserved network of regulatory pathways underlying mammalian aging and show that NF-κB is continually required to enforce many features of aging in a tissue-specific manner. PMID:18055696

  12. Characterization of the human lipoprotein lipase (LPL) promoter: Evidence of two cis-regulatory regions, LP-[alpha] and LP-[beta] of importance for the differentation-linked induction of the LPL gene during adipogenesis

    SciTech Connect

    Enerbaeck, S.; Ohlsson, B.G.; Samuelsson, L.; Bjursell, G. )

    1992-10-01

    When preadipocytes differentiate into adipocytes, several differentiation-linked genes are activated. Lipo-protein lipase (LPL) is one of the first genes induced during this process. To investigate early events in adipocyte development, we have focused on the transcriptional activation of the LPL gene. For this purpose, we have cloned and fused different parts of intragenic and flanking sequences with a chloramphenicol acetyltransferase reporter gene. Transient transfection experiments and DNase I hypersensitivity assays indicate that several positive as well as negative elements contribute to transcriptional regulation of the LPL gene. When reporter gene constructs were stably introduced into preadipocytes, we were able to monitor and compare the activation patterns of different promoter deletion mutants at selected time points representing the process of adipocyte development. We could delimit two cis-regulatory elements important for gradual activation of the LPL gene during adipocyte development in vitro. These elements, LP-[alpha] (-702 to -666) and LP-[beta] (-468 to -430), contain a striking similarity to a consensus sequence known to bind the transcription factors HNF-3 and fork head. Results of gel mobility shift assays and DNase I and exonuclease III in vitro protection assays indicate that factors with DNA-binding properties similar to those of the HNF-3/fork head family of transcription factors are present in adipocytes and interact with LP-[alpha] and LP-[beta]. We also demonstrate that LP-[alpha] and LP-[beta] were both capable of conferring a differentiation-linked expression pattern to a heterolog promoter, thus mimicking the expression of the endogenous LPL gene during adipocyte differentiation. These findings indicate that interactions with LP-[alpha] and LP-[beta] could be a part of a differentiation switch governing induction of the LPL gene during adipocyte differentiation. 48 refs., 11 figs.

  13. Temporal motifs in time-dependent networks

    NASA Astrophysics Data System (ADS)

    Kovanen, Lauri; Karsai, Márton; Kaski, Kimmo; Kertész, János; Saramäki, Jari

    2011-11-01

    Temporal networks are commonly used to represent systems where connections between elements are active only for restricted periods of time, such as telecommunication, neural signal processing, biochemical reaction and human social interaction networks. We introduce the framework of temporal motifs to study the mesoscale topological-temporal structure of temporal networks in which the events of nodes do not overlap in time. Temporal motifs are classes of similar event sequences, where the similarity refers not only to topology but also to the temporal order of the events. We provide a mapping from event sequences to coloured directed graphs that enables an efficient algorithm for identifying temporal motifs. We discuss some aspects of temporal motifs, including causality and null models, and present basic statistics of temporal motifs in a large mobile call network.

  14. Fast approximate motif statistics.

    PubMed

    Nicodème, P

    2001-01-01

    We present in this article a fast approximate method for computing the statistics of a number of non-self-overlapping matches of motifs in a random text in the nonuniform Bernoulli model. This method is well suited for protein motifs where the probability of self-overlap of motifs is small. For 96% of the PROSITE motifs, the expectations of occurrences of the motifs in a 7-million-amino-acids random database are computed by the approximate method with less than 1% error when compared with the exact method. Processing of the whole PROSITE takes about 30 seconds with the approximate method. We apply this new method to a comparison of the C. elegans and S. cerevisiae proteomes. PMID:11535175

  15. Redox active motifs in selenoproteins

    PubMed Central

    Li, Fei; Lutz, Patricia B.; Pepelyayeva, Yuliya; Arnér, Elias S. J.; Bayse, Craig A.; Rozovsky, Sharon

    2014-01-01

    Selenoproteins use the rare amino acid selenocysteine (Sec) to act as the first line of defense against oxidants, which are linked to aging, cancer, and neurodegenerative diseases. Many selenoproteins are oxidoreductases in which the reactive Sec is connected to a neighboring Cys and able to form a ring. These Sec-containing redox motifs govern much of the reactivity of selenoproteins. To study their fundamental properties, we have used 77Se NMR spectroscopy in concert with theoretical calculations to determine the conformational preferences and mobility of representative motifs. This use of 77Se as a probe enables the direct recording of the properties of Sec as its environment is systematically changed. We find that all motifs have several ring conformations in their oxidized state. These ring structures are most likely stabilized by weak, nonbonding interactions between the selenium and the amide carbon. To examine how the presence of selenium and ring geometric strain governs the motifs’ reactivity, we measured the redox potentials of Sec-containing motifs and their corresponding Cys-only variants. The comparisons reveal that for C-terminal motifs the redox potentials increased between 20–25 mV when the selenenylsulfide bond was changed to a disulfide bond. Changes of similar magnitude arose when we varied ring size or the motifs’ flanking residues. This suggests that the presence of Sec is not tied to unusually low redox potentials. The unique roles of selenoproteins in human health and their chemical reactivities may therefore not necessarily be explained by lower redox potentials, as has often been claimed. PMID:24769567

  16. A short conserved motif in ALYREF directs cap- and EJC-dependent assembly of export complexes on spliced mRNAs

    PubMed Central

    Gromadzka, Agnieszka M.; Steckelberg, Anna-Lena; Singh, Kusum K.; Hofmann, Kay; Gehring, Niels H.

    2016-01-01

    The export of messenger RNAs (mRNAs) is the final of several nuclear posttranscriptional steps of gene expression. The formation of export-competent mRNPs involves the recruitment of export factors that are assumed to facilitate transport of the mature mRNAs. Using in vitro splicing assays, we show that a core set of export factors, including ALYREF, UAP56 and DDX39, readily associate with the spliced RNAs in an EJC (exon junction complex)- and cap-dependent manner. In order to elucidate how ALYREF and other export adaptors mediate mRNA export, we conducted a computational analysis and discovered four short, conserved, linear motifs present in RNA-binding proteins. We show that mutation in one of the new motifs (WxHD) in an unstructured region of ALYREF reduced RNA binding and abolished the interaction with eIF4A3 and CBP80. Additionally, the mutation impaired proper localization to nuclear speckles and export of a spliced reporter mRNA. Our results reveal important details of the orchestrated recruitment of export factors during the formation of export competent mRNPs. PMID:26773052

  17. Stochastic motif extraction using hidden Markov model

    SciTech Connect

    Fujiwara, Yukiko; Asogawa, Minoru; Konagaya, Akihiko

    1994-12-31

    In this paper, we study the application of an HMM (hidden Markov model) to the problem of representing protein sequences by a stochastic motif. A stochastic protein motif represents the small segments of protein sequences that have a certain function or structure. The stochastic motif, represented by an HMM, has conditional probabilities to deal with the stochastic nature of the motif. This HMM directive reflects the characteristics of the motif, such as a protein periodical structure or grouping. In order to obtain the optimal HMM, we developed the {open_quotes}iterative duplication method{close_quotes} for HMM topology learning. It starts from a small fully-connected network and iterates the network generation and parameter optimization until it achieves sufficient discrimination accuracy. Using this method, we obtained an HMM for a leucine zipper motif. Compared to the accuracy of a symbolic pattern representation with accuracy of 14.8 percent, an HMM achieved 79.3 percent in prediction. Additionally, the method can obtain an HMM for various types of zinc finger motifs, and it might separate the mixed data. We demonstrated that this approach is applicable to the validation of the protein databases; a constructed HMM b as indicated that one protein sequence annotated as {open_quotes}lencine-zipper like sequence{close_quotes} in the database is quite different from other leucine-zipper sequences in terms of likelihood, and we found this discrimination is plausible.

  18. Lysine residues direct the chlorination of tyrosines in YXXK motifs of apolipoprotein A-I when hypochlorous acid oxidizes high density lipoprotein.

    PubMed

    Bergt, Constanze; Fu, Xiaoyun; Huq, Nabiha P; Kao, Jeff; Heinecke, Jay W

    2004-02-27

    Oxidized lipoproteins may play an important role in the pathogenesis of atherosclerosis. Elevated levels of 3-chlorotyrosine, a specific end product of the reaction between hypochlorous acid (HOCl) and tyrosine residues of proteins, have been detected in atherosclerotic tissue. Thus, HOCl generated by the phagocyte enzyme myeloperoxidase represents one pathway for protein oxidation in humans. One important target of the myeloperoxidase pathway may be high density lipoprotein (HDL), which mobilizes cholesterol from artery wall cells. To determine whether activated phagocytes preferentially chlorinate specific sites in HDL, we used tandem mass spectrometry (MS/MS) to analyze apolipoprotein A-I that had been oxidized by HOCl. The major site of chlorination was a single tyrosine residue located in one of the protein's YXXK motifs (where X represents a nonreactive amino acid). To investigate the mechanism of chlorination, we exposed synthetic peptides to HOCl. The peptides encompassed the amino acid sequences YKXXY, YXXKY, or YXXXY. MS/MS analysis demonstrated that chlorination of tyrosine in the peptides that contained lysine was regioselective and occurred in high yield if the substrate was KXXY or YXXK. NMR and MS analyses revealed that the N(epsilon) amino group of lysine was initially chlorinated, which suggests that chloramine formation is the first step in tyrosine chlorination. Molecular modeling of the YXXK motif in apolipoprotein A-I demonstrated that these tyrosine and lysine residues are adjacent on the same face of an amphipathic alpha-helix. Our observations suggest that HOCl selectively targets tyrosine residues that are suitably juxtaposed to primary amino groups in proteins. This mechanism might enable phagocytes to efficiently damage proteins when they destroy microbial proteins during infection or damage host tissue during inflammation. PMID:14660678

  19. Efficient exact motif discovery

    PubMed Central

    Marschall, Tobias; Rahmann, Sven

    2009-01-01

    Motivation: The motif discovery problem consists of finding over-represented patterns in a collection of biosequences. It is one of the classical sequence analysis problems, but still has not been satisfactorily solved in an exact and efficient manner. This is partly due to the large number of possibilities of defining the motif search space and the notion of over-representation. Even for well-defined formalizations, the problem is frequently solved in an ad hoc manner with heuristics that do not guarantee to find the best motif. Results: We show how to solve the motif discovery problem (almost) exactly on a practically relevant space of IUPAC generalized string patterns, using the p-value with respect to an i.i.d. model or a Markov model as the measure of over-representation. In particular, (i) we use a highly accurate compound Poisson approximation for the null distribution of the number of motif occurrences. We show how to compute the exact clump size distribution using a recently introduced device called probabilistic arithmetic automaton (PAA). (ii) We define two p-value scores for over-representation, the first one based on the total number of motif occurrences, the second one based on the number of sequences in a collection with at least one occurrence. (iii) We describe an algorithm to discover the optimal pattern with respect to either of the scores. The method exploits monotonicity properties of the compound Poisson approximation and is by orders of magnitude faster than exhaustive enumeration of IUPAC strings (11.8 h compared with an extrapolated runtime of 4.8 years). (iv) We justify the use of the proposed scores for motif discovery by showing our method to outperform other motif discovery algorithms (e.g. MEME, Weeder) on benchmark datasets. We also propose new motifs on Mycobacterium tuberculosis. Availability and Implementation: The method has been implemented in Java. It can be obtained from http://ls11-www

  20. Cross-disciplinary detection and analysis of network motifs.

    PubMed

    Tran, Ngoc Tam L; DeLuccia, Luke; McDonald, Aidan F; Huang, Chun-Hsi

    2015-01-01

    The detection of network motifs has recently become an important part of network analysis across all disciplines. In this work, we detected and analyzed network motifs from undirected and directed networks of several different disciplines, including biological network, social network, ecological network, as well as other networks such as airlines, power grid, and co-purchase of political books networks. Our analysis revealed that undirected networks are similar at the basic three and four nodes, while the analysis of directed networks revealed the distinction between networks of different disciplines. The study showed that larger motifs contained the three-node motif as a subgraph. Topological analysis revealed that similar networks have similar small motifs, but as the motif size increases, differences arise. Pearson correlation coefficient showed strong positive relationship between some undirected networks but inverse relationship between some directed networks. The study suggests that the three-node motif is a building block of larger motifs. It also suggests that undirected networks share similar low-level structures. Moreover, similar networks share similar small motifs, but larger motifs define the unique structure of individuals. Pearson correlation coefficient suggests that protein structure networks, dolphin social network, and co-authorships in network science belong to a superfamily. In addition, yeast protein-protein interaction network, primary school contact network, Zachary's karate club network, and co-purchase of political books network can be classified into a superfamily. PMID:25983553

  1. Cross-Disciplinary Detection and Analysis of Network Motifs

    PubMed Central

    Tran, Ngoc Tam L; DeLuccia, Luke; McDonald, Aidan F; Huang, Chun-Hsi

    2015-01-01

    The detection of network motifs has recently become an important part of network analysis across all disciplines. In this work, we detected and analyzed network motifs from undirected and directed networks of several different disciplines, including biological network, social network, ecological network, as well as other networks such as airlines, power grid, and co-purchase of political books networks. Our analysis revealed that undirected networks are similar at the basic three and four nodes, while the analysis of directed networks revealed the distinction between networks of different disciplines. The study showed that larger motifs contained the three-node motif as a subgraph. Topological analysis revealed that similar networks have similar small motifs, but as the motif size increases, differences arise. Pearson correlation coefficient showed strong positive relationship between some undirected networks but inverse relationship between some directed networks. The study suggests that the three-node motif is a building block of larger motifs. It also suggests that undirected networks share similar low-level structures. Moreover, similar networks share similar small motifs, but larger motifs define the unique structure of individuals. Pearson correlation coefficient suggests that protein structure networks, dolphin social network, and co-authorships in network science belong to a superfamily. In addition, yeast protein–protein interaction network, primary school contact network, Zachary’s karate club network, and co-purchase of political books network can be classified into a superfamily. PMID:25983553

  2. No tradeoff between versatility and robustness in gene circuit motifs

    NASA Astrophysics Data System (ADS)

    Payne, Joshua L.

    2016-05-01

    Circuit motifs are small directed subgraphs that appear in real-world networks significantly more often than in randomized networks. In the Boolean model of gene circuits, most motifs are realized by multiple circuit genotypes. Each of a motif's constituent circuit genotypes may have one or more functions, which are embodied in the expression patterns the circuit forms in response to specific initial conditions. Recent enumeration of a space of nearly 17 million three-gene circuit genotypes revealed that all circuit motifs have more than one function, with the number of functions per motif ranging from 12 to nearly 30,000. This indicates that some motifs are more functionally versatile than others. However, the individual circuit genotypes that constitute each motif are less robust to mutation if they have many functions, hinting that functionally versatile motifs may be less robust to mutation than motifs with few functions. Here, I explore the relationship between versatility and robustness in circuit motifs, demonstrating that functionally versatile motifs are robust to mutation despite the inherent tradeoff between versatility and robustness at the level of an individual circuit genotype.

  3. Mining protein sequences for motifs.

    PubMed

    Narasimhan, Giri; Bu, Changsong; Gao, Yuan; Wang, Xuning; Xu, Ning; Mathee, Kalai

    2002-01-01

    We use methods from Data Mining and Knowledge Discovery to design an algorithm for detecting motifs in protein sequences. The algorithm assumes that a motif is constituted by the presence of a "good" combination of residues in appropriate locations of the motif. The algorithm attempts to compile such good combinations into a "pattern dictionary" by processing an aligned training set of protein sequences. The dictionary is subsequently used to detect motifs in new protein sequences. Statistical significance of the detection results are ensured by statistically determining the various parameters of the algorithm. Based on this approach, we have implemented a program called GYM. The Helix-Turn-Helix motif was used as a model system on which to test our program. The program was also extended to detect Homeodomain motifs. The detection results for the two motifs compare favorably with existing programs. In addition, the GYM program provides a lot of useful information about a given protein sequence. PMID:12487759

  4. The Transcriptional Complex Between the BCL2 i-Motif and hnRNP LL Is a Molecular Switch for Control of Gene Expression That Can Be Modulated by Small Molecules

    PubMed Central

    2015-01-01

    In a companion paper (DOI: 10.021/ja410934b) we demonstrate that the C-rich strand of the cis-regulatory element in the BCL2 promoter element is highly dynamic in nature and can form either an i-motif or a flexible hairpin. Under physiological conditions these two secondary DNA structures are found in an equilibrium mixture, which can be shifted by the addition of small molecules that trap out either the i-motif (IMC-48) or the flexible hairpin (IMC-76). In cellular experiments we demonstrate that the addition of these molecules has opposite effects on BCL2 gene expression and furthermore that these effects are antagonistic. In this contribution we have identified a transcriptional factor that recognizes and binds to the BCL2 i-motif to activate transcription. The molecular basis for the recognition of the i-motif by hnRNP LL is determined, and we demonstrate that the protein unfolds the i-motif structure to form a stable single-stranded complex. In subsequent experiments we show that IMC-48 and IMC-76 have opposite, antagonistic effects on the formation of the hnRNP LL–i-motif complex as well as on the transcription factor occupancy at the BCL2 promoter. For the first time we propose that the i-motif acts as a molecular switch that controls gene expression and that small molecules that target the dynamic equilibrium of the i-motif and the flexible hairpin can differentially modulate gene expression. PMID:24559432

  5. Structural alphabet motif discovery and a structural motif database.

    PubMed

    Ku, Shih-Yen; Hu, Yuh-Jyh

    2012-01-01

    This study proposes a general framework for structural motif discovery. The framework is based on a modular design in which the system components can be modified or replaced independently to increase its applicability to various studies. It is a two-stage approach that first converts protein 3D structures into structural alphabet sequences, and then applies a sequence motif-finding tool to these sequences to detect conserved motifs. We named the structural motif database we built the SA-Motifbase, which provides the structural information conserved at different hierarchical levels in SCOP. For each motif, SA-Motifbase presents its 3D view; alphabet letter preference; alphabet letter frequency distribution; and the significance. SA-Motifbase is available at http://bioinfo.cis.nctu.edu.tw/samotifbase/. PMID:22099701

  6. Structural and Mechanistic Analysis of Trichodiene Synthase Using Site-Directed Mutagenesis: Probing the Catalytic Function of Tryosine-295 and the Asparagine-225/Serine-229/Glutamate-233-Mg2+ B Motif

    SciTech Connect

    Vedula,L.; Jiang, J.; Zakharian, T.; Cane, D.; Christianson, D.

    2008-01-01

    Trichodiene synthase from Fusarium sporotrichioides contains two metal ion-binding motifs required for the cyclization of farnesyl diphosphate: the 'aspartate-rich' motif D100DXX(D/E) that coordinates to Mg{sup 2+}{sub A} and Mg{sup 2+}{sub C} source, and the 'NSE/DTE' motif N225DXXSXXXE that chelates Mg{sup 2+}{sub b} (boldface indicates metal ion ligands). Here, we report steady-state kinetic parameters, product array analyses, and X-ray crystal structures of trichodiene synthase mutants in which the fungal NSE motif is progressively converted into a plant-like DDXXTXXXE motif, resulting in a degradation in both steady-state kinetic parameters and product specificity. Each catalytically active mutant generates a different distribution of sesquiterpene products, and three newly detected sesquiterpenes are identified. In addition, the kinetic and structural properties of the Y295F mutant of trichodiene synthase were found to be similar to those of the wild-type enzyme, thereby ruling out a proposed role for Y295 in catalysis.

  7. A G-Box-Like Motif Is Necessary for Transcriptional Regulation by Circadian Pseudo-Response Regulators in Arabidopsis1[OPEN

    PubMed Central

    Newton, Linsey; Liu, Ming-Jung

    2016-01-01

    PSEUDO-RESPONSE REGULATORs (PRRs) play overlapping and distinct roles in maintaining circadian rhythms and regulating diverse biological processes, including the photoperiodic control of flowering, growth, and abiotic stress responses. PRRs act as transcriptional repressors and associate with chromatin via their conserved C-terminal CCT (CONSTANS, CONSTANS-like, and TIMING OF CAB EXPRESSION 1 [TOC1/PRR1]) domains by a still-poorly understood mechanism. Here, we identified genome-wide targets of PRR9 using chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) and compared them with PRR7, PRR5, and TOC1/PRR1 ChIP-seq data. We found that PRR binding sites are located within genomic regions of low nucleosome occupancy and high DNase I hypersensitivity. Moreover, conserved noncoding regions among Brassicaceae species are enriched around PRR binding sites, indicating that PRRs associate with functionally relevant cis-regulatory regions. The PRRs shared a significant number of binding regions, and our results indicate that they coordinately restrict the expression of target genes to around dawn. A G-box-like motif was overrepresented at PRR binding regions, and we showed that this motif is necessary for mediating transcriptional regulation of CIRCADIAN CLOCK ASSOCIATED 1 and PRR9 by the PRRs. Our results further our understanding of how PRRs target specific promoters and provide an extensive resource for studying circadian regulatory networks in plants. PMID:26586835

  8. Triadic motifs in the dependence networks of virtual societies.

    PubMed

    Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2014-01-01

    In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755

  9. Network motifs emerge from interconnections that favour stability

    NASA Astrophysics Data System (ADS)

    Angulo, Marco Tulio; Liu, Yang-Yu; Slotine, Jean-Jacques

    2015-10-01

    The microscopic principles organizing dynamic units in complex networks--from proteins to power generators--can be understood in terms of network `motifs’: small interconnection patterns that appear much more frequently in real networks than expected in random networks. When considered as small subgraphs isolated from a large network, these motifs are more robust to parameter variations, easier to synchronize than other possible subgraphs, and can provide specific functionalities. But one can isolate these subgraphs only by assuming, for example, a significant separation of timescales, and the origin of network motifs and their functionalities when embedded in larger networks remain unclear. Here we show that most motifs emerge from interconnection patterns that best exploit the intrinsic stability characteristics at different scales of interconnection, from simple nodes to whole modules. This functionality suggests an efficient mechanism to stably build complex systems by recursively interconnecting nodes and modules as motifs. We present direct evidence of this mechanism in several biological networks.

  10. The Annotation of RNA Motifs

    PubMed Central

    2002-01-01

    The recent deluge of new RNA structures, including complete atomic-resolution views of both subunits of the ribosome, has on the one hand literally overwhelmed our individual abilities to comprehend the diversity of RNA structure, and on the other hand presented us with new opportunities for comprehensive use of RNA sequences for comparative genetic, evolutionary and phylogenetic studies. Two concepts are key to understanding RNA structure: hierarchical organization of global structure and isostericity of local interactions. Global structure changes extremely slowly, as it relies on conserved long-range tertiary interactions. Tertiary RNA–RNA and quaternary RNA–protein interactions are mediated by RNA motifs, defined as recurrent and ordered arrays of non-Watson–Crick base-pairs. A single RNA motif comprises a family of sequences, all of which can fold into the same three-dimensional structure and can mediate the same interaction(s). The chemistry and geometry of base pairing constrain the evolution of motifs in such a way that random mutations that occur within motifs are accepted or rejected insofar as they can mediate a similar ordered array of interactions. The steps involved in the analysis and annotation of RNA motifs in 3D structures are: (a) decomposition of each motif into non-Watson–Crick base-pairs; (b) geometric classification of each basepair; (c) identification of isosteric substitutions for each basepair by comparison to isostericity matrices; (d) alignment of homologous sequences using the isostericity matrices to identify corresponding positions in the crystal structure; (e) acceptance or rejection of the null hypothesis that the motif is conserved. PMID:18629252

  11. [Prediction of Promoter Motifs in Virophages].

    PubMed

    Gong, Chaowen; Zhou, Xuewen; Pan, Yingjie; Wang, Yongjie

    2015-07-01

    Virophages have crucial roles in ecosystems and are the transport vectors of genetic materials. To shed light on regulation and control mechanisms in virophage--host systems as well as evolution between virophages and their hosts, the promoter motifs of virophages were predicted on the upstream regions of start codons using an analytical tool for prediction of promoter motifs: Multiple EM for Motif Elicitation. Seventeen potential promoter motifs were identified based on the E-value, location, number and length of promoters in genomes. Sputnik and zamilon motif 2 with AT-rich regions were distributed widely on genomes, suggesting that these motifs may be associated with regulation of the expression of various genes. Motifs containing the TCTA box were predicted to be late promoter motif in mavirus; motifs containing the ATCT box were the potential late promoter motif in the Ace Lake mavirus . AT-rich regions were identified on motif 2 in the Organic Lake virophage, motif 3 in Yellowstone Lake virophage (YSLV)1 and 2, motif 1 in YSLV3, and motif 1 and 2 in YSLV4, respectively. AT-rich regions were distributed widely on the genomes of virophages. All of these motifs may be promoter motifs of virophages. Our results provide insights into further exploration of temporal expression of genes in virophages as well as associations between virophages and giant viruses. PMID:26524912

  12. Sequential visibility-graph motifs

    NASA Astrophysics Data System (ADS)

    Iacovacci, Jacopo; Lacasa, Lucas

    2016-04-01

    Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.

  13. From Cis-Regulatory Elements to Complex RNPs and Back

    PubMed Central

    Gebauer, Fátima; Preiss, Thomas; Hentze, Matthias W.

    2012-01-01

    Messenger RNAs (mRNAs), the templates for translation, have evolved to harbor abundant cis-acting sequences that affect their posttranscriptional fates. These elements are frequently located in the untranslated regions and serve as binding sites for trans-acting factors, RNA-binding proteins, and/or small non-coding RNAs. This article provides a systematic synopsis of cis-acting elements, trans-acting factors, and the mechanisms by which they affect translation. It also highlights recent technical advances that have ushered in the era of transcriptome-wide studies of the ribonucleoprotein complexes formed by mRNAs and their trans-acting factors. PMID:22751153

  14. Combinatorial Information Theoretical Measurement of the Semantic Significance of Semantic Graph Motifs

    SciTech Connect

    Joslyn, Cliff A.; al-Saffar, Sinan; Haglin, David J.; Holder, Larry

    2011-06-14

    Given an arbitrary semantic graph data set, perhaps one lacking in explicit ontological information, we wish to first identify its significant semantic structures, and then measure the extent of their significance. Casting a semantic graph dataset as an edge-labeled, directed graph, this task can be built on the ability to mine frequent {\\em labeled} subgraphs in edge-labeled, directed graphs. We begin by considering the fundamentals of the enumerative combinatorics of subgraph motif structures in edge-labeled directed graphs. We identify its frequent labeled, directed subgraph motif patterns, and measure the significance of the resulting motifs by the information gain relative to the expected value of the motif based on the empirical frequency distribution of the link types which compose them, assuming indpendence. We illustrate the method on a small test graph, and discuss results obtained for small linear motifs (link type bigrams and trigrams) in a larger graph structure.

  15. A comprehensive analysis of the La-motif protein superfamily

    PubMed Central

    Bousquet-Antonelli, Cécile; Deragon, Jean-Marc

    2009-01-01

    The extremely well-conserved La motif (LAM), in synergy with the immediately following RNA recognition motif (RRM), allows direct binding of the (genuine) La autoantigen to RNA polymerase III primary transcripts. This motif is not only found on La homologs, but also on La-related proteins (LARPs) of unrelated function. LARPs are widely found amongst eukaryotes and, although poorly characterized, appear to be RNA-binding proteins fulfilling crucial cellular functions. We searched the fully sequenced genomes of 83 eukaryotic species scattered along the tree of life for the presence of LAM-containing proteins. We observed that these proteins are absent from archaea and present in all eukaryotes (except protists from the Plasmodium genus), strongly suggesting that the LAM is an ancestral motif that emerged early after the archaea-eukarya radiation. A complete evolutionary and structural analysis of these proteins resulted in their classification into five families: the genuine La homologs and four LARP families. Unexpectedly, in each family a conserved domain representing either a classical RRM or an RRM-like motif immediately follows the LAM of most proteins. An evolutionary analysis of the LAM-RRM/RRM-L regions shows that these motifs co-evolved and should be used as a single entity to define the functional region of interaction of LARPs with their substrates. We also found two extremely well conserved motifs, named LSA and DM15, shared by LARP6 and LARP1 family members, respectively. We suggest that members of the same family are functional homologs and/or share a common molecular mode of action on different RNA baits. PMID:19299548

  16. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, P.; Ciszak, E.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits and two catalytic centers. Each catalytic center (PP:PYR) is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and amhopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core (PP:PYR)(sub 2) within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GXPhiX(sub 4)(G)PhiXXGQ and GDGX(sub 25-30)NN in the PP-domain, and the EX(sub 4)(G)PhiXXGPhi in the PYR-domain, where Phi corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  17. The Thiamin Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Dominiak, Paulina M.; Ciszak, Ewa M.

    2003-01-01

    Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.

  18. Comprehensive discovery of DNA motifs in 349 human cells and tissues reveals new features of motifs

    PubMed Central

    Zheng, Yiyu; Li, Xiaoman; Hu, Haiyan

    2015-01-01

    Comprehensive motif discovery under experimental conditions is critical for the global understanding of gene regulation. To generate a nearly complete list of human DNA motifs under given conditions, we employed a novel approach to de novo discover significant co-occurring DNA motifs in 349 human DNase I hypersensitive site datasets. We predicted 845 to 1325 motifs in each dataset, for a total of 2684 non-redundant motifs. These 2684 motifs contained 54.02 to 75.95% of the known motifs in seven large collections including TRANSFAC. In each dataset, we also discovered 43 663 to 2 013 288 motif modules, groups of motifs with their binding sites co-occurring in a significant number of short DNA regions. Compared with known interacting transcription factors in eight resources, the predicted motif modules on average included 84.23% of known interacting motifs. We further showed new features of the predicted motifs, such as motifs enriched in proximal regions rarely overlapped with motifs enriched in distal regions, motifs enriched in 5′ distal regions were often enriched in 3′ distal regions, etc. Finally, we observed that the 2684 predicted motifs classified the cell or tissue types of the datasets with an accuracy of 81.29%. The resources generated in this study are available at http://server.cs.ucf.edu/predrem/. PMID:25505144

  19. Detecting correlations among functional-sequence motifs

    NASA Astrophysics Data System (ADS)

    Pirino, Davide; Rigosa, Jacopo; Ledda, Alice; Ferretti, Luca

    2012-06-01

    Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features.

  20. Detecting correlations among functional-sequence motifs.

    PubMed

    Pirino, Davide; Rigosa, Jacopo; Ledda, Alice; Ferretti, Luca

    2012-06-01

    Sequence motifs are words of nucleotides in DNA with biological functions, e.g., gene regulation. Identification of such words proceeds through rejection of Markov models on the expected motif frequency along the genome. Additional biological information can be extracted from the correlation structure among patterns of motif occurrences. In this paper a log-linear multivariate intensity Poisson model is estimated via expectation maximization on a set of motifs along the genome of E. coli K12. The proposed approach allows for excitatory as well as inhibitory interactions among motifs and between motifs and other genomic features like gene occurrences. Our findings confirm previous stylized facts about such types of interactions and shed new light on genome-maintenance functions of some particular motifs. We expect these methods to be applicable to a wider set of genomic features. PMID:23005179

  1. A survey of DNA motif finding algorithms

    PubMed Central

    Das, Modan K; Dai, Ho-Kwok

    2007-01-01

    Background Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are called motifs. Recent advances in genome sequence availability and in high-throughput gene expression analysis technologies have allowed for the development of computational methods for motif finding. As a result, a large number of motif finding algorithms have been implemented and applied to various motif models over the past decade. This survey reviews the latest developments in DNA motif finding algorithms. Results Earlier algorithms use promoter sequences of coregulated genes from single genome and search for statistically overrepresented motifs. Recent algorithms are designed to use phylogenetic footprinting or orthologous sequences and also an integrated approach where promoter sequences of coregulated genes and phylogenetic footprinting are used. All the algorithms studied have been reported to correctly detect the motifs that have been previously detected by laboratory experimental approaches, and some algorithms were able to find novel motifs. However, most of these motif finding algorithms have been shown to work successfully in yeast and other lower organisms, but perform significantly worse in higher organisms. Conclusion Despite considerable efforts to date, DNA motif finding remains a complex challenge for biologists and computer scientists. Researchers have taken many different approaches in developing motif discovery tools and the progress made in this area of research is very encouraging. Performance comparison of different motif finding tools and identification of the best tools have proven to be a difficult task because tools are designed based on algorithms and motif models that are diverse and complex and our incomplete understanding of

  2. D-MATRIX: A web tool for constructing weight matrix of conserved DNA motifs

    PubMed Central

    Sen, Naresh; Mishra, Manoj; Khan, Feroz; Meena, Abha; Sharma, Ashok

    2009-01-01

    Despite considerable efforts to date, DNA motif prediction in whole genome remains a challenge for researchers. Currently the genome wide motif prediction tools required either direct pattern sequence (for single motif) or weight matrix (for multiple motifs). Although there are known motif pattern databases and tools for genome level prediction but no tool for weight matrix construction. Considering this, we developed a D-MATRIX tool which predicts the different types of weight matrix based on user defined aligned motif sequence set and motif width. For retrieval of known motif sequences user can access the commonly used databases such as TFD, RegulonDB, DBTBS, Transfac. D­MATRIX program uses a simple statistical approach for weight matrix construction, which can be converted into different file formats according to user requirement. It provides the possibility to identify the conserved motifs in the co­regulated genes or whole genome. As example, we successfully constructed the weight matrix of LexA transcription factor binding site with the help of known sos­box cis­regulatory elements in Deinococcus radiodurans genome. The algorithm is implemented in C-Sharp and wrapped in ASP.Net to maintain a user friendly web interface. D­MATRIX tool is accessible through the CIMAP domain network. Availability http://203.190.147.116/dmatrix/ PMID:19759861

  3. Comparative genomic analysis of upstream miRNA regulatory motifs in Caenorhabditis.

    PubMed

    Jovelin, Richard; Krizus, Aldis; Taghizada, Bakhtiyar; Gray, Jeremy C; Phillips, Patrick C; Claycomb, Julie M; Cutter, Asher D

    2016-07-01

    MicroRNAs (miRNAs) comprise a class of short noncoding RNA molecules that play diverse developmental and physiological roles by controlling mRNA abundance and protein output of the vast majority of transcripts. Despite the importance of miRNAs in regulating gene function, we still lack a complete understanding of how miRNAs themselves are transcriptionally regulated. To fill this gap, we predicted regulatory sequences by searching for abundant short motifs located upstream of miRNAs in eight species of Caenorhabditis nematodes. We identified three conserved motifs across the Caenorhabditis phylogeny that show clear signatures of purifying selection from comparative genomics, patterns of nucleotide changes in motifs of orthologous miRNAs, and correlation between motif incidence and miRNA expression. We then validated our predictions with transgenic green fluorescent protein reporters and site-directed mutagenesis for a subset of motifs located in an enhancer region upstream of let-7 We demonstrate that a CT-dinucleotide motif is sufficient for proper expression of GFP in the seam cells of adult C. elegans, and that two other motifs play incremental roles in combination with the CT-rich motif. Thus, functional tests of sequence motifs identified through analysis of molecular evolutionary signatures provide a powerful path for efficiently characterizing the transcriptional regulation of miRNA genes. PMID:27140965

  4. The Thiamine-Pyrophosphate-Motif

    NASA Technical Reports Server (NTRS)

    Ciszak, Ewa; Dominiak, Paulina

    2004-01-01

    Thiamin pyrophosphate (TPP), a derivative of vitamin B1, is a cofactor for enzymes performing catalysis in pathways of energy production including the well known decarboxylation of a-keto acid dehydrogenases followed by transketolation. TPP-dependent enzymes constitute a structurally and functionally diverse group exhibiting multimeric subunit organization, multiple domains and two chemically equivalent catalytic centers. Annotation of functional TPP-dependcnt enzymes, therefore, has not been trivial due to low sequence similarity related to this complex organization. Our approach to analysis of structures of known TPP-dependent enzymes reveals for the first time features common to this group, which we have termed the TPP-motif. The TPP-motif consists of specific spatial arrangements of structural elements and their specific contacts to provide for a flip-flop, or alternate site, enzymatic mechanism of action. Analysis of structural elements entrained in the flip-flop action displayed by TPP-dependent enzymes reveals a novel definition of the common amino acid sequences. These sequences allow for annotation of TPP-dependent enzymes, thus advancing functional proteomics. Further details of three-dimensional structures of TPP-dependent enzymes will be discussed.

  5. Synthetic biology with RNA motifs.

    PubMed

    Saito, Hirohide; Inoue, Tan

    2009-02-01

    Structural motifs in naturally occurring RNAs and RNPs can be employed as new molecular parts for synthetic biology to facilitate the development of novel devices and systems that modulate cellular functions. In this review, we focus on the following: (i) experimental evolution techniques of RNA molecules in vitro and (ii) their applications for regulating gene expression systems in vivo. For experimental evolution, new artificial RNA aptamers and RNA enzymes (ribozymes) have been selected in vitro. These functional RNA molecules are likely to be applicable in the reprogramming of existing gene regulatory systems. Furthermore, they may be used for designing hypothetical RNA-based living systems in the so-called RNA world. For the regulation of gene expressions in living cells, the development of new riboswitches allows us to modulate the target gene expression in a tailor-made manner. Moreover, recently RNA-based synthetic genetic circuits have been reported by employing functional RNA molecules, expanding the repertory of synthetic biology with RNA motifs. PMID:18775792

  6. DILIMOT: discovery of linear motifs in proteins.

    PubMed

    Neduva, Victor; Russell, Robert B

    2006-07-01

    Discovery of protein functional motifs is critical in modern biology. Small segments of 3-10 residues play critical roles in protein interactions, post-translational modifications and trafficking. DILIMOT (DIscovery of LInear MOTifs) is a server for the prediction of these short linear motifs within a set of proteins. Given a set of sequences sharing a common functional feature (e.g. interaction partner or localization) the method finds statistically over-represented motifs likely to be responsible for it. The input sequences are first passed through a set of filters to remove regions unlikely to contain instances of linear motifs. Motifs are then found in the remaining sequence and ranked according to a statistic that measure over-representation and conservation across homologues in related species. The results are displayed via a visual interface for easy perusal. The server is available at http://dilimot.embl.de. PMID:16845024

  7. Bridge and brick motifs in complex networks

    NASA Astrophysics Data System (ADS)

    Huang, Chung-Yuan; Sun, Chuen-Tsai; Cheng, Chia-Ying; Hsieh, Ji-Lung

    2007-04-01

    Acknowledging the expanding role of complex networks in numerous scientific contexts, we examine significant functional and topological differences between bridge and brick motifs for predicting network behaviors and functions. After observing similarities between social networks and their genetic, ecological, and engineering counterparts, we identify a larger number of brick motifs in social networks and bridge motifs in the other three types. We conclude that bridge and brick motif content analysis can assist researchers in understanding the small-world and clustering properties of network structures when investigating network functions and behaviors.

  8. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

    PubMed

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2014-01-01

    ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data. PMID:24555784

  9. Sampling Motif-Constrained Ensembles of Networks

    NASA Astrophysics Data System (ADS)

    Fischer, Rico; Leitão, Jorge C.; Peixoto, Tiago P.; Altmann, Eduardo G.

    2015-10-01

    The statistical significance of network properties is conditioned on null models which satisfy specified properties but that are otherwise random. Exponential random graph models are a principled theoretical framework to generate such constrained ensembles, but which often fail in practice, either due to model inconsistency or due to the impossibility to sample networks from them. These problems affect the important case of networks with prescribed clustering coefficient or number of small connected subgraphs (motifs). In this Letter we use the Wang-Landau method to obtain a multicanonical sampling that overcomes both these problems. We sample, in polynomial time, networks with arbitrary degree sequences from ensembles with imposed motifs counts. Applying this method to social networks, we investigate the relation between transitivity and homophily, and we quantify the correlation between different types of motifs, finding that single motifs can explain up to 60% of the variation of motif profiles.

  10. Form and function in gene regulatory networks: the structure of network motifs determines fundamental properties of their dynamical state space

    PubMed Central

    Ahnert, S. E.; Fink, T. M. A.

    2016-01-01

    Network motifs have been studied extensively over the past decade, and certain motifs, such as the feed-forward loop, play an important role in regulatory networks. Recent studies have used Boolean network motifs to explore the link between form and function in gene regulatory networks and have found that the structure of a motif does not strongly determine its function, if this is defined in terms of the gene expression patterns the motif can produce. Here, we offer a different, higher-level definition of the ‘function’ of a motif, in terms of two fundamental properties of its dynamical state space as a Boolean network. One is the basin entropy, which is a complexity measure of the dynamics of Boolean networks. The other is the diversity of cyclic attractor lengths that a given motif can produce. Using these two measures, we examine all 104 topologically distinct three-node motifs and show that the structural properties of a motif, such as the presence of feedback loops and feed-forward loops, predict fundamental characteristics of its dynamical state space, which in turn determine aspects of its functional versatility. We also show that these higher-level properties have a direct bearing on real regulatory networks, as both basin entropy and cycle length diversity show a close correspondence with the prevalence, in neural and genetic regulatory networks, of the 13 connected motifs without self-interactions that have been studied extensively in the literature. PMID:27440255